Skip to content

QuantizeLinear and DequantizeLinear nodes for Conv2D get folded as constants. #1719

@srihari-humbarwadi

Description

@srihari-humbarwadi

Describe the bug
I have taken these representations from TensorRT QAT Presentation

Figure(1)
image
As shown in Figure(1) above, I added QuantizeAndDequantizeV2 nodes before conv2d op and for conv2d kernel in my model.
But after converting it with tf2onnx, I don't seem to find the QuantizeLinear and DequantizeLinear nodes for the conv2d kernel, but as shown in Figure(2) below, I was expecting tf2onnx to keep them and not fold them into constants.
Figure(2)
image

Figure(3)
image
Tensorboard visualization of my model, showing QuantizeAndDequantizeV2 ops for both input and weights.

Figure(4)
image
Netron visualization of onnx model, that shows QuantizeLinear and DequantizeLinear nodes only for conv2d input.

It seems logical to fold the QuantizeLinear and DequantizeLinear nodes for weights into a constant as described here, #1394. But looking at Figure(5) below, it looks like TensorRT requires the QuantizeLinear and DequantizeLinear nodes for both conv2d input and weights!

Figure(5)
image

Urgency
Blocked use-case, TensorFlow2.x QAT -> ONNX -> TensorRT

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • Tensorflow Version: 2.6
  • Python version: 3.8

To Reproduce
Adding QuantizeAndDequantizeV2 into TF2.x graphs is not trivial and hence it is difficult for me to add all the code here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions