`QuantizeLinear` and `DequantizeLinear` nodes for `Conv2D` get folded as constants.

**Describe the bug**
I have taken these representations from [TensorRT QAT Presentation](https://developer.download.nvidia.com/video/gputechconf/gtc/2020/presentations/s21664-toward-int8-inference-deploying-quantization-aware-trained-networks-using-tensorrt.pdf)

*Figure(1)*
![image](https://user-images.githubusercontent.com/24864163/133327241-47d1f42a-2818-47e7-b637-d72b1931e87c.png)
As shown in *Figure(1)* above, I added `QuantizeAndDequantizeV2` nodes before `conv2d` op and for `conv2d` kernel in my model.
But after converting it with tf2onnx, I don't seem to find the `QuantizeLinear` and `DequantizeLinear` nodes for the `conv2d` kernel, but as shown in *Figure(2)* below, I was expecting tf2onnx to keep them and not fold them into constants.
*Figure(2)*
![image](https://user-images.githubusercontent.com/24864163/133327661-aba160a0-2181-410d-a034-6418d86e8dbf.png)

*Figure(3)*
![image](https://user-images.githubusercontent.com/24864163/133328561-302d4358-aa65-49b9-adb9-23e382e9550b.png)
Tensorboard visualization of my model, showing `QuantizeAndDequantizeV2` ops for both input and weights.

*Figure(4)*
![image](https://user-images.githubusercontent.com/24864163/133329051-811f7a3f-4eb6-46d9-83de-562da7a12496.png)
Netron visualization of onnx model, that shows  `QuantizeLinear` and `DequantizeLinear` nodes only for `conv2d` input.

It seems logical to fold the `QuantizeLinear` and `DequantizeLinear` nodes for weights into a constant as described here, https://github.com/onnx/tensorflow-onnx/issues/1394. But looking at *Figure(5)* below, it looks like TensorRT requires the `QuantizeLinear` and `DequantizeLinear` nodes for both `conv2d` input and weights!

*Figure(5)*
![image](https://user-images.githubusercontent.com/24864163/133382894-91c5575d-9ce3-4c40-b6d4-f49af23ebc9b.png)

**Urgency**
Blocked use-case, TensorFlow2.x QAT -> ONNX -> TensorRT

**System information**
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu  18.04
- Tensorflow Version: 2.6
- Python version: 3.8

**To Reproduce**
Adding `QuantizeAndDequantizeV2` into TF2.x graphs is not trivial and hence it is difficult for me to add all the code here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`QuantizeLinear` and `DequantizeLinear` nodes for `Conv2D` get folded as constants. #1719

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

QuantizeLinear and DequantizeLinear nodes for Conv2D get folded as constants. #1719

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`QuantizeLinear` and `DequantizeLinear` nodes for `Conv2D` get folded as constants. #1719