Skip to content

Inference Layer by Layer or feature extraction #6025

@IzanCatalan

Description

@IzanCatalan

Hi everyone, I would like to know if performing a layer-per-layer inference on a pre-trained model (in fp32 or int8 datatypes) is possible using GPU with cuda 11.2.

My idea is to use several fp32 and int8-quantized models from ONNX Model Zoo Repo and then do the inference layer by layer to achieve a feature extraction. After this, I would modify the outputs from each layer and use them as a new input for the following layers, with the last layer's output equal to the output of the original model.

The approximate code would be something similar to this one:

model_path = "model.onnx"
ort_session = ort.InferenceSession(model_path)

input_data = np.random.randn(1, 3, 32, 32).astype(np.float32)

conv1_output = ort_session.run(None, {'input1': input_data})[0]

conv2_output = ort_session.run(None, {'input2': conv1_output})[0]
# Now, I can work with intermediate outputs, modify them and use them as new inputs

However, I tried to reproduce this code with a resnet50 pre-trained model from ONNX Model Zoo Repo, but it seems this model, like the rest of pre-trained models, only has one input and one output (no way of accessing to intermediate outputs).

So, is there any way I could do this? I have seen Evaluation Step by Step documentation. However, I am unsure if this "ReferenceEvaluator" function also works for pre-trained/quantised models and, more importantly, obtaining accuracy from a dataset like ImageNet.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions