Skip to content

Conversation

jcwchen
Copy link
Member

@jcwchen jcwchen commented Nov 19, 2021

Description

  • By default, test generation should produce test models with since_version
  • Enable test generation to specify opset_version if needed
  • Add test generation in Windows-CI, Linux-CI and Mac-CI to verify consistent node models
  • Update some tests to use make_model_gen_version with specified opset_version (e.g., simple tests)

P.S. since_version is the version of the operator set that this operator was initially declared in.

More details about changed behaviors please see the comment here.

Motivation and Context
Closes #2629 Currently test generation will always produce models with the latest opset_version. However, if we want to test "test generation" in CIs with the same situation when they were uploaded, test generation should be always consistent regardless of which platform. To achieve that, make_model_gen_version with specified since_version is needed.

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
@jcwchen jcwchen requested a review from a team as a code owner November 19, 2021 16:39
@jcwchen jcwchen added the run release CIs Use this label to trigger release tests in CI label Nov 19, 2021
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
@jcwchen jcwchen changed the title Make test generation produced models with since_version Make test/node generation produce models with since_version by default Nov 19, 2021
if 'opset_imports' not in kwargs:
# To make sure the model will be produced with the same opset_version after opset changes
# By default, it uses since_version as opset_version for produced models
since_version = onnx.defs.get_schema(node.op_type, node.domain).since_version
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use max_inclusive_version for fetching schema to make sure we fetch the right schema for tests IF they target an older opset. Today none of the tests do but it is best if we add this check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have updated get_schema with max_inclusive_version and make target opset_version as use_max_opset_version, which is more accurate. It takes some time for to inspect whether there is any updated model. Will update them later.

Copy link
Collaborator

@askhade askhade Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes some time for to inspect whether there is any updated model.

If you run your script and do a git diff that should tell you right?

Copy link
Member Author

@jcwchen jcwchen Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be other changes due to system difference... To be careful, I will go through all of them (perhaps write a simple script to detect it).

Copy link
Member Author

@jcwchen jcwchen Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I wrote a script for only updating models/output.pb if their opset_version have been changed by this PR. However, there are 566 models which need to be updated (864 models in total). This PR will impact projects which are relying on backend test data.

One of the reasons why so many models are affected is -- typically an opset bump only involves some conditions and mostly only they were updated. Then the other models of other conditions would still use an old opset_version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case we can change the test to include max_inclusive_version parameter to restrict the model to older opset

Copy link
Collaborator

@askhade askhade Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going forward this should not be a problem... we should ask the authors to update all the models for the changing op... infact it will be gated by the git diff check in the PR.
So basically the author will have 2 options - update the test to include max_inclusive_version or simply update all the test data.

askhade
askhade previously requested changes Dec 8, 2021
Copy link
Collaborator

@askhade askhade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments

…e targeted opset version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
@jcwchen jcwchen force-pushed the jcw/generate-since branch from 4593922 to 7a8d7a7 Compare January 7, 2022 00:45
@@ -1 +1 @@
{"atol": 1e-07, "model_name": "zfnet512", "rtol": 0.001, "url": "https://s3.amazonaws.com/download.onnx/models/opset_9/zfnet512.tar.gz"}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the specified opset version here:

base_model_opset_version = 10
, updating opset_10 here should be correct. I was surprise that these haven't been updated for a while. Since now we do test all models from ONNX model Zoo weekly, perhaps we can consider to remove/refactor these old models.

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>
@jcwchen
Copy link
Member Author

jcwchen commented Jan 29, 2022

This PR will forward after 1.11 release since it involves too many files and makes a fundamental change about data generation.

Here is the summary that data generation behavior will be changed as follows:

Previous

  1. python onnx/backend/test/cmd_tools.py generate-data always produces node models with the "latest" opset version. There is no way to generate a model with older opset version by newer ONNX.
  2. Because of 1., newer ONNX always updates many unrelated models by bumping their opset version to the latest one (although they are not updated)
  3. Because of 1., newer ONNX cannot created models with older opset version. Users need to get older models by older ONNX package.
  4. Many old models were not updated when its opset version has bumped.

New

  1. By default, python onnx/backend/test/cmd_tools.py generate-data will now produce the node models with "since_version".
  2. Users can specify a desired opset version for generate-data. It will find the latest version ("since_version") before desired opset version.
  3. Due to 2., now generate-data is able to create a whole set of node models with an older opset version.
  4. Developers can specify older opset_version by giving opset_imports while writing a node test. It will use since_version based on specified opset_version.
  5. Node models are all updated by their opset's "since_version".
  6. Due to 5., now CI should be able to check the consistency of all generated data.

@jcwchen
Copy link
Member Author

jcwchen commented May 10, 2022

Update: I am trying to test the consistency of test generation in CIs. It seems that only Windows CI can have exactly the same node tests as this PR (I updated these node test models on Windows). I suspect there is an issue like this one: #3120. Still investigating.

jcwchen added 8 commits May 17, 2022 16:14
Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: jcwchen <jacky82226@gmail.com>
@jcwchen
Copy link
Member Author

jcwchen commented May 18, 2022

Update: I am trying to test the consistency of test generation in CIs. It seems that only Windows CI can have exactly the same node tests as this PR (I updated these node test models on Windows). I suspect there is an issue like this one: #3120. Still investigating.

This issue exists because NumPy functions might behave differently on different platforms. Therefore skip all output.pb and some input.pb for now. I have added test generation in Windows CI, Linux CI and Mac CI in this PR. At least all ONNX models produced by test generation are consistent.

jcwchen added 3 commits May 18, 2022 17:05
Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: jcwchen <jacky82226@gmail.com>
@jcwchen
Copy link
Member Author

jcwchen commented May 19, 2022

One last update: I tried to bump the latest opset version and see whether any test has updated. Then I figured out simple tests still used make_model without specifying opset version so it will keep changing if the latest opset version bumps. To solve this, I further use make_mode_gen_version and specify opset version to make this test generation consistent.

@askhade
Copy link
Collaborator

askhade commented May 20, 2022

overall the PR looks good to me. I suggest using this branch in ort CIs and make sure all pipelines are green.

Signed-off-by: jcwchen <jacky82226@gmail.com>
@jcwchen
Copy link
Member Author

jcwchen commented May 20, 2022

Thanks for the review! I have verified these updated models once in ORT CIs and there isn't any related failure: microsoft/onnxruntime#11214

@jcwchen jcwchen enabled auto-merge (squash) May 20, 2022 17:52
@jcwchen jcwchen merged commit acaa175 into onnx:main May 20, 2022
@jcwchen jcwchen deleted the jcw/generate-since branch May 20, 2022 18:11
justinchuby pushed a commit to justinchuby/onnx that referenced this pull request May 20, 2022
onnx#3855)

* add since_version for data generation

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* specify opset_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* int str opset_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix typecheck

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* flake 8

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* do not set opset_imports if it has been specified

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix typecheck

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* use_max_opset_version for the latest opset vesion that supports before targeted opset version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix bug use_max_opset_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* update 566 models

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* apply max_inclusive_version to opset_imports

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix flake8

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* revert onehot's input_1.pb change

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* update onehot because 0d update

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* update helper and tools for older ir_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix helper

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* use IR_VERSION according to the opset_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix flake8

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* if unknown newer opset_version, use default ir_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix undefined opset_version globally

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* use make_model_gen_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* remove useless '' handling

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* handle newer opset_version for undefined IR_VERSION

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* use _make_test_model_gen_version to handle unrelease opset version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* traverse opset_imports correctly

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* ignore reasonable unpack case

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* directly use specified opset version; try CI

Signed-off-by: jcwchen <jacky82226@gmail.com>

* test Windows CI as well

Signed-off-by: jcwchen <jacky82226@gmail.com>

* update layer norm by Windows

Signed-off-by: jcwchen <jacky82226@gmail.com>

* .astype(np.float32) for acosh

Signed-off-by: jcwchen <jacky82226@gmail.com>

* update operator.doc

Signed-off-by: jcwchen <jacky82226@gmail.com>

* test_coverage.md

Signed-off-by: jcwchen <jacky82226@gmail.com>

* float64 instead of float32

Signed-off-by: jcwchen <jacky82226@gmail.com>

* int64 for negativeloss and apply Mac CI, exclude .output.pb

Signed-off-by: jcwchen <jacky82226@gmail.com>

* remove target version for now

Signed-off-by: jcwchen <jacky82226@gmail.com>

* use ! instead of exclude on Mac

Signed-off-by: jcwchen <jacky82226@gmail.com>

* skip test_log's input due to numpy.random

Signed-off-by: jcwchen <jacky82226@gmail.com>

* specify opset_imports for simple tests for consistency

Signed-off-by: jcwchen <jacky82226@gmail.com>

* make_model_gen_version

Signed-off-by: jcwchen <jacky82226@gmail.com>

* update operators and testcoverage

Signed-off-by: jcwchen <jacky82226@gmail.com>

* nit: better error messages

Signed-off-by: jcwchen <jacky82226@gmail.com>
Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>
broune pushed a commit to broune/onnx that referenced this pull request May 6, 2023
onnx#3855)

* add since_version for data generation

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* specify opset_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* int str opset_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix typecheck

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* flake 8

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* do not set opset_imports if it has been specified

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix typecheck

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* use_max_opset_version for the latest opset vesion that supports before targeted opset version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix bug use_max_opset_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* update 566 models

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* apply max_inclusive_version to opset_imports

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix flake8

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* revert onehot's input_1.pb change

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* update onehot because 0d update

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* update helper and tools for older ir_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix helper

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* use IR_VERSION according to the opset_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix flake8

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* if unknown newer opset_version, use default ir_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* fix undefined opset_version globally

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* use make_model_gen_version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* remove useless '' handling

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* handle newer opset_version for undefined IR_VERSION

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* use _make_test_model_gen_version to handle unrelease opset version

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* traverse opset_imports correctly

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* ignore reasonable unpack case

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* directly use specified opset version; try CI

Signed-off-by: jcwchen <jacky82226@gmail.com>

* test Windows CI as well

Signed-off-by: jcwchen <jacky82226@gmail.com>

* update layer norm by Windows

Signed-off-by: jcwchen <jacky82226@gmail.com>

* .astype(np.float32) for acosh

Signed-off-by: jcwchen <jacky82226@gmail.com>

* update operator.doc

Signed-off-by: jcwchen <jacky82226@gmail.com>

* test_coverage.md

Signed-off-by: jcwchen <jacky82226@gmail.com>

* float64 instead of float32

Signed-off-by: jcwchen <jacky82226@gmail.com>

* int64 for negativeloss and apply Mac CI, exclude .output.pb

Signed-off-by: jcwchen <jacky82226@gmail.com>

* remove target version for now

Signed-off-by: jcwchen <jacky82226@gmail.com>

* use ! instead of exclude on Mac

Signed-off-by: jcwchen <jacky82226@gmail.com>

* skip test_log's input due to numpy.random

Signed-off-by: jcwchen <jacky82226@gmail.com>

* specify opset_imports for simple tests for consistency

Signed-off-by: jcwchen <jacky82226@gmail.com>

* make_model_gen_version

Signed-off-by: jcwchen <jacky82226@gmail.com>

* update operators and testcoverage

Signed-off-by: jcwchen <jacky82226@gmail.com>

* nit: better error messages

Signed-off-by: jcwchen <jacky82226@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run release CIs Use this label to trigger release tests in CI topic: test
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[CI] Check backend tests have been generated
3 participants