Skip to content

Conversation

gs-olive
Copy link
Collaborator

@gs-olive gs-olive commented Jun 1, 2024

Description

  • Generative inference with HF text generation models such as gpt2 can fail if graph segmentation causes a symbolic integer to be passed from Torch to TRT, since the Torch output is an integer, while TRT expects a tensor
  • Added logic to the modules to address this case
  • Added test cases to validate generation with both Python and C++ runtimes

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • [ x ] My code follows the style guidelines of this project (You can use the linters)
  • [ x ] I have performed a self-review of my own code
  • [ x ] I have commented my code, particularly in hard-to-understand areas and hacks
  • [ x ] I have made corresponding changes to the documentation
  • [ x ] I have added tests to verify my fix or my feature
  • [ x ] New and existing unit tests pass locally with my changes
  • [ x ] I have added the relevant labels to my PR in so that relevant reviewers are notified

@gs-olive gs-olive requested a review from peri044 June 1, 2024 04:30
@gs-olive gs-olive self-assigned this Jun 1, 2024
@github-actions github-actions bot added component: tests Issues re: Tests component: api [Python] Issues re: Python API component: runtime component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Jun 1, 2024
@gs-olive
Copy link
Collaborator Author

gs-olive commented Jun 5, 2024

--> Cherry pick to release/2.3

@github-actions github-actions bot requested a review from narendasan June 5, 2024 17:58
@gs-olive gs-olive force-pushed the dynamic_shapes_integer_input_bugfix branch from ef83457 to 5906ab4 Compare June 7, 2024 00:36
@github-actions github-actions bot added documentation Improvements or additions to documentation component: lowering Issues re: The lowering / preprocessing passes component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: converters Issues re: Specific op converters component: torch_compile labels Jun 7, 2024
@gs-olive gs-olive changed the base branch from dyn_llama_cp to main June 7, 2024 00:38
@gs-olive gs-olive force-pushed the dynamic_shapes_integer_input_bugfix branch from 5906ab4 to 7b0810d Compare June 7, 2024 00:38
@github-actions github-actions bot removed documentation Improvements or additions to documentation component: lowering Issues re: The lowering / preprocessing passes component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: converters Issues re: Specific op converters component: torch_compile labels Jun 7, 2024
@gs-olive gs-olive force-pushed the dynamic_shapes_integer_input_bugfix branch 3 times, most recently from 046291c to 343d610 Compare June 17, 2024 19:06
@gs-olive gs-olive force-pushed the dynamic_shapes_integer_input_bugfix branch from f960816 to e99514b Compare June 18, 2024 19:08
gs-olive added 10 commits June 24, 2024 15:55
- Generative inference with HF text generation models such as gpt2 can
fail if graph segmentation causes a symbolic integer to be passed from
Torch to TRT, since the Torch output is an integer, while TRT expects a
tensor
- Added logic to the modules to address the above case
- Added test cases to validate generation with both Python and C++
runtimes
@gs-olive gs-olive force-pushed the dynamic_shapes_integer_input_bugfix branch from 89a960e to 0d8ae44 Compare June 24, 2024 22:56
@gs-olive gs-olive merged commit caf3a92 into main Jun 25, 2024
@gs-olive gs-olive deleted the dynamic_shapes_integer_input_bugfix branch June 25, 2024 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: runtime component: tests Issues re: Tests needs-release-cherrypick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants