[dynamo] Added support for tensor's `is_inference` method #136450

SalmanMohammadi · 2024-09-23T18:14:47Z

This PR adds support for the is_inference method on torch tensors which successfully compiles the following example fn without graph breaks:

def fn_simple(x):
    if x.is_inference():
        return x.sum()
    else:
        return x.min()

I've also tried to add guards on the tensor to guard against is_inference. I wasn't 100% sure where these should go so please don't hesitate to correct me.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @rec @ezyang

pytorch-bot · 2024-09-23T18:14:51Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136450

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (5 Unrelated Failures)

As of commit 6d69b45 with merge base a0a1873 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / linux-focal-py3.9-clang10 / test (crossref, 1, 2, lf.linux.2xlarge) (gh) (disabled by #134602)
test_transformers.py::TestSDPAPrivateUse1Only::test_scaled_dot_product_fused_attention_overrideable_backward

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / linux-focal-py3.12-clang10 / test (default, 4, 4, lf.linux.2xlarge) (gh) (trunk failure)
test_serialization.py::TestSerialization::test_skip_data_serialization_materialize_fake_False
pull / linux-focal-py3.12-clang10 / test (dynamo, 3, 3, lf.linux.2xlarge) (gh) (trunk failure)
test_serialization.py::TestSerialization::test_skip_data_serialization_materialize_fake_False
pull / linux-focal-py3.12-clang10-experimental-split-build / test (default, 3, 3, linux.2xlarge) (gh) (trunk failure)
test_serialization.py::TestSerialization::test_skip_data_serialization_materialize_fake_False
pull / linux-focal-py3.12-clang10-experimental-split-build / test (dynamo, 3, 3, linux.2xlarge) (gh) (trunk failure)
test_serialization.py::TestSerialization::test_skip_data_serialization_materialize_fake_False

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2024-09-23T18:14:52Z

The committers listed above are authorized under a signed CLA.

✅ login: SalmanMohammadi / name: Salman Mohammadi (e89f152, 80565a0, 892468a, 72896bc, b8146ab, 6d69b45)

ezyang · 2024-09-24T01:33:22Z

test/dynamo/test_functions.py

+            x = torch.randn(2, 2)
+            fn(x)
+
+        self.assertEqual(cnts.frame_count, 3)  # Recompile! inference_mode changed


You're actually only testing here that the inference mode state is guarded on, not that if a tensor is an inference mode tensor or not causes a change. Move the allocation of the tensor in the mode but do the fn call outside of it to test this.

Since you are diverging the semantics of the program on the inside, you could also just check if the result equals the eager result or not.

I think this makes sense - thanks!

ezyang · 2024-09-24T14:36:46Z

Are you sure we're not guarding on inference mode-ness of a Tensor already? I checked the implementation of is_inference

    bool no_ADInplaceOrView = !key_set_.has_any(c10::inplace_or_view_ks);
    bool no_Autograd = !key_set_.has_any(c10::autograd_dispatch_keyset);
    TORCH_INTERNAL_ASSERT_DEBUG_ONLY(
        no_ADInplaceOrView == no_Autograd,
        "ADInplaceOrView and Autograd keys must be on/off at the same time.");
    return no_ADInplaceOrView && no_Autograd;
  }

This seems to be derived entirely from the dispatch key set. But we DO guard on that right now:

TensorCheck::TensorCheck(
    const LocalState& state,
    PyTypeObject* pt,
    const at::Tensor& v,
    std::vector<std::optional<c10::SymInt>> dynamic_dims_sizes,
    std::vector<std::optional<c10::SymInt>> dynamic_dims_strides)
    : pytype(pt),
      dispatch_key_(state.apply(v.key_set()).raw_repr()),
      dtype_(v.dtype().toScalarType()),

??

SalmanMohammadi · 2024-09-24T15:17:46Z

Yeah I had the same thought here (#135439 (comment))

V0922 18:59:34.045000 493191 torch/_dynamo/guards.py:2830] [0/1] [__recompiles] Recompiling function fn_simple in /home/salman/pytorch/build/test.py:16
V0922 18:59:34.045000 493191 torch/_dynamo/guards.py:2830] [0/1] [__recompiles] triggered by the following guard failure(s):
V0922 18:59:34.045000 493191 torch/_dynamo/guards.py:2830] [0/1] [__recompiles] - 0/0: GLOBAL_STATE changed: grad_mode
V0922 18:59:34.082000 493191 torch/_dynamo/guards.py:2830] [0/2] [__recompiles] Recompiling function fn_simple in /home/salman/pytorch/build/test.py:16
V0922 18:59:34.082000 493191 torch/_dynamo/guards.py:2830] [0/2] [__recompiles] triggered by the following guard failure(s):
V0922 18:59:34.082000 493191 torch/_dynamo/guards.py:2830] [0/2] [__recompiles] - 0/1: tensor 'L['x']' dispatch key set mismatch. expected DispatchKeySet(CPU, BackendSelect), actual DispatchKeySet(CPU, BackendSelect, ADInplaceOrView, AutogradCPU)
V0922 18:59:34.082000 493191 torch/_dynamo/guards.py:2830] [0/2] [__recompiles] - 0/0: GLOBAL_STATE changed: grad_mode
Does this mean the guards on dispatch key set pick up the inference moded-ness?

However, I'm a noob, and I wasn't sure how specific guard semantics should be, i.e. are there other things that would break the dispatch key set guard, and if so is this fine? It's why I placed the check for is_inference to be triggered first but that did feel a bit weird.

If we do want a specific guard, would it also be simpler to just check against the appropriate dispatch keys vs. introducing another field?

ezyang · 2024-09-24T16:18:51Z

Oh, I missed your comment edit. Your last log suggests that we already guard on inference-mode ness. So in fact you can get rid of all the new guard code, your test case should still pass

SalmanMohammadi · 2024-09-25T09:31:43Z

Updating, thank you for your patience @ezyang.

ezyang · 2024-09-25T19:52:32Z

test/dynamo/test_functions.py

+        eager_result = fn(x_inference)
+
+        cnts = torch._dynamo.testing.CompileCounter()
+        fn = torch._dynamo.optimize(cnts, nopython=True)(fn)


You can write this test more clearly. The most important thing is to distinguish fn from opt_fn. The second is to just directly assertEqual(fn(x_inference), opt_fn(x_inference)) and so forth

ezyang · 2024-09-25T19:52:42Z

Thanks, just nits on the test

ezyang

Thanks a lot!

ezyang · 2024-09-26T01:26:39Z

@pytorchbot merge

pytorchmergebot · 2024-09-26T01:29:13Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-09-26T01:54:55Z

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

pull / linux-focal-py3.12-clang10 / test (default, 4, 4, lf.linux.2xlarge)

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

SalmanMohammadi · 2024-09-27T09:07:46Z

another?
@pytorchbot merge

pytorchmergebot · 2024-09-27T09:09:46Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

adding inference mode

e89f152

pytorch-bot bot added the module: dynamo label Sep 23, 2024

pytorchbot added the open source label Sep 23, 2024

ezyang reviewed Sep 24, 2024

View reviewed changes

fixing tests

892468a

ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 24, 2024

SalmanMohammadi added 2 commits September 25, 2024 09:49

Merge branch 'main' into add_inference_mode_dynamo

b8146ab

reverting guarding

72896bc

SalmanMohammadi mentioned this pull request Sep 25, 2024

[RFC] Supporting KV-cache toggling pytorch/torchtune#1675

Closed

ezyang reviewed Sep 25, 2024

View reviewed changes

ezyang added topic: bug fixes topic category release notes: dynamo labels Sep 25, 2024

SalmanMohammadi added 2 commits September 25, 2024 22:41

addressing comment

80565a0

linting

6d69b45

ezyang approved these changes Sep 26, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 26, 2024

pytorchmergebot added the merging label Sep 26, 2024

pytorchmergebot removed the merging label Sep 26, 2024

pytorchmergebot added the merging label Sep 27, 2024

pytorchmergebot added the Merged label Sep 27, 2024

pytorchmergebot closed this in 48c18ff Sep 27, 2024

pytorchmergebot removed the merging label Sep 27, 2024

[dynamo] Added support for tensor's is_inference method #136450

[dynamo] Added support for tensor's is_inference method #136450

Uh oh!

Conversation

SalmanMohammadi commented Sep 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136450

✅ You can merge normally! (5 Unrelated Failures)

Uh oh!

linux-foundation-easycla bot commented Sep 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang Sep 24, 2024

Choose a reason for hiding this comment

Uh oh!

SalmanMohammadi Sep 24, 2024

Choose a reason for hiding this comment

Uh oh!

ezyang commented Sep 24, 2024

Uh oh!

SalmanMohammadi commented Sep 24, 2024

Uh oh!

ezyang commented Sep 24, 2024

Uh oh!

SalmanMohammadi commented Sep 25, 2024

Uh oh!

ezyang Sep 25, 2024

Choose a reason for hiding this comment

Uh oh!

ezyang commented Sep 25, 2024

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Sep 26, 2024

Uh oh!

pytorchmergebot commented Sep 26, 2024

Merge started

Uh oh!

pytorchmergebot commented Sep 26, 2024

Merge failed

Uh oh!

SalmanMohammadi commented Sep 27, 2024

Uh oh!

pytorchmergebot commented Sep 27, 2024

Merge started

Uh oh!

Uh oh!

[dynamo] Added support for tensor's `is_inference` method #136450

[dynamo] Added support for tensor's `is_inference` method #136450

SalmanMohammadi commented Sep 23, 2024 •

edited

Loading

pytorch-bot bot commented Sep 23, 2024 •

edited

Loading

linux-foundation-easycla bot commented Sep 23, 2024 •

edited

Loading