Skip to content

Conversation

pjs102793
Copy link
Contributor

This PR replaces if not tokenizer: with if tokenizer is None: to avoid unnecessary calls to the tokenizer’s len method, improving efficiency.

# PyTorch GPU
>>> sat_sm = SaT("sat-3l-sm")
>>> sat_sm.half().to("cuda") # optional, see above
>>> text = "this is a test this is another test"
>>> %timeit list(sat_sm.split(text))
# 52.7 ms ± 2.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# PR Patch PyTorch GPU
>>> %timeit list(sat_sm.split(text))
# 2.45 ms ± 24.1 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# ONNX CPU
>>> model_ort = SaT("sat-3l-sm", ort_providers=["CUDAExecutionProvider", "CPUExecutionProvider"])
>>> %timeit list(model_ort.split(text))
# 59.3 ms ± 1.83 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# PR Patch ONNX CPU
>>> %timeit list(model_ort.split(text))
# 14.7 ms ± 699 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

When executing if not tokenizer:, the len special method of the corresponding class is triggered, incurring a fixed overhead of approximately 40ms.

To avoid this overhead, we use a direct None check to determine the presence of the tokenizer, eliminating this bottleneck.

@markus583
Copy link
Collaborator

What a nice find, thanks a lot for this! I will merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants