Skip to content

Conversation

dhiaEddineRhaiem
Copy link
Contributor

@dhiaEddineRhaiem dhiaEddineRhaiem commented May 23, 2025

This PR:

  1. adds MuP multipliers in the fwd pass of the slow path torch_forward
  2. Fixes the repeat pattern in Mamba heads , repeat -> repeat_interleave see Fix Mamba2 Grouped SSD Support in the torch_forward Path #37533
    @ArthurZucker

@dhiaEddineRhaiem dhiaEddineRhaiem requested a review from vasqu May 24, 2025 14:37
Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Checking in with the slow tests :)

@vasqu
Copy link
Contributor

vasqu commented May 26, 2025

run-slow: falcon_h1

cc @ydshieh anything I should do differently?

@ydshieh
Copy link
Collaborator

ydshieh commented May 26, 2025

@vasqu It's triggered, but failed at a step (reply to comment) and the tests are not run.

https://github.com/huggingface/transformers/actions/runs/15250152202/job/42884998108

I'm not sure why, but I will re-run it and see how it goes.

Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/falcon_h1']
quantizations: [] ...

1 similar comment
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/falcon_h1']
quantizations: [] ...

@ydshieh
Copy link
Collaborator

ydshieh commented May 26, 2025

@younesbelkada
Copy link
Contributor

younesbelkada commented May 26, 2025

Slow tests failed, but not sure if this PR caused the failure since we only changed the slow path which is not executed on GPUs, perhaps they failed since the beginning?

@younesbelkada
Copy link
Contributor

We're not able to extract the expected text from the logs and we don't have access to T4 GPUs.. is there a way to extract the output from the logs?

@ydshieh
Copy link
Collaborator

ydshieh commented May 26, 2025

I can update the expected value this afternoon

@dhiaEddineRhaiem
Copy link
Contributor Author

i pushed a fix in the Integration test EXPECTED_TEXT
could you please rerun the slow-path test?

@vasqu
Copy link
Contributor

vasqu commented May 26, 2025

run-slow: falcon_h1

Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/falcon_h1']
quantizations: [] ...

@ydshieh
Copy link
Collaborator

ydshieh commented May 26, 2025

remote: Permission to younesbelkada/transformers.git denied to ydshieh.

@younesbelkada you don't love @ydshieh anymore ? You owe me instructblip ....

@younesbelkada
Copy link
Contributor

ahaha sorry let me give you access now ! yes I owe you that one ... which took so long to fix together

@ydshieh
Copy link
Collaborator

ydshieh commented May 26, 2025

or younesbelkada#7

@ydshieh
Copy link
Collaborator

ydshieh commented May 26, 2025

The updated value works for T4. No more need to run-slow

Thank you for the effort 🙏

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤗

@younesbelkada
Copy link
Contributor

Thank you very much @ydshieh and HF team for your continuous support ! 🚀

@ydshieh
Copy link
Collaborator

ydshieh commented May 26, 2025

wait don't merge yet.

I will check again and merge once ok

@dhiaEddineRhaiem
Copy link
Contributor Author

Thanks all guys for your help

@ydshieh
Copy link
Collaborator

ydshieh commented May 26, 2025

younesbelkada#8

😰

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
@ydshieh ydshieh enabled auto-merge (squash) May 26, 2025 13:24
@ydshieh ydshieh disabled auto-merge May 26, 2025 13:30
@ydshieh ydshieh merged commit 7a9b071 into huggingface:main May 26, 2025
12 checks passed
redmoe-moutain pushed a commit to redmoe-moutain/transformers that referenced this pull request Jun 10, 2025
* Create push-important-models.yml

* feat: add falcon-h1

* fixup

* address comment

* fix

* fix copies

* fix copies

* fix

* fix

* fix

* fix

* fix copies

* fix

* fix copies

* fix test import to at least trigget the cis

* yups

* update

* fix make fix copies

* fix inits?

* fix style

* skip annoying test

* add integration test for Falcon H1

* fix copies

* fix

* fix typo

* make style

* fix slow path generations

* clean debug traces

* debug

* remove debug traces final confirmation

* clean debug traces final

* fix format and lineup

* make style

* debug

* Update src/transformers/models/falcon_h1/modular_falcon_h1.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* adress comments

* fix fix-copies

* fix integration test

* Merge pull request huggingface#7 from ydshieh/fix-slow-path

update

* another update (huggingface#8)

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: younesbelkada <younes.belkada@tii.ae>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants