-
Notifications
You must be signed in to change notification settings - Fork 349
Update libmathdx, extend tile_cholesky_solve to 2D rhs, and improve trsm based tile operations #773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…from libmathdx for tile_lower_solve and tile_upper_solve Signed-off-by: Roland Schwan <roland.schwan@mikrounix.com>
Signed-off-by: Roland Schwan <roland.schwan@mikrounix.com>
Signed-off-by: Roland Schwan <roland.schwan@mikrounix.com>
This fix comes from @daedalus5 who has also been working on the libmathdx 0.2.1 evaluation in Warp: in warp/build_dll.py, we need to change Line 396: OLD: linkopts.append(f'nvJitLink_static.lib /LIBPATH:"{args.libmathdx_path}/lib" mathdx_static.lib') NEW: linkopts.append(f'nvJitLink_static.lib /LIBPATH:"{args.libmathdx_path}/lib/x64" mathdx_static.lib') |
Signed-off-by: Roland Schwan <roland.schwan@mikrounix.com>
Signed-off-by: Roland Schwan <roland.schwan@mikrounix.com>
While going through the code again, I figured out what caused #768. Turns out that the layout information was not passed along correctly. Since the layout information is only used in the mathdx calls, I took the liberty to directly fix it in this PR (1561b44) since it's directly related. I modified the tests accordingly to test for layout propagation. |
Thanks so much for this @RSchwan . Phenomenal effort. A few small notes: Could you please add this decorator to
and this decorator to
in test_tile_mathdx.py? We need these for our CI pipeline. |
Signed-off-by: Roland Schwan <roland.schwan@mikrounix.com>
Done. Let me know if you need other changes. |
It looks like we need this: over |
Apologies, these should both reference version 12060, not 12600. So |
Signed-off-by: Roland Schwan <roland.schwan@mikrounix.com>
Just missing this. |
Signed-off-by: Roland Schwan <roland.schwan@mikrounix.com>
Hey @RSchwan, could you also please clean up the commit history with git rebase/squash so it's down to a single commit? We need to merge this into an internal repo first and it's easier if the commits of this pull request are added as-is to the main branch. Rebasing these changes onto the latest |
Update libmathdx, extend tile_cholesky_solve to 2D rhs, and improve trsm based tile operations (NVIDIAGH-773) See merge request omniverse/warp!1360
Description
This PR updates the libmathdx library from version 0.1.2 to 0.2.1. The main reason is to add support for trsm operations, which were introduced in version 0.2.1. Additionally, the
tile_cholesky_solve
has been extended to allow for 2D right-hand sides.The major change from libmathdx 0.1.2 to 0.2.1 was a change in the C API, i.e., the scalings alpha and beta in the gemm kernel are now passed as a pointer. Additionally, I changed the cuda implementations of
tile_lower_solve
andtile_upper_solve
to use the trsm kernels from cusolverdx. In a personal project which heavenly relies ontile_lower_solve
, I see an overall 2x speed-up with the trsm kernel from cusolverdx.The
tile_cholesky_solve
operation has been extended to allow for 2D right-hand sides, mirroring the API oftile_lower_solve
andtile_upper_solve
. An appropriate test has been added.The full test suit is passing on my machine (RTX 3080).
Before your PR is "Ready for review"
stubs.py
,functions.rst
)pre-commit run -a