Skip to content

Improve distances_simd.cpp for aarch64 #2392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

wx257osn2
Copy link
Contributor

@wx257osn2 wx257osn2 commented Jul 20, 2022

  • use vaddvq_f32 instead of vpaddq_f32 and vdups_laneq_f32 in fvec_L2sqr , fvec_inner_product , and fvec_norm_L2sqr
  • implement fvec_L1 and fvec_Linf for ARM SIMD (NEON)
    • This causes performance regression, so I've droped it.
  • implement fvec_madd and fvec_madd_and_argmin for ARM SIMD (NEON)

@wx257osn2
Copy link
Contributor Author

CondaHTTPError: HTTP 403 FORBIDDEN for url <https://conda.anaconda.org/pytorch/noarch/pytorch-mutex-1.0-cpu.tar.bz2>

😞

@wx257osn2 wx257osn2 force-pushed the improve-distances_simd_cpp-for-aarch64 branch from 37e8840 to c2eeef2 Compare July 24, 2022 21:47
@wx257osn2 wx257osn2 force-pushed the improve-distances_simd_cpp-for-aarch64 branch from c2eeef2 to 48c2848 Compare July 24, 2022 23:13
@facebook-github-bot
Copy link
Contributor

@mdouze has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@wx257osn2 wx257osn2 deleted the improve-distances_simd_cpp-for-aarch64 branch August 31, 2022 13:27
BZO95 added a commit to BZO95/faiss that referenced this pull request Apr 10, 2025
Summary:
- use `vaddvq_f32` instead of `vpaddq_f32` and `vdups_laneq_f32` in `fvec_L2sqr` , `fvec_inner_product` , and `fvec_norm_L2sqr`
- ~~implement `fvec_L1` and `fvec_Linf` for ARM SIMD (NEON)~~
    - This causes performance regression, so I've droped it.
- implement `fvec_madd` and `fvec_madd_and_argmin` for ARM SIMD (NEON)

Pull Request resolved: facebookresearch/faiss#2392

Reviewed By: patricklabatut

Differential Revision: D38198174

Pulled By: mdouze

fbshipit-source-id: 3488a0cf2db1ded458b3bf73f4bc9665413e3351
aalekhpatel07 pushed a commit to aalekhpatel07/faiss that referenced this pull request Apr 10, 2025
Summary:
- use `vaddvq_f32` instead of `vpaddq_f32` and `vdups_laneq_f32` in `fvec_L2sqr` , `fvec_inner_product` , and `fvec_norm_L2sqr`
- ~~implement `fvec_L1` and `fvec_Linf` for ARM SIMD (NEON)~~
    - This causes performance regression, so I've droped it.
- implement `fvec_madd` and `fvec_madd_and_argmin` for ARM SIMD (NEON)

Pull Request resolved: facebookresearch#2392

Reviewed By: patricklabatut

Differential Revision: D38198174

Pulled By: mdouze

fbshipit-source-id: 3488a0cf2db1ded458b3bf73f4bc9665413e3351
aalekhpatel07 pushed a commit to aalekhpatel07/faiss that referenced this pull request Apr 10, 2025
Summary:
- use `vaddvq_f32` instead of `vpaddq_f32` and `vdups_laneq_f32` in `fvec_L2sqr` , `fvec_inner_product` , and `fvec_norm_L2sqr`
- ~~implement `fvec_L1` and `fvec_Linf` for ARM SIMD (NEON)~~
    - This causes performance regression, so I've droped it.
- implement `fvec_madd` and `fvec_madd_and_argmin` for ARM SIMD (NEON)

Pull Request resolved: facebookresearch#2392

Reviewed By: patricklabatut

Differential Revision: D38198174

Pulled By: mdouze

fbshipit-source-id: 3488a0cf2db1ded458b3bf73f4bc9665413e3351
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants