Skip to content

Conversation

pjbgf
Copy link
Owner

@pjbgf pjbgf commented Jul 8, 2025

This implementation does not rely on avo, as unfortunately it doesn't support arm64 yet.
The performance improvement is roughly 15-20% in both execution time and amount of data processed:

goos: linux
goarch: arm64
pkg: github.com/pjbgf/sha1cd/test
                                   │     /tmp/before     │                 /tmp/after                  │
                                   │       sec/op        │       sec/op         vs base                │
CalculateDvMask/generic-2            0.0000006500n ± 54%   0.0000005000n ± 20%  -23.08% (p=0.009 n=10)
CalculateDvMask/native-2             0.0000008000n ± 50%   0.0000007000n ± 57%        ~ (p=0.420 n=10)
CalculateDvMask/cgo-2                 0.000001600n ± 12%    0.000001850n ± 19%        ~ (p=0.090 n=10)
Hash8Bytes/sha1-2                           105.2n ±  1%          104.9n ±  0%        ~ (p=0.235 n=10)
Hash8Bytes/sha1cd_native-2                  713.1n ±  1%          603.7n ±  1%  -15.34% (p=0.000 n=10)
Hash8Bytes/sha1cd_generic-2                 703.1n ±  3%          711.6n ±  1%        ~ (p=0.063 n=10)
Hash8Bytes/sha1cd_cgo-2                     1.603µ ±  2%          1.620µ ±  1%   +1.06% (p=0.027 n=10)
Hash320Bytes/sha1-2                         318.6n ±  0%          318.8n ±  0%        ~ (p=0.466 n=10)
Hash320Bytes/sha1cd_native-2                3.152µ ±  1%          2.538µ ±  1%  -19.48% (p=0.000 n=10)
Hash320Bytes/sha1cd_generic-2               3.178µ ±  1%          3.145µ ±  2%        ~ (p=0.118 n=10)
Hash320Bytes/sha1cd_cgo-2                   2.876µ ±  2%          2.867µ ±  1%        ~ (p=0.631 n=10)
Hash1K/sha1-2                               788.0n ±  0%          788.5n ±  2%        ~ (p=0.738 n=10)
Hash1K/sha1cd_native-2                      8.443µ ±  1%          6.663µ ±  1%  -21.08% (p=0.000 n=10)
Hash1K/sha1cd_generic-2                     8.475µ ±  1%          8.448µ ±  3%        ~ (p=1.000 n=10)
Hash1K/sha1cd_cgo-2                         5.669µ ±  2%          5.641µ ±  0%        ~ (p=0.516 n=10)
Hash8K/sha1-2                               5.723µ ±  1%          5.717µ ±  1%        ~ (p=0.305 n=10)
Hash8K/sha1cd_native-2                      63.37µ ±  1%          49.23µ ±  2%  -22.31% (p=0.000 n=10)
Hash8K/sha1cd_generic-2                     63.24µ ±  0%          63.31µ ±  1%        ~ (p=0.481 n=10)
Hash8K/sha1cd_cgo-2                         34.27µ ±  2%          33.92µ ±  3%        ~ (p=0.631 n=10)
HashWithCollision/sha1cd_native-2           9.061µ ±  3%          7.851µ ±  1%  -13.36% (p=0.000 n=10)
HashWithCollision/sha1cd_generic-2          9.072µ ±  1%          9.053µ ±  1%        ~ (p=0.971 n=10)
HashWithCollision/sha1cd_cgo-2              6.266µ ±  0%          6.235µ ±  1%   -0.49% (p=0.035 n=10)
geomean                                     185.8n                175.3n         -5.67%


                                   │ /tmp/before  │              /tmp/after               │
                                   │     B/s      │      B/s       vs base                │
Hash8Bytes/sha1-2                    72.50Mi ± 1%    72.71Mi ± 0%        ~ (p=0.271 n=10)
Hash8Bytes/sha1cd_native-2           10.70Mi ± 1%    12.64Mi ± 1%  +18.09% (p=0.000 n=10)
Hash8Bytes/sha1cd_generic-2          10.85Mi ± 3%    10.72Mi ± 1%        ~ (p=0.055 n=10)
Hash8Bytes/sha1cd_cgo-2              4.764Mi ± 2%    4.711Mi ± 1%   -1.10% (p=0.027 n=10)
Hash320Bytes/sha1-2                  958.0Mi ± 0%    957.4Mi ± 0%        ~ (p=0.493 n=10)
Hash320Bytes/sha1cd_native-2         96.82Mi ± 1%   120.23Mi ± 1%  +24.18% (p=0.000 n=10)
Hash320Bytes/sha1cd_generic-2        96.04Mi ± 1%    97.04Mi ± 2%        ~ (p=0.123 n=10)
Hash320Bytes/sha1cd_cgo-2            106.1Mi ± 2%    106.4Mi ± 1%        ~ (p=0.631 n=10)
Hash1K/sha1-2                        1.210Gi ± 0%    1.209Gi ± 2%        ~ (p=0.631 n=10)
Hash1K/sha1cd_native-2               115.7Mi ± 1%    146.6Mi ± 1%  +26.72% (p=0.000 n=10)
Hash1K/sha1cd_generic-2              115.2Mi ± 1%    115.6Mi ± 3%        ~ (p=1.000 n=10)
Hash1K/sha1cd_cgo-2                  172.3Mi ± 2%    173.1Mi ± 0%        ~ (p=0.516 n=10)
Hash8K/sha1-2                        1.333Gi ± 1%    1.335Gi ± 1%        ~ (p=0.280 n=10)
Hash8K/sha1cd_native-2               123.3Mi ± 1%    158.7Mi ± 2%  +28.72% (p=0.000 n=10)
Hash8K/sha1cd_generic-2              123.5Mi ± 0%    123.4Mi ± 1%        ~ (p=0.469 n=10)
Hash8K/sha1cd_cgo-2                  227.9Mi ± 2%    230.3Mi ± 3%        ~ (p=0.631 n=10)
HashWithCollision/sha1cd_native-2    67.36Mi ± 3%    77.75Mi ± 1%  +15.43% (p=0.000 n=10)
HashWithCollision/sha1cd_generic-2   67.29Mi ± 1%    67.42Mi ± 1%        ~ (p=0.971 n=10)
HashWithCollision/sha1cd_cgo-2       97.41Mi ± 0%    97.90Mi ± 1%   +0.50% (p=0.034 n=10)
geomean                              104.4Mi         110.2Mi        +5.59%

pjbgf added 4 commits July 8, 2025 09:35
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
@pjbgf pjbgf merged commit b3dc5e7 into main Jul 8, 2025
10 checks passed
@pjbgf pjbgf deleted the arm64 branch July 8, 2025 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant