Skip to content

Conversation

bboreham
Copy link
Member

@bboreham bboreham commented Jun 12, 2023

Instead of unpacking every individual string, we skip to the point where there is a difference.

Added a benchmark to illustrate.

[benchmark results updated 2023-06-15T1513]
Before/after this PR with -tags stringlabels:

goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/model/labels
cpu: Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
                                 │  before.txt  │             after.txt              │
                                 │    sec/op    │   sec/op     vs base               │
Labels_Compare/equal-4              71.13n ± 6%   13.40n ± 2%  -81.16% (p=0.002 n=6)
Labels_Compare/not_equal-4          69.54n ± 2%   33.32n ± 2%  -52.09% (p=0.002 n=6)
Labels_Compare/different_sizes-4   39.050n ± 2%   9.840n ± 4%  -74.80% (p=0.002 n=6)
Labels_Compare/lots-4              300.85n ± 9%   46.59n ± 4%  -84.51% (p=0.002 n=6)
geomean                             87.31n        21.27n       -75.64%

After this PR, without/with -tags stringlabels:

goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/model/labels
cpu: Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
                                 │  slice.txt   │             after.txt              │
                                 │    sec/op    │   sec/op     vs base               │
Labels_Compare/equal-4              23.25n ± 3%   13.40n ± 2%  -42.35% (p=0.002 n=6)
Labels_Compare/not_equal-4          25.02n ± 7%   33.32n ± 2%  +33.20% (p=0.002 n=6)
Labels_Compare/different_sizes-4   14.365n ± 6%   9.840n ± 4%  -31.50% (p=0.002 n=6)
Labels_Compare/lots-4               92.77n ± 3%   46.59n ± 4%  -49.77% (p=0.002 n=6)
geomean                             29.67n        21.27n       -28.31%

bboreham added 2 commits June 13, 2023 08:38
Extend tests too.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Instead of unpacking every individual string, we skip to the point
where there is a difference.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
@bboreham bboreham force-pushed the faster-labels-compare branch from 6db3be1 to 9ae580d Compare June 13, 2023 08:39
if firstCharDifferent == len(shorter) {
if len(shorter) == len(longer) {
return 0
} else if firstCharDifferent == len(a.data) {
Copy link
Contributor

@colega colega Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, I think this is more readable:

Suggested change
} else if firstCharDifferent == len(a.data) {
} else if len(a.data) == len(shorter) {

As this way the code reads as: if a is shorter...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read what I wrote as "if the first different character is at the end of a.data".

I think yours requires to be thinking about longer and shorter, which I'm not at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand from if firstCharDifferent == len(shorter) { that firstCharDifferent actually lost its value, as it's not first char different now, but a signal saying: _ all common characters are equal, pending to check whether one string is longer than the other one._

Which now makes me think that it would be better to just write:

	if firstCharDifferent == len(shorter) {
		return compareLengths(a.data, b.data)
	}
	// ...
}
// ...

func compareLengths(a, b []string) int {
	switch {
	case len(a) == len(b):
		return 0
	case len(a) < len(b):
		return -1
	case len(b) > len(a):
		return +1
	}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both options are fine to me, but I agree with @colega naming of firstCharDifferent is indeed misleading (premature - it might be just all equal characters. Perhaps keeping i and not introducing new var might help here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func compareLengths(a, b string) int {
    return len(a) - len(b)
}

Copy link
Contributor

@colega colega Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh great, so maybe then just:

if i == len(shorter) {
    return len(a.data) - len(b.data)
}
firstCharDifferent := i

Comment on lines 436 to 442
i := 0
_ = longer[len(shorter)-1] // Get compiler to do bounds-check on longer just once here.
for ; i < len(shorter); i++ {
if shorter[i] != longer[i] {
break
}
}
Copy link
Contributor

@colega colega Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since decodeSize is very cheap, we can keep decoding sizes as we go, like:

Suggested change
i := 0
_ = longer[len(shorter)-1] // Get compiler to do bounds-check on longer just once here.
for ; i < len(shorter); i++ {
if shorter[i] != longer[i] {
break
}
}
i := 0
lastLblStart, nextLblStart := 0, 0
_ = longer[len(shorter)-1] // Get compiler to do bounds-check on longer just once here.
for ; i < len(shorter); i++ {
if shorter[i] != longer[i] {
break
}
if i >= nextLblStart {
size, nextI := decodeSize(shorter, i)
nextLblStart = nextI + size
lastLblStart = i
}
}

So then later we can just do i = lastLblStart.

This gives pretty good performance impact:

goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/model/labels
cpu: 11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz
                                  │ bryanagain  │              tracknoloop               │
                                  │   sec/op    │   sec/op     vs base                   │
Labels_Compare/equal-16             25.59n ± 0%   26.48n ± 3%   +3.46% (p=0.000 n=30+10)
Labels_Compare/not_equal-16         30.25n ± 1%   24.07n ± 3%  -20.43% (p=0.000 n=30+10)
Labels_Compare/different_sizes-16   16.46n ± 1%   12.38n ± 8%  -24.84% (p=0.000 n=30+10)
Labels_Compare/lots-16              47.77n ± 1%   42.97n ± 4%  -10.06% (p=0.000 n=30+10)
geomean                             27.94n        24.13n       -13.63%

I've ran this benchmark multiple times and didn't get a consistent result (I was getting the equals benchmark faster in my version, which doesn't make sense), however I also think that if we do this, we can remove the last for loop (as we already know which labels are different, no need to search for them).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to search for the first difference much faster, e.g. 8 bytes at a time using dwords, or even longer strides using SIMD. I made various attempts at this, e.g. 06f8bc7. I think it's faster than your version, although I don't really like the implementation in github.com/grailbio/base/simd.FirstUnequal8Unsafe.

Putting an extra if-statement inside the tight loop does not seem attractive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going unsafe, then why just not cast the 8 bytes to uint64?

That seems to give pretty good results:

goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/model/labels
cpu: 11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz
                                  │   original   │            unsafe_uint64            │
                                  │    sec/op    │   sec/op     vs base                │
Labels_Compare/equal-16             24.635n ± 0%   5.832n ± 2%  -76.33% (p=0.000 n=10)
Labels_Compare/not_equal-16          30.33n ± 1%   14.37n ± 1%  -52.63% (p=0.000 n=10)
Labels_Compare/different_sizes-16   16.125n ± 1%   5.353n ± 2%  -66.80% (p=0.000 n=10)
Labels_Compare/lots-16               47.39n ± 0%   27.50n ± 2%  -41.97% (p=0.000 n=10)
geomean                              27.49n        10.54n       -61.66%

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included your latest code, and updated my benchmark results.
Now mostly going faster than the slice version.

bboreham added 3 commits June 13, 2023 13:55
Try each case both ways round.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 💪🏽 Thanks!

if firstCharDifferent == len(shorter) {
if len(shorter) == len(longer) {
return 0
} else if firstCharDifferent == len(a.data) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both options are fine to me, but I agree with @colega naming of firstCharDifferent is indeed misleading (premature - it might be just all equal characters. Perhaps keeping i and not introducing new var might help here?

bboreham and others added 2 commits June 15, 2023 15:00
Compare even faster by casting the strings to uint64.

Now the bounds-check optimisation doesn't make any difference, so we
don't need to protect against zero-length strings on entry.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Created an issue for changing `Compare` to `Less` -
#12455

Changing to non-alphabetic sorting does not seem feasible, since
previously-stored blocks are in alphabetic order.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Copy link
Contributor

@colega colega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@bboreham
Copy link
Member Author

@bwplotka a couple of things changed since you last approved so I'll let you take another look before merging.

Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@bboreham bboreham merged commit 87d08ab into main Jun 20, 2023
@bboreham bboreham deleted the faster-labels-compare branch June 20, 2023 19:58
bboreham added a commit to bboreham/prometheus that referenced this pull request Jul 24, 2023
…heus#12451)

Instead of unpacking every individual string, we skip to the point
where there is a difference, going 8 bytes at a time where possible.

Add benchmark for Compare; extend tests too.

---------

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants