Skip to content

Conversation

fungiboletus
Copy link
Contributor

Hi,

This pull/merge request fixes a bug in bstream. Sometimes, it appends two bytes when only one is necessary.

This is a somewhat risky pull request in critical historical code. I'm submitting it as an improvement suggestion, but I would understand if the maintainers decide against it. I also only tested from my point of view, and I don't know about all the potential implications.

Still, I think the change fixes a minor bug and may bring some little performances.

For context, I'm writing a Prometheus chunk parser in Rust using nom, and an assertion seldomly failed because some of my testing XOR chunks generated by Prometheus had a whole byte of padding instead of 0 to 7 bits as documented.

I have checked my parser many times, but I think Prometheus stores an extra useless byte of padding in XOR chunks once in a while. I have not done stats, but from far it looks in the magnitude of 1% of the chunks.

It's not a bigger problem than a waste of a byte because the Prometheus parser does not read the padding bits.

I think the cause is that when bstream/writeByte is called and 0 bits of space are left, the function adds two bytes of space instead of one. Most of the time, the second byte will be used later, so it's not wasted. But if this happens at the end of the chunk, it is wasted.

From my point of view, a simple fix is to shortcut the bit operations and add the byte to the stream. It's simpler and slightly faster as adding a byte that is luckily aligned skip bit operations. It seems to work on my machine, but I would suggest heavy testing.

An alternative is to update chunks.md and say xor chunks can have from 0 to 8 padding bits. It goes against the efforts of saving as much space as possible in the chunks, but it's historical code.

The change doesn't seem to impact the parser. Old chunks can still be read by an updated Prometheus after the change, and new chunks can be read by older Prometheus versions. It's a removed byte in a non-read part of the chunk data. CRC32C checks will still pass, but making a new chunk with the same data will result in a 1-byte shorter chunk with a different CRC32C.

Here is some code to create an XOR Chunk featuring 8 bits of padding before the change:

package main

import (
	"log"

	"github.com/prometheus/prometheus/tsdb/chunkenc"
)

func main() {

	chunk := chunkenc.NewXORChunk()
	app, err := chunk.Appender()
	if err != nil {
		log.Fatalf("Failed to create appender: %v", err)
		return
	}

	// Tried to create easy to read synthetic data for a while,
        // but finally went with a real prometheus chunk from my laptop
	data := []struct {
		timestamp int64
		value     float64
	}{
		{1723554010810, 3.352926e+06},
		{1723554025810, 3.361034e+06},
		{1723554040813, 3.366238e+06},
		{1723554055813, 3.372543e+06},
		{1723554070810, 3.382218e+06},
		{1723554085810, 3.386243e+06},
		{1723554100813, 3.390625e+06},
		{1723554115815, 3.397079e+06},
		{1723554130812, 3.403027e+06},
		{1723554145815, 3.408209e+06},
		{1723554160810, 3.412838e+06},
		{1723554175810, 3.419406e+06},
		{1723554190810, 3.428828e+06},
		{1723554205813, 3.432756e+06},
		{1723554220814, 3.43818e+06},
		{1723554235810, 3.443567e+06},
		{1723554250815, 3.449113e+06},
		{1723554265813, 3.45423e+06},
		{1723554280815, 3.460286e+06},
		{1723554295815, 3.466954e+06},
		{1723554310810, 3.476994e+06},
		{1723554325814, 3.48092e+06},
		{1723554340810, 3.486993e+06},
		{1723554355810, 3.491518e+06},
		{1723554370810, 3.497818e+06},
		{1723554385810, 3.503245e+06},
		{1723554400815, 3.50839e+06},
		{1723554415813, 3.514719e+06},
		{1723554430813, 3.522938e+06},
		{1723554445810, 3.529361e+06},
		{1723554460810, 3.53427e+06},
		{1723554475810, 3.540406e+06},
		{1723554490815, 3.546393e+06},
		{1723554505815, 3.552433e+06},
		{1723554520815, 3.558446e+06},
		{1723554535810, 3.564795e+06},
		{1723554550810, 3.572833e+06},
		{1723554565813, 3.576793e+06},
		{1723554580815, 3.582382e+06},
		{1723554595810, 3.587263e+06},
		{1723554610815, 3.593223e+06},
		{1723554625810, 3.598246e+06},
		{1723554640815, 3.604505e+06},
		{1723554655810, 3.609956e+06},
		{1723554670815, 3.619533e+06},
		{1723554685810, 3.623309e+06},
		{1723554700826, 3.628362e+06},
		{1723554715810, 3.632816e+06},
		{1723554730813, 3.639926e+06},
		{1723554745813, 3.645495e+06},
		{1723554760815, 3.651517e+06},
		{1723554775814, 3.656546e+06},
		{1723554790810, 3.666253e+06},
		{1723554805810, 3.670205e+06},
		{1723554820810, 3.675055e+06},
		{1723554835815, 3.680532e+06},
		{1723554850810, 3.685707e+06},
		{1723554865814, 3.692034e+06},
		{1723554880815, 3.69737e+06},
		{1723554895810, 3.703294e+06},
		{1723554910815, 3.713391e+06},
		{1723554925810, 3.717385e+06},
		{1723554940815, 3.72191e+06},
		{1723554955815, 3.728171e+06},
		{1723554970810, 3.732978e+06},
		{1723554985814, 3.739219e+06},
		{1723555000810, 3.744222e+06},
		{1723555015815, 3.750557e+06},
		{1723555030816, 3.759927e+06},
		{1723555045814, 3.764075e+06},
		{1723555060815, 3.769623e+06},
		{1723555075810, 3.774632e+06},
		{1723555090815, 3.780971e+06},
		{1723555105813, 3.785869e+06},
		{1723555120810, 3.79131e+06},
		{1723555135816, 3.797813e+06},
		{1723555150813, 3.806619e+06},
		{1723555165810, 3.810334e+06},
		{1723555180813, 3.816185e+06},
		{1723555195816, 3.821011e+06},
		{1723555210810, 3.825368e+06},
		{1723555225813, 3.830105e+06},
		{1723555240815, 3.835296e+06},
		{1723555255810, 3.839372e+06},
		{1723555270815, 3.848867e+06},
		{1723555285810, 3.851704e+06},
		{1723555300813, 3.856602e+06},
		{1723555315815, 3.861115e+06},
		{1723555555810, 139117},
		{1723555570810, 151189},
		{1723555585810, 157878},
		{1723555600810, 166868},
		{1723555615810, 173098},
		{1723555630810, 180218},
		{1723555645810, 187629},
		{1723555660810, 194337},
		{1723555675810, 199912},
		{1723555690810, 206331},
		{1723555705810, 214146},
		{1723555720810, 220622},
		{1723555735810, 227359},
		{1723555750810, 233948},
		{1723555765810, 240125},
		{1723555780810, 249135},
		{1723555795812, 255340},
	}

	for _, d := range data {
		app.Append(d.timestamp, d.value)
	}

	bytes := chunk.Bytes()

	// copy without the last byte
	bytes_copy := make([]byte, len(bytes)-1)
	//bytes_copy := make([]byte, len(bytes))
	copy(bytes_copy, bytes)

	chunk_reader := chunkenc.NewXORChunk()
	chunk_reader.Reset(bytes_copy)

	// read the chunk
	it := chunk_reader.Iterator(nil)
	for _, d := range data {
		next_val := it.Next()
		if next_val == chunkenc.ValNone {
			log.Fatalf("Missing value")
			return
		}
		t, v := it.At()
		// assert that the value is the same as the one we appended
		if t != d.timestamp || v != d.value {
			log.Fatalf("Value mismatch at %v: %v != %v", d.timestamp, v, d.value)
			return
		}
	}
}

Signed-off-by: Antoine Pultier <antoine.pultier@sintef.no>
Signed-off-by: Antoine Pultier <antoine.pultier@sintef.no>
Signed-off-by: Antoine Pultier <antoine.pultier@sintef.no>
@fungiboletus
Copy link
Contributor Author

Looking at the test TestStreamReadEndpoint data, this is a much smaller chunk having a perhaps unnecessary byte:

package main

import (
	"fmt"

	"github.com/prometheus/prometheus/tsdb/chunkenc"
)

func main() {
	byteArrays := [][]byte{
		[]byte("\000\001\200\364\356\006@\307p\000\000\000\000\000\000"),
		[]byte("\000\001\200\364\356\006@\307p\000\000\000\000\000"), // new version
	}

	for _, bytes := range byteArrays {
		chunk_reader := chunkenc.NewXORChunk()
		chunk_reader.Reset(bytes)

		it := chunk_reader.Iterator(nil)
		for j := 0; ; j++ {
			next_val := it.Next()
			if next_val == chunkenc.ValNone {
				break
			}
			t, v := it.At()
			fmt.Printf("Value at %v: %v\n", t, v)
		}

		fmt.Println()
	}
}

@beorn7
Copy link
Member

beorn7 commented Sep 10, 2024

Thanks for the find. The proposed change looks plausible to me. However, this code is essentially vendored from https://github.com/dgryski/go-tsz/blob/03b7d791f4fe79270d114e1ea6929332dbf028b7/bstream.go#L59 . So it might be interesting to discuss this with @dgryski . If this change makes sense and there is no catch that we are missing here, then the upstream library should probably be changed, too.

@jesusvazquez @roidelapluie @codesome @bwplotka WDYT about this?

@dgryski
Copy link

dgryski commented Sep 10, 2024

I implemented the original paper in a weekend for fun, more or less. I never ended up using the package for anything, although prometheus and grafana's metrictank did pick it up (and they both vendored it in and hacked it up for their own purposes.).

Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks plausible to me as well, but I'm not super familiar with bstream.

Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@aknuds1
Copy link
Contributor

aknuds1 commented Sep 13, 2024

Should we add a changelog entry?

@bboreham
Copy link
Member

Changelog entries are for user-visible changes.

Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all you thoughts, and most importantly @fungiboletus for the find and fix.
Let's ship it!

@beorn7 beorn7 merged commit d90d097 into prometheus:main Sep 17, 2024
26 checks passed
@fungiboletus
Copy link
Contributor Author

Thanks everyone! It's great to see this merged. I feel better about the many hours I worked on this.

@bboreham
Copy link
Member

While reading the code to understand this PR, I found a couple more small performance improvements: #14932.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants