Skip to content

Conversation

mhilton
Copy link
Contributor

@mhilton mhilton commented Aug 12, 2025

It is very common for flux to be using arrays of strings containing a single repeated value. This is often the case when processing InfluxDB tag types. Ad support for run-end encoded string arrays to optimize for this case. Such arrays are quick to access as they consist of a single run, but use less memory than a full string array, or even a dictionary which still needs a full length array to hold the dictionary indexes.

Checklist

Dear Author 👋, the following checks should be completed (or explicitly dismissed) before merging.

  • ✏️ Write a PR description, regardless of triviality, to include the value of this PR
  • 🔗 Reference related issues
  • 🏃 Test cases are included to exercise the new code
  • 🧪 If new packages are being introduced to stdlib, link to Working Group discussion notes and ensure it lands under experimental/
  • 📖 If language features are changing, ensure docs/Spec.md has been updated

Dear Reviewer(s) 👋, you are responsible (among others) for ensuring the completeness and quality of the above before approval.

It is very common for flux to be using arrays of strings containing a
single repeated value. This is often the case when processing InfluxDB
tag types. Ad support for run-end encoded string arrays to optimize for
this case. Such arrays are quick to access as they consist of a single
run, but use less memory than a full string array, or even a dictionary
which still needs a full length array to hold the dictionary indexes.
@mhilton mhilton requested a review from a team as a code owner August 12, 2025 10:39
Copy link
Contributor

@appletreeisyellow appletreeisyellow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactor! It would be helpful to link the run-end encoded layout doc somewhere in the comment for future references.

Comment on lines +195 to +196
runEnds = array.NewInt32Data(data.Children()[0])
values = array.NewBinaryData(data.Children()[1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run-end encoded layout doc that you linked in the original issue was helpful in understanding the data structure here. Thank you for referencing!

func (a *String) valuesIndex(i int) (int, bool) {
if a.indices != nil {
if a.indices.IsNull(i) {
return 0, false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are looking at the dictionary encoding handling here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅👍

@mhilton mhilton merged commit 507e7ef into master Aug 13, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants