Streams: Reduce memory usage for streams with only write positions #94
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There's an unbounded Dictionary that maintains, for each stream that's ever been processed:
a) the backlog of events that have yet to be processed. Definitely necessary state.
b) the write position. Provides performance benefits as it enables:
i) deduplication of repeated events coming from the input source (the Ingester can throw away the events before they even get to the scheduler)
ii) proactive discarding because a projection has an awareness of the index of the first event in the stream that's going to be useful. For example, for replication to a warm standby that's already got data, but also when catching up or replaying events for arbitrary projection scenarios. (The underlying mechanism is the same as used for the previous scenario - the handler can yield a response that indicates the lowest position for which we are interested in processing events)
However, being unbounded and ever growing, that obviously risks running out of memory.
This PR optimizes the in-memory representation to delay (not prevent) this consequence.
The existing memory layout was:
This PR changes from an Object /
type
to astruct
/Struct
with the following layout: