Fix data race when writing GeoParquet #13962
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When writing GeoParquet, we stored the column-level auxiliary geoparquet statistics/metadata directly in the
GeometryColumnWriter
, which is not thread safe, causing issues when multiple row-groups are written concurrently.This PR moves the column-level geoparquet metadata from the
ColumnWriter
into the associatedColumnWriterState
, and also introduces a write-lock on the file-level geoparquet metadata struct so that the threads don't race each other when flushing the stats after finishing a row group.Additionally I've changed the
GeometryColumnWriter
into aWKBColumnWriter
that directly subclasses theStringColumnWriter
instead of trying to template the base class as that is significantly messier to do when we also need to wrap the associated state. This makes the code a bit cleaner, and makes sense given that the WKB geometry encoding is the only one we support right now anyway. I've also realized that the implementation for the other encodings is going to end up different enough that there is probably no point in trying to generalize all the geoparquet specific encoding logic into a common base class anyway.Closes: #13914