VectorChord v0.5.0

Two big upgrades, focused and pragmatic.

1) Experimental DiskANN (with RaBitQ) — `vchordg` (preview)

A new disk-backed graph index that keeps memory low while giving you a DiskANN-style option inside VectorChord.

When it shines: can be faster than IVF+RaBitQ (vchordrq) on some embeddings (e.g., OpenAI/Cohere) — but not always.
Caveats: slow build, and insert/delete are weaker than IVF. Dataset-dependent: benchmark before switching.

Try it:

CREATE INDEX ON items USING vchordg (embedding vector_l2_ops)
WITH (options = $$
  m = 64
  ef_construction = 128
$$);

SET vchordg.ef_search = 128;

Memory knob: bits = 1 halves index memory vs default bits = 2 (better recall/QPS).

We’re shipping this to give you a one-stop vector search toolbox at VectorChord. Feel free to share any thoughts and questions about it!

2) Recall measurement for IVF+RaBitQ — vchordrq_evaluate_query_recall

Approximate ≠ exact. Now you can quantify how close your results are with vchordrq_evaluate_query_recall. It accepts a query that returns row identifiers (e.g., ctid) and returns a recall score.

SET vchordrq.probes = '100';
SET vchordrq.epsilon = 1.0;

SELECT vchordrq_evaluate_query_recall(query => $$
  SELECT ctid FROM items
  ORDER BY embedding <-> '[3,1,2]'
  LIMIT 10
$$);  -- add ", exact_search => true" for table-scan ground truth

Note: recall evaluation targets vchordrq in 0.5 (not vchordg yet).

Other fixes

We fixed some performance regression problems in this release. User can enjoy better performances with it!

Talk to us

Thanks for building with us. If you have any question or thoughts, open an issue, join our discord or start a Discussion. Your notes guide what we fix first. If VectorChord helped you, drop us a ⭐ on GitHub and hit Watch → Releases.

VectorChord 0.4.3 Release Notes

use mimalloc on aarch64-linux
fix compilation with gcc on x86_64
fix compilation with clang on Windows
prompt the user to rebuild the index after the upgrade

Full Changelog: 0.4.2...0.4.3

VectorChord 0.4.2 Release Notes

fix compilation on aarch64 macos
add support for pgxnclient: you can install VectorChord with pgxnclient install vchord==0.4.2 now

Full Changelog: 0.4.1...0.4.2

VectorChord 0.4.1 Release Notes

Fix potential precision issue if the dimension of vectors is 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768.

Full Changelog: 0.4.0...0.4.1

VectorChord 0.4 Release Notes

Major Improvements

Streaming I/O & Page Prefetch
- Complete rewrite of page layout to enable pipelined computation with streaming I/O.
- On PostgreSQL 17, uses fadvise to prefetch buffers into the OS page cache, eliminating per-buffer read waits and fully leveraging disk throughput.
- In upcoming PostgreSQL 18, direct support for io_uring will further streamline asynchronous I/O.
- Benchmarks: 2–3× lower latency on cold queries (no buffer or page cache), translating to significantly improved tail latency in production.
Prefilter Acceleration
- Introduces true prefilter support for vector + filter queries.
- Previous postfilter approach ranked full result sets and then applied filters—inefficient when selectivity is low (e.g., 1% filter rate).
- Applies SQL filters before full precision vector distance computations, avoiding unnecessary work.
- Benchmarks: Up to 3× faster end-to-end search on highly selective filters without any additional tuning.

Other Improvements

Optimized Residual Quantization
- Collaboration with RaBitQ author Jianyang: refactored distance term $|⟨o, q–c⟩|$ into $⟨o, q⟩ – ⟨o, c⟩$, so the query vector is quantized only once.
- Result: ~20% QPS improvement over 0.3.
- Recommendation: Enable residual quantization for L2 workloads.
Fast Walsh-Hadamard Transform for Rotation
- Collaboration with RaBitQ author Jianyang: replaced manual vchordrq.prewarm_dim GUC with an on-the-fly Fast Walsh-Hadamard Transform.
- Removes the need to configure a prewarmed dimension list and yields marginal speed gains during setup.

Thank you for using VectorChord! As always, we welcome feedback and contributions on GitHub.

Full Changelog: 0.3.0...0.4.0

@usamoi

Features

Native support for the maxsim operator and efficient indexing inspired by XTR-WARP project. This makes it possible to build ColBERT- or ColPaLI-style multi-vector retrieval applications seamlessly within PostgreSQL.

Improvements

More KMeans parameters can be configured
Better progress report for internal KMeans build

What's Changed

feat: check quals to skip rerank if rerank_in_table is enabled by @usamoi in #206
feat: maxsim operator and indexing on maxsim operator by @usamoi in #197
chore: update dependencies by @usamoi in #216
feat: use mimalloc by @usamoi in #217
feat: emit INFO when performing kmeans by @usamoi in #219
feat: encode kmeans progress to phase name by @usamoi in #220
readme: move docs to docs repo by @usamoi in #222
chore: change default lists and probes to empty by @usamoi in #223
feat: allows skipping rerank by @usamoi in #225
delete cnpg image by @xieydd in #224
refactor: remove allows_skipping_rerank by @usamoi in #226
fix: apply epsilon for default search by @usamoi in #228
chore: remove patch of crate half by @usamoi in #231
fix: release env vars by @cutecutecat in #232
chore: update 0.3.0 schema script by @usamoi in #230

Full Changelog: 0.2.2...0.3.0

@xieydd

What's Changed

Support linux/arm64 for vchord-cnpg docker image by @xieydd in #184
refactor: split quantized data vectors to two tapes by @usamoi in #196
chore: add ghcr image by @cutecutecat in #200
chore: update readme and scripts with url by @cutecutecat in #198
fix: lint by ruff by @cutecutecat in #201
fix: wrong operators for halfvec by @cutecutecat in #202
doc: add new options by @usamoi in #207
fix: do not leak memory in heap fetch and fix reading tuples in prewarm by @usamoi in #205
fix: recycle pages in maintainance by @usamoi in #204
chore: update v0.2.2 schema script by @usamoi in #210
fix: correct logic of marking free pages by @usamoi in #211
fix: remove heapify by @usamoi in #213
fix error in cnpg amd64 Dockerfile by @xieydd in #214
ci: fix release docker build by @usamoi in #215

Full Changelog: 0.2.1...0.2.2

@usamoi

Major Improvement

We optimize the external centroid index building speed, about 30%. Now it takes about 30h to build index for 100M vectors with only 4 vcpu on i4i.xlarge.

What's Changed

refactor: move algorithm to a crate by @usamoi in #172
feat: pinning index in memory when building, second try by @usamoi in #181
fix: use linked list of vectors to skip realloc by @usamoi in #182
feat: use select algorithm to replace heap, if k in top-k is expected to be small by @usamoi in #183
ci: install pg13 in docker image by @usamoi in #186
ci: use less docker by @usamoi in #187
feat: rerank by fetching vectors in heap table by @usamoi in #189
ci: enable CI for pg13 by @usamoi in #185
chore: update dependencies by @usamoi in #190
refactor: remove meaningless target feature requirements by @usamoi in #192
fix: test simd operations in emulator by @usamoi in #193
feat: neon impl of u8::reduce_sum_of_x by @usamoi in #194
chore: update 0.2.1 schema (upgrade) script by @usamoi in #195

Full Changelog: 0.2.0...0.2.1

@gaocegege

VectorChord 0.2 Release Notes

We are thrilled to announce the release of VectorChord 0.2, advancing vector search capabilities within PostgreSQL.

🚀 New Features

Optimized Storage Layout

Long Cross-Page Vector Support: Redesigned internal storage allows vectors to span multiple 8KB PostgreSQL pages, enabling support for vectors with over 2000 dimensions, up to 16000 dim.
Enhanced Storage Efficiency: Achieves higher storage density by minimizing wasted space, reducing index size by up to 50% compared to version 0.1.

Additional Data Types

Float16 Support: Introduces Float16 data type, allowing users to halve the storage space required with a slight decrease in recall. Note that Float16 does not reduce the size of quantized vectors, maintaining 1 bit per dimension for original vector representation.

Architecture Enhancements

ARM Architecture Support: Rewritten distance calculations and Fast Scan implementations using the Scalable Vector Extension (SVE) instruction set for optimal performance on ARM-based systems.
AWS Graviton4 Compatibility: Leverage the latest i8g platform based on Graviton4 processors for improved performance at the same cost as i4i models.

⚡ Performance Improvements

Reduced Index Size: Up to 50% reduction in index size compared to version 0.1.

🔧 Getting Started

Comprehensive getting started guides will be available soon.

📝 Summary

VectorChord 0.2 introduces support for high-dimensional vectors, Float16 data type, ARM architecture optimizations, and a more compact storage layout. These enhancements collectively improve storage efficiency and query performance, providing a superior vector retrieval experience within PostgreSQL.

What's Changed

fix: Unlock the conversation in CLA bot by @gaocegege in #86
fix: CI env SEMVER by @kemingy in #87
fix: format by @cutecutecat in #90
feat: max_scan_tuples by @usamoi in #94
test: fix push down tests by @cutecutecat in #95
fix: set max dimension to 1600 in readme by @usamoi in #97
fix: set max dimension to 1600 by @usamoi in #98
docs: Update README by @VoVAllen in #103
feat: support up to 60000 dimensions by @usamoi in #100
chore: directory structure by @usamoi in #104
feat: vchordrqfscan by @usamoi in #105
chore: add CI to build the pgrx image by @kemingy in #107
chore: fix the pgrx ci by @kemingy in #108
fix: optimize insertions in building when lists = 1 by @usamoi in #106
chore: build multiarch docker images by @kemingy in #112
docs(readme): fix markdown style for docker run by @kemingy in #113
feat: improve internal build by @usamoi in #115
add enterprise image build step to ci by @xieydd in #114
chore(README): Add some benchmark data by @gaocegege in #126
fix: use stable toolchain by default by @usamoi in #128
feat: scalar8 & indexing on halfvec by @usamoi in #131
feat: Use dual license (AGPLv3 and ELv2) by @gaocegege in #130
chore: update base by @usamoi in #137
fix: remove sudo in dockerfile by @usamoi in #138
fix: ci by @usamoi in #139
fix: preprocess for halfvec by @usamoi in #140
docs: update README for clarity and new features by @VoVAllen in #142
fix: set an implicit root in external build if parents are not set by @usamoi in #147
chore: type check and external test by @cutecutecat in #149
chore: update dependencies by @usamoi in #151
fix: update docker image in ci by @usamoi in #152
Update enterprise dockerfile by @xieydd in #148
feat: build multiarch pgrx image by @kemingy in #153
chore: impl dereference traits for page guards by @usamoi in #156
chore: fix the psql & release CI target, update readme by @kemingy in #158
chore: add postgres sqllogicaltest for arm by @kemingy in #159
chore: update link in readme by @VoVAllen in #160
refactor: move pgvecto.rs base to this repo by @usamoi in #161
fix: pick feat back to vchordrqfscan by @usamoi in #162
chore: fix discord and x badge by @kemingy in #170
feat: unify vchordrq and vchordrqfscan by @usamoi in #167
fix: respect aliasing rule by not reading past of reference by @usamoi in #169
fix: correct output of prewarm by @usamoi in #173
fix: add magic number and version number to meta tuple by @usamoi in #174
Release 0.2.0 by @cutecutecat in #177
chore: install zip in tensorchord-pgrx by @usamoi in #178
fix: release package name, version and licenses by @usamoi in #179

New Contributors

@xieydd made their first contribution in #114

Full Changelog: 0.1.0...0.2.0

@gaocegege

Highlights

Support fp16 vec
Support vector longer than 2000 dim

What's Changed

fix: Unlock the conversation in CLA bot by @gaocegege in #86
fix: CI env SEMVER by @kemingy in #87
fix: format by @cutecutecat in #90
feat: max_scan_tuples by @usamoi in #94
test: fix push down tests by @cutecutecat in #95
fix: set max dimension to 1600 in readme by @usamoi in #97
fix: set max dimension to 1600 by @usamoi in #98
docs: Update README by @VoVAllen in #103
feat: support up to 60000 dimensions by @usamoi in #100
chore: directory structure by @usamoi in #104
feat: vchordrqfscan by @usamoi in #105
chore: add CI to build the pgrx image by @kemingy in #107
chore: fix the pgrx ci by @kemingy in #108
fix: optimize insertions in building when lists = 1 by @usamoi in #106
chore: build multiarch docker images by @kemingy in #112
docs(readme): fix markdown style for docker run by @kemingy in #113
feat: improve internal build by @usamoi in #115
add enterprise image build step to ci by @xieydd in #114
chore(README): Add some benchmark data by @gaocegege in #126
fix: use stable toolchain by default by @usamoi in #128
feat: scalar8 & indexing on halfvec by @usamoi in #131
feat: Use dual license (AGPLv3 and ELv2) by @gaocegege in #130
chore: update base by @usamoi in #137
fix: remove sudo in dockerfile by @usamoi in #138
fix: ci by @usamoi in #139
fix: preprocess for halfvec by @usamoi in #140

New Contributors

@xieydd made their first contribution in #114

Full Changelog: 0.1.0...0.1.1-alpha.1

Releases: tensorchord/VectorChord

0.5.0

VectorChord v0.5.0

1) Experimental DiskANN (with RaBitQ) — vchordg (preview)

2) Recall measurement for IVF+RaBitQ — vchordrq_evaluate_query_recall

Other fixes

Talk to us

Uh oh!

0.4.3

Uh oh!

0.4.2

Uh oh!

0.4.1

Uh oh!

0.4.0

Major Improvements

Other Improvements

Uh oh!

0.3.0

Features

Improvements

What's Changed

Contributors

Uh oh!

0.2.2

What's Changed

Contributors

Uh oh!

0.2.1

Major Improvement

What's Changed

Contributors

Uh oh!

0.2.0

VectorChord 0.2 Release Notes

🚀 New Features

Optimized Storage Layout

Additional Data Types

Architecture Enhancements

⚡ Performance Improvements

🔧 Getting Started

📝 Summary

What's Changed

New Contributors

Contributors

Uh oh!

0.1.1-alpha.1

Highlights

What's Changed

New Contributors

Contributors

Uh oh!

1) Experimental DiskANN (with RaBitQ) — `vchordg` (preview)