Though it's worth noting that the license is AGPL. So if the idea is for this to take over for pgvecto.rs, it's an important data point for those building SaaS products.
It will make pgvector the only permissively licensed option, given it has the same license as Postgres.
If the data distribution shifts, the optimal solution would be to rebuild the index. We believe that HNSW also experiences challenges with data distribution to some extent. However, without rebuilding, our observations suggest that users are more likely to experience slightly longer query times rather than a significant loss in recall.
[1] https://github.com/google-research/google-research/tree/mast...
[2] Also available on something like AlloyDB on GCP: https://cloud.google.com/alloydb/docs/ai/store-index-query-v...
[3] https://ann-benchmarks.com/glove-100-angular_10_angular.html
Disclaimer: Working for Google, but nowhere close to Databases.
Additionally, I couldn’t find any performance benchmarks for ScaNN integrated with PostgreSQL, particularly in comparison to pgvector or its standalone. The publicly available metrics focus exclusively on query-only indexing outside of the database.
On our side, we’ve implemented the fastscan kernel for bit vector scanning, which is considered as one of ScaNN’s key advantages.
Really appreciate it and it makes perfect sense.
Here is a link to the cost calculator. Note that the calculator includes cost of ingestion, but the article only mentions storage costs, not ingestion costs: https://www.datastax.com/pricing/vector-search?cloudProvider...
Disclaimer: I work on vectorsearch/AstraDB at DataStax.
1. Uses half-vecs, so you cut down everything by half with no recall loss 2. Uses token pooling with hierarchial clustering at 3, so, you further cut down things by 2/3rd with <1% loss 3. Everything is on Postgres and pgvector, so you can do all the Postgres stuff and decrease corpus size by document metadata filtering 4. We have a 5000+ pages corpus in production with <3 seconds latency. 5. We benchmark against the Vidore leaderboard, and very near SOTA
You can read about half-vecs here: https://jkatz05.com/post/postgres/pgvector-scalar-binary-qua...
Hierarchical token pooling: https://www.answer.ai/posts/colbert-pooling.html
And how we implemented them here: https://blog.colivara.com/
It is a small upgrade, but one nonetheless. The complexity, and the cost of multi-vectors *might* not make this worth it, really depends on how accuracy-critical the task is.
For example, one of our customers who does this over FDA monographs, which is like 95%+ text, and 5% tables - they misses were extremely painful - even though there weren't that many in text-based pipelines. So, the migrations made sense to them.
I am ok with it being less efficient as the dev ux will be amazing. Vespa ops (even in their cloud) are a complete nightmare compared to postgres
It’s also worth noting that ElasticSearch has implemented RaBitQ support for HNSW. So it's difficult to compare without running actual benchmarks. However, ElasticSearch typically requires at least double, if not triple, the memory size of the vector dataset to maintain system stability. In contrast, PostgreSQL can achieve a stable system with far fewer resources—for example, 32GB of memory is sufficient to manage 100 million vectors efficiently.
From my perspective, it would be faster in query comparing to ElasticSearch due to the extensive optimizations. And much much faster with the updates (insert and delete) due to using IVF instead of HNSW.
For an example of how you can communicate with domain experts, while still giving everyone else some form of clue as to what this hell you’re talking about, check out the link to the product that this thing claims to be a successor to:
That starts off by telling us what it is and what it does.
Yan easily store 1B data into Zilliz Serverless and the cost is incredible cheap
$0.318/hour ($228.96/month)
Which means you can store 20M 768 dimension data in 228$ per month
The title here, the presentation on the page itself. Everything screams "landing page". I had to go back on a desktop browser to see the word "blog" in the url bar, and mentally shift those graphics and little islands of text around until I can view it from that lens. If it's really just a sub-product of the main product that they're talking about, then yeah, it makes more sense in that context.
But my answer to your question would still be "Yes". Absolutely. If you're a product, the job of your blog is to convince people coming off the street that they need your thing, even if they didn't realize it yet.
Step one of that process is to not bounce them back to the street without any idea what they're looking at.