Using Vectorize to build an unreasonably good search engine in 160 lines of code
71 points
3 days ago
| 8 comments
| blog.partykit.io
| HN
simonw
3 hours ago
[-]
I was super-excited about vector search and embeddings in 2024 but my enthusiasm has faded somewhat in 2025 for a few reasons:

- LLMs with a grep or full-text search tool turn out to be great at fuzzy search already - they throw a bunch of OR conditions together and run further searches if they don't find what they want

- ChatGPT web search and Claude Code code search are my favorite AI-assisted search tools and neither bother with vectors

- Building and maintaining a large vector speech index is a pain. The vector are usually pretty big and you need to keep them in memory to get truly great performance. FTS and grep are way less hassle.

- Vector matches are weird. So you get back the top twenty results... those might be super relevant or they might be total garbage, it's on you to do a second pass to figure out if they're actually useful results or not.

I expected to spend much of 2025 building vector search engines, but ended up not finding them as valuable as I had thought.

reply
markerz
1 hour ago
[-]
The problem with LLMs using full-text-search is they’re very slow compared to a vector search query. I will admit the results are impressive but often it’s because I kick off an agent query and step away for 5 minutes.

On the other hand, generating and regenerating embeddings for all your documents can be time consuming and costly, depending on how often you need to reindex

reply
____tom____
37 minutes ago
[-]
You didn't build a search engine in 160 lines of code. You build a client for a search engine in 160 lines of code. The vector database is providing the search.
reply
mips_avatar
5 hours ago
[-]
There’s a lot of previously intractable problems that are getting solved with these new embeddings models. I’ve been building a geocoder for the past few months and it’s been remarkable how close to google places I can get with just slightly enriched open street maps plus embedding vectors
reply
occupant
5 hours ago
[-]
That sounds really interesting. If you’re open to it, I’d be curious what the high-level architecture looks like (what gets embedded, how you rank results)?
reply
robrenaud
1 hour ago
[-]
What are you embedding? Are you doing a geo restricted area (small universe?).
reply
isaachh
3 hours ago
[-]
Id love to hear more about this
reply
RomanPushkin
2 hours ago
[-]
You might be getting a good _recall_ rate, since vectorize search is ANN, but the _precision_ can be low, because reranker piece is missing. So I would slightly improve it by adding 10 more lines of code and introducing reranker after the search (slightly increasing topK). Query expansion in the beginning can be also added to improve recall.
reply
repeekad
3 hours ago
[-]
What about re-ranking? In my limited experience, adding fast+cheap re-ranking with something like Cohere to the query results took an okay vector based search and made top 1-5 results much stronger
reply
vjerancrnjak
3 hours ago
[-]
Query expansion works better.
reply
sa-code
2 hours ago
[-]
Query expansion and re ranking can and often do coexist

Roughly, first there is the query analysis/manipulation phase where you might have NER, spell check, query expansion/relaxation etc

Then there is the selection phase, where you retrieve all items that are relevant. Sometimes people will bring in results from both text and vector based indices. Perhaps and additional layer to group results

Then finally you have the reranking layer using a cross encoder model which might even have some personalisation in the mix

Also, with vector search you might not need query expansion necessarily since semantic similarity does loose association. But every domain is unique and there’s only one way to find out

reply
repeekad
3 hours ago
[-]
Query expansion happens before the retrieval query, reranking is applied after the ranked results are returned, both are important
reply
yuzhun
2 hours ago
[-]
While embeddings are generally not required in the context of code, I am interested in how they perform in the legal and regulatory domain, where documents are substantially longer. Specifically, how do embeddings compare with approaches such as ripgrep in terms of effectiveness?
reply
sa-code
2 hours ago
[-]
Models like bge are small and quantized versions will fit in browser or on a tiny machine. Not sure why everyone reaches for an API as their first choice
reply
Supermancho
6 hours ago
[-]
Site has a neat feature where you can see the pointers of other people, marked by regional? notations, scrolling through the content.
reply
wqaatwt
17 minutes ago
[-]
Seems fantastic for analytics. I wonder how many news sites do that
reply
wormpilled
4 hours ago
[-]
It's amazing! Got so distracted, gotta switch to reader mode haha. Never seen anything like that.
reply
fnord77
4 hours ago
[-]
that got annoying fast
reply