FilterHN

Ask HN: How would you architect a RAG system for 10M+ documents today?

10 points

8 hours ago

| 2 comments

I'm tasked with building a private AI assistant for a corpus of 10 million text documents (living in PostgreSQL). The goal is semantic search and chat, with a requirement for regular incremental updates.

I'm trying to decide between:

Bleeding edge: Implementing something like LightRAG or GraphRAG.

Proven stack: Standard Hybrid Search (Weaviate/Elastic + Reranking) orchestrated by tools like Dify.

For those who have built RAG at this scale:

What is your preferred stack for 2025?

Is the complexity of Graph/LightRAG worth it over standard chunking/retrieval for this volume?

How do you handle maintenance and updates efficiently?

Looking for architectural advice and war stories.

▲

parentheses

6 hours ago

[-]

If it's < 100M, with vectors of 1024 size, you could fit all of that in ~100G of memory. So, maybe storing it in memory is an easy way to go about it. This ignores a lot of "database problems". If the docs are changing constantly, or uou have other scalability concerns, you may be better off using a "proper" vector db. There have been HN postings which indicate vector db choice matters. Do your research there.

▲

walpurginacht

3 hours ago

[-]

do you have an evaluation in place that necessitates complex stuffs? If not I'd start simple with proven stuffs and collect usage data to determine what's next