FilterHN

Show HN: Self-hosted RAG with MCP support for OpenClaw

4 points

2 months ago

| 1 comment

I've been using OpenClaw to control my home server via WhatsApp, but it couldn't access my documents. Instead of uploading my private contracts to OpenAI, I built ClawRAG – a self-hosted RAG engine that connects to OpenClaw via MCP (Model Context Protocol). Now I can ask "What did the contract say about liability?" and get cited answers, not hallucinations.

Most RAG systems are either too complex for a solo dev's home setup or they rely on cloud-hosted vector stores. I needed something that runs in a single Docker container, understands messy PDFs (tables!), and integrates natively as a "tool" for agents rather than just another REST endpoint.

## Technical Deep Dive

### Why MCP instead of REST? I chose the Model Context Protocol (MCP) because it provides structured schemas that LLMs understand natively. The MCP server exposes `query_knowledge` as a tool, allowing the agent to decide exactly when to pull from the knowledge base vs. when to use its built-in memory. It prevents "tool-drift" and ensures type-safe responses.

### The Stack - *Parsing*: Docling 2.13.0 (The first parser I've found that doesn't choke on nested tables in legacy PDFs). - *Storage*: ChromaDB (Lightweight, file-based, no Postgres/pgvector overhead needed for personal knowledge bases). - *Search*: Hybrid (Vector similarity + BM25 keyword search) fused using Reciprocal Rank Fusion (RRF) for better retrieval on specific legal jargon. - *Footprint*: Optimized to run under 2GB RAM (excluding the local LLM).

### The tricky part: Citation Preservation Getting citations to work reliably over a WhatsApp round-trip was the biggest challenge. I had to ensure chunk IDs and source metadata survive the transformation from ChromaDB → LlamaIndex → LLM → OpenClaw → WhatsApp without getting "summarized away" or sanitized by the LLM's output formatting.

## Use Case Last week my landlord claimed I signed a clause about garden/snow maintenance. I pulled up my phone, wrote to my OpenClaw bot: "Search my lease for gardening obligations". It found the relevant paragraph in 3 seconds, cited the page/section, and provided the exact quote. Argument closed.

## Quick Start The repo includes a `docker-compose.yml` that spins up everything including the vector store:

```bash # 1. Start ClawRAG docker compose up -d

# 2. Add your documents curl -X POST http://localhost:8080/api/v1/rag/documents/upload \ -F "files=@my_lease.pdf" \ -F "collection_name=personal"

# 3. Connect to your agent openclaw mcp add --transport stdio clawrag npx -y @clawrag/mcp-server ```

## Community & Feedback Code is MIT licensed. I'd love feedback on the MCP implementation – specifically if you see better ways to handle tool schemas for multi-collection search.

*Ask me anything about the architecture or how I handled the citation logic!*

---

### Hidden Technical Details - *Privacy*: Zero external data leaks. Everything stays on your metal. - *LLM Agnostic*: Tested with Ollama (Llama 3.2) and Claude 3.5 via API. - *Context Management*: Explicit context window limiting to prevent GPU crashes on 8GB VRAM cards.

▲

rizzo94

2 months ago

[-]

This is a brilliant use of the Model Context Protocol (MCP). Using query_knowledge as a tool rather than a generic REST endpoint is definitely the right move for reducing hallucinations in legal/contractual contexts. The citation preservation over WhatsApp is a particularly nice touch—that's usually where these workflows fall apart.

My only concern with the self-hosted Docker + Docling + ChromaDB stack is the 'maintenance tax.' It’s great for a solo dev, but for a production-grade personal assistant that needs to stay 'always-on' without me babying the container, I've been looking at PAIO (Personal AI Operator).

They seem to be aiming for this exact 'Private RAG' sweet spot but as a managed, one-click service. Their BYOK architecture is what sold me; it keeps the security risk low because it’s using your own keys, but you get that fortress-level privacy that’s hard to replicate in a home-server setup without a lot of manual hardening.

Are you planning to add support for other 'operators' like PAIO, or is the goal to keep ClawRAG strictly as a standalone self-hosted primitive?

▲

2dogsanerd

2 months ago

[-]

Thanks! The Community Edition is intentionally a 'primitive' – single container, zero config However, I disagree on the 'maintenance tax'. ClawRAG isn't a weekend project; it's the retrieval engine extracted directly from our Enterprise RAG Core V4 system. It keeps the same connection pooling and health checks, so it is built to be 'always-on' without babying The full V4 system adds Governance (Solomon Consensus/Multi-lane validation), not basic stability. You don't need the Enterprise layer just to keep the lights on

Re: PAIO – if they implement an MCP Client, ClawRAG can serve them. But I'd argue: if you already run a host, adding a container gives you provable privacy vs. 'trust us' managed services. I prefer owning the keys AND the lock ;-)