Harbor · An RAG search across 12 years of legal filings
Built a private retrieval system across 1.8 million legal documents with citation-grade answers, deployed inside their tenant. 92% precision on their internal eval set.
01/ The challenge
Harbor's associates were spending 11 hours a week on prior-art research, often missing relevant cases. The firm had tried two off-the-shelf legal AI tools and rejected both for hallucinations.
02/ The approach
We built a hybrid retrieval pipeline (BM25 + dense) over their full corpus, ran an LLM as a re-ranker with a custom prompt, and wired citations through to source snippets. Every answer is grounded in an actual document: no answer-without-source.
03/ The outcome
- Precision (firm eval)
- 92%
- Research time
- -40%
- Documents indexed
- 1.8M
Associates report 40% time savings on prior-art tasks. Internal eval shows 92% precision on the firm's standard test set, vs ~70% from off-the-shelf options. Zero hallucinated citations in production.
“They understood that 'no hallucinated citations' wasn't a nice-to-have for us, it was the whole product. The eval rigor showed it.”