breakingMarch 19, 2026

Reason-ModernColBERT ranks 87.59 on BrowseComp-Plus

LightOn’s late-interaction retriever paired with GPT-5 reached 87.59 accuracy on BrowseComp-Plus while using fewer search calls than larger baselines. It suggests deep-research quality may now hinge more on retrieval architecture than on swapping in ever larger LLMs.

Search Reranking Benchmarks Deep Research

4 min read

Reason-ModernColBERT ranks 87.59 on BrowseComp-Plus

TL;DR

LightOn’s benchmark thread says its 150M-parameter Reason-ModernColBERT, paired with GPT-5, reached 87.59 accuracy on BrowseComp-Plus, a 7.59-point gain over the previous best.
The same benchmark thread reports wins on recall and calibration too: 83.52% recall versus 80.29%, and 7.46 calibration error versus 7.92, while using fewer search calls.
Practitioner reaction centered on model size efficiency: follow-up thread notes the strongest prior retriever baseline was Qwen3-8B, about 54 times larger than the 150M ColBERT model.
The result also points to a workflow change, because scaffold details says a simple get_document(id) fetch tool improved performance over the official top-5-snippet-only scaffold, suggesting retrieval quality is driving deep-research performance more than bigger generators alone.