Search results can be relevant and still bad.
If the top five results all say the same thing, the agent gets less context than it appears to have.
Maximal Marginal Relevance, or MMR, helps solve that.
What MMR Does
MMR balances two goals:
- relevance to the query
- diversity from results already selected
The goal is not to return random variety.
The goal is to avoid redundant context.
Why Agents Need Diversity
AI agents often need to compare sources:
- official docs
- release notes
- GitHub issues
- community reports
- examples
Five near-duplicate snippets from the same source family can hide important edge cases.
MMR can help the retrieval layer include broader evidence.
Example
For a query like:
FastMCP streamable HTTP auth error
A pure relevance ranking might return five similar docs pages.
An MMR-aware result set might include:
- official auth docs
- transport docs
- relevant GitHub issue
- migration note
- package reference
That is more useful for debugging.
The Agent Loop
The Practical Takeaway
Agents do not need more results.
They need better coverage.
MMR helps prevent the context window from being filled with duplicates, which improves reasoning and reduces wasted tokens.
Sources
- Original MMR paper: The use of MMR, diversity-based reranking for reordering documents and producing summaries
- Ninelayer blog: How to Reduce AI Agent Token Usage
