Blog·June 23, 2026

Maximal Marginal Relevance (MMR) for AI Agents: Getting Diverse, Non-Redundant Results

retrievalAI agentssearch architectureMMR

Search results can be relevant and still bad.

If the top five results all say the same thing, the agent gets less context than it appears to have.

Maximal Marginal Relevance, or MMR, helps solve that.

What MMR Does

MMR balances two goals:

  • relevance to the query
  • diversity from results already selected

The goal is not to return random variety.

The goal is to avoid redundant context.

Why Agents Need Diversity

AI agents often need to compare sources:

  • official docs
  • release notes
  • GitHub issues
  • community reports
  • examples

Five near-duplicate snippets from the same source family can hide important edge cases.

MMR can help the retrieval layer include broader evidence.

Example

For a query like:

FastMCP streamable HTTP auth error

A pure relevance ranking might return five similar docs pages.

An MMR-aware result set might include:

  • official auth docs
  • transport docs
  • relevant GitHub issue
  • migration note
  • package reference

That is more useful for debugging.

The Agent Loop

The Practical Takeaway

Agents do not need more results.

They need better coverage.

MMR helps prevent the context window from being filled with duplicates, which improves reasoning and reduces wasted tokens.

Sources

  1. Original MMR paper: The use of MMR, diversity-based reranking for reordering documents and producing summaries
  2. Ninelayer blog: How to Reduce AI Agent Token Usage
← Back to Blog