Skip to content

StableTfl Similarity#16243

Open
txwei wants to merge 8 commits into
apache:mainfrom
txwei:stableTfl
Open

StableTfl Similarity#16243
txwei wants to merge 8 commits into
apache:mainfrom
txwei:stableTfl

Conversation

@txwei

@txwei txwei commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

Adds StableTflSimilarity, a new similarity algorithm that estimates term rarity from term length and document length instead of corpus-level stats (doc frequency, total doc count, avg doc length). I presented this model at Haystack 2026 at Charlottesville (slides).

It keeps the same overall shape as BM25 - a sum over query terms of a term-frequency saturation factor times a per-term rarity weigh, but swaps IDF for a synthetic, corpus-independent term rarity (TR) term, computed as the probability that a given term appears in a doc at least once based only on the term length and doc length.

Why?

BM25 scores depend on per-field corpus stats that could differ across search nodes and shift over time as segments merge. When users paginate by relevance score, these drifting scores could produce duplicate or skipped results across pages. Since StableTfl doesn't rely on any of the drifting stats, it always produces a deterministic score. If score and ranking consistency is more important than a few percentage points of recall loss, this is something worth exploring.

Recall and NDCG performance

Benchmark results across 21 BEIR datasets:

Model NDCG@10 Recall@10
BM25 0.346 0.388
StableTfl 0.315 0.350
Gap 0.031 0.038

Since StableTfl uses less information than BM25, it has a slightly lower recall.

@github-actions github-actions Bot added this to the 11.0.0 milestone Jun 11, 2026
@txwei txwei marked this pull request as ready for review June 11, 2026 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant