With @james_blackwoodsewell_58 we were comparing the BM25 text search scores between MongoDB Atlas (Lucene), ElasticSearch (Lucene) and ParadeDB (using Tantivy) which provide the same ordering but MongoDB Atlas shows constantly a lower score by a factor of 2.2:
It was the occasion for me to look at the score details which gives the calculation details for the score.
Test case
I've built the same test case as in my previous blog:
db.articles.drop();
db.articles.deleteMany({});
db.articles.insertMany([
{ description : "π π π" }, // short, 1 π
{ description : "π π π" }, // short, 1 π
{ description : "π π π π" }, // larger, 2 π
{ description : "π π π π π" }, // larger, 1 π
{ description : "π π π π΄ π« π π π°" }, // large, 1 π
{ description : "π π π π π π" }, // large, 6 π
{ description : "π π" }, // very short, 1 π
{ description : "π π π΄ π« π π π° π" }, // large, 1 π
{ description : "π π π π π" }, // shorter, 2 π
]);
db.articles.createSearchIndex("default",
{ mappings: { dynamic: true } }
);
Score with details
I ran the same query, adding scoreDetails: true to the search stage, and scoreDetails: { $meta: "searchScoreDetails" } } to the projection stage:
db.articles.aggregate([
{
$search: {
text: { query: ["π", "π"], path: "description" },
index: "default",
scoreDetails: true
}
},
{ $project: {
_id: 0, description: 1,
score: { $meta: "searchScore" },
scoreDetails: { $meta: "searchScoreDetails" } } },
{ $sort: { score: -1 } } ,
{ $limit: 1 }
])
Here is the result:
mdb> db.articles.aggregate([
... {
... $search: {
... text: { query: ["π", "π"], path: "description" },
... index: "default",
... scoreDetails: true
... }
... },
... { $project: { _id: 0, description: 1, score: { $meta: "searchScore" }, scoreDetails: { $meta: "searchScoreDetails" } } },
... { $sort: { score: -1 } } ,
... { $limit: 1 }
... ])
[
{
description: 'π π π',
score: 1.0242118835449219,
scoreDetails: {
value: 1.0242118835449219,
description: 'sum of:',
details: [
{
value: 1.0242118835449219,
description: '$type:string/description:π [BM25Similarity], result of:',
details: [
{
value: 1.0242118835449219,
description: 'score(freq=1.0), computed as boost * idf * tf from:',
details: [
{
value: 1.8971199989318848,
description: 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:',
details: [
{
value: 1,
description: 'n, number of documents containing term',
details: []
},
{
value: 9,
description: 'N, total number of documents with field',
details: []
}
]
},
{
value: 0.5398772954940796,
description: 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:',
details: [
{
value: 1,
description: 'freq, occurrences of term within document',
details: []
},
{
value: 1.2000000476837158,
description: 'k1, term saturation parameter',
details: []
},
{
value: 0.75,
description: 'b, length normalization parameter',
details: []
},
{
value: 3,
description: 'dl, length of field',
details: []
},
{
value: 4.888888835906982,
description: 'avgdl, average length of field',
details: []
}
]
}
]
}
]
}
]
}
}
]
So all is there. Here is the scoring breakdown for "π π π", which produced a score of 1.0242118835449219.
IDF calculation (inverse document frequency)
Search result:
- Number of documents containing the term:
n = 1 - Total number of documents with this field:
N = 9
idf = log(1 + (N - n + 0.5) / (n + 0.5))
= log(1 + (9 - 1 + 0.5) / (1 + 0.5))
= log(6.666666666666667)`
β 1.8971199989318848
TF calculation (term frequency)
Parameters are the Lucene defaults:
- Term saturation parameter:
k1 = 1.2000000476837158 - Length normalization parameter:
b = 0.75
Document field statistics:
- Average length of the field:
avgdl = 44 / 9 β 4.888888835906982 - Occurrences of the term in this document:
freq = 1
tf = freq / (freq + k1 * (1 - b + b * dl / avgdl))
= 1 / (1 + 1.2000000476837158 Γ (0.25 + 0.75 Γ (3 / 4.888888835906982)))
β 0.5398772954940796
Final score
Parameter:
- Boost:
1.0
score = boost Γ idf Γ tf
= 1.0 Γ 1.8971199989318848 Γ 0.5398772954940796
β 1.0242118835449219
That confirms that Atlas Search uses the same scoring as Lucene https://github.com/apache/lucene/blob/releases/lucene/10.3.2/lucene/core/src/java/org/apache/lucene/search/similarities/BM25Similarity.java#L183
What about ElasticSearch and Tantivy
Eight years ago, Lucene removed the (k1 + 1) factor in LUCENE-8563. For k1 = 1.2, this change reduces the score by a factor of 2.2 from that version onward. Tantivy and Elasticsearch apparently still use the old formula, while Atlas Search uses the updated one, which explains the observed differences in scoring.
Conclusion
MongoDB Atlas Search indexes are built on Lucene and use its parameters and scoring formulas. When you compare Atlas Search with other Luceneβbased text search engines that use older Lucene scoring formulas, you may see score differences of roughly a factor of 2.2. However, this has no practical impact because scores are only used to order results, so the relative ranking of documents remains the same.
Text search scores can seem magical, but they are deterministic and based on open-source formulas. In MongoDB, you can include the score details option in a text search query to inspect all the parameters and formulas behind the score.
Top comments (0)