admin管理员组文章数量:1122833
I have the following code in a search query, and when I adjust the boost value of the 'wildcard' section, it doesn't change any result scores. Our search objective is that if search phrase is in title, it is most important. Otherwise search body as well. If still not turning up anything, try fuzzy matches. Prioritize recency with a time decay function. Information from last 30 days is far more valuable than information from 9 months ago.
Any thoughts?
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"wildcard": {
"title": {
"value": "*blackwell*",
"boost": 1000
}
}
},
{
"multi_match": {
"fields": [
"title",
"body"
],
"query": "blackwell",
"type": "phrase",
"boost": 100
}
},
{
"multi_match": {
"fields": [
"title",
"body"
],
"query": "blackwell",
"operator": "or",
"boost": 10
}
},
{
"multi_match": {
"fields": [
"title",
"body"
],
"query": "blackwell",
"fuzziness": "AUTO",
"operator": "and",
"boost": 1
}
}
],
"minimum_should_match": 1
}
},
"functions": [
{
"exp": {
"modifiedAt": {
"offset": "30d",
"scale": "360d",
"decay": 0.01
}
}
}
],
"boost_mode": "multiply"
}
},
Here is output from _explain:
{
"_index": "534e9cac-96c9-47af-9321-67860f3b33ba_record_chunks",
"_id": "836bba2c-8ba6-491e-a1e2-da7e91de78f8_0011",
"matched": true,
"explanation": {
"value": 1254.3658,
"description": "function score, product of:",
"details": [
{
"value": 1254.3658,
"description": "sum of:",
"details": [
{
"value": 1254.3658,
"description": "weight(body:blackwel in 257) [PerFieldSimilarity], result of:",
"details": [
{
"value": 1254.3658,
"description": "score(freq=2.0), computed as boost * idf * tf from:",
"details": [
{
"value": 244.20001,
"description": "boost",
"details": []
},
{
"value": 7.447975,
"description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details": [
{
"value": 544,
"description": "n, number of documents containing term",
"details": []
},
{
"value": 934570,
"description": "N, total number of documents with field",
"details": []
}
]
},
{
"value": 0.68966836,
"description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details": [
{
"value": 2,
"description": "freq, occurrences of term within document",
"details": []
},
{
"value": 1.2,
"description": "k1, term saturation parameter",
"details": []
},
{
"value": 0.75,
"description": "b, length normalization parameter",
"details": []
},
{
"value": 56,
"description": "dl, length of field (approximate)",
"details": []
},
{
"value": 84.00777,
"description": "avgdl, average length of field",
"details": []
}
]
}
]
}
]
}
]
},
{
"value": 1,
"description": "min of:",
"details": [
{
"value": 1,
"description": "Function for field modifiedAt:",
"details": [
{
"value": 1,
"description": "exp(- MIN[Math.max(Math.abs(1.731852097E12(=doc value) - 1.732904021506E12(=origin))) - 2.592E9(=offset), 0)] * 1.4805716904539902E-10)",
"details": []
}
]
},
{
"value": 3.4028235e38,
"description": "maxBoost",
"details": []
}
]
}
]
}
}
I have the following code in a search query, and when I adjust the boost value of the 'wildcard' section, it doesn't change any result scores. Our search objective is that if search phrase is in title, it is most important. Otherwise search body as well. If still not turning up anything, try fuzzy matches. Prioritize recency with a time decay function. Information from last 30 days is far more valuable than information from 9 months ago.
Any thoughts?
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"wildcard": {
"title": {
"value": "*blackwell*",
"boost": 1000
}
}
},
{
"multi_match": {
"fields": [
"title",
"body"
],
"query": "blackwell",
"type": "phrase",
"boost": 100
}
},
{
"multi_match": {
"fields": [
"title",
"body"
],
"query": "blackwell",
"operator": "or",
"boost": 10
}
},
{
"multi_match": {
"fields": [
"title",
"body"
],
"query": "blackwell",
"fuzziness": "AUTO",
"operator": "and",
"boost": 1
}
}
],
"minimum_should_match": 1
}
},
"functions": [
{
"exp": {
"modifiedAt": {
"offset": "30d",
"scale": "360d",
"decay": 0.01
}
}
}
],
"boost_mode": "multiply"
}
},
Here is output from _explain:
{
"_index": "534e9cac-96c9-47af-9321-67860f3b33ba_record_chunks",
"_id": "836bba2c-8ba6-491e-a1e2-da7e91de78f8_0011",
"matched": true,
"explanation": {
"value": 1254.3658,
"description": "function score, product of:",
"details": [
{
"value": 1254.3658,
"description": "sum of:",
"details": [
{
"value": 1254.3658,
"description": "weight(body:blackwel in 257) [PerFieldSimilarity], result of:",
"details": [
{
"value": 1254.3658,
"description": "score(freq=2.0), computed as boost * idf * tf from:",
"details": [
{
"value": 244.20001,
"description": "boost",
"details": []
},
{
"value": 7.447975,
"description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details": [
{
"value": 544,
"description": "n, number of documents containing term",
"details": []
},
{
"value": 934570,
"description": "N, total number of documents with field",
"details": []
}
]
},
{
"value": 0.68966836,
"description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details": [
{
"value": 2,
"description": "freq, occurrences of term within document",
"details": []
},
{
"value": 1.2,
"description": "k1, term saturation parameter",
"details": []
},
{
"value": 0.75,
"description": "b, length normalization parameter",
"details": []
},
{
"value": 56,
"description": "dl, length of field (approximate)",
"details": []
},
{
"value": 84.00777,
"description": "avgdl, average length of field",
"details": []
}
]
}
]
}
]
}
]
},
{
"value": 1,
"description": "min of:",
"details": [
{
"value": 1,
"description": "Function for field modifiedAt:",
"details": [
{
"value": 1,
"description": "exp(- MIN[Math.max(Math.abs(1.731852097E12(=doc value) - 1.732904021506E12(=origin))) - 2.592E9(=offset), 0)] * 1.4805716904539902E-10)",
"details": []
}
]
},
{
"value": 3.4028235e38,
"description": "maxBoost",
"details": []
}
]
}
]
}
}
Share
Improve this question
edited Nov 30, 2024 at 10:01
Paulo
10.3k5 gold badges22 silver badges37 bronze badges
asked Nov 23, 2024 at 1:47
JackBurtonJackBurton
2431 gold badge4 silver badges12 bronze badges
1 Answer
Reset to default 0Tldr;
It should work, see my demo.
Maybe your wildcard
query is never a match, thus not participating int the scoring. But I am shooting in the dark without mapping / sample documents.
Could you perhaps give a minimum reproducible exemple ?
To investigate
Have you thought about using the explain api ?
GET 79216980/_explain/<doc _id>
{
"query": {
"wildcard": {
"data.keyword": {
"value": "*pot*",
"boost": 2
}
}
}
}
Which can give you an explanation as to why a document did match.
{
"_index": "79216980",
"_id": "S1jPY5MB7I4Yfde2NMmi",
"matched": true,
"explanation": {
"value": 2,
"description": "data.keyword:*pot*^2.0",
"details": []
}
}
analysis of the explain (edit)
From the explain I can tell you that:
title
field never matches, only thebody
field does.wildcard
query did not match either.
The score is solely computed by the matches one the body. So this would explain why the score does not change with regards to the boost.
Could it be the title
is of type keyword
, meaning it is case sensitive.
I would suggest you test you query one clause at a time ? see which one work ?
You could also set on the wildcard
clause the setting case_insensitive: true
{
"wildcard": {
"title": {
"value": "*blackwell*",
"boost": 1000,
"case_insensitive": true
}
}
}
Demo
POST _bulk
{"index":{"_index":"79216980"}}
{"data": "I love potatoes"}
{"index":{"_index":"79216980"}}
{"data": "I love potage"}
GET 79216980/_search
{
"query": {
"wildcard": {
"data.keyword": {
"value": "*pot*",
"boost": 2
}
}
}
}
Gives:
{
...
"hits": {
...
"max_score": 2,
"hits": [
{
"_index": "79216980",
"_id": "S1jPY5MB7I4Yfde2NMmi",
"_score": 2,
"_source": {
"data": "I love potatoes"
}
},
{
"_index": "79216980",
"_id": "TFjPY5MB7I4Yfde2NMmi",
"_score": 2,
"_source": {
"data": "I love potage"
}
}
]
}
}
本文标签: ElasticSearch boost on wildcard not workingStack Overflow
版权声明:本文标题:ElasticSearch boost on wildcard not working? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736300081a1930690.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论