I have an index named books which has reviews as an object which can handle arrays.
While retrieving data, in a particular case I only want the review having maximum rating.
"books" :{
"reviews": {
"properties": {
"rating": {
"type": "float"
},
"comments": {
"type": "string"
}
}
},
"author" : {
"type" : "string"
}
}
Many books can have many reviews each having some rating. For a particular use case I only want the result set to have the reviews having maximum rating. I need to build a search query for that kind of result.
POST books/_search
{
"size": 51,
"sort": [
{
"reviews.rating": {
"order": "asc",
"mode" : "min"
}
}
],
"fields": [
"reviews","author"]
}
By using script_fields one can build dynamic fields but not objects. Else I could have made a dynamic object reviews having one field as rating and another as comment.
script_fields can be used to build both dynamic fields and objects:
curl -XDELETE localhost:9200/test-idx
curl -XPUT localhost:9200/test-idx -d '{
"mappings": {
"books" :{
"reviews": {
"properties": {
"rating": {
"type": "float"
},
"comments": {
"type": "string"
}
}
},
"author" : {
"type" : "string"
}
}
}
}'
curl -XPOST "localhost:9200/test-idx/books?refresh=true" -d '{
"reviews": [{
"rating": 5.5,
"comments": "So-so"
}, {
"rating": 9.8,
"comments": "Awesome"
}, {
"rating": 1.2,
"comments": "Awful"
}],
"author": "Roversial, Cont"
}'
curl "localhost:9200/test-idx/books/_search?pretty" -d '{
"fields": ["author"],
"script_fields": {
"highest_review": {
"script": "max_rating = 0.0; max_review = null; for(review : _source[\"reviews\"]) { if (review.rating > max_rating) { max_review = review; max_rating = review.rating;}} max_review"
}
}
}'
Related
I have a products catalogue where every product is indexed as follows (queried from http://localhost:9200/products/_doc/1) as sample:
{
"_index": "products_20201202145032789",
"_type": "_doc",
"_id": "1",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"found": true,
"_source": {
"title": "Roncato Eglo",
"description": "Amazing LED light made of wood and description continues.",
"price": 3990,
"manufacturer": "Eglo",
"category": [
"Lights",
"Indoor lights"
],
"options": [
{
"title": "Mount type",
"value": "E27"
},
{
"title": "Number of bulps",
"value": "4"
},
{
"title": "Batteries included",
"value": "true"
},
{
"title": "Ligt temperature",
"value": "warm"
},
{
"title": "Material",
"value": "wood"
},
{
"title": "Voltage",
"value": "230"
}
]
}
}
Every option contains different value, so there are many Mount type values, Light temperature values, Material values, and so on.
How can I create an aggregation (filter) where I can let customers choose between various Mount Type options:
[ ] E27
[X] E14
[X] GU10
...
Or let them choose from different Material options displayed as checkboxes:
[X] Wood
[ ] Metal
[ ] Glass
...
I can handle it on frontend once the buckets are created. Creation of different buckets for these options is What I am struggling with.
I have succesfully created and displayed and using aggregations for Category, Manufacturer and other basic ones. Thes product options are stored in has_many_through relationships in database. I am using Rails + searchkick gem, but those allow me to create raw queries to elastic search.
The prerequisite for such aggregation is to have options field as nested.
Sample index mapping:
PUT test
{
"mappings": {
"properties": {
"title": {
"type": "keyword"
},
"options": {
"type": "nested",
"properties": {
"title": {
"type": "keyword"
},
"value": {
"type": "keyword"
}
}
}
}
}
}
Sample docs:
PUT test/_doc/1
{
"title": "Roncato Eglo",
"options": [
{
"title": "Mount type",
"value": "E27"
},
{
"title": "Material",
"value": "wood"
}
]
}
PUT test/_doc/2
{
"title": "Eglo",
"options": [
{
"title": "Mount type",
"value": "E27"
},
{
"title": "Material",
"value": "metal"
}
]
}
Assumption: For a given document a title under option appears only once. For e.g. there can exists only one nested document under option having title as Material.
Query for aggregation:
GET test/_search
{
"size": 0,
"aggs": {
"OPTION": {
"nested": {
"path": "options"
},
"aggs": {
"TITLE": {
"terms": {
"field": "options.title",
"size": 10
},
"aggs": {
"VALUES": {
"terms": {
"field": "options.value",
"size": 10
}
}
}
}
}
}
}
}
Response:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"OPTION" : {
"doc_count" : 4,
"TITLE" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Material",
"doc_count" : 2,
"VALUES" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "metal",
"doc_count" : 1
},
{
"key" : "wood",
"doc_count" : 1
}
]
}
},
{
"key" : "Mount type",
"doc_count" : 2,
"VALUES" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "E27",
"doc_count" : 2
}
]
}
}
]
}
}
}
}
I have an elasticsearch index and am using the following query:
"_source": [
"title",
"content"
],
"size": 15,
"from": 0,
"query": {
"bool": {
"must": {
"multi_match": {
"query": "{{query}}",
"fields": [
"title",
"content"
],
"operator": "or"
}
},
"should": [
{
"multi_match": {
"query": "{{query}}",
"fields": [
"title.standard^16",
"content.standard^2"
],
"operator": "and"
}
},
{
"match_phrase": {
"content.standard": {
"query": "{{query}}",
"_name": "Phrase on title",
"boost": 1000
}
}
}
]
}
},
"highlight": {
"fields": {
"content": {}
},
"fragment_size": 100
}
}
Here is the mapping I set:
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"my_metaphone"
]
}
},
"filter": {
"my_metaphone": {
"type": "phonetic",
"encoder": "metaphone",
"replace": true
}
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"term_vector": "with_positions_offsets",
"analyzer": "my_analyzer",
"fields": {
"standard": {
"type": "text"
},
"stemmer": {
"type": "text",
"analyzer": "english"
}
}
},
"content": {
"type": "text",
"term_vector": "with_positions_offsets",
"analyzer": "my_analyzer",
"fields": {
"standard": {
"type": "text"
},
"stemmer": {
"type": "text",
"analyzer": "english"
}
}
}
}
}
}
Here is my logic with the query:
1) It will give the highest precedence to a phrase if it appears.
2) If not it will use the standard analyzer (that is the text, as is) and give it the highest precedence.
3) If all else doesn't match up, it will use the phonetic analyzer to get the results, that is the least precedence.
But obviously there is some fault to this as it seems to give higher precedence to the phonetic analyzer than the standard or phrase. For example, if I search for "Person of Indian Origin" it returns results on the top highlighting "Pursuant" "pursuing" and very, very less number of results with person of Indian origin although I know a large number of them exists. How do I solve this?
I am very new to Elasticsearch.Using ES 5.1.1, I am trying to create the following simple index:
curl -XPOST "http://localhost:9200/user" -d'
{
"mappings": {
"post": {
"properties": {
"first_name": {
"type": "string"
},
"last_name": {
"type": "string"
},
"birth_date": {
"type": "date"
}
}
}
}
}'
But I am getting an HTTP 400 with the following error message:
No handler found for uri [/user] and method [POST]
Since ES 5, you must use PUT instead of POST when creating new indices:
curl -XPUT "http://localhost:9200/user" -d'
{
"mappings": {
"post": {
"properties": {
"first_name": {
"type": "string"
},
"last_name": {
"type": "string"
},
"birth_date": {
"type": "date"
}
}
}
}
}'
I am a novice with elastic search and while writing script_score I am facing parse exception saying 'expected field name but got [START_ARRAY]'
Here is the mapping:
PUT toadb
{
"mappings":{
"keywords":{
"properties":{
"Name":{"type":"string","analyzer": "simple"},
"Type":{"type":"string","index": "not_analyzed"},
"Id":{"type":"string","index": "not_analyzed"},
"Boosting Field":{"type" : "integer", "store" : "yes"}
}
},
"businesses":{
"properties": {
"Name":{"type":"string","analyzer": "simple"},
"Type":{"type":"string","index": "not_analyzed"},
"Id":{"type":"string","index": "not_analyzed"},
"Business_seq":{"type":"string","index": "not_analyzed"},
"Status":{"type":"string","index": "not_analyzed"},
"System_rating":{"type" : "integer", "store" : "yes"},
"System_rating_weight":{"type" : "integer", "store" : "yes"},
"Position":{ "type":"geo_point","lat_lon": true},
"Display Pic":{"type": "string","index": "not_analyzed"},
"Boosting Field":{"type" : "integer", "store" : "yes"}
}
}
}
}
Here is the query I am trying to execute:
GET /toadb/_search
{
"query":{
"function_score" : {
"query" : {
"multi_match" : {
"query": "Restaurant",
"fields": [ "Name"],"fuzziness":1
}},
"script_score":
{
"script":"if(doc['Status'] && doc['Status']=='A'){ _score+ (doc['Boosting Field'].value);}"
}
},
"size":10
}
}
Please provide sample examples if any (Already referred to elasticsearch documentation)
It looks like you have mistakenly placed the size option in your query. In your example, you have added it as a field next to the function_score query. Instead, it belongs as a sibling to the root query object.
Try this:
GET /toadb/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "Restaurant",
"fields": [
"Name"
],
"fuzziness": 1
}
},
"script_score": {
"script": "if(doc['Status'] && doc['Status']=='A'){ _score+ (doc['Boosting Field'].value);}"
}
}
},
"size": 10
}
Have a look at the documentation for the request body search.
I have this scenario wherein there are two multi_match searches within the same query. The trouble is, when I create the JSON for it in ruby, a json with non-unique keys doesn't seem possible so only one of them appear.
Here is my query:
{
"fields": ["id", "title",
"address.city", "address.state", "address.country", "address.state_code", "address.country_code", "proxy_titles", "location"],
"size":2,
"query":{
"filtered":{
"filter": {
"range": {
"custom_score": {
"gte": 100
}
}
},
"query":{
"bool": {
"must": {
"multi_match":{
"query": "term 1",
"type": "cross_fields",
"fields": ["title^2", "proxy_titles^2","description"]
}
},
"must": {
"multi_match": {
"query": "us",
"fields": ["address.city", "address.country", "address.state",
"address.zone", "address.country_code", "address.state_code", "address.zone_code"]
}
}
}
}
}
},
"sort": {
"_score": { "order": "desc" },
"variation": {"order": "asc"},
"updated_at": { "order": "desc" }
}
}
I have also only recently started using elasticsearch so it be very helpful if you could suggest me a better query to accomplish the same as well.
You have the syntax wrong. For multiple "must" values in a "bool", they need to be in an array. The documentation is not always terribly helpful, unfortunately (the bool query page shows this for "should" but not "must").
Try this:
{
"fields": ["id","title","address.city","address.state","address.country","address.state_code","address.country_code","proxy_titles","location"],
"size": 2,
"query": {
"filtered": {
"filter": {
"range": {
"custom_score": {
"gte": 100
}
}
},
"query": {
"bool": [
{
"must": {
"multi_match": {
"query": "term 1",
"type": "cross_fields",
"fields": ["title^2","proxy_titles^2","description"]
}
}
},
{
"must": {
"multi_match": {
"query": "us",
"fields": ["address.city","address.country","address.state","address.zone","address.country_code","address.state_code","address.zone_code"]
}
}
}
]
}
}
},
"sort": {
"_score": {
"order": "desc"
},
"variation": {
"order": "asc"
},
"updated_at": {
"order": "desc"
}
}
}