Elasticsearch date decay function, Rails - ruby-on-rails

I'm doing a simple query, with multiple fields, and trying to apply a decay function based on how many days old the given document is. The following query is my attempt:
{
query: {
function_score:{
query: {
multi_match: {
query: query,
fields: ['name', 'location']
},
functions: [{
gauss: {
created_at: {
origin: 'now',
scale: '1d',
offset: '2d',
decay: 0.5
}
}
}]
}
}
}
}
With the following mapping:
mappings dynamic: 'false' do
indexes :name, analyzer: 'english'
indexes :location, analyzer: 'english'
indexes :created_at, type: 'date'
end
Gives the following error:
[400]
{"error":{"root_cause":[{"type":"query_parsing_exception","reason":"No
query registered for
[gauss]","index":"people","line":1,"col":143}],"type":"search_phase_execution_exception","reason":"all
shards
failed","phase":"query_fetch","grouped":true,"failed_shards":[{"shard":0,"index":"jobs","node":"abcdefgZq1PMsd882foA","reason":{"type":"query_parsing_exception","reason":"No
query registered for
[gauss]","index":"people","line":1,"col":143}}]},"status":400}

The functions need to go one level higher, just inside the function_score and not inside the query, like this:
{
query: {
function_score:{
functions: [{
gauss: {
created_at: {
origin: 'now',
scale: '1d',
offset: '2d',
decay: 0.5
}
}
}],
query: {
multi_match: {
query: query,
fields: ['name', 'location']
}
}
}
}
}

Related

Elasticsearch field value factor with multi matching?

I'm using this to search my outfits:
def self.search(query, purchasedlow, purchasedhigh)
__elasticsearch__.search(
{
query: {
function_score: {
query: {
bool: {
filter: [
{
multi_match: {
query: query,
fields: ['name','description','material']
}
},
{
range: {
purchased: { lte: purchasedhigh.to_i, gte: purchasedlow.to_i},
},
}
]
}
}
}
}
}
)
end
But I don't know how to add this code:
field_value_factor: {
field: "likes",
factor: "100"
}
I know that I'm supposed to put it after the function score, so that the calculated score is then multiplied by the amount of likes to make the final score, but when I put the code after the function_score, I get the following error:
[400] {"error":{"root_cause":[{"type":"parsing_exception","reason":"[function_score] malformed query, expected [END_OBJECT] but found [FIELD_NAME]","line":1,"col":232}],"type":"parsing_exception","reason":"[function_score] malformed query, expected [END_OBJECT] but found [FIELD_NAME]","line":1,"col":232},"status":400}
Where do I need to put the field value factor so that it works correctly?
I used field_value_factor in one of my queries like this:
#products = Product.search(
query:{
function_score:{
query:{
bool:{
must:{
multi_match:{
fields: ['brand^5', '_all'],
query: "#{query}",
fuzziness: "AUTO"
}
},
filter:{
bool:{
must:[
{term: {"brand":"NordicTrack"}},
{term: {"product_type":"Treadmill"}}
]
}
}
}
},
field_value_factor:{
field: "popularity",
modifier: "log1p"
}
}
})

elasticsearch 5.X + searchkick(rails) configuration

I'm in the process of upgrading my elasticsearch instance from V1.7 to 5.3 and I'm running into some errors when starting my reindex. From what I can tell most models can index just fine but there are a couple I had that used more advanced settings that don't seem to be working. Here's an example of one of my models(brand.rb):
searchkick word_start: [:name],
merge_mappings: true,
mappings: searchkick_mappings,
settings: searchkick_settings
def search_data
attributes.merge(
geography: self.geography ? self.geography.name : "",
geography_breadcrumb: self.geography ? self.geography.breadcrumb : "",
producer_name: self.producer.name
)
end
the searchkick_mappings and searchkick_settings are defined in another file that is included in my model. Here's the code:
def searchkick_mappings
{
brand: {
properties: {
name: {
type: 'text',
analyzer: 'standard',
fields: {
autocomplete: {
type: 'text',
analyzer: 'autocomplete'
},
folded: {
type: 'text',
analyzer: 'folded'
},
delimited: {
type: 'text',
analyzer: 'delimited'
}
}
}
}
}
}
end
def searchkick_settings
{
analysis: {
filter: {
autocomplete_filter: {
type: 'edge_ngram',
min_gram: 1,
max_gram: 20
},
delimiter_filter: {
type: 'word_delimiter',
preserve_original: true
}
},
analyzer: {
folded: {
type: 'custom',
tokenizer: 'standard',
filter: ['standard','lowercase','asciifolding']
},
delimited: {
type: 'custom',
tokenizer: 'whitespace',
filter: ['lowercase','delimiter_filter']
},
autocomplete: {
type: 'custom',
tokenizer: 'standard',
filter: ['standard','lowercase', 'asciifolding',
'autocomplete_filter']
}
}
}
}
end
The only change I made from when it was working in V1.7 -> 5.3 is that I had to change the 'type' field from "String" to "text" as it seems they removed the string type in favor of a text and keyword type, where text is analyzed and keywords are not. The error i'm receiving when I run bundle exec searchkick:reindex:all is saying there is an unknown parameter 'ignore_above'. From reading the documentation it seems that parameter is only for keyword fields and not text but I am not adding that parameter in my custom mappings so I don't see why it would be there.
Let me know if you need to see more code/need to know more. I'll gladly edit OP/comment whatever is helpful.

Elasticsearch different behaviour on test server

My elasticsearch is currently giving different results on different environments even though I'm doing the same search.
It works fine in development on my localhost, however it doesn't work on my test server (doesn't give expected records, yes I do have the database seeded).
Far as I understand what this should do is check whether it finds a hit on one of the three matches, and if it does return all the hits.
I'm running Windows 10, just using rails s.
The server is running Ubuntu 16, using nginx and unicorn.
Here's my mapping: (note: I'm not completely sure whether the analyzer does anything but it shouldn't matter)
settings index: { number_of_shards: 1 } do
mappings dynamic: 'true' do
indexes :reportdate, type: 'date'
indexes :client do
indexes :id
indexes :name, analyzer: 'dutch'
end
indexes :animal do
indexes :id
indexes :species, analyzer: 'dutch'
indexes :other_species, analyzer: 'dutch'
indexes :chip_code
end
indexes :locations do
indexes :id
indexes :street, analyzer: 'dutch'
indexes :city, analyzer: 'dutch'
indexes :postalcode
end
end
end
Here's my search:
__elasticsearch__.search({
sort: [
{ reportdate: { order: "desc" }},
"_score"
],
query: {
bool: {
should: [
{ multi_match: {
query: query,
type: "phrase_prefix",
fields: [ "other_species", "name"]
}},
{ prefix: {
chip_code: query
}},
{ match_phrase: {
"_all": {
query: query,
fuzziness: "AUTO"
}
}}
]
}
}
})
EDIT #1: Note: I'm fairly new to ruby on rails, started about 2 weeks ago, doing maintenance work on an old project and they also requested a search function.
Turns out that the problem was that I was using foreign tables (well, kinda) and nested mapping (probably this).
Here's the updated code that works on both production and locally:
__elasticsearch__.search({
sort: [
{ reportdate: { order: "desc" }},
"_score"
],
query: {
bool: {
should: [
{ multi_match: {
query: query,
type: "phrase_prefix",
fields: [ "animal.other_species", "client.name"]
}},
{ prefix: {
"animal.chip_code": query
}},
{ match_phrase: {
"_all": {
query: query,
fuzziness: "AUTO"
}
}}
]
}
}
})
Not sure why it doesn't need the animal and client parents preappended to work locally whilst it does need them on my testing server. However this works on both this way.

empty or nil field filter not working for elastic search

I am trying to create an ES query where 3 things must be true: a boolean field must be true, a search term must match a field, and third filed must be null. Here's my query in YML (I'm using mongoid-elasticsearch)
:query:
:filtered:
:filter:
:bool:
:must:
- :term:
:qualified: true
- :term:
:name: mukluk
- :missing:
:field: :location_ids
:existence: true
:null_value: true
In this query "name" is the name of a field.
This results in zero hits.
If I remove the last key (missing), the query works fine. Here's the actual record in JSON (this is the actual elasticsearch JSON that gets returned when removing the missing filter, so you can see that location_ids is in fact nil):
{
"took"=> 1,
"timed_out"=> false,
"_shards"=> {
"total"=> 5,
"successful"=> 5,
"failed"=> 0
},
"hits"=> {
"total"=> 1,
"max_score"=> 1.0,
"hits"=> [
{
"_index"=> "items",
"_type"=> "item",
"_id"=> "5480b53c73d5495ce600062b",
"_score"=> 1.0,
"_source"=> {
"name"=> "FitFlop mukluk (black)",
"description"=> "FitFlop mukluk (black)",
"qualified"=> true,
"retailer_id"=> "5470a7f273d549817c1882de",
"location_ids"=> nil
}
}
]
}
}
See how location_ids is nil? How do I get this to be part of the query?
btw, here's the body json of the actual query that gets submitted:
{
\"query\": {
\"filtered\": {
\"filter\": {
\"bool\": {
\"must\": [
{
\"term\": {
\"qualified\": true
}
},
{
\"term\": {
\"name\": \"mukluk\"
}
}
]
}
}
}
},
\"size\": 200,
\"from\": 0
}
UPDATE:
Here's the output of the failed query:
{
"took"=> 1,
"timed_out"=> false,
"_shards"=> {
"total"=> 5,
"successful"=> 5,
"failed"=> 0
},
"hits"=> {
"total"=> 0,
"max_score"=> nil,
"hits"=> []
}
}
And here's the mapping:
index_mappings: {
name: {
type: 'multi_field',
fields: {
name: {
type: 'string',
analyzer: 'snowball'
},
description: {
type: 'string',
analyzer: 'snowball'
},
qualfied: {
type: 'boolean',
index: :not_analyzed
},
location_ids: {
type: 'string',
index: :not_analyzed
}
}
},
tags: {
type: 'string',
include_in_all: false
},
wrapper: :load
}

Why does this elasticsearch/tire code not match partial words?

I'm trying to use Elasticsearch and Tire to index some data. I want to be able to search it on partial matches, not just full words. When running a query on the example model below, it will only match words in the "notes" field that are full word matches. I can't figure out why.
class Thingy
include Tire::Model::Search
include Tire::Model::Callbacks
# has some attributes
tire do
settings analysis: {
filter: {
ngram_filter: {
type: 'nGram',
min_gram: 2,
max_gram: 12
}
},
analyzer: {
index_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase']
},
search_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'ngram_filter']
}
}
} do
mapping do
indexes :notes, :type => "string", boost: 10, index_analyzer: "index_ngram_analyzer", search_analyzer: "search_ngram_analyzer"
end
end
end
def to_indexed_json
{
id: self.id,
account_id: self.account_id,
created_at: self.created_at,
test: self.test,
notes: some_method_that_returns_string
}.to_json
end
end
The query looks like this:
#things = Thing.search page: params[:page], per_page: 50 do
query {
boolean {
must { string "account_id:#{account_id}" }
must_not { string "test:true" }
must { string "#{query}" }
}
}
sort {
by :id, 'desc'
}
size 50
highlight notes: {number_of_fragments: 0}, options: {tag: '<span class="match">'}
end
I've also tried this but it never returns results (and ideally I'd like the search to apply to all fields, not just notes):
must { match :notes, "#{query}" } # tried with `type: :phrase` as well
What am I doing wrong?
You almost got there! :) The problem is that you've swapped the role of index_analyzer and search_analyzer, in fact.
Let me explain briefly how it works:
You want to break document words into these ngram "chunks" during indexing, so when you are indexing a word like Martian, it get's broken into: ['ma', 'mar', 'mart', ..., 'ar', 'art', 'arti', ...]. You can try it with the Analyze API: http://localhost:9200/thingies/_analyze?text=Martian&analyzer=index_ngram_analyzer.
When people are searching, they are already using these partial ngrams, so to speak, since they search for "mar" or "mart" etc. So you don't break their phrases further with the ngram tokenizer.
That's why you (correctly) separate index_analyzer and search_analyzer in your mapping, so Elasticsearch knows how to analyze the notes attribute during indexing, and how to analyse any search phrase against this attribute.
In other words, do this:
analyzer: {
index_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'ngram_filter']
},
search_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase']
}
}
Full, working Ruby code is below. Also, I highly recommend you to migrate to the new elasticsearch-model Rubygem, which contains all important features of Tire and is actively developed.
require 'tire'
Tire.index('thingies').delete
class Thingy
include Tire::Model::Persistence
tire do
settings analysis: {
filter: {
ngram_filter: {
type: 'nGram',
min_gram: 2,
max_gram: 12
}
},
analyzer: {
index_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'ngram_filter']
},
search_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase']
}
}
} do
mapping do
indexes :notes, type: "string", index_analyzer: "index_ngram_analyzer", search_analyzer: "search_ngram_analyzer"
end
end
end
property :notes
end
Thingy.create id: 1, notes: 'Martial Partial Martian'
Thingy.create id: 2, notes: 'Venetian Completion Heresion'
Thingy.index.refresh
# Find 'art' in 'martial'
#
# Equivalent to: http://localhost:9200/thingies/_search?q=notes:art
#
results = Thingy.search do
query do
match :notes, 'art'
end
end
p results.map(&:notes)
# Find 'net' in 'venetian'
#
# Equivalent to: http://localhost:9200/thingies/_search?q=notes:net
#
results = Thingy.search do
query do
match :notes, 'net'
end
end
p results.map(&:notes)
The problem for me was that I was using the string query instead of the match query. The search should have been written like this:
#things = Thing.search page: params[:page], per_page: 50 do
query {
match [:prop_1, prop_2, :notes], query
}
sort {
by :id, 'desc'
}
filter :term, account_id: account_id
filter :term, test: false
size 50
highlight notes: {number_of_fragments: 0}, options: {tag: '<span class="match">'}
end

Resources