Find model with part of title using ElasticSearch / Rails - ruby-on-rails

There is the following Post model:
class Post < ActiveRecord::Base
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
def self.search query
__elasticsearch__.search(
{
query: {
multi_match: {
query: query,
fields: ['title']
}
},
filter: {
and: [
{ term: { deleted: false } },
{ term: { enabled: true } }
]
}
}
)
end
settings index: { number_of_shards: 1 } do
mappings dynamic: 'false' do
indexes :title, analyzer: 'english'
end
end
end
Post.import
I have one Post with 'Amsterdam' title. When I execute Post.search('Amsterdam') I will get one record, all is good. But if I execute Post.search('Amster') I will get no records. What do I wrong? How can I fix it? Thanks!
OS - OS X, ElasticSearch I installed using Homebrew

You will have to use nGram tokenizer, in order to create a partial text search. A very good example of how to do this can be found here. That said, I would be very careful with nGram, as it can often turn up unrelated results.
This is because the substring "mon" is contained within all of the strings: "monkey", "money", and "monday". All of which are unrelated.
Alternatively (What I would do.)
You could try making it a fuzzy search. However, the max distance with fuzzy search is only two, which still doesn't return anything in your example. However, it tends to return relevant results.
The example I found: How to use Fuzzy Search
# Perform a fuzzy search!
POST /fuzzy_products/product/_search
{
"query": {
"match": {
"name": {
"query": "Vacuummm",
"fuzziness": 2,
"prefix_length": 1
}
}
}
}

Related

ElasticSearch: A query that allows nil parameters

So i have the below module in an ElasticSearch concern for my Model in rails.
This is working, but how do I make each of the bool query(must, must_not, filter) accept nil or empty parameters?
Say if I pass an empty query_string it would get all the documents.
Then when I pass an empty size parameter it will return all sizes.
module ClassMethods
def home_page_search(query_string, size, start_date, end_date)
search({
query: {
bool: {
must: [
{
multi_match: {
query: query_string,
fields: [:brand, :name, :notes, :size_notes]
}
}
],
must_not: [
range: {
unavailable_dates: { gte: start_date, lte: end_date }
}
],
filter: [
{ term: { size: size } }
]
}
}
})
end
end
I solved a similar problem by constructing the query string on more of an as-needed basis, so I only included a clause if there was a search term for it. The query I sent to Elasticsearch only included the terms that were actually set by the user. For example:
if size.present?
query[:query][:bool][:filter] = { term: { size: size } }
end
(assuming the correct representation of the query, etc.)

Elasticsearch different behaviour on test server

My elasticsearch is currently giving different results on different environments even though I'm doing the same search.
It works fine in development on my localhost, however it doesn't work on my test server (doesn't give expected records, yes I do have the database seeded).
Far as I understand what this should do is check whether it finds a hit on one of the three matches, and if it does return all the hits.
I'm running Windows 10, just using rails s.
The server is running Ubuntu 16, using nginx and unicorn.
Here's my mapping: (note: I'm not completely sure whether the analyzer does anything but it shouldn't matter)
settings index: { number_of_shards: 1 } do
mappings dynamic: 'true' do
indexes :reportdate, type: 'date'
indexes :client do
indexes :id
indexes :name, analyzer: 'dutch'
end
indexes :animal do
indexes :id
indexes :species, analyzer: 'dutch'
indexes :other_species, analyzer: 'dutch'
indexes :chip_code
end
indexes :locations do
indexes :id
indexes :street, analyzer: 'dutch'
indexes :city, analyzer: 'dutch'
indexes :postalcode
end
end
end
Here's my search:
__elasticsearch__.search({
sort: [
{ reportdate: { order: "desc" }},
"_score"
],
query: {
bool: {
should: [
{ multi_match: {
query: query,
type: "phrase_prefix",
fields: [ "other_species", "name"]
}},
{ prefix: {
chip_code: query
}},
{ match_phrase: {
"_all": {
query: query,
fuzziness: "AUTO"
}
}}
]
}
}
})
EDIT #1: Note: I'm fairly new to ruby on rails, started about 2 weeks ago, doing maintenance work on an old project and they also requested a search function.
Turns out that the problem was that I was using foreign tables (well, kinda) and nested mapping (probably this).
Here's the updated code that works on both production and locally:
__elasticsearch__.search({
sort: [
{ reportdate: { order: "desc" }},
"_score"
],
query: {
bool: {
should: [
{ multi_match: {
query: query,
type: "phrase_prefix",
fields: [ "animal.other_species", "client.name"]
}},
{ prefix: {
"animal.chip_code": query
}},
{ match_phrase: {
"_all": {
query: query,
fuzziness: "AUTO"
}
}}
]
}
}
})
Not sure why it doesn't need the animal and client parents preappended to work locally whilst it does need them on my testing server. However this works on both this way.

How to use bool to limit Elasticsearch query results?

In my Rails app I have 2 models: User(id) & Document(id,user_id, document_title,document)
def self.search(query)
__elasticsearch__.search(
{
query: {
multi_match: {
query: query,
fields: ['document_title^10', 'document']
}
},
}
end
I'm using the above search query which works great for return results across the entire table. The problem is, the results are not limited to the current_user. I'm trying to update the search method to only return results for the current_user. Per the docs, I'm doing:
def self.search(query, user_id)
__elasticsearch__.search(
{
bool: {
filter: ["user_id", user_id]
},
query: {
multi_match: {
query: query,
fields: ['document_title^10', 'document']
}
},
}
end
However, that is erroring with:
[400] {"error":{"root_cause":[{"type":"search_parse_exception","reason":"failed to parse search source. unknown search element [bool]","line":1,"col":2}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"documents","node":"52GAD0HbT4OlekjesTZY_A","reason":{"type":"search_parse_exception","reason":"failed to parse search source. unknown search element [bool]","line":1,"col":2}}]},"status":400}
I'm not sure what docs you are looking at but that query isn't right: the multi match query should be in the must clause of the bool query.
{
query: {
bool: {
must: [{
multi_match: {...}
}],
filter: [{
term: {user_id: user_id}
}]
}
}
}

Why does this elasticsearch/tire code not match partial words?

I'm trying to use Elasticsearch and Tire to index some data. I want to be able to search it on partial matches, not just full words. When running a query on the example model below, it will only match words in the "notes" field that are full word matches. I can't figure out why.
class Thingy
include Tire::Model::Search
include Tire::Model::Callbacks
# has some attributes
tire do
settings analysis: {
filter: {
ngram_filter: {
type: 'nGram',
min_gram: 2,
max_gram: 12
}
},
analyzer: {
index_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase']
},
search_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'ngram_filter']
}
}
} do
mapping do
indexes :notes, :type => "string", boost: 10, index_analyzer: "index_ngram_analyzer", search_analyzer: "search_ngram_analyzer"
end
end
end
def to_indexed_json
{
id: self.id,
account_id: self.account_id,
created_at: self.created_at,
test: self.test,
notes: some_method_that_returns_string
}.to_json
end
end
The query looks like this:
#things = Thing.search page: params[:page], per_page: 50 do
query {
boolean {
must { string "account_id:#{account_id}" }
must_not { string "test:true" }
must { string "#{query}" }
}
}
sort {
by :id, 'desc'
}
size 50
highlight notes: {number_of_fragments: 0}, options: {tag: '<span class="match">'}
end
I've also tried this but it never returns results (and ideally I'd like the search to apply to all fields, not just notes):
must { match :notes, "#{query}" } # tried with `type: :phrase` as well
What am I doing wrong?
You almost got there! :) The problem is that you've swapped the role of index_analyzer and search_analyzer, in fact.
Let me explain briefly how it works:
You want to break document words into these ngram "chunks" during indexing, so when you are indexing a word like Martian, it get's broken into: ['ma', 'mar', 'mart', ..., 'ar', 'art', 'arti', ...]. You can try it with the Analyze API: http://localhost:9200/thingies/_analyze?text=Martian&analyzer=index_ngram_analyzer.
When people are searching, they are already using these partial ngrams, so to speak, since they search for "mar" or "mart" etc. So you don't break their phrases further with the ngram tokenizer.
That's why you (correctly) separate index_analyzer and search_analyzer in your mapping, so Elasticsearch knows how to analyze the notes attribute during indexing, and how to analyse any search phrase against this attribute.
In other words, do this:
analyzer: {
index_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'ngram_filter']
},
search_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase']
}
}
Full, working Ruby code is below. Also, I highly recommend you to migrate to the new elasticsearch-model Rubygem, which contains all important features of Tire and is actively developed.
require 'tire'
Tire.index('thingies').delete
class Thingy
include Tire::Model::Persistence
tire do
settings analysis: {
filter: {
ngram_filter: {
type: 'nGram',
min_gram: 2,
max_gram: 12
}
},
analyzer: {
index_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'ngram_filter']
},
search_ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase']
}
}
} do
mapping do
indexes :notes, type: "string", index_analyzer: "index_ngram_analyzer", search_analyzer: "search_ngram_analyzer"
end
end
end
property :notes
end
Thingy.create id: 1, notes: 'Martial Partial Martian'
Thingy.create id: 2, notes: 'Venetian Completion Heresion'
Thingy.index.refresh
# Find 'art' in 'martial'
#
# Equivalent to: http://localhost:9200/thingies/_search?q=notes:art
#
results = Thingy.search do
query do
match :notes, 'art'
end
end
p results.map(&:notes)
# Find 'net' in 'venetian'
#
# Equivalent to: http://localhost:9200/thingies/_search?q=notes:net
#
results = Thingy.search do
query do
match :notes, 'net'
end
end
p results.map(&:notes)
The problem for me was that I was using the string query instead of the match query. The search should have been written like this:
#things = Thing.search page: params[:page], per_page: 50 do
query {
match [:prop_1, prop_2, :notes], query
}
sort {
by :id, 'desc'
}
filter :term, account_id: account_id
filter :term, test: false
size 50
highlight notes: {number_of_fragments: 0}, options: {tag: '<span class="match">'}
end

Sorting search results with mongoid-elasticsearch gem

I am trying to implement sorting into my search. I am searching with use of the gem mongoid-elasticsearch. This is my model configuration:
class ActivityLog
include Mongoid::Document
include Mongoid::Timestamps
include Mongoid::Elasticsearch
elasticsearch!(
{
wrapper: :load,
sort: [
{ created_at: "desc" }
]
}
)
field :type, type: String
end
This configuration does not raise any errors, but it does not either seem like it has any effect, because search results are listed randomly.
I think I am implementing the configuration in accordance to the documentation:
Check mongoid-elasticsearch documentation here
Check elasticsearch documentation here
My search query btw is:
ActivityLog.es.search(params[:query], page: params[:page]).results.paginate(per_page: 5, page: params[:page])
You should remove Array wrapper:
elasticsearch!(
{
wrapper: :load,
sort: { created_at: "desc" }
}
)
The same way you would do it in Elasticsearch itself
ActivityLog.es.search({
body: {
query: {
query_string: {
query: params[:search]
}
},
sort: [
{created_at: {order: "asc"}},
]
})

Resources