ruby on rails: ElasticSearch / Tire dynamic search on multiple indices - ruby-on-rails

I've done a bunch of searching and I haven't been able to get an answer to this question - hopefully this isn't a repeat (apologies if it is)...
Preface: I'm using Rails & Tire to perform ElasticSearch.
I have an object, Place, with attributes "name", "city", "state", and "zip". They are indexed as follows:
indexes :name, :type => 'multi_field', :fields => {
:name => { :type => 'string', :analyzer => 'snowball' },
:"name.exact" => { :type => 'string', :index => :not_analyzed }
}
indexes :city
indexes :state
indexes :zip
There are three conditions for searching: 1. Name only, 2. (City, State OR Zip), 3. Name AND (City, State OR Zip).
My code for the "query" block is:
if (City, State).present?
boolean do
must { string "name:#{name}*" } if name.present?
must { string "city:#{city_state}*" }
must { string "state:#{city_state}*" }
end
elsif (Zip).present?
boolean do
must { string "name:#{name}*" } if name.present?
must { string "zip:#{query_parameters["zip"]}*" }
end
else
string "name:#{name}*" }
end
The aforementioned search conditions #1 and #2 work as expected against multiple tests. However, condition 3 does not - it seems to only pay attention to the "name" field. I'm assuming it has something to do with using the "city_state" variable to search on both "city" and "state"... But I'm doing this because a user can enter either "Chicago" or "Illinois" in the City, State / Zip text box and the search should still work, using either the geographic center of Chicago or the geographic center of Illinois, respectively.
Anything obvious I'm doing wrong?

However, condition 3 does not - it seems to only pay attention to the "name" field
Errr, isn't
string "name:#{name}*"
telling it to do exactly that?
or did you mean to just do
string "#{name}"

Related

Elasticsearch, tire and autocomplete

Good day. I have elasticsearch in my rails app using tire.
I have many names in my db. And I want to search for them like search_query: "alex ivan", and the output should be ["Alexander Ivanov", "Alex Ivanenko] etc. (Real names from db)
I tried to make it with this article but it's not searching. So I've made a quickhack:
params[:search_query] = params[:search_query].split(" ").map{|a|a<<("*")}.join(" ")
Is it a good decision or I can do it with analyzers etc. ?
Here's what I did using analyzers for doing a search on names of businesses when I used ElasticSearch. Place this inside your mapping block and modify the index appropriately -- I think this will give you what you want:
indexes :name, :type => 'multi_field', :fields => {
:name => { :type => 'string', :analyzer => 'standard' },
:"name.exact" => { :type => 'string', :index => :not_analyzed }
}
Then inside your search and query blocks, something like:
search do
query do
# either a must match for exact match
boolean(:minimum_number_should_match => 1) do
must { string "name:#{<variable>}" }
end
# or a broader match
string "name:#{<variable>}*"
end
end

How to make fields on my model not searchable but they should still be available in the _source?

I am using the tire gem for ElasticSearch in Rails.
Ok so I have been battling with this the whole day and this is how far I have got. I would like to make fields on my model not searchable but they should still be available in the _source so I can use them for sorting on the search result.
My mappings:
mapping do
indexes :created_at, :type => 'date', :index => :not_analyzed
indexes :vote_score, :type => 'integer', :index => :not_analyzed
indexes :title
indexes :description
indexes :tags
indexes :answers do
indexes :description
end
end
My to_indexed_json method:
def to_indexed_json
{
vote_score: vote_score,
created_at: created_at,
title: title,
description: description,
tags: tags,
answers: answers.map{|answer| answer.description}
}.to_json
end
My Search query:
def self.search(term='', order_by, page: 1)
tire.search(page: page, per_page: PAGE_SIZE, load: true) do
query { term.present? ? string(term) : all }
sort {
by case order_by
when LAST_POSTED then {created_at: 'desc'}
else {vote_score: 'desc', created_at: 'desc'}
end
}
end
end
The only issue I am battling with now is how do I make vote_score and created_at field not searchable but still manage to use them for sorting when I'm searching.
I tried indexes :created_at, :type => 'date', :index => :no but that did not work.
If I understand you, you are not specifying a field when you send your search query to elasticsearch. This means it will be executed agains the _all field. This is a "special" field that makes elasticsearch a little easier to get using quickly. By default all fields are indexed twice, once in their own field, and once in the _all field. (You can even have different mappings/analyzers applied to these two indexings.)
I think setting the field's mappings to "include_in_all": "false" should work for you (remove the "index": "no" part). Now the field will be tokenized (and you can search with it) under it's fieldname, but when directing a search at the _all field it won't affect results (as none of it's tokens are stored in the _all field).
Have a read of the es docs on mappings, scroll down to the parameters for each type
Good luck!
I ended up going with the approach of only matching on the fields I want and that worked. This matches on multiple fields.
tire.search(page: page, per_page: PAGE_SIZE, load: true) do
query { term.present? ? (match [:title, :description, :tags, :answers], term) : all }
sort {
by case order_by
when LAST_POSTED then {created_at: 'desc'}
else {vote_score: 'desc', created_at: 'desc'}
end
}
end

not_analyzed is not working as expected

Mapping:
include Tire::Model::Search
mapping do
indexes :name, :boost => 10
indexes :account_id
indexes :company_name
indexes :email, :index => :not_analyzed
end
def to_indexed_json
to_json( :only => [:name, :account_id, :email, :company_name],
)
end
From the above mapping the it can be seen that the email field is set to not_analyzed (no broken tokens). I have an user with email vamsikrishna#gmail.com.
Now when I search for vamsikrishna, the result is showing the user...I guess it is using the default analyzer. why?
But, it should be shown only when the complete email is specified I guess (vamsikrishna#gmail.com). Why is the :not_analyzed not considered in this case? Please help.
I need only the email field to be set as not_analyzed, other fields should use standard analyzer (which is done by default).
You are searching using the _all field. It means that you are using analyzer specified for _all, not for email. Because of this the analyzer specified for email doesn't affect your search.
There are a couple of ways to solve this issue. First, you can modify the analyzer for _all field to treat emails differently. For, example you can switch to uax_url_email tokenizer that works as standard tokenizer, but doesn't split emails into tokens.
curl -XPUT 'http://localhost:9200/test-idx' -d '{
"settings" : {
"index": {
"analysis" :{
"analyzer": {
"default": {
"type" : "custom",
"tokenizer" : "uax_url_email",
"filter" : ["standard", "lowercase", "stop"]
}
}
}
}
}
}
'
The second way is to exclude email field from _all and use your query to search against both fields at the same time.
try :analyzer => 'keyword' instead of :index => :not_analyzed
what it does is to tokenize the string and hence it will be searchable only as a whole.
Dont forget to reindex !
Ref - http://www.elasticsearch.org/guide/reference/index-modules/analysis/keyword-analyzer.html
And still, if u are getting results by searching for vamsikrishna, check if you have other searchable fields with same value (for eg, name / company)
You're right, you should search for the whole field content in order to have a match on it if the specific field is not analyzed.
There are two options:
The mapping hasn't been submitted correctly. You can check your current mapping through the get mapping api: 'localhost:9200/_mapping' will give you the mapping of all your indexes. Not a tire expert, but shouldn't you provide not_analyzed as a string? 'not_analyzed' instead of :not_analyzed?
If you see that your mapping is there, that means you are searching on some other fields that match. Are you specifying the name of the field in your query?

Elastic Search nested

I'm using Elastic search through tire gem.
Given this structure to index my resource model
mapping do
indexes :_id
indexes :version, analyzer: 'snowball', boost: 100
indexes :resource_files do
indexes :_id
indexes :name, analyzer: 'snowball', boost: 100
indexes :resource_file_category do
indexes :_id
indexes :name, analyzer: 'snowball', boost: 100
end
end
end
How can i retrieve all the resources that have resource_files with a given resource_file_category id?
i've looked in the elastic search docs and i think could be using the has child filter
http://www.elasticsearch.org/guide/reference/query-dsl/has-child-filter.html
i've tried this way
filter :has_child, :type => 'resource_files', :query => {:filter => {:has_child => {:type => 'resource_file_category', :query => {:filter => {:term => {'_id' => params[:resource_file_category_id]}}}}}}
but i'm not sure if is possible/valid to make a "nested has_child filter" or if is there a better/simpler way to do this... any advice is welcome ;)
I'm afraid I don't know what your mapping definition means. It'd be easier to read if you just posted the output of:
curl -XGET 'http://127.0.0.1:9200/YOUR_INDEX/_mapping?pretty=1'
But you probably want something like this:
curl -XGET 'http://127.0.0.1:9200/YOUR_INDEX/YOUR_TYPE/_search?pretty=1' -d '
{
"query" : {
"term" : {
"resource_files.resource_file_catagory._id" : "YOUR VALUE"
}
}
}
'
Note: The _id fields should probably be mapped as {"index": "not_analyzed"} so that they don't get analyzed, but instead store the exact value. Otherwise if you do a term query for 'FOO BAR' the doc won't be found, because the actual terms that are stored are: ['foo','bar']
Note: The has_child query is used to search for parent docs who have child docs (ie docs which specify a parent type and ID) that match certain search criteria.
The dot operator can be used to access nested data.
You can try something like this:
curl -XGET 'http://loclahost:port/INDEX/TYPE/_search?pretty=1' -d
'{
"query": {
"match": {
"resource_files.resource_file_catagory.name": "VALUE"
}
}
}'
If resource_file_catagory is non_analyzed the value is not tokenized and stored as a single value, hence giving you an exact match.
You can also use elasticsearch-head plugin for data validation and also query building reference.
https://www.elastic.co/guide/en/elasticsearch/reference/1.4/modules-plugins.html or
https://mobz.github.io/elasticsearch-head/

ElasticSearch and tire/Rails: use two fields for a single facet

Using Elasticsearch with Rails 3 and tire gem.
I have got facets to work on a couple of fields, but I now have a special requirement and not sure it is possible.
I have two fields on my model Project that both store the same values: Country1 and Country2
The user is allowed to store up to two countries for a project. The drop down menus on both are the same. Neither field is required.
What I would like is a single facet that 'merges' the values from Country1 and Country2 and would handle clicking on those facets intelligently (i.e. would find it whether it was in 1 or 2)
Here's my model so far: (note Country1/2 can be multiple words)
class Project < ActiveRecord::Base
mapping do
indexes :id
indexes :title, :boost => 100
indexes :subtitle
indexes :country1, :type => 'string', :index => 'not_analyzed'
indexes :country2, :type => 'string', :index => 'not_analyzed'
end
def self.search(params)
tire.search(load: true, page: params[:page], per_page: 10) do
query do
boolean do
must { string params[:query], default_operator: "AND" } if params[:query].present?
must { term :country1, params[:country] } if params[:country].present?
end
end
sort { by :display_type, "desc" }
facet "country" do
terms :country1
end
end
end
Any tips greatly appreciated!
This commit https://github.com/karmi/tire/commit/730813f in Tire brings support for aggregating over multiple fields in the "terms" facet.
The interface is:
Tire.search('articles-test') do
query { string 'foo' }
# Pass fields as an array, not string
facet('multi') { terms ['bar', 'baz'] }
end
according to the elasticsearch docs for the terms facet http://www.elasticsearch.org/guide/reference/api/search/facets/terms-facet.html this should be possible:
Multi Fields:
The term facet can be executed against more than one field, returning
the aggregation result across those fields. For example:
{
"query" : {
"match_all" : { }
},
"facets" : {
"tag" : {
"terms" : {
"fields" : ["tag1", "tag2"],
"size" : 10
}
}
}
}
did you try providing an array of fields to the term facet like terms :country1, :country2 ?
This seems to work but I need to test it more: facet('country') { terms fields: [:country1, :country2]}

Resources