Elasticsearch Term Filter Slow - ruby-on-rails

We're running a 2 node elasticsearch cluster with 2 indexes currently and it's performing beautifully (750k docs and 11.1 million docs).
We're now trying to add a new index with 35.4 million docs and the search performance is slow. A term filter takes about 2 seconds to return.
Mapping:
tire do
mapping _routing: { required: true, path: :order_id } do
indexes :id, type: 'string', index: :not_analyzed
indexes :order_id, type: 'string', index: :not_analyzed
[:first_name, :last_name, :company_name, :title, :email, :city, :state_region_province, :postal_code].each do |attribute|
indexes attribute, type: 'string', analyzer: 'keyword'
end
indexes :metadata, type: 'string'
indexes :clicks, type: 'integer', index: :not_analyzed, include_in_all: false
indexes :view_count, type: 'integer', index: :not_analyzed, include_in_all: false
indexes :sender, type: 'boolean', index: :not_analyzed, include_in_all: false
indexes :bounced, type: 'boolean', index: :not_analyzed, include_in_all: false
indexes :unsubscribed, type: 'boolean', index: :not_analyzed, include_in_all: false
end
end
Searching:
Model.tire.search(load: true, page: page, per_page: per_page, routing: order_id) do |search|
search.query do
match :metadata, query, type: 'phrase_prefix', max_expansions: 10
end if query.present?
search.filter :term, order_id: order_id
search.filter :term, sender: false
end
The search I'm doing is just specifying an order_id to filter on; it takes about 2 seconds to return results. How do I speed this up?
Edit:
I'm now indexing the user_id and using it as the routing path. I've created a new index with 30 shards to test overallocation.
Edit 2:
With 30 shards, the index is more performant but still takes over a second to return data on the first query. I'm not sure how to speed this up more or what I'm doing wrong.

What happens if you toggle analyzing for the order_id field to :keyword? From:
indexes :order_id, type: 'string', index: :not_analyzed
to:
indexes :order_id, type: 'string', index: :keyword
The docs say:
An analyzer of type keyword that “tokenizes” an entire stream as a single token. This is useful for data like zip codes, ids and so on.
It seems like that'd apply to an order_id.

If you are not using facets with your query, I would suggest converting your query into filtered query and moving term filters from the top level to filters in the filtered query. See also Performance of elastic queries

Related

Preventing and excluding certain attributes from being indexed in Elasticsearch using Rails

I am trying to index a particular model using Ruby on Rails and Elasticsearch but even if I use index: false the email still shows up on the index? I do not want the email to be indexed, what else can I do to prevent the email attribute from being indexed?
mappings dynamic: false do
indexes :author, type: 'text'
indexes :title, type: 'text'
indexes :email, index: false
end
I then produce the index and import records using:
Book.__elasticsearch__.create_index!
followed by Book.import force: true

Get all records without childrens even if they exists in elasticsearch

I am using elasticsearch_rails gem to do elasticsearch queries.
I have mappings like this
mapping do
indexes :id, type: 'keyword'
indexes :name, type: 'text'
indexes :slug, type: 'keyword'
indexes :variants, type: 'nested' do
indexes :id, type: 'keyword'
indexes :sku, type: 'keyword'
indexes :main, type: 'boolean'
indexes :price, type: 'float'
end
end
def as_indexed_json(options = {})
as_json(only: ['id', 'name', 'slug'],
include: {
variants: {
only: ['id', 'sku', 'main', 'price']
}
})
end
and what I am trying to do is to get all "parents" element, but without "variants". I mean I want to get all products without variants even if they have some variants.
I am trying to do it because when I have a product with large amount of variants (fe. product with 2.5k variants) elasticsearch returns me whole "collection" (product and all of its variants) and if I'm going to list 20 products with 2k variants each its gonna took forever.
I am retrieving all products with simple Product.elasticsearch.search('*') query
Regards.
To not transport all source data, the elasticsearch keyword to use is "_source" https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html

Best way to search for human names with elasticsearch rails?

I have the following setup in my Rails 4 application via the elasticsearch-rails gem. If I search for a technical term such as "appellate" in the description field (which may be repeated more than once), I get back search results. But if I search for a name, whether first, last, or first and last, I seem to get nothing. Am I supposed to do something for string type fields? (keeping in mind that description is a text field in Rails and first and last names are string fields in Rails) Keep in mind that a first name such as "John" only appears in the first name field (last name is analogous to this), so I wonder if that is part of the issue?
class Trial < ActiveRecord::Base
include ActiveModel::Serialization
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
index_name [Rails.env, model_name.collection.gsub(/\//, '-')].join('_')
settings index: { number_of_shards: 2 } do
mappings dynamic: 'false' do
indexes :first_name, analyzer: 'english', index_options: 'offsets', copy_to: 'full_name'
indexes :last_name, analyzer: 'english', index_options: 'offsets', copy_to: 'full_name'
indexes :email, analyzer: 'english', index_options: 'offsets'
indexes :description, analyzer: 'english', index_options: 'offsets'
indexes :full_name, analyzer: 'english', index_options: 'offsets'
end
end
def self.search_trials(search_terms)
response = ClinicalTrial.search(
size: 20,
query: {
multi_match: {
"query" => search_terms,
"type" => "cross_fields",
"fields" => "_all"
}
}
)
response.records
end
end

How to perform ElasticSearch query on records from only a certain user (Rails Tire gem)

I have a Mongoid model which I perform ElasticSearch search queries on. Very standard:
class ActivityLog
include Mongoid::Document
include Mongoid::Timestamps
include Tire::Model::Search
include Tire::Model::Callbacks
field :extra, type: Hash
belongs_to :user, :inverse_of => :activity_logs
def self.search(params, user)
tire.search(load: true, page: params[:page], per_page: 5) do
query { string params[:query], default_operator: "AND" } if params[:query].present?
sort { by :created_at, "desc" }
end
end
I am having a hard time understanding the documentation on how to do more advanced stuff, and currently I'm stuck in how to work into my search query that search should be restricted to ActivityLog objects that belongs to the user only.
Could anyone show me how to work the user._id match requirement into the search query?
Working with ElasticSearch, there are 2 parts: mapping and search
So, if you want to search by a field of association table (users table), right. There are many ways, but this one I often use for my project:
Inside the ActivityLog model
mapping do
indexes :id, type: 'integer'
indexes :name, type: 'string', analyzer: 'snowball', boost: 5
indexes :description, type: 'string', analyzer: 'snowball'
indexes :description_latin, as: 'description.sanitize', type: 'string', analyzer: 'snowball'
indexes :user_id, as: 'user.id', type: 'string', index: :not_analyzed
indexes :user_name, as: 'user.name', type: 'string'
indexes :created_at, type: 'date'
indexes :slug, index: :not_analyzed
indexes :publish, type: 'boolean', index: :not_analyzed
end
Notify the user_id and user_name, the above definition of mapping method will map the user.id to the user_id, and user.name to user_name.
So now in search method, you can do some similar code like
filter :terms, user_id: params[:search][:user_id]
Hope this help
One correction to the above, ElasticSearch has dropped the type string and now uses text.

Sorting by number of nested elements with ElasticSearch and Tire

Given the following mapping on a model called Post, is it possible to build a query that returns posts ordered by the number of votes cast within a specific time range (eg. past 7 days)?
mapping do
indexes :id, type: 'integer'
indexes :user_id, type: 'integer'
indexes :name, boost: 10
indexes :body # analyzer: 'snowball'
indexes :created_at, type: 'date'
indexes :vote_count, type: 'integer'
indexes :topic_ids
indexes :topics do
indexes :id, type: 'integer'
indexes :name, type: 'string'
end
indexes :votes do
indexes :user_id, type: 'integer'
indexes :created_at, type: 'date'
end
end
I'm using Tire in Rails 3.2.13. I have a Post model and a Vote model. A Post has many votes, and a Vote belongs to Post.
Yes, you can either sort on the vote_count atrribute, or use the custom_score query for fine-grained relevancy computation.
For the former, something like:
require 'tire'
s = Tire.search do
query do
filtered do
query { all }
filter :range, 'created_at' => { from: '-7d' }
end
end
sort do
by :vote_count
end
end
puts s.to_curl

Resources