ElasticSearch sorting - ruby-on-rails

ElasticSearch Version: 1.3.2
I am trying to sort simple collection but no matter what I try it just ignore sorting to me...
{
"query":{
"filtered":{
"query":{
"match_all":{
}
},
"filter":{
"bool":{
"must":[
{
"terms":{
"status":[
"active",
"featured"
]
}
}
]
}
}
}
},
"sort":[
{
"price_cents":{
"order":"asc"
}
}
]
}
I've noticed in my mapping I have auto_boost = true
{
"items" : {
"mappings" : {
"item" : {
"dynamic" : "false",
"_all" : {
"auto_boost" : true
},
"properties" : {
"price_cents" : {
"type" : "integer"
},
"status" : {
"type" : "string"
},
"title" : {
"type" : "string",
"boost" : 10.0,
"analyzer" : "snowball"
}
}
}
}
}
}
this attribute has been added automatically by https://github.com/elasticsearch/elasticsearch-rails gem which I use:
mappings :dynamic => false do
indexes :title, :analyzer => 'snowball', :boost => 10.0
indexes :status
indexes :price_cents, :type => :integer, :index => 'not_analyzed'
end
I wonder is the "auto_boost": true the reason of sort ignore? I can't find the correct way how to turn it to false and check...

I found the issue, it is kind of related to the current bug with .records (https://github.com/elasticsearch/elasticsearch-rails/issues/206), because of it does not paginating correctly I used temporary this construction
#paginates_items = Item.search(params).page(params[:page]).per(60).results
item_ids = #paginates_items.collect(&:id)
#items = Item.where(:id => item_ids)
but I forgot that where(:id => items_id) completely break the order...

Related

logstash change type format

I have ror application that in admin dashboard, admin could observe the location of his employee, in my case, I use elk to gather information of employees that contains latitude and longitude and which send to my map based on his movement, My problem is, I have a template that logstash based on template create daily index but recently I found every field in my index that have type changed to text when indexed created.
this is my json that logstash reads:
{"driver_id": 31,"driver_email": "ankith.ravindran#mailinator.com","location": {"latitude": "-35.2824767","longitude": "149.1326453"},"created_at": "2021-06-29 14:28:47", "required_matches": 1, "type": "location"}
this is my logstash.conf file:
input {
file {
path => ["/usr/share/logstash/MPD_LOCATION/*",
"/usr/share/logstash/MPD_LOCATION/*/*",
"/usr/share/logstash/MPD_LOCATION/*/*/*",
"/usr/share/logstash/MPD_LOCATION/*/*/*/*",
"/usr/share/logstash/MPD_LOCATION/*/*/*/*/*"]
start_position => "beginning"
type => "json"
sincedb_path => "/dev/null"
}
}
filter {
mutate {
gsub => ["message","/}+({)/", "}::{"]
}
mutate {
gsub => ["message","/}+( )/", "}::"]
}
split {
field => "message"
terminator => "::"
}
json { source => "message" }
mutate {
add_field => { "uuid" => "D%{driver_id}T%{created_at}" }
rename => {
"[location][latitude]" => "[location][lat]"
"[location][longitude]" => "[location][lon]"
}
convert => {
"[location][lat]" => "float"
"[location][lon]" => "float"
}
}
}
output {
if ([type] == "location") {
elasticsearch {
hosts => "http://elasticsearch:9200"
index => "live_locations_%{+YYYY_MM_dd}"
# manage_template => true
template => "/usr/share/logstash/Template/live_locations.json"
template_name => "live_locations"
# template_overwrite => true
document_id => "%{uuid}"
}
} else if ([type] == "app_info") {
elasticsearch {
hosts => "http://elasticsearch:9200"
index => "app_info_%{+YYYY_MM_dd}"
document_id => "%{uuid}"
}
}
stdout { codec => rubydebug }
}
this is my template file:
{
"settings": {
"index": {
"number_of_shards": 5,
"number_of_replicas": 1
}
},
"mappings": {
"properties": {
"driver_id": { "type": "integer" },
"email": { "type": "text" },
"location": { "type": "geo_point" },
"app-platform": { "type": "text" },
"app-version": { "type": "text" },
"created_at": { "type": "date", "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"},
"required_matches": { "type": "integer" }
}
}
}
for example, I defined type of created_at , date but when index created this field return as text and I can't understand what happened or field of location it's return float so I could not use my index as geo_point, I have to add I use elk in the version of 7.13 and used on docker.
Updated : I have two types of JSON that one of them just returns the location of the employee the second of them just returns app_version and app_platform of the employee that used.
Updated 2 : I change my input from logstash to filebeat but I still have the same problem.

Failed to parse date field [0] with format [MMM, YY] with elastic search 5.0

I am trying to get the date parsed into a string format as month and numerical year format like "JAN, 92". My mapping is as below:
size" => 0,
"query" => {
"bool" => {
"must" => [
{
"term" => {
"checkin_progress_for" => {
"value" => "Goal"
}
}
},
{
"term" => {
"goal_owner_id" => {
"value" => "#{current_user.access_key}"
}
}
}
]
}
},
"aggregations" => {
"chekins_over_time" => {
"range" => {
"field" => "checkin_at",
"format" => "MMM, YY",
"ranges" => [
{
"from" => "now-6M",
"to" => "now"
}
]
},
"aggs" => {
"checkins_monthly" => {
"date_histogram" => {
"field" => "checkin_at",
"format" => "MMM, YY",
"interval" => "month",
"min_doc_count" => 0,
"missing" => 0,
"extended_bounds" => {
"min" => "now-6M",
"max" => "now"
}
}
}
}
}
}
}
I throws the following error:
elasticsearch.transport.RemoteTransportException: [captia-america][127.0.0.1:9300][indices:data/read/search[phase/query]]
Caused by: elasticsearch.ElasticsearchParseException: failed to parse date field [0] with format [MMM, YY]
If I remove the {MMM, YY} and put the normal date format it works.
What could the solution to rectify this.Help appreciated.
Your checkins_monthly aggregation is a bit wrong. The missing part should have the same format for the date to use when the field is missing. A 0 is not actually a date.
For example:
"aggs": {
"checkins_monthly": {
"date_histogram": {
"field": "checkin_at",
"format": "MMM, YY",
"interval": "month",
"min_doc_count": 0,
"missing": "Jan, 17",
"extended_bounds": {
"min": "now-6M",
"max": "now"
}
}
}

Exclude nil values from ElasticSearch Aggregation

I was using this query to retrieve the most significant values:
keywords = Answer.search(
:size => 5,
:query => {
:match => {
:question_id => 32481
}
},
:aggregations => {
:keywords => {
:significant_terms => {
:field => 'text'
}
}
}
)
The field is :text, but it has nil values, so the answer is always:
2.1.2 :135 > keywords.map(&:text)
=> [nil, nil, nil, nil, nil]
I tried to add a filter, as the documentation suggests, but it gives me a parse error:
keywords = Answer.search(
:size => 5,
:query => {
:match => {
:question_id => 32481
},
:filtered => {
:filter => {
:exists => { :field => 'text' }
}
}
},
:aggregations => {
:keywords => {
:significant_terms => {
:field => 'text'
}
}
}
)
I've tried many combinations, with no success. How can I get only the valid text answers?
I believe your ES query should translate to something like this:
"size": 5,
"query": {
"filtered": {
"query": { "match": { "question_id" : 32481 } },
"filter": {
"exists": {
"field": "text"
}
}
}
},
"aggs": {
"keywords": {
"significant_terms": {
"field": "text"
}
}
}
meaning your "question_id" "match" should be enclosed in the "filtered" element.

Elasticsearch and Rails: Using ngram to search for part of a word

I am trying to use the Elasticsearch-Gem in my project. As I understand: By now there is no need for the Tire-Gem anymore, or am I wrong?
In my project I have a search (obivously), which currently applies to one model. Now I am trying to avoid wildcards, since they don't scale well, but I can't seem to get the ngram-Analyzers work properly. If I search for whole words, the search still works, but not for parts of it.
class Pictures < ActiveRecord::Base
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
settings :analysis => {
:analyzer => {
:my_index_analyzer => {
:tokenizer => "keyword",
:filter => ["lowercase", "substring"]
},
:my_search_analyzer => {
:tokenizer => "keyword",
:filter => ["lowercase", "substring"]
}
},
:filter => {
:substring => {
:type => "nGram",
:min_gram => 2,
:max_gram => 50
}
}
} do
mapping do
indexes :title,
:properties => {
:type => "string",
:index_analyzer => 'my_index_analyzer',
:search_analyzer => "my_search_analyzer"
}
Maybe somebody can give me a hint into the right direction.
I have given up on defining schema in the model class. In fact, it does not make much sense too.
So here is what I have done. A schema/mapping definition the db/ folder and a rake task to build it.
https://gist.github.com/geordee/9313f4867d61ce340a08
In the model
def as_indexed_json(options={})
self.as_json(only: [:id, :name, :description, :price])
end
I'm using an index for suggestions based on edgeNGram (like nGram, but always starting at the left side of the word) with this settings:
{
"en_suggestions": {
"settings": {
"index": {
"analysis": {
"filter": {
"tpNGramFilter": {
"min_gram": "4",
"type": "edgeNGram",
"max_gram": "50"
}
},
"analyzer": {
"tpNGramAnalyzer": {
"type": "custom",
"filter": [
"tpNGramFilter"
],
"tokenizer": "lowercase"
}
}
}
}
}
}
}
and this mapping:
{
"en_suggestions": {
"mappings": {
"suggest": {
"properties": {
"proposal": {
"type": "string",
"analyzer": "tpNGramAnalyzer"
}
}
}
}
}
}

mongodb/rails "exception: can't find special index: 2d for:"

i have a rails app where i have some problems with indexes. I search locations by name.
First i thought its a problem with the addresses.coords but iam not sure about it.
The relevant parts of the search controller:
#practices = Practice.published
#practices = #practices.where(:"addresses.country" => params[:country].upcase) if params[:country].present?
if params[:location].present? && latlng = get_coordinates
#practices = #practices.near_sphere(:"addresses.coords" => latlng).max_distance(:"addresses.coords" => get_distance )
end
# now find doctors based on resulting practices
#doctors = Doctor.published.in("organization_relations.practice_id" => #practices.distinct(:_id))
The complete crash log:
Moped::Errors::OperationFailure (The operation: #<Moped::Protocol::Command
#length=255
#request_id=646
#response_to=0
#op_code=2004
#flags=[]
#full_collection_name="um-contacts.$cmd"
#skip=0
#limit=-1
#selector={:distinct=>"practices", :key=>"_id", :query=>{"deleted_at"=>nil, "published_at"=>{"$lte"=>2012-11-05 15:17:14 UTC}, "addresses.country"=>"DE", "addresses.coords"=>{"$nearSphere"=>[13.4060912, 52.519171], "$maxDistance"=>0.01569612305760477}}}
#fields=nil>
failed with error 13038: "exception: can't find special index: 2d for: { deleted_at: null, published_at: { $lte: new Date(1352128634313) }, addresses.country: \"DE\", addresses.coords: { $nearSphere: [ 13.4060912, 52.519171 ], $maxDistance: 0.01569612305760477 } }"
See https://github.com/mongodb/mongo/blob/master/docs/errors.md
for details about this error.):
app/controllers/search_controller.rb:16:in `index'
Thats the result of the indexes, not sure how to query them from the addresses which are embedded via has_many.
> db.practices.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "um-contacts.practices",
"name" : "_id_"
}
]
Help would be really appreciated!
Edit: Looks like the indexes for adresses.coords arent created,
db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "um-contacts.users", "name" : "_id_" }
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "um-contacts.doctors", "name" : "_id_" }
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "um-contacts.collaborations", "name" : "_i
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "um-contacts.practices", "name" : "_id_" }
but should be created within the practice class:
class Practice
...
embeds_many :addresses, cascade_callbacks: true, as: :addressable
...
field :name, type: String
field :kind, type: String
field :slug, type: String
index({"addresses.coords" => '2d'}, { min: -180, max: 180, background: true })
index({name: 1})
index({slug: 1}, { unique: true })
...
Anyone have an idea why its failing?
try to re-create your indexes. for mongoid:
rake db:mongoid:create_indexes

Resources