Grouping results and summarizing a field in one query with Mongoid? - ruby-on-rails

I'm trying to execute a more or less advanced query with Mongoid that basically gets metrics for a date range, groups them by day and then summarizes the values for each day, it should also tell me how many entries there are for each day.
I highly doubt this can be done with the active record part of Mongoid, but I don't know how to execute queries on the mongo driver directly.
My model:
class Metric
include Mongoid::Document
field :id_session, :type => Integer
field :metric, :type => Integer
field :value, :type => Integer
field :date, :type => Date
field :time, :type => Time
validates_numericality_of :value
validates_presence_of :id_session, :metric, :value
before_create :set_date
def set_date
self.time = Time.now
self.date = Date.now
end
end
I've been able to get the results grouped by date simply by using Metric.distinct(:date), but I don't know how to do a sum and count of those results as I can't use that method on the results.
Any ideas? I prefer to stay within the Mongoid active record methods but if anyone knows how I can execute queries directly on the MongoDB driver that would help too.
Thanks!

Managed to get it working
result = Metric.collection.group(
['date'] , nil, {:count => 0, :value => 0}, "function(x, y) { y.count++; y.value += x.value }"
)
Credits go to the answer on this page

Related

How to make fields on my model not searchable but they should still be available in the _source?

I am using the tire gem for ElasticSearch in Rails.
Ok so I have been battling with this the whole day and this is how far I have got. I would like to make fields on my model not searchable but they should still be available in the _source so I can use them for sorting on the search result.
My mappings:
mapping do
indexes :created_at, :type => 'date', :index => :not_analyzed
indexes :vote_score, :type => 'integer', :index => :not_analyzed
indexes :title
indexes :description
indexes :tags
indexes :answers do
indexes :description
end
end
My to_indexed_json method:
def to_indexed_json
{
vote_score: vote_score,
created_at: created_at,
title: title,
description: description,
tags: tags,
answers: answers.map{|answer| answer.description}
}.to_json
end
My Search query:
def self.search(term='', order_by, page: 1)
tire.search(page: page, per_page: PAGE_SIZE, load: true) do
query { term.present? ? string(term) : all }
sort {
by case order_by
when LAST_POSTED then {created_at: 'desc'}
else {vote_score: 'desc', created_at: 'desc'}
end
}
end
end
The only issue I am battling with now is how do I make vote_score and created_at field not searchable but still manage to use them for sorting when I'm searching.
I tried indexes :created_at, :type => 'date', :index => :no but that did not work.
If I understand you, you are not specifying a field when you send your search query to elasticsearch. This means it will be executed agains the _all field. This is a "special" field that makes elasticsearch a little easier to get using quickly. By default all fields are indexed twice, once in their own field, and once in the _all field. (You can even have different mappings/analyzers applied to these two indexings.)
I think setting the field's mappings to "include_in_all": "false" should work for you (remove the "index": "no" part). Now the field will be tokenized (and you can search with it) under it's fieldname, but when directing a search at the _all field it won't affect results (as none of it's tokens are stored in the _all field).
Have a read of the es docs on mappings, scroll down to the parameters for each type
Good luck!
I ended up going with the approach of only matching on the fields I want and that worked. This matches on multiple fields.
tire.search(page: page, per_page: PAGE_SIZE, load: true) do
query { term.present? ? (match [:title, :description, :tags, :answers], term) : all }
sort {
by case order_by
when LAST_POSTED then {created_at: 'desc'}
else {vote_score: 'desc', created_at: 'desc'}
end
}
end

Optimize query mongodb/mongoid

class Job
field :occupation, :type => String
field :experience, :type => String
end
In my api file:
get :searches do
Cv.search({query: "*#{params[:q]}*"}).map{ |cv| {id: cv.id, text: cv.occupation } }
end
This generate a json file:
[{"id":"513dbb61a61654a845000005","text":"industrial engineer"},{"id":"513a11d4a6165411b2000008","text":"javascript engineer"}]
I'm using mongodb as database and mongoid as orm/odm.
This working fine with 10 or 100 or 1000 results but my question is if is possible optimize the api query for large data collections *1.000.000 or 2.000.000 of results.*
Depending on your query, if you want to query on occupation you need to create an index on occupation
class Job
include Mongoid::Document
field :occupation, :type => String
field :experience, :type => String
index({ occupation: 1 })
end
Mongoid automatically creates and index on _id so if you always include that in your query it will remain fast, unless your database indexes starts exceeding available ram.
Job.where(:occupation => "javascript engineer") will use the index

ElasticSearch filter to match a single date

I've been working with elastic search for sometime now and I've hit a roadblock where I have to search for events that match a particular start date (start_at). I've indexed my fields as
mapping do
indexes :name, :type => 'string', :analyzer => 'snowball'
indexes :description, :type => 'string', :analyzer => 'snowball'
indexes :start_at, :type => 'date'
indexes :end_at, :type => 'date'
indexes :tag_list, :type => 'string', :analyzer => 'snowball'
indexes :lat_lon, :type => 'geo_point'
indexes :user_details, :type => 'string'
end
def to_indexed_json
to_hash.merge({
:user_details => (user ? user.to_index : nil),
:artist_details => (artists ? artists.each{|artist| artist.to_index }: nil),
:primary_genre => (genre ? genre.name : nil),
:lat_lon => [lat, lng].join(',')
}).to_json
end
So when i hit
Tire.search('events') do
# ignore search query keywords
filter range: {start_at: {gte: Date.today, lt: Date.tomorrow}}
end
Returns nothing but works great with single ranges. That is
Tire.search('events') do
# ignore search query keywords
filter range: {start_at: {gte: Date.today}}
end
I indexed Elasticsearch for events mappings to make start_at and end_at into dates or it would perform term matches on those but something like this would not be the answer
Tire.search('events') do
query do
string "start_at: #{Date.today}"
end
end
Since this performs a string match it results in all records because the tokenizer would break into 2012, 05, 16 and since 2012 and 16 may match in multiple areas so it would return all matches.
I know I'm missing something very basic. I would appreciate any help on this.
Update
Event.find_all_by_start_at(Date.tomorrow + 1.day).size
Event Load (0.7ms) SELECT `events`.* FROM `events` WHERE `events`.`start_at` = '2012-05-19'
=> 1
So I have events for that day. Now when I run it with elastic search
ruby-1.9.2-p180 :024 > Tire.search('events') do
ruby-1.9.2-p180 :025 > filter :range, :start_at => {gte: Date.tomorrow + 1.days, lt: Date.tomorrow + 2.days}
ruby-1.9.2-p180 :026?> end
ruby-1.9.2-p180 :029 > x.to_curl
=> "curl -X GET \"http://localhost:9200/events/_search?pretty=true\" -d '{\"filter\":{\"range\":{\"start_at\":{\"gte\":\"2012-05-19\",\"lt\":\"2012-05-20\"}}}}'"
{"events":{"event":{"properties":{"allow_comments":{"type":"boolean"},"artist_details":{"type":"string"},"artist_id":{"type":"long"},"city":{"type":"string"},"comments_count":{"type":"long"},"confirm":{"type":"boolean"},"created_at":{"type":"date","format":"dateOptionalTime"},"description":{"type":"string","analyzer":"snowball"},"end_at":{"type":"string"},"event_attendees_count":{"type":"long"},"event_content_type":{"type":"string"},"event_file_name":{"type":"string"},"event_file_size":{"type":"long"},"genre_id":{"type":"long"},"hits":{"type":"long"},"id":{"type":"long"},"interview":{"type":"boolean"},"lat":{"type":"double"},"lat_lon":{"type":"geo_point"},"lng":{"type":"double"},"location":{"type":"string"},"name":{"type":"string","analyzer":"snowball"},"online_tix":{"type":"boolean"},"primary_genre":{"type":"string"},"private":{"type":"boolean"},"start_at":{"type":"string"},"state":{"type":"string"},"tag_list":{"type":"string","analyzer":"snowball"},"updated_at":{"type":"date","format":"dateOptionalTime"},"user_details":{"type":"string"},"user_id":{"type":"long"},"venue_id":{"type":"long"},"zip":{"type":"string"}}}}}
Elasticsearch tries to be flexible in handing mappings. At the same time, it has to deal with limitations of underlying search engine - Lucene. As a result, when existing mapping contradicts the updated mapping, the new mapping is ignored. Another feature of elasticsearch that probably played a role in this issue is automatic mapping creation based on the data. So, if you
Created new index
Indexed a records with the field start_at with a string that contains a date in a format that elasticsearch didn't recognize
Updated mapping assigning type "date" to the start_at field
you ended up with the mapping where the field start_at has type "string". The only way around it is to delete the index and specify the mapping before adding the first record.
It does not seem you need to use a search query - but a filter. Try something like this:
filter(:range, date: {
to: params[:date],
from: params[:date]
}) if params[:date].present?
Where params[:date] should match the format:
>> Time.now.strftime('%F')
=> "2014-03-10"
and could be anything - both hardtyped or passed in as parameters.
Fields :start_at and :end_at should be mapped as :type => 'date' (just as you have now), no need to change to string or anything alike.
This approach works with mapping of a field of date, should be fine also for datetime as Tire/Elasticsearch doesn't seem to differ those two field types.
Bonus: you can find nice rails elasticsearch/tire production setup example here:
https://gist.github.com/psyxoz/4326881

Rails + Sunspot: Multiple model search, but only certain fields on one of the models?

In my Rails app I'm using Sunspot to index a few different models. I then have a global search form which returns mixed results. This is working fine:
Sunspot.search(Person, EmailAddress, PhoneNumber, PhysicalAddress) do
fulltext params[:q]
paginate :per_page => 10
end
I would like to add an additional model, say Project, to this search. The Project model has quite a bit that is indexed:
class Project < ActiveRecord::Base
searchable do
string :category do
category.name.downcase if category = self.category
end
string :client do
client.name.downcase if client = self.client
end
string :status
text :tracking_number
text :description
integer :category_id, :references => Category
integer :client_id, :references => Client
integer :tag_ids, :multiple => true, :references => Tag
time :created_at, :trie => true
time :updated_at, :trie => true
time :received_at, :trie => true
time :completed_at, :trie => true
end
end
How can I modify my original Sunspot.search call to add searching for Project records by just the tracking_number field and not the description field?
Sunspot.search(Person, EmailAddress, PhoneNumber, PhysicalAddress, Project) do
fulltext params[:q] do
fields(:tracking_number, :other_fields_outside_your_project_model)
end
paginate :per_page => 10
end
This will do full text search on tracking_number field and any other fields you specify, particularly in your Person, EmailAddress, PhoneNumber, and PhysicalAddress models.
I think you have to define your tracking_number as a text field and not a string field. Full text search only on "text fields".
Did you try this :
text:tracking_number
And your sunspot search looks like :
Sunspot.search(Person, EmailAddress, PhoneNumber, PhysicalAddress, Project) do
fulltext params[:q]
paginate :per_page => 10
end
See you
Did you try something like :
Sunspot.search(Post) do
keywords 'great pizza', :fields => [:title, :body]
end
You can make one request for each model and then concat your results in only one list. I think you can't make it on one search.

Querying embedded objects in Mongoid/rails 3 ("Lower than", Min operators and sorting)

I am using rails 3 with mongoid.
I have a collection of Stocks with an embedded collection of Prices :
class Stock
include Mongoid::Document
field :name, :type => String
field :code, :type => Integer
embeds_many :prices
class Price
include Mongoid::Document
field :date, :type => DateTime
field :value, :type => Float
embedded_in :stock, :inverse_of => :prices
I would like to get the stocks whose the minimum price since a given date is lower than a given price p, and then be able to sort the prices for each stock.
But it looks like Mongodb does not allow to do it.
Because this will not work:
#stocks = Stock.Where(:prices.value.lt => p)
Also, it seems that mongoDB can not sort embedded objects.
So, is there an alternative in order to accomplish this task ?
Maybe i should put everything in one collection so that i could easily run the following query:
#stocks = Stock.Where(:prices.lt => p)
But i really want to get results grouped by stock names after my query (distinct stocks with an array of ordered prices for example). I have heard about map/reduce with the group function but i am not sure how to use it correctly with Mongoid.
http://www.mongodb.org/display/DOCS/Aggregation
The equivalent in SQL would be something like this:
SELECT name, code, min(price) from Stock WHERE price<p GROUP BY name, code
Thanks for your help.
MongoDB / Mongoid do allow you to do this. Your example will work, the syntax is just incorrect.
#stocks = Stock.Where(:prices.value.lt => p) #does not work
#stocks = Stock.where('prices.value' => {'$lt' => p}) #this should work
And, it's still chainable so you can order by name as well:
#stocks = Stock.where('prices.value' => {'$lt' => p}).asc(:name)
Hope this helps.
I've had a similar problem... here's what I suggest:
scope :price_min, lambda { |price_min| price_min.nil? ? {} : where("price.value" => { '$lte' => price_min.to_f }) }
Place this scope in the parent model. This will enable you to make queries like:
Stock.price_min(1000).count
Note that my scope only works when you actually insert some data there. This is very handy if you're building complex queries with Mongoid.
Good luck!
Very best,
Ruy
MongoDB does allow querying of embedded documents, http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-ValueinanEmbeddedObject
What you're missing is a scope on the Price model, something like this:
scope :greater_than, lambda {|value| { :where => {:value.gt => value} } }
This will let you pass in any value you want and return a Mongoid collection of prices with the value greater than what you passed in. It'll be an unsorted collection, so you'll have to sort it in Ruby.
prices.sort {|a,b| a.value <=> b.value}.each {|price| puts price.value}
Mongoid does have a map_reduce method to which you pass two string variables containing the Javascript functions to execute map/reduce, and this would probably be the best way of doing what you need, but the code above will work for now.

Resources