Mongoid Group By or MongoDb group by in rails - ruby-on-rails

I have a mongo table that has statistical data like the following....
course_id
status which is a string, played or completed
and timestamp information using Mongoid's Timestamping feature
so my class is as follows...
class Statistic
include Mongoid::Document
include Mongoid::Timestamps
include Mongoid::Paranoia
field :course_id, type: Integer
field :status, type: String # currently this is either play or complete
I want to get a daily count of total # of plays for a course. So for example...
8/1/12 had 2 plays, 8/2/12 had 6 plays. Etc. I would therefore be using the created_at timestamp field, with course_id and action. The issue is I don't see a group by method in Mongoid. I believe mongodb has one now, but I'm unsure of how that would be done in rails 3.
I could run through the table using each, and hack together some map or hash in rails with incrementation, but what if the course has 1 million views, retrieving and iterating over a million records could be messy. Is there a clean way to do this?

As mentioned in comments you can use map/reduce for this purpose. So you could define the following method in your model ( http://mongoid.org/en/mongoid/docs/querying.html#map_reduce )
def self.today
map = %Q{
function() {
emit(this.course_id, {count: 1})
}
}
reduce = %Q{
function(key, values) {
var result = {count: 0};
values.forEach(function(value) {
result.count += value.count;
});
return result;
}
}
self.where(:created_at.gt => Date.today, status: "played").
map_reduce(map, reduce).out(inline: true)
end
which would result in following result:
[{"_id"=>1.0, "value"=>{"count"=>2.0}}, {"_id"=>2.0, "value"=>{"count"=>1.0}}]
where _id is the course_id and count is the number of plays.
There is also dedicated group method in MongoDB but I am not sure how to get to the bare mongodb collection in Mongoid 3. I did not have a chance to dive into code that much yet.
You may wonder why I emit a document {count: 1} as it does not matter that much and I could have just emitted empty document or anything and then always add 1 to the result.count for every value. The thing is that reduce is not called if only one emit has been done for particular key (in my example course_id has been played only once) so it is better to emit documents in the same format as result.

Using Mongoid
stages = [{
"$group" => { "_id" => { "date_column_name"=>"$created_at" }},
"plays_count" => { "$sum" => 1 }
}]
#array_of_objects = ModelName.collection.aggregate(stages, {:allow_disk_use => true})
OR
stages = [{
"$group" => {
"_id" => {
"year" => { "$year" => "$created_at" },
"month" => { "$month" => "$created_at" },
"day" => { "$dayOfMonth" => "$created_at" }
}
},
"plays_count" => { "$sum" => 1 }
}]
#array_of_objects = ModelName.collection.aggregate(stages, {:allow_disk_use => true})
Follow the links below to group by using mongoid
https://taimoorchangaizpucitian.wordpress.com/2016/01/08/mongoid-group-by-query/
https://docs.mongodb.org/v3.0/reference/operator/aggregation/group/

Related

How to query more fields on mongo DB aggregation query?

I would like to know how to add an extra field to the response of collection.aggregate?
The query below groups activities by user_id. And I would like to know how to also include the user_name in the response.
DB Models
class Activity
include Mongoid::Document
field :hd_race_id, type: Float
field :metric_elapsed_time, type: Float
field :metric_distance, type: Float
field :user_name, type: String
belongs_to :user
...
class User
include Mongoid::Document
field :user_name, type: String
has_many :activities
...
Query
Activity.collection.aggregate([
{
"$group" => {
"_id" => "$user_id",
"distance" => { "$sum" => "$metric_distance" },
"time" => { "$sum" => "$metric_elapsed_time" },
},
},
{ "$sort" => { "distance" => -1 } },
])
Thank you in advance
Use the operator $first (aggregation accumulator) inside the $group stage.
For example:
"user_name": {"$first": "$user_name"}
or for the programming language you are using (not sure what it is), try something like:
"user_name" => {"$first" => "$user_name"},
For an example, see the "Group & Total" chapter in my Practical MongoDB Aggregations book

Mongoid - two query conditions to both be met in an embedded doc?

I have two models
class User
include Mongoid::Document
include Mongoid::Timestamps
field :username
embeds_many :user_tags
end
class UserTag
include Mongoid::Document
field :name
field :like_count, :type => Integer, :default => 0
embedded_in :user
end
I want to query all the users that have the user_tag named "nyc" and where the user_tag "nyc" has a like_count > 10. I've tried the following:
users = User.where('user_tags.name' => "nyc").and('user_tags.like_count' => {'$gte' => 10 })
Logically this does what it's supposed to do, but not what I need it to do. It returns users that have the user_tag "nyc" and have any user_tag with a like_count >= 10. I need users that have the user_tag "nyc" and where the user_tag "nyc"'s like_count is >= 10.
How do I do that? I'm running mongoid 4.0.2.
Actually your query is not correct for the purpose you are trying to achieve. It translates to the following MongoDB query:
db.users.find({'user_tags.name': 'nyc' }, {'user_tags.like_count': {$gte: 10}})
It means that MongoDB will find all documents with both criteria. Mongoid is returning you the same data, as MongoDB.
What you need instead is the following MongoDB query:
db.users.find({ user_tags: {
$elemMatch: {
name: 'nyc',
like_count: { $gte: 10 }
}
}})
With Mongoid you can write:
User.where(user_tags: {
'$elemMatch' => {
name: 'nyc',
like_count: { '$gte' => 10 }
}
}).count
Maybe you should write something like this:
users = User.where('user_tags.name' => "nyc", 'user_tags.like_count' => {'$gte' => 10 })
Mongoid will try to find Documents which satisfies both conditions.
You can try this
users = User.where('user_tags.name' => "nyc").where('user_tags.like_count' => {'$gte' => 10 }).all
or
users = User.where('user_tags.name' => "nyc", 'user_tags.like_count' => {'$gte' => 10 }).all

Rails update multiple records find based on other id

Using Rails 3.2. As shown in the doc on update method, the update finds based on id:
update(id, attributes)
# id - This should be the id or an array of ids to be updated.
# Updates multiple records
people = { 1 => { "first_name" => "David" }, 2 => { "first_name" => "Jeremy" } }
Person.update(people.keys, people.values)
What if I want to update an array found based on other columns? For example:
people = { 'cook' => { "first_name" => "David" }, 'server' => { "first_name" => "Jeremy" } }
Find people with role = cook, then update first_name = David; find people with role = server, then update first_name = jeremy.
I want it to be done in 1 query if possible, and not by SQL. Thanks.
You can Achieve this with #update_all
people = { 'cook' => { "first_name" => "David" }, 'server' => { "first_name" => "Jeremy" } }
Person.update_all(people.keys, people.values)
In that case I would write my own sql statement. I depends on which database backend you are using.
http://www.postgresql.org/docs/9.1/static/plpgsql-control-structures.html
https://dev.mysql.com/doc/refman/5.0/en/case.html
The update method doesn't execute 1 SQL query when passed an array of ids and values. If you view the source code for update, you will see it loops through the array and executes 2 queries for each record (a find and an update query) to then return an array of updated objects.
If you're happy accepting that you will need to make 2 queries per row, then you can use the following code for finding people by role.
people = { 'cook' => { "first_name" => "David" }, 'server' => { "first_name" => "Jeremy" } }
updated_people = people.keys.map.with_index do |role, index|
object = Person.where(role: role).first!
object.update(people.values[index])
object
end
Note: This code only updates the first record it finds per role because I've assumed there will only be one cook with the first name 'David'.
If you want to only use 1 SQL statement, you should look at doing it in SQL like devanand suggested.

How to filter search by attribute only if it exists using ElasticSearch and Tire?

Right now I wrote
Tire.search INDEX_NAME do
query do
filtered do
query { string term }
filter :or, { missing: { field: :app_id } },
{ terms: { app_id: app_ids } }
end
end
end.results.to_a
Well returning items that either have no app_id or one that matches your terms sounds like a job for an or filter - I'd try
filter :or, [
{:not => {:exists => {:field => :app_id}}},
{:terms => {:app_id => app_ids}}
]

dynamic loop in a hash Ruby on Rails

I'm trying to create a dynamic loop within the hash #data below and
can't really seem to figure it out. I'm creating an annotatedtimeline-for-rails using the google api from here https://github.com/mcommons/annotatedtimeline-for-rails.
The array within the hash #data has to be dynamic i:e the day number has to be generated by a loop and the name of the product and number are dynamic as well. I'll
try to give an example in the loop below
#numdeployed is a number and comes from a table in the db
i should be generated by the loop
#data{
begin loop
i.day.ago.to_date => { :foo=>#numdeployed, :bar=>#numdeployed, :barbaz=>#numdeployed, :foobar=>#numdeployed },
end loop
}
The Original Data Hash looks like this
#data = {
1.day.ago.to_date => { :foo=>10, :bar=>40, :barbaz=>10, :foobar=>40 },
2.day.ago.to_date => { :foo=>10, :bar=>40, :barbaz=>10,:foobar=>40 },
3.day.ago.to_date => { :foo=>10, :bar=>40, :barbaz=>10,:foobar=>40 },
4.day.ago.to_date => { :foo=>10, :bar=>40, :barbaz=>10,:foobar=>40 },
5.day.ago.to_date => { :foo=>10, :bar=>40, :barbaz=>10,:foobar=>40 }
}
hope someone can help. Thanks
Are you looking for something like this?
#data = Hash[
n.times.map do |i|
[ (i + 1).day.ago.to_date, { :foo => 10, :bar => 40, :barbaz => 10, :foobar => 40 } ]
end
]
The n is however many pairs you want in your #data.

Resources