What is the correct way to chain map_reduce calls in Mongoid? - ruby-on-rails

I have an Element model that belongs to User. I am trying to calculate the following hash: how many users have element count of 1, 2, 3, etc. The approach I take is to first generate a hash of {user -> num elements}, then I sort-of invert it using a second map-reduce.
Here's what I have so far:
Element.map_reduce(%Q{
emit(this.user_id, 1);
}, %Q{
function(key, values) {
return Array.sum(values);
}
}).out(inline: true).map_reduce(%Q{
if (this.value > 1) {
emit(this.value, this._id);
}
}, %Q{
function(element_count, user_ids) {
return user_ids.length;
}
}).out(inline: true)
This gives me an "undefined method `map_reduce'" error. I couldn't find the answer in the docs. Any help would be great.

I calculated the hash using aggregate instead mapreduce, first grouping by user, and then grouping again by elements count:
Element.collection.aggregate([
{
"$group" => {
"_id" => "$user_id", "elements_count" => {"$sum" => 1}
}
},
{
"$group" => {
"_id" => "$elements_count", "users_count" => {"$sum" => 1}
}
},
{ "$project" => {
"_id" => 0,
"users_count" => '$users',
"elements_count" => '$_id',
}
}
])
This returns the following array:
[
{"users_count"=>3, "elements_count"=>2},
{"users_count"=>4, "elements_count"=>3},
...
]
If needed it can also be sorted using $sort operator

Related

Mongoid aggregation with conditions

I'm using the mongoid 6.1.0 aggregation framework in my Rails 5 project. I need to add a $match pipeline if the value of a search field (text or select field) is not empty. Otherwise, it should be ignored and don't filter results. Something like:
#messages = Message.collection.aggregate([
{ '$match' => {'month': {'$gte' => #fr_mnth, '$lte' => #to_mnth}}},
{ '$group' => {'_id': '$mmsi'} },
{ '$lookup' => {'from': 'ships', 'localField': "_id", 'foreignField': "mmsi", as: "ship"}},
{ '$match' => {"ship.n2": params[:n2] if !params[:n2].blank? }}
]).allow_disk_use(true)
or to be more clear:
if not params[:n2].blank? { '$match' => {"ship.n2": params[:n2] }}
The problem is that if !params[:n2].blank? cannot be included in the aggregation framework. Is there any other alternative solution?
I don't know ruby, but maybe I understand your problem.
Pseudo-code
# DON'T DO SO! SEE UPDATE BELOW
if your_condition is true:
filter = { field: 'some value' }
else:
filter = { # always true condition
$or: [
{ field: { $exists: true } },
{ field: { $exists: false } }
]
}
Message.collection.aggregate([
# ...
{
"$match": filter
}
])
UPDATE:
As Aboozar Rajabi noted, if condition is true then we can just add $match stage to pipeline:
pipeline = [
# stages
];
if condition is true:
pipeline.push({
$match: {
# filter
}
});
The above pseudo-code (Kan A's answer) is translated to Ruby and Mongoid aggregation framework as its syntax might be a bit confusing and there're a few Mongoid aggregation examples online:
if not params[:field].blank?
filter = { "db_field_name": params[:field] }
else
filter = {
'$or' => [
{ "db_field_name" => { '$exists' => true } },
{ "db_field_name" => { '$exists' => false } }
]
}
end
I hope it would help the others who will see this page later. Also, this solution and the code in the question would be an example of using MongoDB aggregation framework in a Rails or Ruby project.

How to convert hash with keys representing nesting into a nested hash

I need to convert the following hash:
{
"item[0][size]" => "12",
"item[0][count]" => "1"
}
to this:
{
"item": {
"0": {
"size": "12",
"count": "1"
}
}
}
Could you please advice on how to achieve that most gracefully? Maybe I can reuse some ActionPack's utility method that is used for parsing parameter strings?
You can reuse a rack lib method Rack::Utils.parse_nested_query
require "rack"
def p p
Rack::Utils.parse_nested_query(p)
end
p 'item[0][size]=12' # => {"item"=>{"0"=>{"size"=>"12"}}}
Found here.
After some research I found a way to parse nested query keys using http://apidock.com/rails/Rack/Utils/parse_nested_query:
Rack::Utils.parse_nested_query('item[0][size]')
=> {
"item" => {
"0" => {
"size" => nil
}
}
}
So it's now possible to do:
items_string = item_hash.to_a.map { |row| row.join('=') }.join('&')
result = Rack::Utils.parse_nested_query(items_string)
=> {
"item" => {
"0" => {
"size" => "12",
"count" => "1"
}
}
}

Aggregation on mongoid has-many relationship

I want to do aggregation where a model has has-many relationship with other model.
My document is A which has has-many relationship with B.
order_settlements = A.collection.aggregate(
{
"$match" =>
{
}
},
"$group" => {
'_id' => {
},
'B' => { "$first" => "$B"},
}
)
I am not able to get anything in B, it always returns empty [].
I tried using eager loading -
order_settlements = A.includes(:B).collection.aggregate(
{
"$match" =>
{
}
},
"$group" => {
'_id' => {
},
'B' => { "$first" => "$B"},
}
)
This didn't work either. Is it possible to do aggregation with has-many relationship? Please let me know how I can do it.

Elasticsearch doesn't apply the NOT filter

I've been knocking my head against a wall with Elasticsearch today, trying to fix a failing test case.
I am using Rails 3.2.14, Ruby 1.9.3, the Tire gem and ElasticSearch 0.90.2
The objective is to have the query return matching results EXCLUDING the item where
vid == "ABC123xyz"
The Ruby code in the Video model looks like this:
def related_videos(count)
Video.search load: true do
size(count)
filter :term, :category_id => self.category_id
filter :term, :live => true
filter :term, :public => true
filter :not, {:term => {:vid => self.vid}}
query do
boolean do
should { text(:_all, self.title, boost: 2) }
should { text(:_all, self.description) }
should { terms(:tags, self.tag_list, minimum_match: 1) }
end
end
end
end
The resulting search query generated by Tire looks like this:
{
"query":{
"bool":{
"should":[
{
"text":{
"_all":{
"query":"Top Gun","boost":2
}
}
},
{
"text":{
"_all":{
"query":"The macho students of an elite US Flying school for advanced fighter pilots compete to be best in the class, and one romances the teacher."
}
}
},
{
"terms":{
"tags":["top-gun","80s"],
"minimum_match":1
}
}
]
}
},
"filter":{
"and":[
{
"term":{
"category_id":1
}
},
{
"term":{
"live":true
}
},
{
"term":{
"public":true
}
},
{
"not":{
"term":{
"vid":"ABC123xyz"
}
}
}
]
},
"size":10
}
The resulting JSON from ElasticSearch:
{
"took": 7,
"timed_out": false,
"_shards":{
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total":1,
"max_score":0.2667512,
"hits":[
{
"_index":"test_videos",
"_type":"video",
"_id":"8",
"_score":0.2667512,
"_source":{
"vid":"ABC123xyz",
"title":"Top Gun",
"description":"The macho students of an elite US Flying school for advanced fighter pilots compete to be best in the class, and one romances the teacher.",
"tags":["top-gun","80s"],
"category_id":1,
"live":true,
"public":true,
"featured":false,
"last_video_view_count":0,
"boost_factor":0.583013698630137,
"created_at":"2013-08-28T14:24:47Z"
}
}
]
}
}
Could somebody help! The docs for Elasticsearch are sparse around this topic and I'm running out of ideas.
Thanks
Using a top-level filter the way you are doesn't filter the results of your query - it just filters results out of things like facet counts. There's a fuller description in the elasticsearch documentation for filter.
You need to do a filtered query which is slightly different and filters the results of your query clauses:
Video.search load: true do
query do
filtered do
boolean do
should { text(:_all, self.title, boost: 2) }
should { text(:_all, self.description) }
should { terms(:tags, self.tag_list, minimum_match: 1) }
end
filter :term, :category_id => self.category_id
filter :term, :live => true
filter :term, :public => true
filter :not, {:term => {:vid => self.vid}}
end
end
end

mongo-ruby-driver will not create a new document on upsert when there is a custom _id

I want to upsert a document with the mongo-ruby-driver using something like the following-
id = "#{params[:id]}:#{Time.now.strftime("%y%m%d")}"
# db.collection('metrics').insert({'_id' => id})
db.collection('metrics').update(
{ '_id' => id },
{ '$inc' => { "hits" => 1 } },
{ 'upsert' => true }
)
Right now this will only update existing documents, and not create one if it doesn't already exist. The only way it will perform both actions is if I uncomment the insert() command above it.
If I use the mongo console and try and do this upsert directly (without the need for the insert() ) it works as expected.
You should use a symbol instead of string in params. This code works.
db.collection('metrics').update(
{ '_id' => id },
{ '$inc' => { "hits" => 1 } },
{ :upsert => true }
)
In fact, you can use symbols most everywhere. This also works:
db.collection(:metrics).update(
{ :_id => id },
{ :$inc => { :hits => 1 } },
{ :upsert => true }
)

Resources