Mongoid aggregation with conditions - ruby-on-rails

I'm using the mongoid 6.1.0 aggregation framework in my Rails 5 project. I need to add a $match pipeline if the value of a search field (text or select field) is not empty. Otherwise, it should be ignored and don't filter results. Something like:
#messages = Message.collection.aggregate([
{ '$match' => {'month': {'$gte' => #fr_mnth, '$lte' => #to_mnth}}},
{ '$group' => {'_id': '$mmsi'} },
{ '$lookup' => {'from': 'ships', 'localField': "_id", 'foreignField': "mmsi", as: "ship"}},
{ '$match' => {"ship.n2": params[:n2] if !params[:n2].blank? }}
]).allow_disk_use(true)
or to be more clear:
if not params[:n2].blank? { '$match' => {"ship.n2": params[:n2] }}
The problem is that if !params[:n2].blank? cannot be included in the aggregation framework. Is there any other alternative solution?

I don't know ruby, but maybe I understand your problem.
Pseudo-code
# DON'T DO SO! SEE UPDATE BELOW
if your_condition is true:
filter = { field: 'some value' }
else:
filter = { # always true condition
$or: [
{ field: { $exists: true } },
{ field: { $exists: false } }
]
}
Message.collection.aggregate([
# ...
{
"$match": filter
}
])
UPDATE:
As Aboozar Rajabi noted, if condition is true then we can just add $match stage to pipeline:
pipeline = [
# stages
];
if condition is true:
pipeline.push({
$match: {
# filter
}
});

The above pseudo-code (Kan A's answer) is translated to Ruby and Mongoid aggregation framework as its syntax might be a bit confusing and there're a few Mongoid aggregation examples online:
if not params[:field].blank?
filter = { "db_field_name": params[:field] }
else
filter = {
'$or' => [
{ "db_field_name" => { '$exists' => true } },
{ "db_field_name" => { '$exists' => false } }
]
}
end
I hope it would help the others who will see this page later. Also, this solution and the code in the question would be an example of using MongoDB aggregation framework in a Rails or Ruby project.

Related

Aggregation on mongoid has-many relationship

I want to do aggregation where a model has has-many relationship with other model.
My document is A which has has-many relationship with B.
order_settlements = A.collection.aggregate(
{
"$match" =>
{
}
},
"$group" => {
'_id' => {
},
'B' => { "$first" => "$B"},
}
)
I am not able to get anything in B, it always returns empty [].
I tried using eager loading -
order_settlements = A.includes(:B).collection.aggregate(
{
"$match" =>
{
}
},
"$group" => {
'_id' => {
},
'B' => { "$first" => "$B"},
}
)
This didn't work either. Is it possible to do aggregation with has-many relationship? Please let me know how I can do it.

Multiple elasticsearch filters

I am using Tire for rails to integrate elasticsearch. This bit is quite confusing and I want to make sure I'm doing this right.
Is this how I apply multiple filters? I'm basically trying to check 'mixtape_id IS NULL AND artist_id IS NOT NULL'
def self.search(query)
tire.search() do
query { string query }
filter :exists, { field: 'artist_id' }
filter :not, { exists: { field: 'mixtape_id' } }
end
end
Here is my second attempt, still doesnt appear to work
def self.search(query)
tire.search(load: true) do
query { string query }
filter :and, [
{ exists: { field: 'artist_id' } },
{ not: { exists: { field: 'mixtape_id' } } }
]
end
end
Thanks
I mostly had it working the whole time, I stupidly forgot to force reindexing each time though sigh. Here is some cleaned up code that takes advantage of the missing filter.
def self.search(query)
tire.search load: { include: { artist: :attachments } } do
query { string query }
filter :and, [
{ exists: { field: 'artist_id' } },
{ missing: { field: 'mixtape_id' } }
]
end
end

Elasticsearch doesn't apply the NOT filter

I've been knocking my head against a wall with Elasticsearch today, trying to fix a failing test case.
I am using Rails 3.2.14, Ruby 1.9.3, the Tire gem and ElasticSearch 0.90.2
The objective is to have the query return matching results EXCLUDING the item where
vid == "ABC123xyz"
The Ruby code in the Video model looks like this:
def related_videos(count)
Video.search load: true do
size(count)
filter :term, :category_id => self.category_id
filter :term, :live => true
filter :term, :public => true
filter :not, {:term => {:vid => self.vid}}
query do
boolean do
should { text(:_all, self.title, boost: 2) }
should { text(:_all, self.description) }
should { terms(:tags, self.tag_list, minimum_match: 1) }
end
end
end
end
The resulting search query generated by Tire looks like this:
{
"query":{
"bool":{
"should":[
{
"text":{
"_all":{
"query":"Top Gun","boost":2
}
}
},
{
"text":{
"_all":{
"query":"The macho students of an elite US Flying school for advanced fighter pilots compete to be best in the class, and one romances the teacher."
}
}
},
{
"terms":{
"tags":["top-gun","80s"],
"minimum_match":1
}
}
]
}
},
"filter":{
"and":[
{
"term":{
"category_id":1
}
},
{
"term":{
"live":true
}
},
{
"term":{
"public":true
}
},
{
"not":{
"term":{
"vid":"ABC123xyz"
}
}
}
]
},
"size":10
}
The resulting JSON from ElasticSearch:
{
"took": 7,
"timed_out": false,
"_shards":{
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total":1,
"max_score":0.2667512,
"hits":[
{
"_index":"test_videos",
"_type":"video",
"_id":"8",
"_score":0.2667512,
"_source":{
"vid":"ABC123xyz",
"title":"Top Gun",
"description":"The macho students of an elite US Flying school for advanced fighter pilots compete to be best in the class, and one romances the teacher.",
"tags":["top-gun","80s"],
"category_id":1,
"live":true,
"public":true,
"featured":false,
"last_video_view_count":0,
"boost_factor":0.583013698630137,
"created_at":"2013-08-28T14:24:47Z"
}
}
]
}
}
Could somebody help! The docs for Elasticsearch are sparse around this topic and I'm running out of ideas.
Thanks
Using a top-level filter the way you are doesn't filter the results of your query - it just filters results out of things like facet counts. There's a fuller description in the elasticsearch documentation for filter.
You need to do a filtered query which is slightly different and filters the results of your query clauses:
Video.search load: true do
query do
filtered do
boolean do
should { text(:_all, self.title, boost: 2) }
should { text(:_all, self.description) }
should { terms(:tags, self.tag_list, minimum_match: 1) }
end
filter :term, :category_id => self.category_id
filter :term, :live => true
filter :term, :public => true
filter :not, {:term => {:vid => self.vid}}
end
end
end

What is the correct way to chain map_reduce calls in Mongoid?

I have an Element model that belongs to User. I am trying to calculate the following hash: how many users have element count of 1, 2, 3, etc. The approach I take is to first generate a hash of {user -> num elements}, then I sort-of invert it using a second map-reduce.
Here's what I have so far:
Element.map_reduce(%Q{
emit(this.user_id, 1);
}, %Q{
function(key, values) {
return Array.sum(values);
}
}).out(inline: true).map_reduce(%Q{
if (this.value > 1) {
emit(this.value, this._id);
}
}, %Q{
function(element_count, user_ids) {
return user_ids.length;
}
}).out(inline: true)
This gives me an "undefined method `map_reduce'" error. I couldn't find the answer in the docs. Any help would be great.
I calculated the hash using aggregate instead mapreduce, first grouping by user, and then grouping again by elements count:
Element.collection.aggregate([
{
"$group" => {
"_id" => "$user_id", "elements_count" => {"$sum" => 1}
}
},
{
"$group" => {
"_id" => "$elements_count", "users_count" => {"$sum" => 1}
}
},
{ "$project" => {
"_id" => 0,
"users_count" => '$users',
"elements_count" => '$_id',
}
}
])
This returns the following array:
[
{"users_count"=>3, "elements_count"=>2},
{"users_count"=>4, "elements_count"=>3},
...
]
If needed it can also be sorted using $sort operator

mongo-ruby-driver will not create a new document on upsert when there is a custom _id

I want to upsert a document with the mongo-ruby-driver using something like the following-
id = "#{params[:id]}:#{Time.now.strftime("%y%m%d")}"
# db.collection('metrics').insert({'_id' => id})
db.collection('metrics').update(
{ '_id' => id },
{ '$inc' => { "hits" => 1 } },
{ 'upsert' => true }
)
Right now this will only update existing documents, and not create one if it doesn't already exist. The only way it will perform both actions is if I uncomment the insert() command above it.
If I use the mongo console and try and do this upsert directly (without the need for the insert() ) it works as expected.
You should use a symbol instead of string in params. This code works.
db.collection('metrics').update(
{ '_id' => id },
{ '$inc' => { "hits" => 1 } },
{ :upsert => true }
)
In fact, you can use symbols most everywhere. This also works:
db.collection(:metrics).update(
{ :_id => id },
{ :$inc => { :hits => 1 } },
{ :upsert => true }
)

Resources