MongoDB - Mongoid map reduce basic operation - ruby-on-rails

I have just started with MongoDB and mongoid.
The biggest problem I'm having is understanding the map/reduce functionality to be able to do some very basic grouping and such.
Lets say I have model like this:
class Person
include Mongoid::Document
field :age, type: Integer
field :name
field :sdate
end
That model would produce objects like these:
#<Person _id: 9xzy0, age: 22, name: "Lucas", sdate: "2013-10-07">
#<Person _id: 9xzy2, age: 32, name: "Paul", sdate: "2013-10-07">
#<Person _id: 9xzy3, age: 23, name: "Tom", sdate: "2013-10-08">
#<Person _id: 9xzy4, age: 11, name: "Joe", sdate: "2013-10-08">
Could someone show how to use mongoid map reduce to get a collection of those objects grouped by the sdate field? And to get the sum of ages of those that share the same sdate field?
I'm aware of this: http://mongoid.org/en/mongoid/docs/querying.html#map_reduce
But somehow it would help to see that applied to a real example. Where does that code go, in the model I guess, is a scope needed, etc.
I can make a simple search with mongoid, get the array and manually construct anything I need but I guess map reduce is the way here. And I imagine these js functions mentioned on the mongoid page are feeded to the DB that makes those operations internally. Coming from active record these new concepts are a bit strange.
I'm on Rails 4.0, Ruby 1.9.3, Mongoid 4.0.0, MongoDB 2.4.6 on Heroku (mongolab) though I have locally 2.0 that I should update.
Thanks.

Taking the examples from http://mongoid.org/en/mongoid/docs/querying.html#map_reduce and adapting them to your situation and adding comments to explain.
map = %Q{
function() {
emit(this.sdate, { age: this.age, name : this. name });
// here "this" is the record that map
// is going to be executed on
}
}
reduce = %Q{
function(key, values) {
// this will be executed for every group that
// has the same sdate value
var result = { avg_of_ages: 0 };
var sum = 0; // sum of all ages
var totalnum = 0 // total number of people
values.forEach(function(value) {
sum += value.age;
});
result.avg_of_ages = sum/total // finding the average
return result;
}
}
results = Person.map_reduce(map, reduce) //You can access this as an array of maps
first_average = results[0].avg_of_ages
results.each do |result|
// do whatever you want with result
end
Though i would suggest you use Aggregation and not map reduce for such a simple operation. The way to do this is as follows :
results = Person.collection.aggregate([{"$group" => { "_id" => {"sdate" => "$sdate"},
"avg_of_ages"=> {"$avg" : "$age"}}}])
and the result will be almost identical with map reduced and you would have written a lot less code.

Related

How to split an array of objects into subarrays depending on the field value in Rails and Mongodb

I want to get several arrays of objects aggregated by months (and years) in a their property value.
I have class Request like this:
class Request
include Mongoid::Document
include MongoidDocument::Updated
field :name, type: String
field :start_date, type: DateTime
#...
end
And I want the resulting array of multiple hashes with
{month: m_value, year: y_value, request: requests_with_m_value_as_month_and_y_value_as_year_in_start_date_field}
as element of array
Can someone help me with this?
You can use Aggregation Pipeline to get the data in the right shape back from MongoDB:
db.requests.aggregate([
{
$group: {
_id: {
year: {
$year: "$start_date"
},
month: {
$month: "$start_date"
}
},
requests: {
$push: "$$ROOT"
}
}
},
{
$project: {
_id: 0,
year: "$_id.year",
month: "$_id.month",
requests: "$requests"
}
}
])
Obviously, this is using just the REPL and you will have to translate it to the DSL provided by Mongoid. Based on what I could find it should be possible to just get the underlying collection and call aggregate on it:
Request.collection.aggregate([...])
Now you just need to take the query and convert it into something that Mongoid will accept. I think you just need to add a bunch of quotes around the object keys but I don't the environment set up to try that myself.

Exporting Multiple HTML tables to excel such that each HTML table is in a new column

I'm trying to use alasql to export a set of HTML tables into an excel document.
The documentation has code that looks similar to this:
var data1 = alasql('SELECT * FROM HTML("#dev-table",{headers:false})');
var data2 = alasql('SELECT * FROM HTML("#dev2-table",{headers:false})');
var data3 = alasql('SELECT * FROM HTML("#dev3-table",{headers:false})');
//var data4 = alasql('SELECT * FROM HTML("#dev2-table",{headers:true})');
var data = data1.concat(data2, data3);
alasql('SELECT * INTO XLS("data.xls",{headers:false}) FROM ?', [data]);
The problem is that this code concatenates the data1 and data2 field so that all of the data is printed in the same column. This is not the result I desire. I want "data1" to go into column "A" and data2 to go into column "B".
I've looked through the documentation and am unsure how to get the desired result. I'm aware of the existence of "options" that include fields for specifying columns based on the data itself, but none of those examples are what I want. If this is not possible using alasql, I'm willing to use a different library or framework for this.
Based on this JSFiddle, http://jsfiddle.net/95j0txwx/7/
$scope.items = [{
name: "John Smith",
email: "j.smith#example.com",
dob: "1985-10-10"
}, {
name: "Jane Smith",
email: "jane.smith#example.com",
dob: "1988-12-22"
},
...
I would guess that your data is not formatted correctly to be inserted.
EDIT: JSFiddle is from documentation. https://github.com/agershun/alasql/wiki/XLSX

Filter Active Record by relation but not the relation itself

I would like to filter a model with a has_many relation by a value of it's relation.
This is easily achievable by using Foo.include(:bars).where(bars: { foobar: 2 }).
If we had an object like this in our database:
{
id: 1,
bars: [
{ id: 4, foobar: 1 },
{ id: 5, foobar: 2 }
]
}
This query would only return this:
{
id: 1,
bars: [
{ id: 5, foobar: 2 }
]
}
I would like to still filter my records by this relation, but receive all records in the relation.
I was able to achieve this by using something like: Foo.include(:bars).joins('LEFT JOIN bars AS filtered_bars ON filtered_bars.foo_id = foos.id').where(filtered_bars: { foobar: 2 }) but this does not look very pretty.
Is there a better way to do this?
This is the magic of includes in action - since you are using bar in where, it decides to use a left join and will not preload anything else. You have to explicitly join and prelaod the association here:
Foo.joins(:bars).where(bars: {foobar: 2}).uniq.preload(:bars)
Some time ago I made a talk on those magic methods, you can find it here: https://skillsmatter.com/skillscasts/6731-activerecord-vs-n-1

Breeze.js OData - Complex Type fails --> Cannot call method '_createInstanceCore' of null

We are currently developing a small Hmtl/ JavaScript application with breeze.js (Version 1.3.4). We configured to used OData protocol to query the entities.
With a simple entity it just works fine. If we are querying a complex entity (contact entity with two complex type properties for phone numbers and addresses), we receive the following error:
"TypeError: Cannot call method '_createInstanceCore' of null
at ctor.startTracking (<ServerAddress>/scripts/breeze.debug.js:14086:49)
at Array.forEach (native)
at ctor.startTracking (<ServerAddress>1/scripts/breeze.debug.js:14069:12)
at new ctor <ServerAddress>/scripts/breeze.debug.js:2952:52)
at proto._createEntityCore (<ServerAddress>1/scripts/breeze.debug.js:6478:9)
at mergeEntity <ServerAddress>/scripts/breeze.debug.js:12458:39)
at processMeta (<ServerAddress>/scripts/breeze.debug.js:12381:24)
at visitAndMerge (<ServerAddress>/scripts/breeze.debug.js:12361:16)
at <ServerAddress>/scripts/breeze.debug.js:12316:33
at Array.map (native)
From previous event:
at executeQueryCore (<ServerAddress>/scripts/breeze.debug.js:12290:77)
at proto.executeQuery (<ServerAddress>/scripts/breeze.debug.js:11243:23)
at DataContext.executeCachedQuery (<ServerAddress>/App/services/datacontext.js:138:33)
at DataContext.getContactsBySearchParams (<ServerAddress>/App/services/datacontext.js:111:25)
at Search.searchCmd.ko.asyncCommand.execute (<ServerAddress>/App/viewmodels/search.js:34:38)
at Search.ko.asyncCommand.self.execute (<ServerAddress>/scripts/knockout.command.js:57:29)
at HTMLButtonElement.ko.bindingHandlers.event.init (<ServerAddress>/scripts/knockout-2.2.1.debug.js:2318:66)"
While debugging the code, we see, that the dataType field of the complex property instance is null:
val = prop.dataType._createInstanceCore(entity, prop.name);
We can also see that the complexTypeName has a strange value formatting like:
<ComplexTypeName>):#<NameSpace>
Another thing we noticed concerning the strange complex type name is, that the entities property is a collection of complex types (a contact may have multiple addresses). The check on Line 14085 always returns isScalar = true, but a complex array should be created instead.
Is there a problem with the OData Metadata for complex types? How could we solve this issue?
Thank you in advance for your answer.
Cheers,
Marc
Breeze currently does support both scalar complex types and arrays of complex types.
But there is a bug with using EntityManager.createEntity to create an entity and its complex type values in a single pass. This will be fixed in the next release in about a week.
So for now the following does NOT work. ( Assume 'location' in the examples below is a complex property of type 'Location', itself with several other properties)
var supplier = em.createEntity("Supplier",
{ companyName: "XXX", location: { city: "LA" } }
);
but the following will ( assuming you are using the breeze Angular/backingStore impl - the knockout code would look a bit different)
var supplier = em.createEntity("Supplier", { companyName: "XXX" });
supplier.location.city = "San Francisco";
supplier.location.postalCode = "91333";
or the following
var supplier = em.createEntity("Supplier", { companyName: "XXX" });
var locationType = em.metadataStore.getEntityType("Location");
supplier.location = locationType.createInstance(
{ city: "Boston", postalCode: "12345" }
);
I am seeing the same problem with breeze 1.4.5.
My metadata looks like:
{ "shortName":"Phone",
...
"dataProperties":[ {"name":"phoneNumber",
"complexTypeName":"PhoneNumber#mynamespace",
"isScalar":true }]
...
},
{"shortName":"PhoneNumber",
"namespace":"mynamespace",
"isComplexType":true,
"dataProperties":[ ... ]
}
My client code makes a call:
var newPhone = manager.createEntity('Phone', {phoneNumber:{num: "234-2342"}});
(there are more properties in the PhoneNumber complex type, but you yet the picture).
The breeze code (same call stack as orignal poster's) tries to dereference the dataType field, which is not defined, and throws an exception:
if (prop.isDataProperty) {
if (prop.isComplexProperty) {
if (prop.isScalar) {
val = prop.dataType._createInstanceCore(entity, prop);
} else {
val = breeze.makeComplexArray([], entity, prop);
}
I went through the Zza sample's schema and found no examples of complex data properties. The Northwind schema included with the samples bundle does, but I'm not sure how to get it to work with my schema.

Test mongodb map, reduce function in rails

When I use mongoid mapreduce in rails, I pass the map and reduce function by string. for example
def group
Order.collection.map_reduce(map, reduce, :out => 'output_collection')
end
def map
"function()
{
var key = {game_id: this.game_id, year: this.date.getFullYear()};
emit(key, {count: 1, score: this.score});
}"
end
def reduce
"function()
{
var result = {sum: 0, total_score: 0, average: 0};
values.forEach(function(value)
{
result.sum += value.count;
result.total_score += value.score;
});
}"
end
Is it possible to test the map and reduce function in rails.
It is a simple example. But my project's function is more complicate, and I feel it is hard to maintainable.
Thanks for your help for any advice.
For a test of a function that takes place in the database, your testing methodology is:
Create a base set of data
Run the function
Compare the results to your expected
So, in this case, you'll want to make sure the sum/average/total_score match what you expect.

Resources