How to safely modify in memory mongoid model? - ruby-on-rails

I'm using mongoid gem on rail project and I'm quite puzzled by trying to modify the model in memory but never saving it so I do not modify the db. I'm trying to modify an attribute from a model loaded in memory but it does not work as shown bellow:
mymodel = MyModel.where('some criteria')
mymodel.first.some_attribute = 0
mymodel.first.some_attribute == 0 -> is false
So I guess mongo reloads from the db each time we do first or even looping on each entry and setting some attribute has no effect, if I loop again all attributes I set are still set to the original value. Is there a way to commit the transaction and force mymodel to stay loaded in memory? It's hard for me to use proper terminology so I hope you get what I'm saying.

Calling first is a query so this is two distinct queries:
M.first
M.first
and two hits to the database that produce two completely different model instances. Similarly, calling M.each { ... } (or some other iteration method) twice will hit the database twice and produce two sets of completely distinct model instances. You could have a look at what #object_id says to verify this.
If you want to load the objects and do things to them then be explicit about it:
m = M.first
m.attr = 0
# Now m.attr == 0 will be true and you can m.save to update the database
and for iterating, you can call #to_a to execute the query and pull a bunch of model instances from the database into local memory:
ms = M.some_query.to_a
ms.each { ... }
ms.each { ... } # iterates over the same model instances as the first ms.each

Related

rails activerecord return array of results by comparing 2 tables from 2 different databses

So I am working on this existing rails application where I am accessing 2 tables from 2 different databases.
scope :comp_ids_in, lambda {|comp_ids| where(:comp_id => comp_ids)}
company_info = CompanyInfo.comp_ids_in(my_array_of_ids)
the above company_info returns back an an array of ActiveRecord:Relation CompanyInfo objects.
Now I want to compare the above company_info objects with another table on a different database
and return back all the found results in an object.
My existing attempt in my controller would return back only 1 result at a time.
company_info.each do |info|
# RemoteInfo is an acive record class which accesses record from a different database
remote_info = RemoteInfo.where(username: current_user.username, property_code: info.org_id, chain_code: info.site_id)
end
I want all the results stored in the remote_info object. So that I can loop through that object and get any information that was returned.
I would appreciate it if some one can suggest me an efficient approach.
If you are not worried about the order, it would be efficient to just make one query to get all the remote info
remote_info = RemoteInfo.where(username: current_user.username, property_code: company_info.collect(&:org_id), chain_code: company_info.collect(&:site_id))

Postgres/Rails => is object guaranteed to be persisted if I do .save?

If I have a model and save it like this:
model = Website.new
model.attr = 1
model.id = 1
model.save #assume no errors in saving
then retrieve it like this:
model2 = Website.find(1)
Will model2 always be returned? Ignoring errors saving to the database.
Is there a possible scenario where the data is not yet committed to the database, and as a result the find results in no records found? Do I need to delay the find to guarantee the row is returned?
Assuming no database errors, and assuming you haven't overwritten save on Website, the only race condition you'd have is if you try to access the object (via find or otherwise) in the milliseconds before the record is created in the database.
So, to directly answer your question - yes, it's possible - but given a single database (e.g. no read-only slaves or anything like that), it's highly, highly unlikely.

How to find existence of association with certain attributes with optimal SQL?

Suppose you have two ActiveRecord models - Problem and ProblemSet.
You have a #problem_set object, and you want to check if it has a problem with a certain title attribute.
You could say:
#problem_set.problems.where(:title => "Adding Numbers").any?
which returns true, by running the optimal SQL:
SELECT COUNT(*) FROM "problems" INNER JOIN "problem_sets_problems" ON "problems"."id" = "problem_sets_problems"."problem_id" WHERE "problem_sets_problems"."problem_set_id" = 1 AND "problems"."title" = 'Adding Numbers'
However, if #problem_set was in memory, ie, you got #problem_set by:
#problem_set = ProblemSet.new()
#problem_set.problems << Problem.new(:title = "Adding Numbers")
Then you will not be able to find the problem (ie. it returns false!). This is because the following SQL is run:
SELECT COUNT(*) FROM "problems" INNER JOIN "problem_sets_problems" ON "problems"."id" = "problem_sets_problems"."problem_id" WHERE "problem_sets_problems"."problem_set_id" IS NULL AND "problems"."title" = 'Adding Numbers'
A possible way to perform the check correctly for both persistent objects and in-memory objects, is:
#problem_set.problems.map(&:title).include? "Adding Numbers"
Which correctly returns true in both cases. However, in the case of a persistent object, it runs the non-optimal SQL (which retrieves all problems):
SELECT "problems".* FROM "problems" INNER JOIN "problem_sets_problems" ON "problems"."id" = "problem_sets_problems"."problem_id" WHERE "problem_sets_problems"."problem_set_id" = 1
Question: Is there a way to use the same code to check for both persistent objects and in-memory objects, while running optimal SQL code?
Note that a solution which checks for object persistence is permitted (but I don't see how to check the dirtiness of the collection). However, it should still work if a persistent object is modified (ie. the association collection attribute becomes dirty, and therefore the result from an SQL query would be out-of-date).
Ok, I finally worked it out.
Browsing through obscure rails functions, I found the persisted? method. You can use #problem_set.persisted? to check if the object is persistent or in-memory only.
So the answer is:
if #problem_set.persisted?
#problem_set.problems.where(:title => "Adding Numbers").any?
else
#problem_set.problems.map(&:title).include? "Adding Numbers"
end
The remaining question is, what about persistent objects where the association collection is dirty? Well, by experimentation, I found out that it doesn't really happen. When you add an object to the collection, for example, one of the following:
#problem_set.problems << Problem.new(:title => "hello")
#problem_set.problems.push Problem.new(:title => "hello")
ActiveRecord immediately saves the data. Similarly, it immediately destroys the row from the associations table when you say:
#problem_set.problems.delete(#problem_set.problems[2])
That means, although there is no such method as #problem_set.problems_changed?, if there was, the current implementation would result in problems_changed? always returning false.
In effect, the collection<<, collection.push, collection.delete methods auto-save (ie. calls save automatically).

Grails: how to structure transactions when I want to continue validating even after the transaction has already failed

My users are uploading a csv or xls or whatever and each line is going to be an instance of a domain object I save. If any of the lines fail I want the whole thing rolled back, but I also want to return errors for any lines that will fail later. Let's make an example:
Domain class:
MyDomainClass{
String fieldOne
BigDecimal fieldTwo
}
Input:
ThisLineWorks,4.4
ThisLineFails,BecauseOfThis
How would I also get an error, for this line as well considering the last one would have rolled back the transaction already?
Fantasy Output:
OK|ThisLineWorks,4.4
field 2 isn't a number|ThisLineFails,BecauseOfThis
field 2 isn't a number|How would I also get an error, for this line as well considering the last one would have rolled back the transaction already?
You can validate the objects without having to save them: ( http://grails.org/doc/2.0.x/guide/validation.html#validatingConstraints). So in a service you can create all of the objects, then validate all of the objects, then save all of the objects. Something similar to:
def serviceMethod(data) {
def listOfObjects = createObjectsFromData(data)
listOfObjects*.validate()
def anErrorOccurred = listOfObjects.find {it.hasErrors()} != null
if(anErrorOccurred) {
return listOfObjects
}
listOfObjects*.save(validate: false) //you could use the validate:false or leave it out. I figure since we've already validated that you could do without re-validating.
}
This way you can collect all of your errors and not have to worry about rolling back the transaction. Problem with this setup is you'll be creating N number of objects and holding onto all of them. If your file is longer than 100k rows (a slightly educated guess on where you'll start to suffer) then this might cause some performance issues. If you don't like the above method you could handle the transaction manually:
( http://grails.org/doc/2.0.x/ref/Domain%20Classes/withTransaction.html)
def serviceMethod(data) {
MyDomainClass.withTransaction { status ->
def listOfObjects = []
data.each {
def domainObject = createObjectFromData(it)
lisOfObjects << domainObject.save()
}
def anErrorOccurred = lisOfObjects.find {it.hasErrors()} != null
if(anErrorOccurred) {
status.setRollbackOnly() //will roll back all of the transactions surrounded by the .withTransaction {}
}
}
}
You're still holding onto all of the objects here (since you want to retrieve ALL errors that occur). One way I can think of to avoid holding onto all of the objects would be to create the objects one at a time and validate them one by one adding errors to a list when applicable, but then you'd have to recreate all of the objects when they all pass validation which doesn't seem very efficient either.
here is what i am thinking:
1 . Set a flag that signals ALL CLEAR and commit the transaction manually at the end if all is clear.
or
2 . Commit each line in a separate transaction capturing errors of failed lines and skipping over failures.

Rails difference in object created from a .find(:id) and .where() methods

What is the difference in the objects created with these 2 methods:
tec = Technique.find(6)
tec2 = Technique.where(:korean => 'Jok Sul')
The data returned for each is exactly the same, yet the first object will respond perfectly to an inherited method like update_attributes while the second object will give an error of method not found.
When I do tec.class and tec2.class one is an ActiveRecord::Relation and the other doesn't give me a class at all, it just prints out the content of the object.
Maybe when you use the .where method you get an array, even if there is only one match and therefore you always have to issue the .each method to get at the contents? But that makes it hard to deal with when you want to update records, etc.
Can someone clarify this for me? Specifically, how to deal with matches found through the .where method.
Thanks.
Try:
tec2 = Technique.where(:korean => 'Jok Sul').first
Good question.
tec_scope = Technique.where(:korean => 'Jok Sul') # create an internal query
Remember, here only the query is created, it is not executed. You can programmatically build on top of this query if you so wished. The scope (or query if you so wish) will be executed in 2 ways. "Implicit" or "Explicit". Implicit way of running the query happens for example in the console, which invokes a method on the scope which automatically runs the query for you. This wont happen in your controllers unless you run it explicitly for .e.g
tec_scope.all # returns array
tec_scope.first # retuns one element
Scopes are just adding where clauses/predicates to your query. It's query building and delaying the execution till it is needed.
However,
tec_objects = Technique.find(6) # explicitly runs a query and returns one object (in this case)
This will explicitly run the query there and then. It is a question of the timing of execution of the query.
The difference is subtle but very important.
This hasnt got anything to do with whether you get one result or an array.
Technique.find([4,5]) # will return an array
Technique.find(4) # will return one object
Technique.where(:some_key => "some value").all # will return an array
Technique.where(:id => 5).first # will return one object
The difference is in timing of the execution of the query. Don't let the console fool you into believing there is no difference. Console is implicitly firing the query for you :)
The find(6) returns a single object, because you're specifying the object ID in the database, which is guaranteed to be unique by convention.
The where call returns a collection, which may be only 1 item long, but it still returns a collection, not a single object.
You can reveal this difference. Using your example code, if you call tec.class vs. tec2.class I think you'll find that they aren't the same class of object, as you expect.
That is, the methods available to a collection of objects is different than the methods available on an instance of that object.

Resources