Concurrency and Mongoid - ruby-on-rails

I'm currently trying my hand at developing a simple web based game using rails and Mongoid. I've ran into some concurrency issues that i'm not sure how to solve.
The issue is i'm not sure how to atomically do a check and take an action based upon it in Mongoid.
Here is a sample of the relevant parts of the controller code to give you an idea of what i'm trying to do:
battle = current_user.battle
battle.submitted = true
battle.save
if Battle.where(opponent: current_user._id, submitted: true, resolving: false).any?
battle.update_attribute(:resolving, true)
#Resolve turn
A battle is between two users, but i only want one of the threads to run the #Resolve turn. Now unless i'm completely off both threads could check the condition one after another, but before setting resolving to true, therefore both end up running the '#Resolve turn' code.
I would much appreciate any ideas on how to solve this issue.
I am however getting an increasing feeling that doing user synchronization in this way is fairly impractical and that there's a better way altogether. So suggestions for other techniques that could accomplish the same thing would be greatly appreciated!

Sounds like you want the mongo findAndModify command which allows you to atomically retrieve and update a row.
Unfortunately mongoid doesn't appear to expose this part of the mongo api, so it looks like you'll have to drop down to the driver level for this one bit:
battle = Battle.collection.find_and_modify(query: {oppenent: current_user._id, ...},
update: {'$set' => {resolving: true})
By default the returned object does not include the modification made, but you can turn this on if you want (pass {:new => true})
The value returned is a raw hash, if my memory is correct you can do Battle.instantiate(doc) to get a Battle object back.

Related

Mongoid identity_map and memory usage, memory leaks

When I executing query
Mymodel.all.each do |model|
# ..do something
end
It uses allot of memory and amount of used memory increases at all the time and at the and it crashes. I found out that to fix it I need to disable identity_map but when I adding to my mongoid.yml file identity_map_enabled: false I am getting error
Invalid configuration option: identity_map_enabled.
Summary:
A invalid configuration option was provided in your mongoid.yml, or a typo is potentially present. The valid configuration options are: :include_root_in_json, :include_type_for_serialization, :preload_models, :raise_not_found_error, :scope_overwrite_exception, :duplicate_fields_exception, :use_activesupport_time_zone, :use_utc.
Resolution:
Remove the invalid option or fix the typo. If you were expecting the option to be there, please consult the following page with repect to Mongoid's configuration:
I am using Rails 4 and Mongoid 4, Mymodel.all.count => 3202400
How can I fix it or maybe some one know other way to reduce amount of memory used during executing query .all.each ..?
Thank you very much for the help!!!!
I started with something just like you by doing loop through millions of record and the memory just keep increasing.
Original code:
#portal.listings.each do |listing|
listing.do_something
end
I've gone through many forum answers and I tried them out.
1st attempt: I try to use the combination of WeakRef and GC.start but no luck, I fail.
2nd attempt: Adding listing = nil to the first attempt, and still fail.
Success Attempt:
#start_date = 10.years.ago
#end_date = 1.day.ago
while #start_date < #end_date
#portal.listings.where(created_at: #start_date..#start_date.next_month).each do |listing|
listing.do_something
end
#start_date = #start_date.next_month
end
Conclusion
All the memory allocated for the record will never be released during
the query request. Therefore, trying with small number of record every
request does the job, and memory is in good condition since it will be
released after each request.
Your problem isn't the identity map, I don't think Mongoid4 even has an identity map built in, hence the configuration error when you try to turn it off. Your problem is that you're using all. When you do this:
Mymodel.all.each
Mongoid will attempt to instantiate every single document in the db.mymodels collection as a Mymodel instance before it starts iterating. You say that you have about 3.2 million documents in the collection, that means that Mongoid will try to create 3.2 million model instances before it tries to iterate. Presumably you don't have enough memory to handle that many objects.
Your Mymodel.all.count works fine because that just sends a simple count call into the database and returns a number, it won't instantiate any models at all.
The solution is to not use all (and preferably forget that it exists). Depending on what "do something" does, you could:
Page through all the models so that you're only working with a reasonable number of them at a time.
Push the logic into the database using mapReduce or the aggregation framework.
Whenever you're working with real data (i.e. something other than a trivially small database), you should push as much work as possible into the database because databases are built to manage and manipulate big piles of data.

Updating a lot of records frequently

I have a Rails 3 app that has several hundred records in a mySQL-DB that need to be updated multiple times each hour. The actual updating is done through delayed_job which is triggered in controller-logic (checking if enough time has passed since the last update, only then sth. happens).
Each update is slow, it can take up to a second in some cases (although it averages at 3 - 5 updates/sec.).
Code looks like this:
class Thing < ActiveRecord::Base
...
def self.scheduled_update
Thing.all.each do |t|
...
t.some_property = new_value
t.save
end
end
end
I've observed that the execution stalls after 300 - 400 records and then the delayed job just seems to hang and times out eventually (entries in delayed_job.log). After a while the next one starts, also fails, and so forth, so not all records get updated.
What is the proper way to do this?
How does Rails handle database-connections when used like that? Could it be some timeout issue that is not detected/handled properly?
There must be a default way to do this, but couldn't find anything so far..
Any help is appreciated.
Another options is update_all.
Rails is a bad choice for mass data records. See if you can create a sql stored procedure or some other way that would avoid active record.
Use object.save_with_validation(false) if you are ok with skipping validations altogether.
When finding records, use :select => 'a,b,c,other_fields' to limit the fields you want ('a', 'b', 'c' and 'other' in this example).
Use :include for eager loading when you are initially selecting and joining across multiple tables.
So I solved my problem.
There was some issue with the rails-version I was using (3.0.3), the Timeout was caused by some bug I suspect. Updating to a later version of the 3.0.x branch solved it and everything runs perfectly now.

Is this a race condition issue in Rails 3?

Basically I have this User model which has certain attributes say 'health' and another Battle model which records all the fight between Users. Users can fight with one another and some probability will determine who wins. Both will lose health after a fight.
So in the Battle controller, 'CREATE' action I did,
#battle = Battle.attempt current_user.id, opponent.id
In the Battle model,
def self.attempt current_user.id, opponent_id
battle = Battle.new({:user_id => current_user.id, :opponent_id => opponent_id})
# all the math calculation here
...
# Update Health
...
battle.User.health = new_health
battle.User.save
battle.save
return battle
end
Back to the Battle controller, I did ...
new_user_health = current_user.health
to get the new health value after the Battle. However the value I got is the old health value (the health value before the Battle).
Has anyone face this kind of problem before ???
UPDATE
I just add
current_user.reload
before the line
new_user_health = current_user.health
and that works. Problem solved. Thanks!
It appears that you are getting current_user, then updating battle.user and then expecting current_user to automatically have the updated values. This type of thing is possible using Rails' Identity Map but there are some caveats that you'll want to read up on first.
The problem is that even though the two objects are backed by the same data in the database, you have two objects in memory. To refresh the information, you can call current_user.reload.
As a side note, this wouldn't be classified a race condition because you aren't using more than one process to modify/read the data. In this example, you are reading the data, then updating the data on a different object in memory. A race condition could happen if you were using two threads to access the same information at the same time.
Also, you should use battle.user, not battle.User like Wayne mentioned in the comments.

Rails - given an array of Users - how to get a output of just emails?

I have the following:
#users = User.all
User has several fields including email.
What I would like to be able to do is get a list of all the #users emails.
I tried:
#users.email.all but that errors w undefined
Ideas? Thanks
(by popular demand, posting as a real answer)
What I don't like about fl00r's solution is that it instantiates a new User object per record in the DB; which just doesn't scale. It's great for a table with just 10 emails in it, but once you start getting into the thousands you're going to run into problems, mostly with the memory consumption of Ruby.
One can get around this little problem by using connection.select_values on a model, and a little bit of ARel goodness:
User.connection.select_values(User.select("email").to_sql)
This will give you the straight strings of the email addresses from the database. No faffing about with user objects and will scale better than a straight User.select("email") query, but I wouldn't say it's the "best scale". There's probably better ways to do this that I am not aware of yet.
The point is: a String object will use way less memory than a User object and so you can have more of them. It's also a quicker query and doesn't go the long way about it (running the query, then mapping the values). Oh, and map would also take longer too.
If you're using Rails 2.3...
Then you'll have to construct the SQL manually, I'm sorry to say.
User.connection.select_values("SELECT email FROM users")
Just provides another example of the helpers that Rails 3 provides.
I still find the connection.select_values to be a valid way to go about this, but I recently found a default AR method that's built into Rails that will do this for you: pluck.
In your example, all that you would need to do is run:
User.pluck(:email)
The select_values approach can be faster on extremely large datasets, but that's because it doesn't typecast the returned values. E.g., boolean values will be returned how they are stored in the database (as 1's and 0's) and not as true | false.
The pluck method works with ARel, so you can daisy chain things:
User.order('created_at desc').limit(5).pluck(:email)
User.select(:email).map(&:email)
Just use:
User.select("email")
While I visit SO frequently, I only registered today. Unfortunately that means that I don't have enough of a reputation to leave comments on other people's answers.
Piggybacking on Ryan's answer above, you can extend ActiveRecord::Base to create a method that will allow you to use this throughout your code in a cleaner way.
Create a file in config/initializers (e.g., config/initializers/active_record.rb):
class ActiveRecord::Base
def self.selected_to_array
connection.select_values(self.scoped)
end
end
You can then chain this method at the end of your ARel declarations:
User.select('email').selected_to_array
User.select('email').where('id > ?', 5).limit(4).selected_to_array
Use this to get an array of all the e-mails:
#users.collect { |user| user.email }
# => ["test#example.com", "test2#example.com", ...]
Or a shorthand version:
#users.collect(&:email)
You should avoid using User.all.map(&:email) as it will create a lot of ActiveRecord objects which consume large amounts of memory, a good chunk of which will not be collected by Ruby's garbage collector. It's also CPU intensive.
If you simply want to collect only a few attributes from your database without sacrificing performance, high memory usage and cpu cycles, consider using Valium.
https://github.com/ernie/valium
Here's an example for getting all the emails from all the users in your database.
User.all[:email]
Or only for users that subscribed or whatever.
User.where(:subscribed => true)[:email].each do |email|
puts "Do something with #{email}"
end
Using User.all.map(&:email) is considered bad practice for the reasons mentioned above.

ActiveRecord touch(:column) always also updates :updated_at

I've got pretty much the same problem as this guy: http://www.ruby-forum.com/topic/197440.
I'm trying to touch a column (:touched_at) without having it auto-update :updated_at, but watching the SQL queries, it always updates both to the current time.
I thought it might be something to do with the particular model I was using it on, so I tried a couple different ones with the same result.
Does anyone know what might be causing it to always set :updated_at when touching a different column? touch uses write_attribute internally, so it shouldn't be doing this.
Edit:
Some clarification... the Rails 2.3.5 docs for touch state that "If an attribute name is passed, that attribute is used for the touch instead of the updated_at/on attributes." But mine isn't acting that way. Perhaps it's a case of the docs having drifted away from the actual state of the code?
You pretty much want to write custom SQL:
def touch!
self.class.update_all({:touched_at => Time.now.utc}, {:id => self.id})
end
This will generate this SQL:
UPDATE posts SET touched_at = '2010-01-01 00:00:00.0000' WHERE id = 1
which is what you're after. If you call #save, this will end up calling #create_with_timestamps or #update_with_timestamps, and these are what update the updated_on/updated_at/created_on/created_at columns.
By the way, the source for #touch says it all.

Resources