I have a cycle model with two fields: duration (string) and completed (boolean). When a user creates a cycle, they enter the duration (lets say 30 minutes) and the cycle is set to not complete (boolean 0). How do I update that database entry after the cycle duration (30 minutes) to mark the cycle as complete (boolean 1)? Is there a way to handle this with ruby/rails code, or do I have to execute a javascript function?
The goal is to be able to find and display all completed cycles using Cycle.all(:conditions..) and call the SQL database. I wrote a "complete?" method in the cycle model that compares the age of the cycle to the duration, but this is useless for SQL find methods.
What's the best way to tackle this? Thanks!
Define a rake task that runs something likeā¦
desc "Expire old cycles"
task :cron => :environment do
expired = Cycle.all :conditions => ["expiration < ?", DateTime.now]
expired.each { |c| c.expire! }
end
Where c#expire! is a method that'll mark it as expired in the database. Then setup rake cron to run every N minutes via a cronjob.
If you're comfortable doing this in SQL, you can optimize this by writing a query to do UPDATE cycles SET complete = 1 WHERE expiration < NOW();.
You can add another field, let's say Expired_time that is when the cycle is complete. For example:
# Here is the example record:
Duration Created_at Expired_time
30 mins Time Time + 30 mins
And now simply check current date (now) with Expired_time to check it is complete or not.
Related
I need to generate invoices once per month, according to the EST/EDT timezone (the clients are throughout the country but in this industry billing happens at the same timezone).
I'm creating a GenerateInvoicesJob but I'm having trouble reasoning about a 100% perfect way to generate invoices so that there isn't any possible duplicate/confusion with regards to:
Generate invoices only once per month
Let the job run every day
Make the job idempotent
Then the final point which is the hard one for me, how do I ensure that there are no bugs with EST/DST and 1 hour slipping through.
Here is my clock.rb:
every(1.day, 'GenerateInvoicesJob', tz: 'America/New_York', at: '04:00') do
Delayed::Job.enqueue GenerateInvoicesJob.new, queue: 'high'
end
And here's the top of my job:
Unit.where(enabled: true)
.joins(:user)
.where('last_invoice_generated_at <= ?', Time.now.utc.end_of_month)
.each do |unit|
ActiveRecord::Base.transaction do
unit.update_attributes(
last_invoice_generated_at: Time.now.utc
)
invoice = Invoice.create!(
...
)
line_item = LineItem.create!(
...
)
end
I realize the direct conditional logic might be wrong, so that's not entirely my question... my main addition to that question is whats the best way overall to do this so I can make sure that all times in EST are 100% accounted for, including weird off-by-1-hour bugs, etc. This job is super important so I'm hesitant on a way to make it perfect.
On top of that I"m not sure whether I should store UTC in the database.... normally I know you always are supposed to store UTC, but I know UTC doesnt have DST so I'm afraid if I store it like that, the job could run one time and invoices would be not run properly
I would do something like this in the worker:
# `beginning_of_month` because we want to load units that haven't
# been billed this month
units_to_bill = Unit.where(enabled: true)
.where('last_invoice_generated_at < ?', Time.current.beginning_of_month)
# `find_each` because it needs less memory
units_to_bill.find_each do |unit|
# Beginn a transaction to ensure all or nothing is updated
Unit.transaction do
# reload the unit, because it might have been updated by another
# task in the meantime
unit.reload
# lock the current unit for updates
unit.lock!
# check if the condition is still true
if unit.last_invoice_generated_at < 1.month.ago
# generate invoices
# invoice = Invoice.create!(
# line_item = LineItem.create!(
# last step update unit
unit.update_attributes(
last_invoice_generated_at: Time.current
)
end
end
end
I have an interesting dilema. I have an application that when a new record, in this case a user, is created it is published to another application (specific information is).
Now I could use User.last and get the latest and greatest. Publishing happens as soon as the record is saved and it only takes a second. So assume I have 500 users signing up at once.
Thats 500 new records published to the second app, For each of those I need to say:
If this user is new, do x with it, else ignore it.
I am using the whenever gem to create a cron job on the second app that watches every 5 seconds for new records. in that time 5 new recods could come through so I need to update the above statement to say:
If the record is 5 seconds or younger do x with it, else ignore it.
Can I do the following:
Users.all.each do |u|
*if u is 5 seconds or less old*
do something here
end
end
I am not sure what the if statement would be, would it be u.created_at <= 5.seconds ??
User.where('created_at >= ?', 5.seconds.ago) will find records that are no more than five seconds old. But it sounds like you might be better off with an API that would push create events to your second app.
You could. It would look like this:
gauge = Time.now - 5
Users.all.each do |u|
u.created_at >= gauge
# do something here
end
end
The issue is that it takes time to run the loop. Even if you ran a more efficient query, you're still relying on timestamps for a high level of precision.
User.where('created_at >= ?', Time.now - 5)
If there's a delay in the system, say the computer takes .5 seconds to get to your query, you'll miss any users created in the .5 second gap. Better to add a published column to Users, mark the record as published, and then check for unpublished records.
users = User.where(published: nil)
users.each { |u| u.publish }
Or, as another post mentioned, look for an API that will push create events.
I noticed that Rails can have concurrency issues with multiple servers and would like to force my model to always lock. Is this possible in Rails, similar to unique constraints to force data integrity? Or does it just require careful programming?
Terminal One
irb(main):033:0* Vote.transaction do
irb(main):034:1* v = Vote.lock.first
irb(main):035:1> v.vote += 1
irb(main):036:1> sleep 60
irb(main):037:1> v.save
irb(main):038:1> end
Terminal Two, while sleeping
irb(main):240:0* Vote.transaction do
irb(main):241:1* v = Vote.first
irb(main):242:1> v.vote += 1
irb(main):243:1> v.save
irb(main):244:1> end
DB Start
select * from votes where id = 1;
id | vote | created_at | updated_at
----+------+----------------------------+----------------------------
1 | 0 | 2013-09-30 02:29:28.740377 | 2013-12-28 20:42:58.875973
After execution
Terminal One
irb(main):040:0> v.vote
=> 1
Terminal Two
irb(main):245:0> v.vote
=> 1
DB End
select * from votes where id = 1;
id | vote | created_at | updated_at
----+------+----------------------------+----------------------------
1 | 1 | 2013-09-30 02:29:28.740377 | 2013-12-28 20:44:10.276601
Other Example
http://rhnh.net/2010/06/30/acts-as-list-will-break-in-production
You are correct that transactions by themselves don't protect against many common concurrency scenarios, incrementing a counter being one of them. There isn't a general way to force a lock, you have to ensure you use it everywhere necessary in your code
For the simple counter incrementing scenario there are two mechanisms that will work well:
Row Locking
Row locking will work as long as you do it everywhere in your code where it matters. Knowing where it matters may take some experience to get an instinct for :/. If, as in your above code, you have two places where a resource needs concurrency protection and you only lock in one, you will have concurrency issues.
You want to use the with_lock form; this does a transaction and a row-level lock (table locks are obviously going to scale much more poorly than row locks, although for tables with few rows there is no difference as postgresql (not sure about mysql) will use a table lock anyway. This looks like this:
v = Vote.first
v.with_lock do
v.vote +=1
sleep 10
v.save
end
The with_lock creates a transaction, locks the row the object represents, and reloads the objects attributes all in one step, minimizing the opportunity for bugs in your code. However this does not necessarily help you with concurrency issues involving the interaction of multiple objects. It can work if a) all possible interactions depend on one object, and you always lock that object and b) the other objects each only interact with one instance of that object, e.g. locking a user row and doing stuff with objects which all belong_to (possibly indirectly) that user object.
Serializable Transactions
The other possibility is to use serializable transaction. Since 9.1, Postgresql has "real" serializable transactions. This can perform much better than locking rows (though it is unlikely to matter in the simple counter incrementing usecase)
The best way to understand what serializable transactions give you is this: if you take all the possible orderings of all the (isolation: :serializable) transactions in your app, what happens when your app is running is guaranteed to always correspond with one of those orderings. With ordinary transactions this is not guaranteed to be true.
However, what you have to do in exchange is to take care of what happens when a transaction fails because the database is unable to guarantee that it was serializable. In the case of the counter increment, all we need to do is retry:
begin
Vote.transaction(isolation: :serializable) do
v = Vote.first
v.vote += 1
sleep 10 # this is to simulate concurrency
v.save
end
rescue ActiveRecord::StatementInvalid => e
sleep rand/100 # this is NECESSARY in scalable real-world code,
# although the amount of sleep is something you can tune.
retry
end
Note the random sleep before the retry. This is necessary because failed serializable transactions have a non-trivial cost, so if we don't sleep, multiple processes contending for the same resource can swamp the db. In a heavily concurrent app you may need to gradually increase the sleep with each retry. The random is VERY important to avoid harmonic deadlocks -- if all the processes sleep the same amount of time they can get into a rhythm with each other, where they all are sleeping and the system is idle and then they all try for the lock at the same time and the system deadlocks causing all but one to sleep again.
When the transaction that needs to be serializable involves interaction with a source of concurrency other than the database, you may still have to use row-level locks to accomplish what you need. An example of this would be when a state machine transition determines what state to transition to based on a query to something other than the db, like a third-party API. In this case you need to lock the row representing the object with the state machine while the third party API is queried. You cannot nest transactions inside serializable transactions, so you would have to use object.lock! instead of with_lock.
Another thing to be aware of is that any objects fetched outside the transaction(isolation: :serializable) should have reload called on them before use inside the transaction.
ActiveRecord always wraps save operations in a transaction.
For your simple case it might be best to just use a SQL update instead of performing logic in Ruby and then saving. Here is an example which adds a model method to do this:
class Vote
def vote!
self.class.update_all("vote = vote + 1", {:id => id})
end
This method avoids the need for locking in your example. If you need more general database locking check see David's suggestion.
You can do the following in your model like so
class Vote < ActiveRecord::Base
validate :handle_conflict, only: :update
attr_accessible :original_updated_at
attr_writer :original_updated_at
def original_updated_at
#original_updated_at || updated_at
end
def handle_conflict
#If we want to use this across multiple models
#then extract this to module
if #conflict || updated_at.to_f> original_updated_at.to_f
#conflict = true
#original_updated_at = nil
#If two updates are made at the same time a validation error
#is displayed and the fields with
errors.add :base, 'This record changed while you were editing'
changes.each do |attribute, values|
errors.add attribute, "was #{values.first}"
end
end
end
end
The original_updated_at is a virtual attribute that is set. handle_conflict is fired when the record is updated. Checks to see if the updated_at attribute is in the database is later than the one hidden(defined on your page). By the way you should define the following in the your app/view/votes/_form.html.erb
<%= f.hidden_field :original_updated_at %>
If a there is a conflict then raise the validation error.
And if you are using Rails 4 you will won't have the attr_accessible and will need to add :original_updated_at to your vote_params method in your controller.
Hopefully this sheds some light.
For simple +1
Vote.increment_counter :vote, Vote.first.id
Because vote was used both for the table name and the field, this is how the 2 are used
TableName.increment_counter :field_name, id_of_the_row
Foobar.find(1).votes_count returns 0.
In rails console, I am doing:
10.times { Resque.enqueue(AddCountToFoobar, 1) }
My resque worker:
class AddCountToFoobar
#queue = :low
def self.perform(id)
foobar = Foobar.find(id)
foobar.update_attributes(:count => foobar.votes_count +1)
end
end
I would expect Foobar.find(1).votes_count to be 10, but instead it returns 4. If I run 10.times { Resque.enqueue(AddCountToFoobar, 1) } again, it returns the same behaviour. It only increments votes_count by 4 and sometimes 5.
Can anyone explain this?
This is a classic race condition scenario. Imagine that only 2 workers exist and that they each run one of your vote incrementing jobs. Imagine the following sequence.
Worker1: load foobar(vote count == 1)
Worker2: load foobar(vote count == 1, in a separate ruby object)
Worker 1: increment vote count (now == 2) and save
Worker 2: increment it's copy of foobar (vote count now == 2) and save, overwriting what worker 1 did
Although 2 workers ran 1 update job each, the count only increased by 1 because they were both operating on their own copy of foobar that wasn't aware of the change the other worker was doing
To solve this, you could either do an inplace style update, ie
UPDATE foos SET count = count + 1
or use one of the 2 forms of locking active record supports (pessimistic locking & optimistic locking)
The former works because the database ensures that you don't have concurrent updates on the same row at the same time.
Looks like ActiveRecord is not thread-safe in Resque (or rather redis, I guess). Here's a nice explanation.
As Frederick says, you're observing a race condition. You need to serialize access to the critical section from the time you read the value and update it.
I'd try to use pessimistic locking:
http://api.rubyonrails.org/classes/ActiveRecord/Transactions/ClassMethods.html
http://api.rubyonrails.org/classes/ActiveRecord/Locking/Pessimistic.html
foobar = Foobar.find(id)
foobar.with_lock do
foobar.update_attributes(:count => foobar.votes_count +1)
end
I have run with a problem which i believe is Active Records fault. I am parsing an XML file which contains jobs. This xml file contains nodes which indicate walltime in the time format 00:00:00. I also have a model which will accept these jobs. However, when the time is larger than an actual 24H time, Active record inserts it as NULL. Examples below:
INSERT INTO `jobs` (`jobid`, `walltime`) VALUES('71413', 'NULL')
INSERT INTO `jobs` (`jobid`, `walltime`) VALUES('71413', '15:24:10')
Any ideas? Thank you!
The standard SQL time and datetime data types aren't intended to store a duration. Probably in agreement with those standards, ActiveRecord's time attribute assignment logic uses the time parsing rules of the native Ruby Time class to reject invalid time of day.
The way to store durations, as you intend, is either:
Store the duration as an integer (e.g. "number of seconds"), or
Store two (date)times, a start and an end, and use date arithmetic on them.
class Thing < ActiveRecord::Base
...
def duration
return start - end
end
def duration=(length)
start = Time.now
end = start + length
end
...
end