Running delayed jobs in loop [closed] - ruby-on-rails

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am trying to put a function in delay. The delay will start in a loop:
class Test
def check_function1
while a < b do
self.delay.check_function2(t1, t2)
end
end
def check_function(t1, t2)
#SAVING DATA
end
end
If the loop runs twice, will that mix up the data here? Is there any way to rectify? Also can we do any sleep after one loop?

Delayed Job runs a polling daemon in the background that polls the jobs table at regular intervals for new jobs.
If you only run one DJ worker, then all delayed tasks will be processed in sequence and the potential of a race condition isn't there.
If you run multiple workers then you need to take into account race conditions and guard against "mixing up data" (I assume this is what you meant with "mix up the data").
So yes, running multiple delayed jobs has the potential to "mix data" IF you run multiple DJ workers.
See here for a more detailed explanation:
http://ternarylabs.com/2012/04/16/handle-job-queue-workers-concurrency-in-rails/
Sleeping when launching jobs has practically no effect on what happens to the job queue. Generally you would not want to sleep as it doesn't really do anything other than delay the user interface of your Rails app.

Related

Ruby Concurrency in cron job needed [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am developing a system in which the API should handle simultaneous, continuous by rails 4.0
In system, each user has 3 scripts to be run in background. The scripts grab the user's information from DB to call API repeatedly and process transaction. Currently I am using cronjob (whenever gem) to run scripts in the background for each individual user
So my problem is when the system has 1,000 people, I need to run 3000 cronjobs.
I think this system will have problems. Can anyone help me solve this problem?
At this point you have a system that performs some tasks periodically, and the amount of work your system has to handle (let's say, per hour) is less than the amount of work it could handle.
However, the amount of work increases with the number of users in your system so, as you have already guessed, there will be a moment when the situation will be critical. Your system will not be able to handle all the tasks it has to do.
One way to solve this problem is adding more machines to your system, that is, if you are currently using a single machine to run all your tasks, consider adding another one and split the job. You can split the job between the machines in a number of ways, but I would use a consumer-producer approach.
You will need to use a queue manager where your producer periodically sends a batch of tasks to be done (you can still use whenever gem for that) and a number of consumers (1 makes no sense, 2 would be OK by now but you could increase this number) get the tasks done one by one until there is none left.
The manager I like the most is Sidekiq but you can find some others that might match your needs better.

Rails 4.2/Sidekiq -- how refactoring job code affects already scheduled jobs

We are using Rails 4.2 and Sidekiq for processing jobs. Our application schedules jobs to be performed at some point in the future for our users, and thus we have probably thousands of currently scheduled jobs awaiting execution.
I am doing some substantial refactoring of the code underlying these jobs, changing the parameters and whatnot. My question is: when I deploy my new code, will the currently pending jobs -- which were scheduled using the old code -- be affected by my new code when they run?
I assume the answer is no, and that scheduled jobs include the code they are to process. But I'd feel a lot better with some confirmation. My googling did not reveal an answer.
Consider jobs stored in Redis to be exactly like data in a database. If you want to change them, you need to have a proper migration.
So the answer to your question is yes. The scheduled jobs will use the code that is deployed when they run, not when they were scheduled.

Importing data that may take 10-15 minutes to process, what are my options in Rails? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have a Rails application that displays thousands of products.
The products are loaded from product feeds, so the source may be a large XML file or web service API calls.
I want to be able to re-use my models in my existing rails application in my import process.
What are my options in importing data into my Rails application?
I could use sidekiq to fire off rake tasks, but not sure if sidekiq is suitable for tasks that take 10+ minutes to run? Most use cases that I have seen is for sending of emails and other similiar light tasks
I could create maybe a stand-alone ruby script, but not sure how I could re-use my Rails models if I go this route.
Update
My total product could is around 30-50K items.
Sidekiq would be a great option for this as others have mentioned. 10+ minutes isn't unreasonable as long as you understand that if you restart your sidekiq process mid run that job will be stopped as well.
The concern I have is if you are importing 50K items and you have a failure near the beginning you'll never get to the last ones. I would suggest looking at your import routine and seeing if you can break it up into smaller components. Something like this:
Start sidekiq import job.
First thing job does is reschedule itself N hours later.
Fetch data from API/XML.
For each record in that result schedule a "import this specific data" job with the data as an argument.
Done.
The key is the second to last step. By doing it this way your primary job has a much better chance of succeeding as all it is doing is reading API/XML and scheduling 50K more jobs. Each of those can run individually and if a single one fails it won't affect the others.
The other thing to remember is that unless you configure it not to Sidekiq will rerun failed jobs. So make sure that "import specific data" job can be run multiple times and still do the right thing.
I have a very similar setup that has worked well for me for two years.

System for monitoring cron jobs and automated tasks? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I have several cron-jobs and background tasks on a variety of servers. These tasks can fail for any number of reasons:
lack of disk space
processing strange, unreadable file types
logical errors/bugs in the programs
invalid cron entry
invalid json received
network connectivity failure
db locks
system library update breaks program
Why they failed to run is important, but the most important thing is knowing they failed to run.
Is there a uniform way to monitor multiple jobs, and be alerted if they fail to run at their scheduled time, for any reason? I'm using Ubuntu, the scripts are primarily in Ruby.
Note:
I'm specifically looking for a framework or system that works across multiple servers, and that has alerting via email or text built in, and one that can survive limited disk-space. So the solution presented in
How can I setup a system to tell me if a cron job is NOT running fine? doesn't seem applicable.
It's still under active development but I would encourage you to take a look at https://github.com/jamesrwhite/minicron, I believe it meets all the requirements you specified and more!
Disclaimer: I'm the developer working on it.
Cronitor (https://cronitor.io) was a tool I built exactly for this purpose. It basically boils down to being a tracking beacon that uses http requests as the pings (similar to pushmon).
However, one of the needs that I had (and that pushmon and similar tools couldn't offer) was getting alerts if cron jobs started taking too long to run (or conversely if they started finishing too quickly). Cronitor solves this by allowing you to optionally trigger a begin event and an end event in order to keep track of duration.
Duration tracking was a must have for me because I had a cronjob that was scheduled every hour, but over time started taking over an hour to run. That was a disaster ;)
Will http://www.pushmon.com fill your needs? It's built primarily to let you know if a cron job or scheduled task has failed to run. You can put it on any of your servers and has email and text alerts. The idea is you "ping" PushMon when your job has run successfully, and PushMon will alert you if it didn't receive the ping.
Although it may not satisfy all your needs:
https://github.com/javan/whenever

Looking for suggestions on a background gem [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I plan on running our web app on Heroku. I am looking for a gem to handle a few background jobs. i.e. sending emails, to call a few methods which submit files to an encoding service via an API, etc.
A few that have, so far, come to mind are resque and delayed_job. I hear good things about resque and it also seems to be the more popular gem in its category. Ryan Bates has done an excellent screen cast on delayed_job. However, I hear delayed_job has had a few problems. i.e. not very stable in some areas. So I hear.
Heroku offers Redis-to-Go. They have a free plan which offers 5mb. If I go with resque, is this 5mb plan enough to handle background jobs? I don't want to end up spending more just for background jobs.
Just concerned that if I went with resque, I would need another db just to run background jobs. If I was using Redis for something else, then perhaps it would be worth it. Is it worth having another db just to handle background jobs?
Should I consider alternative gems? If so which ones?
Both delayed_job and resque work fairly well. resque should scale better as the volume of background requests increases.
resque's use of redis should be limited to the task request. Large data objects that are needed by the background tasks should be stored somewhere other than the background worker queue. For example, the files being sent to a background worker to be encoded should be stored in AWS S3 or some other persistent store, not the redis queue used by resque.
When using delayed_job or resque, you will need to run background workers which cost money. You might want to look at an autoscaling solution for dynamically starting and stopping background workers as needed.
See http://s831.us/h3pKE6 as an example.
We've used delayed_job very intensively, sending hundreds of concurrent emails, and it's worked very well. flawlessly. Yes, it'll cost $36/mo for the worker. But a single worker gets a lot of jobs done... several fairly complex emails (lot of dbase lookups) sent per second.

Resources