Service vs Scheduled Task intervals - windows-services

If you have a recurring task that runs once per day, you use a Scheduled Task.
If you have a recurring task that runs every 10 seconds, you use a Service.
At what point do you switch between the two? Is there official guidance on this somewhere?

i`m not sure the interval is the main issue here.
here are a few thing to consider:
how much state this task needs in memory - do you load stuff from a file of DB ?
does the system that needs this task to run, have a need to communicate with the task
other that when its running ?
do you need more control over the process lifecycle when the task is up?
you can see where i`m going with this , that a service is a resident entity, and a sched task isn't.

i think it depends on the point if your programm is made for only one task or for more. if it's just doin' one "stupid" thing (like running a stored procedure in a database every 20 seconds) i would concidering a sheduled task, but if it does more than that and maybe got some dependencies (maybe what time it is running or some file-operations) I would concider a service.
I would also concider a service if the intervals when the operation is made are different. Let's say your programm runs a single stored procedure in a database and depending on the fact that it made "real" changes to the db. If it did something the next run is in 5 seconds and if not the next run is in 20 seconds. That's one of the perfect examples for a service.

Related

Quart.Net is Sometimes Running Overlapping Tasks

I am using Quartz.Net 3.0.7 to manage a scheduler. In my test environment I have two instances of the scheduler running. I have a test process that runs for exactly 2 hours before ending. Quartz is configured to start the process every 10 seconds and I am using the DisallowConcurrentExecution attribute to prevent multiple instances of the task from running at the same time. 80% of the time this is working as expected. Quartz will start up the process and prevent any other instances of the task from starting until after the initial one has completed. If I stop one of the two services hosting Quart, then the other instance picks up the task at the next 10-second mark.
However, after keeping these two Quartz services running for 48 uninterrupted hours, I have discovered a couple of times where things went horribly wrong. At times host B will start up the task, even though the task is still in the middle of its 2 hour execution on host A. At one point I even found the process had started up 3 times on host B, all within a 10 minute period. So, for a two hour period, the one task had three instances running simultaneously. After all three finished, Quartz went back to the expected schedule of only having one instance running at a time.
If these overlapping tasks were happening 100% of the time, I would think there is something wrong on my end, but since it seems to happen only about 20% of the time, I am thinking it must be something in the Quartz implementation. Is this by design or is it a bug? If there is an event I can capture from Quart.Net to tell me that another instance of a task has started up, I can listen for that and stop the existing task from running. I just need to make sure that DisallowConcurrentExecution is getting obeyed and prevent a task from running multiple instances concurrently. Thanks.
Edit:
I added logic that uses context.Scheduler.GetCurrentlyExecutingJobs to look for any jobs that have the same JobDetail.Key but a different FireInstanceId when my task starts up. If I find another currently executing job, I will prevent this instance from doing anything. I am finding that in the duplicate concurrent scenario, Quartz is reporting that there are no other jobs currently executing with the same JobDetail.Key. Should that be possible? Under what case would Quartz.Net start an IJob, lose track of it as an executing job after a few minutes, but allow it to continue executing without cancelling the CancellationToken?
Edit2:
I found an instance in my logs where Quartz started a task as expected. Then, one minute later, Quartz tried to start up 9 additional instances, each with a different FireInstanceId. My custom code blocked the 9 additional instances, because it can see that the original instance was still going, by calling GetCurrentlyExecutingJobs to get a list of running jobs. I double checked and the ConcurrentExecutionDisallowed flag is true on all of the tasks at runtime, so I would expect that Quartz would prevent the duplicate instances. This sounds like a bug. Am I expected to handle this manually or should I expect Quartz to get this right?
Edit3:
I am definitely looking at two different problems. In both cases Quartz.Net is launching my IJob instance with a new FireInstanceId while there is already another FireInstanceId running for the same JobKey. In one scenario I can see that both FireInstanceIds are active by calling GetCurrentlyExecutingJobs. In the second scenario calling GetCurrentlyExecutingJobs shows that the first FireInstanceId is no longer running, even though I can see from my logs that the original instance is still running. Both of these scenarios result in multiple instances of my IJob running at the same time, which is not acceptable. It is easy enough to tackle the first scenario by calling GetCurrentlyExecutingJobs when my IJob starts, but the second scenario is harder. I will have to ping GetCurrentlyExecutingJobs on an interval and stop the task if it’s FireInstanceId has disappeared from the active list. Has anyone else really not noticed this behavior?
I found that if I set this option, that I no longer have overlapping executing jobs. I still wish that Quartz would cancel the job’s cancellation token, though, if it lost track of the executing job.
QuartzProperties.Add("quartz.jobStore.clusterCheckinInterval", "60000");

Grails non time based queuing

I need to process files which get uploaded and it can take as little as 1 second or as much as 10 minutes. Currently my solution is to make a quartz job with a timer of 30 seconds and then process and arbitrary job whenever it hits. There are several problems with this.
One: if the job will take less than a few seconds it is wasteful to make things wait 30 seconds for the job queue.
Two: if there is only one long job in the queue it could feasibly try to do it twice.
What I want is a timeless queue. When things are added the are started immediately if there is a free worker. Is there a solution for this? I was looking at jesque, but I couldn't tell if it can do this.
What you are looking for is a basic message queue. There are lots of options out there, but my favorite for Grails is RabbitMQ. The Grails plugin for it is quite good and it performs well in my experience.
In general, message queues allow you to have N producers (things creating jobs") adding work messages to a queue and then M consumers pulling jobs off of the queue and processing them. When a worker completes it's job, it simply asks the queue for the next job to process and if there is none, it just waits for the queue to give it something to do. The queue also keeps track of success / failure of message processing (you can control this) so that you don't give the same message to more than one worker.
This has the advantage of not relying on polling (so you can start processing as soon as things come in) and it's also much more scaleable. You can scale both your producers and consumers up or down as needed, decoupling the inputs from the outputs so that you can take a traffic spike and then work your way through it as you have the resources (workers) available.
To solve problem one just make the job check for new uploaded files every 5 seconds (or 3 seconds, or 1 second). If the check for uploaded files is quick then there is no reason you can't run it often.
For problem two you just need to record when you start processing a file to ensure it doesn't get picked-up twice. You could create a table in the database, or store the information in memory somewhere.

Running Jobs when DB is free on Ruby on Rails Heroku

I have a ruby on rails app that uses Heroku. I have the need to run things like import/export tasks on our db that lock up the whole system since they are so heavy on the DB. Is there a way to tell the system to only run these tasks when the database is not being used at that second?
There is no built-in way to schedule a job like this. There are a few things you can do, though.
Schedule the jobs to run during the least busy hours of the day. That will depend on your business, customer base and so on, but hopefully there is a window that is more suitable than others.
You could write your batch job to run for a longer time, doing small units of work. Between each unit of work, sleep for a few seconds, or take a look at the current load average and decide what to do based on that. This should lower the impact of the batch jobs.
Have the website update a "lock" somewhere, either in the database or in a memcached or something. If your normal website usage updates the database, you could look at the existing updated_at. Then only do batch work when there hasn't been any activity for a while. This doesn't guarantee that a new user won't pop in at the same time your batch job runs, of course, but could be a way to find a window where the site is less used.
Have you looked into using Background Jobs / Workers on Heroku? It's also worth reading about Heroku's Delayed Job queuing system

Can I start and stop delayed_job workers from within my Rails app?

I've got an app that could benefit from delayed_job and some background processing. The thing is, I don't really need/want delayed_job workers running all the time.
The app runs in a shared hosting environment and in multiple locations (for different users). Plus, the app doesn't get a large amount of usage.
Is there a way to start and stop processing jobs (either with the script or rake task) from my app only after certain actions/events?
You could call out to system:
system "cd #{Rails.root} && rake delayed_job:start RAILS_ENV=production"
You could just change delayed_job to check less often too. Instead of the 5 second default, set it to 15 minutes or something.
Yes, you can, but I'm not sure what the benefit will be. You say you don't want workers running all the time - what are your concerns? Memory usage? Database connections?
To keep the impact of delayed_job low on your system, I'd run only one worker, and configure it to sleep most of the time.
Delayed::Worker::sleep_delay = 60 * 5 # in your initializer.rb
A single worker will only wake up and check the db for new jobs every 5 minutes. Running this way keeps you from 'customizing' too much.
But if you really want to start a Delayed::Worker programatically, look in that class for work_off, and implement your own script/run_jobs_and_exit script. It should probably look much like script/delayed_job does - 3 lines.
I found this because I was looking for a way to run some background jobs without spending all the money to run them all the time when they weren't needed. Someone made a hack using google app engine to run the background jobs:
http://viatropos.com/blog/how-to-run-background-jobs-on-heroku-for-free/
It's a little outdated though. There is an interesting comment in the thread:
"When I need to send an e-mail, copy a file, etc I basically add it to the queue. At the end of every request it checks if there is anything in the queue. If so then it uses the Heroku API to set the worker to 1. At the end of a worker getting a task done it checks to see if there is anything left in the queue. If not then it sets the workers back to 0. The end result is the background worker will just work for a few seconds here and there. I can do all the background processing that I need and the bill at the end of the month rarely ever reaches 1 hour total worth of work. Even if it does no problem, I'll pay $0.05 for background processing. :)"
If you go to stop a worker, you are given the PID. You can simply kill -9 PID if all else fails.

How reliable is windows task scheduler for scheduling code to run repeatedly?

I have a bit of code that needs to sit on a windows server 2003 machine and run every minute.
What is the recommended way of handling this? Is it ok to design it as a console service and just have the task scheduler hit it ever minute? (is that even possible?) Should I just suck it up and write it as a windows service?
Since it needs to run every single minute, I would suggest writing a Windows Service. It is not very complicated, and if you never did this before, it would be great for you to learn how it is done.
Calling the scheduled task every minute is not something I would recommend.
I would say suck it up and write it as a Windows service. I've not found scheduled tasks to be very reliable and when it doesn't run, I have yet to find an easy way to find out why it hasn't.
Windows Scheduled Tasks has been fairly reliable for our purposes and we favor them in almost all cases over Windows Services due to their ease of installing and advanced recovery features. The always on nature of a windows service could end up being a problem if a part of the code that was written ends ups getting locked up or looped in a piece of code that it shouldn't be in. We generally write our code in a fashion similar to this
Init();
Run();
CleanUp();
Then as part of the Scheduled Task we put a time limit on how long the process can run and have it kill the process if it runs for longer. If we do have a piece of code that is having trouble Scheduled Tasks will kill it and the process will start up in the next minute.
if you need to have it run every minute, I would build it as a windows service. I wouldn't use the scheduler for anything less than a daily task.
I would say that it depends on what it was doing, but in general I am always in favor of having the fewest layers. If you write it as a console service and use the task scheduler then you have two places to maintain going forward.
If you write it as a windows service then you only have one fewer places to check in case something goes wrong.
While searching for scheduled service help, i came across to a very good article by Jon Galloway.
There are various diadvantages if a windows service is used for scheduled task. I agreed with it. I would suggest to use Task Scheduled, simple in implementation. Please refer to detailed information of implementing the task scheduler. Hope this info helps in finalizing the implementation approach.
The only other point to consider, is that if you're job involves some kind of database interaction, consider looking into the integration/scheduling services provided by your database.
For example, creating an SSIS package for your SQL Server related service may seem a bit like overkill, but it can be integrated nicely with the environment and will have its own logging/error checking mechanisms already in place.
I agree, it is kind of a waste of effort to create even a console executable and schedule it to be run every minute. I would suggest exploring something like Quartz.Net. That way you can create a simple job and schedule it to run every minute.

Resources