I'm connecting to Twitter's streaming API to get a stream of updates to my Rails app, adding them to the db, etc, etc.
What's the best way to do this on Heroku? Right now, I'm using the delayed_job gem - problem is that the job (connecting to the Twitter Streaming API) expires after hours.
Is there a way to make the job run forever, or a better way to do this?
Thanks
I wouldn't make a job "run forever" as that would mean loading the CPU forever too.
The way this is usually handled is by using a cron job which starts the specific script at specific intervals (every minute, every hour, every few days, etc.).
Almost every webhost provides an easy interface to setup such cron jobs via their backend (eg: CPanel).
In case you're running your own server, you probably already know how to configure such jobs. If you don't, you'll have to lookup the individual setup guide which fits the operating system you're running on your server… there's always a way to run "jobs" at specific intervals (even on MS Windows servers — via scheduling).
And for a more detailed description and better insight into what "cron" is, you might want to check the "cron" article at Wikipedia , which also provides some pretty good examples.
Related
This may be more of an App Engine question than a delayed_job question. But generally, how can I keep a long-lived process running to handling the scheduling of notifications and the sending of scheduled notifications on Google App Engine?
The maintainers of active_job https://github.com/collectiveidea/delayed_job include a script for production deploys, but this seems to stop after a few hours. Trying to figure out the best approach to ensure that the script stays running, and also that the script is able to access the logs for debugging purposes.
I believe that Google Pub/Sub is also a possibility, but I would ideally like to avoid setting up additional infrastructure for such a small project.
For running long processes that last for hours, App Engine will not be the ideal solution, since the requests are cap to 60 s (GAE Standard) and 60 m (GAE Flex).
The best would be to use a Compute Engine based solution, since the you would be able to keep the GCE VM up for long periods.
Once you have deployed on your GCE VM a RESTful application you can use Cloud Scheduler to create an scheduled job with this command:
gcloud scheduler jobs create http JOB --schedule=SCHEDULE --uri=APP_PATH
You can find more about this solution in this article
If App Engine is required take into consideration the mentioned maximum request times. And additionally you can give a look to Cloud Tasks, since those fit pretty much into your requirement.
I want to send email batch at specific time like CRON.
I think whenever gem (https://github.com/javan/whenever) is not to fit in Cloud Foundry Environment. Because Cloud Foundry can't use crontab.
Please inform me what options are available to me.
There's a node.js app here that you could use to schedule a specific rake task.
I haven't worked with cloudfare so I'm not sure if it'll serve your needs, but you can also try some of the batch job processing tools rails has available: Delayed job and sidekiq. Those store data for recurring jobs either on your database (DJ) or in a separate redis database (Sidekiq) and both need keeping extra processes up and running, so review them deeply and the changes you'd need for your deployment process before using each one. There's also resque, and here's a tutorial to use it with rails for scheduling tasks.
There are multiple solutions here, but the short answer is that whatever you end up doing needs to implement its own scheduler. This is because there is no cron service available to your application when it runs on CF. This means there is nothing to trigger or schedule your actions. Any project or solution that depends on cron will not work when deploying to CF. Any project that implements it's own scheduler should work fine.
Some specific things I've seen people do successfully:
Use a web service that sends HTTP requests to your app on predefined intervals. The requests trigger your action. It's the services responsibility to let you define when to trigger and to send the HTTP requests. I'm intentionally avoiding mentioning any specific services, but you can find them by searching for "cron http service" or something like that.
Importing a library that has cron like functionality. I'm not familiar with Ruby, so I don't know the landscape there. #mlabarca has mentioned a couple that you might try out. Again, look to see that they implement the scheduling functionality and do not depend on cron. I'm more familiar with Java where you have Quartz and Spring, which has some scheduling functionality too.
Implement a "clock" process or scheduler. This would generally be a second app that you deploy on CF. It would be lightweight and probably not have a web interface. It could be as simple as do something, sleep, loop for ever repeating those two steps. It really depends on your needs. You could even get fancy and implement something like the first option above where you're sending some sort of request to your other apps to trigger the actual events.
There are probably other solutions as well, those are just some examples to get you started.
Probably also worth mentioning that the Cloud Controller v3 API will have first class features to run tasks. In this case, the "task" is some job that runs in a finite amount of time and exits (like a batch job). This is opposed to the standard "app" that when run on CF should continue executing forever (i.e. if it exits, it's cause of a crash). That said, I do not believe it will include a scheduler so you'd still need something to trigger the task.
Background
I'm currently working on a small Rails 5 project that needs to access and process an external API. There is a ruby wrapper gem available for the API, so accessing the data is not a problem.
Problem description
There are two parts of the equation that I am currently missing, and hoping someone out there can help me with.
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
The first problem you can solve using recurring tasks. The main idea is to run the process that will perform some operations every x minutes (or days or whatever fits your problem.
There are several tools that you can use. One of them is built-in the unix system and it is cron. You can read about it in system's manual. You can easily manage it using whenever gem. The main disadvantage is that you need an access to the system's cron which may be non-trivial on non-bare machines (for example Platform as a Service hosts such as Heroku).
You should also take a look at clockwork which does not rely on the system's cron. It uses approach where you have a separate process running all time and it keeps an eye on defined tasks.
In the second approach (having a separate process) you need to remember that time-consuming instructions may "lock" the process and postpone another tasks. In this case, you may want to use background processing such as sidekiq or delayed_job. The idea is to use one process for scheduling tasks at certain time and another process to process those tasks as soon as they appear in the queue.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
You need to create a client that will consume the API and map its responses into models that you have in your application. This way, you don't need to make your model's scheme dependent on the API scheme. Take a look at resource_kit gem - this is a sample solution that uses this approach.
HI hdauven,
processing the API every 15 minutes will affect your server performance,so done it by using sidekiq, it is a background job and use sidetiq it will help you to perform the task every 15 min automatically
You are accessing API, Then why are you worrying about different domain.
We have an app running on heroku that we need to put into maintenance mode for an hour during the night while Amazon performs maintenance on the RDS instance we are using.
Does anyone know of a good way to schedule this?
I could create a cron job on one of the machines in the office to do it, but that seems a bit inelegant.
I'd rather do it from within the application itself, perhaps using delayed_job (which we are already using), or from a scheduled task. However I can't find any documentation about interacting with the Heroku management system from within a running application. Am I just not searching for the right thing?
I have a ruby on rails app that uses Heroku. I have the need to run things like import/export tasks on our db that lock up the whole system since they are so heavy on the DB. Is there a way to tell the system to only run these tasks when the database is not being used at that second?
There is no built-in way to schedule a job like this. There are a few things you can do, though.
Schedule the jobs to run during the least busy hours of the day. That will depend on your business, customer base and so on, but hopefully there is a window that is more suitable than others.
You could write your batch job to run for a longer time, doing small units of work. Between each unit of work, sleep for a few seconds, or take a look at the current load average and decide what to do based on that. This should lower the impact of the batch jobs.
Have the website update a "lock" somewhere, either in the database or in a memcached or something. If your normal website usage updates the database, you could look at the existing updated_at. Then only do batch work when there hasn't been any activity for a while. This doesn't guarantee that a new user won't pop in at the same time your batch job runs, of course, but could be a way to find a window where the site is less used.
Have you looked into using Background Jobs / Workers on Heroku? It's also worth reading about Heroku's Delayed Job queuing system