How to use delayed_job with Rails on Google App Engine? - ruby-on-rails

This may be more of an App Engine question than a delayed_job question. But generally, how can I keep a long-lived process running to handling the scheduling of notifications and the sending of scheduled notifications on Google App Engine?
The maintainers of active_job https://github.com/collectiveidea/delayed_job include a script for production deploys, but this seems to stop after a few hours. Trying to figure out the best approach to ensure that the script stays running, and also that the script is able to access the logs for debugging purposes.
I believe that Google Pub/Sub is also a possibility, but I would ideally like to avoid setting up additional infrastructure for such a small project.

For running long processes that last for hours, App Engine will not be the ideal solution, since the requests are cap to 60 s (GAE Standard) and 60 m (GAE Flex).
The best would be to use a Compute Engine based solution, since the you would be able to keep the GCE VM up for long periods.
Once you have deployed on your GCE VM a RESTful application you can use Cloud Scheduler to create an scheduled job with this command:
gcloud scheduler jobs create http JOB --schedule=SCHEDULE --uri=APP_PATH
You can find more about this solution in this article
If App Engine is required take into consideration the mentioned maximum request times. And additionally you can give a look to Cloud Tasks, since those fit pretty much into your requirement.

Related

Should I use a Container/Service Fabric Guest Executable for a scheduled daily workload?

This is a more general question about which types of payloads to host in a Container. In our case we will use Service Fabric guest executables. For this post I will only use the word Container to refer to both. The reason I do this is they have similar properties and think more people may understand a container than a SF Guest Exe.
WebAPIs/Services that needs to scale are a good fit for containers, but this question is related to what we call a "Batch" job. This nomenclature comes out of the old .bat files, but in our case we are using a .NET Framework or Core .exe (console apps).
Currently Windows Task Scheduler kicks off the batch running under a service account on a VM. We want the processing to happen on a certain time of day or day of the week and not before or after. There is not any real scaling here. There is one instance which may or may not be multithreaded and on average they generally run between 2-15 minutes and then stop. Some run longer some run shorter. I understand there are limitations to this approach but this is the type of payload I'm discussing here.
As we modernize the Technology stack we are looking to use the Orchestrator as much as possible. As a technologist I've always tried to understand the different tools in our tool belts and not use a tool just because that's the one I used last, instead use the correct tool for the task.
We started out by not writing any more .net console apps. Instead we put the business logic of these "batches" into WebApi's. Then having the task scheduler call the API when it needed to perform its action. If I put this into Service Fabric and host it my concern is that the system resources are consumed for 23 hours and 45 minutes a day when they are not being used. That seems to be opposite of what you would expect when using a container.
Now if I could spin up a Service Fabric Guest Exe/Container on demand and then after it finishes destroy the instance of the app that could fit the need. Then I could have the benefits of the orchestrator without the determent of having it consume resources all the time. I would hope to retire the Batch Server (VM) as the hardware is usage is not optimized and instead add resources to the cluster.
UPDATE
Looking at Vaclav's Scalability Doco I think there might be a use case in here? https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-concepts-scalability He uses a "Workload Manager Service" combined with CreateServiceAsync, to spin up an instance of the service on demand. I guess I would deploy the app to the image store but not create an instance of the app until needed. Then I need to figure out how to end it, is it as simple as changing the infinite loop in Program.cs? The thing is it doesn't look like there is a Program.cs in a Guest Executable.
This looks like a way to run a package until completion, which was releases as part of 7.1. But how do we start a second execution of the service? I want to execute based on a request coming in.
https://learn.microsoft.com/en-us/azure/service-fabric/run-to-completion
Thoughts?

Schedule Mail batch by Rails in Cloud Foundry

I want to send email batch at specific time like CRON.
I think whenever gem (https://github.com/javan/whenever) is not to fit in Cloud Foundry Environment. Because Cloud Foundry can't use crontab.
Please inform me what options are available to me.
There's a node.js app here that you could use to schedule a specific rake task.
I haven't worked with cloudfare so I'm not sure if it'll serve your needs, but you can also try some of the batch job processing tools rails has available: Delayed job and sidekiq. Those store data for recurring jobs either on your database (DJ) or in a separate redis database (Sidekiq) and both need keeping extra processes up and running, so review them deeply and the changes you'd need for your deployment process before using each one. There's also resque, and here's a tutorial to use it with rails for scheduling tasks.
There are multiple solutions here, but the short answer is that whatever you end up doing needs to implement its own scheduler. This is because there is no cron service available to your application when it runs on CF. This means there is nothing to trigger or schedule your actions. Any project or solution that depends on cron will not work when deploying to CF. Any project that implements it's own scheduler should work fine.
Some specific things I've seen people do successfully:
Use a web service that sends HTTP requests to your app on predefined intervals. The requests trigger your action. It's the services responsibility to let you define when to trigger and to send the HTTP requests. I'm intentionally avoiding mentioning any specific services, but you can find them by searching for "cron http service" or something like that.
Importing a library that has cron like functionality. I'm not familiar with Ruby, so I don't know the landscape there. #mlabarca has mentioned a couple that you might try out. Again, look to see that they implement the scheduling functionality and do not depend on cron. I'm more familiar with Java where you have Quartz and Spring, which has some scheduling functionality too.
Implement a "clock" process or scheduler. This would generally be a second app that you deploy on CF. It would be lightweight and probably not have a web interface. It could be as simple as do something, sleep, loop for ever repeating those two steps. It really depends on your needs. You could even get fancy and implement something like the first option above where you're sending some sort of request to your other apps to trigger the actual events.
There are probably other solutions as well, those are just some examples to get you started.
Probably also worth mentioning that the Cloud Controller v3 API will have first class features to run tasks. In this case, the "task" is some job that runs in a finite amount of time and exits (like a batch job). This is opposed to the standard "app" that when run on CF should continue executing forever (i.e. if it exits, it's cause of a crash). That said, I do not believe it will include a scheduler so you'd still need something to trigger the task.

Interval based API access and processing different DSL

Background
I'm currently working on a small Rails 5 project that needs to access and process an external API. There is a ruby wrapper gem available for the API, so accessing the data is not a problem.
Problem description
There are two parts of the equation that I am currently missing, and hoping someone out there can help me with.
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
The first problem you can solve using recurring tasks. The main idea is to run the process that will perform some operations every x minutes (or days or whatever fits your problem.
There are several tools that you can use. One of them is built-in the unix system and it is cron. You can read about it in system's manual. You can easily manage it using whenever gem. The main disadvantage is that you need an access to the system's cron which may be non-trivial on non-bare machines (for example Platform as a Service hosts such as Heroku).
You should also take a look at clockwork which does not rely on the system's cron. It uses approach where you have a separate process running all time and it keeps an eye on defined tasks.
In the second approach (having a separate process) you need to remember that time-consuming instructions may "lock" the process and postpone another tasks. In this case, you may want to use background processing such as sidekiq or delayed_job. The idea is to use one process for scheduling tasks at certain time and another process to process those tasks as soon as they appear in the queue.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
You need to create a client that will consume the API and map its responses into models that you have in your application. This way, you don't need to make your model's scheme dependent on the API scheme. Take a look at resource_kit gem - this is a sample solution that uses this approach.
HI hdauven,
processing the API every 15 minutes will affect your server performance,so done it by using sidekiq, it is a background job and use sidetiq it will help you to perform the task every 15 min automatically
You are accessing API, Then why are you worrying about different domain.

Keep delayed job running on Heroku

I'm connecting to Twitter's streaming API to get a stream of updates to my Rails app, adding them to the db, etc, etc.
What's the best way to do this on Heroku? Right now, I'm using the delayed_job gem - problem is that the job (connecting to the Twitter Streaming API) expires after hours.
Is there a way to make the job run forever, or a better way to do this?
Thanks
I wouldn't make a job "run forever" as that would mean loading the CPU forever too.
The way this is usually handled is by using a cron job which starts the specific script at specific intervals (every minute, every hour, every few days, etc.).
Almost every webhost provides an easy interface to setup such cron jobs via their backend (eg: CPanel).
In case you're running your own server, you probably already know how to configure such jobs. If you don't, you'll have to lookup the individual setup guide which fits the operating system you're running on your server… there's always a way to run "jobs" at specific intervals (even on MS Windows servers — via scheduling).
And for a more detailed description and better insight into what "cron" is, you might want to check the "cron" article at Wikipedia , which also provides some pretty good examples.

Is there a way to schedule Heroku maintenance mode in advance?

We have an app running on heroku that we need to put into maintenance mode for an hour during the night while Amazon performs maintenance on the RDS instance we are using.
Does anyone know of a good way to schedule this?
I could create a cron job on one of the machines in the office to do it, but that seems a bit inelegant.
I'd rather do it from within the application itself, perhaps using delayed_job (which we are already using), or from a scheduled task. However I can't find any documentation about interacting with the Heroku management system from within a running application. Am I just not searching for the right thing?

Resources