Does spring-cloud-dataflow provide support for scheduling applications defined as tasks? - spring-cloud-dataflow

I have been looking at using projects built using spring-cloud-task within spring-cloud-dataflow. Having looked at the example projects and the documentation, the indication seems to be that tasks are launched manually through the dashboard or the shell. Does spring-cloud-dataflow provide any way of scheduling task definitions so that they can run for example on a cron schedule? I.e. Can you create a spring-cloud-task app which itself has no knowledge of a schedule, but deploy it to the dataflow server and configure the scheduling there?
Among the posts and blogs I have looked at I noticed the following:
https://spring.io/blog/2016/01/27/introducing-spring-cloud-task
Some of the Q&A afterwards hints at this being a possibility, with the reference to triggers, but I think this was discussed before it was released.
Any advice would be greatly appreciated, many thanks.

There are few ways you could launch Tasks in Spring Cloud Data Flow. Following are the available options today.
Launch it using TriggerTask; with this you could either choose to launch it with fixedDelay or via a cron expression - example here.
Launch it via an event in streaming pipeline. Imagine a use-case where you would want to create a "thumbnail" as and when there's a new image (event) in s3-bucket or in a file-system directory; the "thumbnail" operation could be a task in this case - example here.
Lastly, in the upcoming releases, we will port over "scheduler" functionality from Spring XD to Spring Cloud Data Flow.

Yes, Spring Cloud Data Flow does provide a scheduling option. To enable it, you need to add below arguments while starting the server:
--spring.cloud.dataflow.features.schedules-enabled=true

Related

Spring Cloud Data Flow preconfigured tasks

Spring Cloud Data Flow is a great solution and currently I'm trying to find the possibility to preconfigure the tasks in order to trigger them manually.
The use-case very simple:
as a DevOps I should have ability to preconfigure the tasks, which includes creation of the execution graph and application and deploy parameters and save task with all parameters needed for execution.
as a User with role ROLE_DEPLOY I should have ability start, stop, restart and execution monitor the preconfigured tasks.
Preconfigured tasks is the task with all parameters needed for execution.
Is it possible to have such functionality?
Thank you.
You may want to review the continuous deployment section from the reference guide. There's built-in lifecycle semantics for orchestrating tasks in SCDF.
Assuming you have the Task applications already built and that the DevOps persona is familiar with how the applications work together, they can either interactively or programmatically build a composed-task definition and task/job parameters in SCDF.
The above step persists the task definition in the SCDF's database. Now, the same definition can be launched manually or via a cronjob schedule.
If nothing really changes to the definition, app version, or the task/parameters, yes, anyone with the ROLE_DEPLOY role can interact with it.
You may also find the CD for Tasks guide useful reference material. Perhaps repeat this locally (with desired security configurations) on your environment to get a hold of how it works.

How to use delayed_job with Rails on Google App Engine?

This may be more of an App Engine question than a delayed_job question. But generally, how can I keep a long-lived process running to handling the scheduling of notifications and the sending of scheduled notifications on Google App Engine?
The maintainers of active_job https://github.com/collectiveidea/delayed_job include a script for production deploys, but this seems to stop after a few hours. Trying to figure out the best approach to ensure that the script stays running, and also that the script is able to access the logs for debugging purposes.
I believe that Google Pub/Sub is also a possibility, but I would ideally like to avoid setting up additional infrastructure for such a small project.
For running long processes that last for hours, App Engine will not be the ideal solution, since the requests are cap to 60 s (GAE Standard) and 60 m (GAE Flex).
The best would be to use a Compute Engine based solution, since the you would be able to keep the GCE VM up for long periods.
Once you have deployed on your GCE VM a RESTful application you can use Cloud Scheduler to create an scheduled job with this command:
gcloud scheduler jobs create http JOB --schedule=SCHEDULE --uri=APP_PATH
You can find more about this solution in this article
If App Engine is required take into consideration the mentioned maximum request times. And additionally you can give a look to Cloud Tasks, since those fit pretty much into your requirement.

Schedule Mail batch by Rails in Cloud Foundry

I want to send email batch at specific time like CRON.
I think whenever gem (https://github.com/javan/whenever) is not to fit in Cloud Foundry Environment. Because Cloud Foundry can't use crontab.
Please inform me what options are available to me.
There's a node.js app here that you could use to schedule a specific rake task.
I haven't worked with cloudfare so I'm not sure if it'll serve your needs, but you can also try some of the batch job processing tools rails has available: Delayed job and sidekiq. Those store data for recurring jobs either on your database (DJ) or in a separate redis database (Sidekiq) and both need keeping extra processes up and running, so review them deeply and the changes you'd need for your deployment process before using each one. There's also resque, and here's a tutorial to use it with rails for scheduling tasks.
There are multiple solutions here, but the short answer is that whatever you end up doing needs to implement its own scheduler. This is because there is no cron service available to your application when it runs on CF. This means there is nothing to trigger or schedule your actions. Any project or solution that depends on cron will not work when deploying to CF. Any project that implements it's own scheduler should work fine.
Some specific things I've seen people do successfully:
Use a web service that sends HTTP requests to your app on predefined intervals. The requests trigger your action. It's the services responsibility to let you define when to trigger and to send the HTTP requests. I'm intentionally avoiding mentioning any specific services, but you can find them by searching for "cron http service" or something like that.
Importing a library that has cron like functionality. I'm not familiar with Ruby, so I don't know the landscape there. #mlabarca has mentioned a couple that you might try out. Again, look to see that they implement the scheduling functionality and do not depend on cron. I'm more familiar with Java where you have Quartz and Spring, which has some scheduling functionality too.
Implement a "clock" process or scheduler. This would generally be a second app that you deploy on CF. It would be lightweight and probably not have a web interface. It could be as simple as do something, sleep, loop for ever repeating those two steps. It really depends on your needs. You could even get fancy and implement something like the first option above where you're sending some sort of request to your other apps to trigger the actual events.
There are probably other solutions as well, those are just some examples to get you started.
Probably also worth mentioning that the Cloud Controller v3 API will have first class features to run tasks. In this case, the "task" is some job that runs in a finite amount of time and exits (like a batch job). This is opposed to the standard "app" that when run on CF should continue executing forever (i.e. if it exits, it's cause of a crash). That said, I do not believe it will include a scheduler so you'd still need something to trigger the task.

Interval based API access and processing different DSL

Background
I'm currently working on a small Rails 5 project that needs to access and process an external API. There is a ruby wrapper gem available for the API, so accessing the data is not a problem.
Problem description
There are two parts of the equation that I am currently missing, and hoping someone out there can help me with.
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
The first problem you can solve using recurring tasks. The main idea is to run the process that will perform some operations every x minutes (or days or whatever fits your problem.
There are several tools that you can use. One of them is built-in the unix system and it is cron. You can read about it in system's manual. You can easily manage it using whenever gem. The main disadvantage is that you need an access to the system's cron which may be non-trivial on non-bare machines (for example Platform as a Service hosts such as Heroku).
You should also take a look at clockwork which does not rely on the system's cron. It uses approach where you have a separate process running all time and it keeps an eye on defined tasks.
In the second approach (having a separate process) you need to remember that time-consuming instructions may "lock" the process and postpone another tasks. In this case, you may want to use background processing such as sidekiq or delayed_job. The idea is to use one process for scheduling tasks at certain time and another process to process those tasks as soon as they appear in the queue.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
You need to create a client that will consume the API and map its responses into models that you have in your application. This way, you don't need to make your model's scheme dependent on the API scheme. Take a look at resource_kit gem - this is a sample solution that uses this approach.
HI hdauven,
processing the API every 15 minutes will affect your server performance,so done it by using sidekiq, it is a background job and use sidetiq it will help you to perform the task every 15 min automatically
You are accessing API, Then why are you worrying about different domain.

How to implement background processing for ASP.Net MVC website in a shared hosting environment?

I am developing my first web application using ASP.Net MVC, and I am in a situation where I would like a background service to process status notifications outside of the application, not unlike the reputation/badge system on stackoverflow.
What is the best way to handle something like this? Is it even possible in a shared-hosting environment like Godaddy, which I am using.
I don't need to communicate with the background worker directly, since I will be adding notification records to a database table with a column set to an "unprocessed" state. Then the worker will just scan the table on a regular schedule and processes what is ready.
Thanks for your advice.
Have you tried with quartz.net? I think it may fit your needs.
also take a look at this Simulate a Windows Service using ASP.NET to run scheduled jobs article.
it explains a nice way to schedule operations with no outer dependence.
The idea is to use Cache timeout to control the schedule. I've implemented it successfully on a project which required regular temp file cleaning. This cleaning is a bit heavy so we move this clean operation in a scheduled job (using the asp.net cache) to avoid having to deploy scheduled task or custom program.
To answer whether GoDaddy will support a seperate service you need to ask them.
However there are a number of creative ways that you can "get around" this issue on shared hosting.
Have a secure page that's purpose is to execute your background work. You could have scheduled task on a machine under your control that calls to this web page at set intervals.
Use a variation of the Background Worker Thread answer from #safi. Your background worker thread could check to see if another is already processing and stop, so that only one instance is running at a time.
If only one background task is enough for you then use the WebBackgrounder
And this is the article with detailed explanation.

Resources