Use case(Spring cloud task):
I have different tasks which are independent of each other. I want to create those tasks in one jar and trigger task from command line. Is it possible ?
Also I want to schedule them using crontab. Please suggest.
I am not sure why you would need to have all those independent tasks inside the same jar but as long as you have a way to invoke the appropriate task based on your criteria, you could use Spring Cloud Data Flow to set this up.
You register your single jar with all the independent tasks as a task application (let's say mytask)
You create a schedule to trigger this task with the command line arguments to specify the criteria to launch the appropriate functionality inside the task jar
(Note that the scheduler support is only available on Kubernetes and Cloud Foundry)
Depending on the context you might want to consider using composed tasks as well: https://docs.spring.io/spring-cloud-dataflow/docs/2.5.2.RELEASE/reference/htmlsingle/#spring-cloud-dataflow-composed-tasks
Related
In docker we have used deploy: replicas: 3 for our microservice. We have some Cronjob & the problem is the system in running all cronjob is getting called 3 times which is not what we want. We want to run it only one time. Sample of cron in nest.js :
#Cron(CronExpression.EVERY_5_MINUTES)
async runBiEventProcessor() {
const calculationDate = new Date()
Logger.log(`Bi Event Processor started at ${calculationDate}`)
How can I run this cron only once without changing the replicas to 1?
This is quite a generic problem when cron or background job is part of the application having multiple instances running concurrently.
There are multiple ways to deal with this kind of scenario. Following are some of the workaround if you don't have a concrete solution:
Create a separate service only for the background processing and ensure only one instance is running at a time.
Expose the cron job as an API and trigger the API to start background processing. In this scenario, the load balancer will hand over the request to only one instance. This approach will ensure that only one instance will handle the job. You will still need an external entity to hit the API, which can be in-house or third-party.
Use repeatable jobs feature from Bull Queue or any other tool or library that provides similar features.
Bull will hand over the job to any active processor. That way, it ensures the job is processed only once by only one active processor.
Nest.js has wrapper for the same. Read more about the Bull queue repeatable job here.
Implement a custom locking mechanism
It is not difficult as it sounds. Many other schedulers in other frameworks work on similar principles to handle concurrency.
If you are using RDBMS, make use of transactions and locking. Create cron records in the database. Acquire the lock as soon as the first cron enters and processes. Other concurrent jobs will either fail or timeout as they will not be able to acquire the lock. But you will need to handle a few cases in this approach to make it bug-free and flawless.
If you are using MongoDB or any similar database that supports TTL (Time-to-live) setting and unique index. Insert the document in the database where one of the fields from the document has unique constraints that ensure another job will not be able to insert one more document as it will fail due to database-level unique constraints. Also, ensure TTL(Time-to-live index) on the document; this way document will be deleted after a configured time.
These are workaround if you don't have any other concrete options.
There are quite some options here on how you could solve this, but I would suggest to create a NestJS microservice (or plain nodeJS) to run only the cronjob and store it in a shared db for example to store the result in Redis.
Your microservice that runs the cronjob does not expose anything, it only starts your cronjob:
const app = await NestFactory.create(
WorkerModule,
);
await app.init();
Your WorkerModule imports the scheduler and configures the scheduler there. The result of the cronjob you can write to a shared db like Redis.
Now you can still use 3 replica's but prevent registering cron jobs in all replica's.
Spring Cloud Data Flow is a great solution and currently I'm trying to find the possibility to preconfigure the tasks in order to trigger them manually.
The use-case very simple:
as a DevOps I should have ability to preconfigure the tasks, which includes creation of the execution graph and application and deploy parameters and save task with all parameters needed for execution.
as a User with role ROLE_DEPLOY I should have ability start, stop, restart and execution monitor the preconfigured tasks.
Preconfigured tasks is the task with all parameters needed for execution.
Is it possible to have such functionality?
Thank you.
You may want to review the continuous deployment section from the reference guide. There's built-in lifecycle semantics for orchestrating tasks in SCDF.
Assuming you have the Task applications already built and that the DevOps persona is familiar with how the applications work together, they can either interactively or programmatically build a composed-task definition and task/job parameters in SCDF.
The above step persists the task definition in the SCDF's database. Now, the same definition can be launched manually or via a cronjob schedule.
If nothing really changes to the definition, app version, or the task/parameters, yes, anyone with the ROLE_DEPLOY role can interact with it.
You may also find the CD for Tasks guide useful reference material. Perhaps repeat this locally (with desired security configurations) on your environment to get a hold of how it works.
I am using SCDF on k8s. When run the task, every time I have to enter the Arguments and Parameters. Is there a way I can auto save them along with Task definition so that I dont need to enter every time?
By design, the task definition doesn't include the arguments or batch job parameters (if your task is a batch job). This is because the task arguments and parameters are bound to vary and typically provided externally. But, if you want them to be part of the task application, then you can have them configured as task application properties instead of arguments.
Is it safe to run multiple instances of the quartz.net scheduler?
If so, how do I do it?
You can use quartz_jobs.xml to configure jobs and create StatefulJobs and use job chaining for running jobs sequentially in one thread scheduler(pointing to RAMJobStore); another scheduler pointing to data store can run simultaneously
http://quartz-scheduler.org/documentation/faq#FAQ-chain
If you need to persist all jobs to single database, you can use 2 schedulers with clustering but you won't get to choose which job runs on which scheduler, so your jobs will run sequentially but may not run on single thread scheduler. 2schedulers can be run, if having 2 quartz table sets with different prefix is not an issue.
http://quartz-scheduler.org/documentation/quartz-1.x/cookbook/MultipleSchedulers
I have one SQL Server database as the job store, two web applications that both can schedule jobs, and a Quartz.NET windows service to execute jobs.
I want the two web applications to just schedule jobs, while the windows service just to execute jobs.
Here comes the problem:
If I create IScheduler instances in the two web applications and in the windows service, they will execute jobs at the same time, there can be conflicts.
If I do not create IScheduler instances in the two web applications, how can I schedule jobs to the windows service from web applications?
Is there a way to let the IScheduler to just schedule jobs without executing jobs? (I can deploy the IJob assemblies to all these three applications)
You probably don't want to instantiate an IScheduler instance in the websites, precisely because creating a local instance also executes jobs.
I've implemented something similar to what you're looking to do.
First, make sure that in your service configuration file (app.config) that you configure the four keys quartz.scheduler.exporter.type, quartz.scheduler.exporter.port, quartz.scheduler.exporter.bindName, and quartz.scheduler.exporter.channelType.
Second, make sure that your web.config has the following four keys configured : quartz.scheduler.instanceName, quartz.scheduler.instanceId, quartz.scheduler.proxy, and quartz.scheduler.proxy.address.
Then when you create your StdSchedulerFactory() and use it to get a scheduler, you are not instantiating a new scheduler, but attaching to an existing scheduler. You can then do anything through the remote scheduler that you could do with a local one, but there is only a single instance that executes jobs.
In your config file, set key "quartz.threadPool.type" to "Quartz.Simpl.ZeroSizeThreadPool, Quartz". The quartz scheduler thus created can schedule/modify jobs/trigger/calendars but can not run the jobs.