Run method periodically in manualy created interval in Rails - ruby-on-rails

I have a form when user can set day in week and time period (for example Monday from 02:00 to 03:00).
Then I want to run specific method (periodically, for example every 30 seconds) but only during given time interval (it should be performed on background every Monday during 02:00 and 03:00, every 30 seconds).
Is it possible to do this when time interval is created dynamically (by user)?

You could define a cron job which runs a rake task (refer to this for syntactic sugar: https://github.com/javan/whenever).
Your rake task could run every X seconds and query the database for new jobs.
However, calling a rake task via cron boots the rails app every time, so a more suffisticated approach would be to use sidekiq (http://sidekiq.org/) in combination with sidekiq-scheduler (https://github.com/moove-it/sidekiq-scheduler).

Related

How to make a Job time safe? Ensuring it is run when it's supposed to using Date methods

We have CreditCard related rake tasks that are supposed to be run on the 1st every month, to remind our clients to update their payment method if it expired at some point during the previous month.
class SomeJob < ApplicationJob
def perform
::CreditCard.
.where(expiration_date: Date.yesterday.all_month)
.find_each do |credit_card|
Email::CreditCard::SendExpiredReminderJob.perform_later(credit_card.id)
end
end
end
I'm concerned that this particular Job as we currently have it might not be Time zone safe due to the Date.yesterday.all_month we use to get last month's date range (remember the rake task is run on the 1st every month).
For example, if for some reason the Job were to be run past midnight (on the 2nd), it would incorrectly notify clients with cards expiring this month when it should of notified last month's expired cards.
The safest bet would be to substract more than 1 day from Date, but I'm not sure that's the cleanest way to go (and later on someone would not understand why we are substracting 5, 7 days or whatever).
Are there safer ways to do it?
Date.current.prev_month.all_month
.current may be better than .today because it checks if it is set time zone, see answer.
.prev_month or .months_ago(1) see https://apidock.com/rails/v3.2.13/Date/prev_month

InfluxDB Continuous Query running on entire time series data

If my interpretation is correct, according to the documentation provided here:InfluxDB Downsampling when we down-sample data using a Continuous Query running every 30 minutes, it runs only for the previous 30 minutes data.
Relevant part of the document:
Use the CREATE CONTINUOUS QUERY statement to generate a CQ:
CREATE CONTINUOUS QUERY "cq_30m" ON "food_data" BEGIN
SELECT mean("website") AS "mean_website",mean("phone") AS "mean_phone"
INTO "a_year"."downsampled_orders"
FROM "orders"
GROUP BY time(30m)
END
That query creates a CQ called cq_30m in the database food_data.
cq_30m tells InfluxDB to calculate the 30-minute average of the two
fields website and phone in the measurement orders and in the DEFAULT
RP two_hours. It also tells InfluxDB to write those results to the
measurement downsampled_orders in the retention policy a_year with the
field keys mean_website and mean_phone. InfluxDB will run this query
every 30 minutes for the previous 30 minutes.
When I create a Continuous Query it actually runs on the entire dataset, and not on the previous 30 minutes. My question is, does this happen only the first time after which it runs on the previous 30 minutes of data instead of the entire dataset?
I understand that the query itself uses GROUP BY time(30m) which means it'll return all data grouped together but does this also hold true for the Continuous Query? If so, should I then include a filter to only process the last 30 minutes of data in the Continuous Query?
What you have described is expected functionality.
Schedule and coverage
Continuous queries operate on real-time data. They use the local server’s timestamp, the GROUP BY time() interval, and InfluxDB database’s preset time boundaries to determine when to execute and what time range to cover in the query.
CQs execute at the same interval as the cq_query’s GROUP BY time() interval, and they run at the start of the InfluxDB database’s preset time boundaries. If the GROUP BY time() interval is one hour, the CQ executes at the start of every hour.
When the CQ executes, it runs a single query for the time range between now() and now() minus the GROUP BY time() interval. If the GROUP BY time() interval is one hour and the current time is 17:00, the query’s time range is between 16:00 and 16:59.999999999.
So it should only process the last 30 minutes.
Its a good point about the first run.
I did manage to find a snippet from an old document
Backfilling Data
In the event that the source time series already has data in it when you create a new downsampled continuous query, InfluxDB will go back in time and calculate the values for all intervals up to the present. The continuous query will then continue running in the background for all current and future intervals.
https://influxdbcom.readthedocs.io/en/latest/content/docs/v0.8/api/continuous_queries/#backfilling-data
Which would explain the behaviour you have found

How to schedule Sidekiq jobs according to a recurring schedule in database?

I've set up a system where Measurements are run based on MeasurementSettings. Each setting defines a schedule using ice_cube (and some other properties for measurements). I'm able to edit this schedule for each setting and then poll for the next occurrences with something like:
def next_occurrences
schedule.occurrences(Time.now + 1.day)
end
This gives me a set of timestamps when there should be a Measurement.
Now, I also have Sidekiq installed, and I can successfully run a measurement at a specific time using a MeasurementWorker. To do that, I just create an empty Measurement record, associate it with its settings, and then perform_async (or perform_at(...)) this worker:
class MeasurementWorker
include Sidekiq::Worker
sidekiq_options retry: 1, backtrace: true
def perform(measurement_id)
Measurement.find(measurement_id).run
end
end
What's missing is that I somehow need to create the empty measurements based on a setting's schedule. But how do I do this?
Say I have the following data:
MeasurementSetting with ID 1, with a daily schedule at 12:00 PM
MeasurementSetting with ID 2, with an hourly schedule at every full hour (12:00 AM, 01:00 AM, etc.)
Now I need to:
Create a Measurement that uses setting 1, every day at 12:00
Create a Measurement that uses setting 2, every full hour
Call the worker for these measurements
But how?
Should I check, say every minute, whether there is a MeasurementSetting that is defined to occur now, and then create the empty Measurement records, then run them with the specific setting using Sidekiq?
Or should I generate the empty Measurement records with their settings in advance, and schedule them this way?
What would be the simplest way to achieve this?
I would use cron every minute to look for MeasurementSettings that should run NOW. If yes, create a Sidekiq job to run immediately which populates the Measurement.
Here's how I successfully did it:
Update your model that should run on a scheduled basis with a field planned_start_time. This will hold the time when it was planned to start at.
Use the whenever Gem to run a class method every minute, e.g. Measurement.run_all_scheduled
In that method, go through each setting (i.e. where the schedule is), and check if it is occurring now:
setting = MeasurementSetting.find(1) # get some setting, choose whatever
schedule = setting.schedule
if not schedule.occurring_at? Time.now
# skip this measurement, since it's not planned to run now
If that's the case, then check if we can run the measurement by looking in the database if there isn't any previous measurement with the same planned start time. So first, we have to get the planned start time for the current schedule, e.g. when it's now 3:04 PM. the planned start time could have been 3:00 PM.
this_planned_start_time = schedule.previous_occurrence(Time.now).start_time
Then we check if the last measurement's start time (limit(1) gets just the last one) is the same or not.
if Measurement.limit(1).last.planned_start_time != this_planned_start_time
# skip this one, since it was already run
If not, we can continue setting up a measurement.
measurement = Measurement.create measurement_setting: setting,
planed_start_time: this_planned_start_time
Then run it:
measurement.delay.run

Rails activrecord jobs queue with maximum execution frequency

I've ActiveRecord model:
User(id,user_specific_attributes, last_check:datetime, check_priority:integer, today_api_calls:integer)
I'm doing API call for each User once a day. API has some important limits:
it's accesible from 4am to 8pm
call frequency limit: 10 per minute = 6 seconds timeout
call count limit: 3000/day
I need to run get_some_data_from_api() for each User once a day (start at 4am). Execution order is defined by check_priority column.
In case of error from get_some_data_from_api() it should restart job after 6 seconds (api limit).
Is there any gem suitable for this case?
Gems like Sidekiq, Delayed Job, Resque are unsuitable. Using them I need to queue all jobs with specific time. Consider:
Adding new job with high priority (requeue all next jobs?)
Job execution can take more than 6 seconds
Restarting job in case of error (requeue all next jobs?)
Delayed job would also work. It has options to run at specific time and reschedule if job fails.
DJ runs as a separate instance of your app. You can assign priorities to jobs. The time of execution does not matter. It has built in options to configure retry on failure.
To schedule jobs at a specific time and to reschedule them, I use self perpetuating jobs. So after the job is done it reschedules itself. Something like
def run_me
///code code
User.delayed_job(:run_at => next_day).run_me
end
You can handle certain errors in the same way. For eg if api limit is crossed you may want to catch some exception and reschedule next day instead of 6 secs.

Schedule / trigger action realtime and absolute in rails when a certain condition is true?

Im looking for a way to trigger a certain action absolute and in realtime without delay in ruby on rails when a certain condition is TRUE.
A simplified example to illustrate this:
**table times**
id | time
1 12:00
2 12.05
3 13:00
Checking every second, to see if the current_time == times in db table
If TRUE then it should execute a piece of code ( function ) directly with no delays
I have looked into resque and delayed_jobs but the problem is those do not support a absolute realtime execution they just add to a queue which could cause delays in the execution, it can be a second of max
Anyone has experience with above case and could point me to the best practice on how to implement above in Ruby on rails?
I end up writing a custom rake task, wich is called with delayed gem every few seconds

Resources