I've got an app that allows users to schedule tasks to run whenever they desire. (A todo list with recurring items.)
I need to somehow re-trigger these events to show up again each time their schedule comes up by updating an attribute on the object - it may also send a notification to the user.
My plan for this was to have a cron job that runs every minute/hour/short interval, and in that job, it would find all of the items with schedules that match the current time or should be updated since the last job, however, short of iterating through every item, I don't see a quick way of querying for those objects.
Using Ice Cube I can very easily and cleanly save schedules in my database, but I don't see a method of finding all events that match a particular point in time.
I know once I find the item I can run occurring_between? or occurring_at? to find if I should run it, but that requires pulling every single item into memory and manually checking it, which is not very scaleable.
Is there a way I'm missing, or are there other suggestions for accomplishing what I'm trying to do here? It's still pretty early, so I'm not attached to Ice Cube or any of the current implementations.
After some more thought- I'm not seeing any way to do this, so I've come up with a little hack that I'll do instead:
On the item object, I'll have 2 additional attributes. One will be the schedule which is fed to Ice Cube to generate all of the dates/times to recur at. The next will be next_occurrence, which I'll set on create and each time the item is renewed.
Then in the worker, I'll query for all items that have a next_occurrence in the past and process them, resetting the next_occurrence to be the next time the schedule is to occur.
I'll leave this answer unmarked for a bit in case anybody has a better solution.
Related
I have the following feature on a web app:
For the picture above, I save the time specified on a Schedule model for some work to be performed. At the moment this works perfectly for "Simple" selections meaning that if a user selects Hourly, Weekly, Daily, or Monthly only, then I have a fleet of cron jobs that will start Hourly, Weekly, Daily, or Monthly. So just to be clear, I have 4 cron jobs already placed in my crontab that will run at the specified time and pick the corresponding schedule models.
The problem that I'm facing is the Custom feature. I'm stumped as to how I can effectively create a background task to run on custom input for a user. Here is an example scenario. Let's say I have 3 users who each select a custom time to perform some work.
User 1 selects a custom task to be done on January, February, and March only.
User 2 selects a custom task to be done on Mondays at 1, 3, 6, and 9 pm.
User 3 selects a scan to be done on Friday through Saturday every month only.
And the customization goes on and on.......
What would be an effective and feasible way of implementing this kind of behavior? For the moment, I'm stumped and haven't come up with a way to even approach this based on the requirement. From what I have read, I believe a delayed job will help me based on this stack overflow question custom job based on input but with so much variability, how would this be possible. Some help and guidance would be appreciated. Thank you.
I'd probably go with a mechanism that enqueues the job once for the next occurrence when the configuration is saved. Then, after it has been run, calculate the next occurrence and so on. Don't forget to first remove the already-scheduled run when making changes to the configuration. This can be tricky with some ActiveJob adapters which is why, in case you haven't made a decision towards a specific one yet, I'd like to recommend que. But you can do it in Sidekiq, too.
To facilitate calculation of the next occurrence, you could use, for example, the ice_cube gem.
Essentially each time a visitor reaches the application, the controller performs a database query to check what are the most relevant items to show.
Although the items shown vary with time, they are not personally selected for each user.
This means that instead of being calculated each time a visitor comes, it would be better to be system performing a single query every like 10 minutes and store it, to apply on each visit.
What is the best way to apply this idea? I was thinking on cronjobs and maybe store on redis but IDK, some help is appreciated!
There are a number of ways to do this. One way that I've used in the past with success is to have a table in your database that represents the most relevant items and then have a cron job that updates that table.
Fragment caching like #wesley6j recommended isn't a bad way to go either and you can combine the 2 techniques as well if you want.
If you want more detailed suggestions, you can provide some more details about what you are trying to achieve.
This has been quite a stumbling block. Warning: the following is not a question, rather explanation of what I came up with. My question is — do you have a better way to do this? Is there some common technique for this that I'm not familiar with? Seems like this is a trivial problem.
So you have Task model. You can create tasks, complete them, destroy them. Then you have recurring tasks. It's just like regular task, but it has a recurrence rule attached to it. However, tasks can recur indefinitely — you can go a year ahead in the schedule, and you should see the task show up.
So when a user creates a recurring task, you don't want to build thousands of tasks for hundred years into the future, and save them to database, right? So I started thinking — how do you create them?
One way would be to create them as you view your schedule. So, when the user is moving a month ahead, any recurring tasks will be created. Of course that means that you can't simply work with database records of tasks any longer. Every SELECT operation on tasks you ever do has to be in the context of a particular date range, in order to trigger recurring tasks in that date range to persist. This is a maintenance and performance burden, but doable.
Alright, but how about the original task? Every recurrent task gets associated with the recurrence rule that created it, and every recurrence rule needs to know the original task that started the recurrence. The latter is important, because you need to clone the original task into new dates as the user browses their schedule. I guess doable too.
But what happens if the original task is updated? It means that now as we browse the schedule, we will be creating recurring tasks cloned off of the modified task. That's undesirable. All the implicitly persisted recurring tasks should show up the way the original task looked like when recurrence was added. So we need to store a copy of the original task separately, and clone from that, in order for recurrence to work.
However, when the user navigates the tasks in the schedule, how do we know if at a particular point a new recurrence task needs to be created? We ask recurrence rule: "hey, should I persist a task for this day?" and it says yes or no. If there is already a task for this recurrence for this day, we don't create one. All nice, except a user shall also be able to simply delete one of the recurring tasks that has been automatically persisted. In that case following our logic, the system will re-create the task that has been deleted. Not good. So it means we need to keep storing the task, but mark it as deleted task for this recurrence. Meh.
As I said in the beginning, I want to know if somebody else tackled this problem and can provide architectural advice here. Does it have to be this messy? Is there anything more elegant I'm missing?
Update: Since this question is hard to answer perfectly, I will approve the most helpful insight into design/architecture, which has the best helpfulness/trade-offs ratio for this type of problem. It does not have to encompass all the details.
I know this is an old question but I'm just starting to look into this for my own application and I found this paper by Martin Fowler illuminating: Recurring Events for Calendars
The main takeaway for me was using what he calls "temporal expressions" to figure out if a booking falls on a certain date range instead of trying to insert an infinite number of events (or in your case tasks) into the database.
Practically, for your use case, this might mean that you store the Task with a "temporal expression" property called schedule. The ice_cube recurrence gem has the ability to serialize itself into an active record property like so:
class Task < ActiveRecord::Base
include IceCube
serialize :schedule, Hash
def schedule=(new_schedule)
write_attribute(:schedule, new_schedule.to_hash)
end
def schedule
Schedule.from_hash(read_attribute(:schedule))
end
end
Ice cube seems really flexible and even allows you to specify exceptions to the recurrence rules. (Say you want to delete just one occurrence of the task, but not all of them.)
The problem is that you can't really query the database for a task that falls in a specific range of dates, because you've only stored the rule for making tasks, not the tasks themselves. For my case, I'm thinking about adding a property like "next_recurrence_date" which will be used to do some basic sorting/filtering. You could even use that to throw a task on a queue to have something done on the next recurring date. (Like check if that date has passed and then regenerate it. You could even store an "archived" version of the task once its next recurring date passes.)
This fixes your issue with "what if the task is updated" since tasks aren't ever persisted until they're in the past.
Anyway, I hope that is helpful to someone trying to think this through for their own app.
Having done a calendar-like component for an internal social networking app, here's my approach to that problem.
Tiny bit of background: I needed to book boardrooms for meetings for the entire company. Every boardroom needed to be booked either as a one-off or on a recurring basis. As you've found out, it's the recurrence rules that kill you. The additional twist to my problem was that there could be conflicts, i.e. two people could try to book the same boardroom for the same date and time.
I split my models into Boardroom (obviously) and Event (which is the booking associated to a User). I think there was a join model, as well, but it's been a while. When a User would try to book a boardroom, this is the process taken:
Attempt to book on the first available date (done through the calendar UI by the user similar to how Google Calendar creates events)
If it's a one-off, you're done
If it's a recurring event, try to immediately book the next 6 events based on the rule given (weekly, bi-weekly, monthly); If it fails, due to conflict, book the ones you can, e-mail the conflicts to the user
Book for the next year or up to the date the recurrence is ending in a background job; Follow the conflict resolution rule from #3
When resolving the conflicts, the user had the option of either resolving them on a case-by-case basis or moving the remaining bookings to the new, available date and time.
If the user updated the original booking (e.g changed the time and date), he/she had the option of updating only the that one or every following recurrence. If the latter was selected, steps 3 and 4 are re-invoked after the deletion of existing events.
If this sounds a lot like Google Calendar, then you've fully understood my approach, :)
Hope this helps.
I personally think that (in python which I know well), and ruby (which I know less well, but it's a dynamic language, and so I think the concepts map 1:1), you should be using generators. How's that for a minimalistic answer? Now, when you generate your UI, you pass in a reference to the generator, and it generates the objects you need, as they are requested.
As an interface, it has next item, and previous item methods, and acts a bit like a cursor that can wade forward and backward through the various interations. It is in fact, a piece of code masquerading as an infinite series (array) without using infinite memory.
Why do you need to proliferate objects? What you really need are virtual data display controls (for the web or desktop) also known as "paging" I think, in web contexts, and you can think of your schedule as an infinite generated-on-demand spreadsheet, with no top row, and no bottom row. The only values you need to be able to calculate (calculate, not store) are the ones that appear right now, as visible to the user.
The situation:
The magazine accepts submissions. Once you submit, an editor will schedule your submission for review. Once it has been reviewed, you are no longer allowed to edit it.
So, I have submission in various states. "Draft", "queued", "reviewed", etc. Most of the switches into these various states are triggered by some action, e.g., a submission becomes queued when an editor schedules it. Easy peasey. However, the switch into the "reviewed" state is not triggered by any action, it just happens after a certain datetime has passed.
I have two thoughts on how to accomplish this:
Run a daily/hourly cron job to check up on all the queued submissions and switch them to reviewed if necessary. I dislike this because
I would prefer it to be hourly, so that I can edit my submission up to three hours before a meeting starts
Hourly cron jobs cost money on Heroku, and this application will either never make money or won't make money for months and months to come
Somehow construct a before_load ActiveRecord callback, that will perform some logic on submissions each time they are loaded. "Queued? No? Nevermind. Otherwise, switch it to 'Reviewed' if its meeting is less than three hours away."
I wanted to get people's input on the second idea.
Is that an atrociously smelly way to accomplish this?
If so, can you suggest an awesomer third way?
If 'no' to both of the above, can you give tips on how to perform such logic each time a record is loaded from the database? I would need to always perform some logic before doing a select from the submissions table (which is gearing up to be the most-queried table in the app...)
If there's no good way to accomplish Option Two (or, I hope!, Option Three), I will resort to Option One with a daily cron job. Being able to edit up to a day before a meeting will just have to suffice.
Maybe using after_find, although your performance will sort of suck, same goes if you do something as crazy as before_load, performance would suck, that said money might be more important than performance, if that is so, I would go with the after_find.
Like with browser games. User constructs building, and a timer is set for a specific date/time to finish the construction and spawn the building.
I imagined having something like a deamon, but how would that work? To me it seems that spinning + polling is not the way to go. I looked at async_observer, but is that a good fit for something like this?
If you only need the event to be visible to the owning player, then the model can report its updated status on demand and we're done, move along, there's nothing to see here.
If, on the other hand, it needs to be visible to anyone from the time of its scheduled creation, then the problem is a little more interesting.
I'd say you need two things. A queue into which you can put timed events (a database table would do nicely) and a background process, either running continuously or restarted frequently, that pulls events scheduled to occur since the last execution (or those that are imminent, I suppose) and actions them.
Looking at the list of options on the Rails wiki, it appears that there is no One True Solution yet. Let's hope that one of them fits the bill.
I just did exactly this thing for a PBBG I'm working on (Big Villain, you can see the work in progress at MadGamesLab.com). Anyway, I went with a commands table where user commands each generated exactly one entry and an events table with one or more entries per command (linking back to the command). A secondary daemon run using script/runner to get it started polls the event table periodically and runs events whose time has passed.
So far it seems to work quite well, unless I see some problem when I throw large number of users at it, I'm not planning to change it.
To a certian extent it depends on how much logic is on your front end, and how much is in your model. If you know how much time will elapse before something happens you can keep most of the logic on the front end.
I would use your model to determin the state of things, and on a paticular request you can check to see if it is built or not. I don't see why you would need a background worker for this.
I would use AJAX to start a timer (see Periodical Executor) for updating your UI. On the model side, just keep track of the created_at column for your building and only allow it to be used if its construction time has elapsed. That way you don't have to take a trip to your db every few seconds to see if your building is done.