Related
This has been quite a stumbling block. Warning: the following is not a question, rather explanation of what I came up with. My question is — do you have a better way to do this? Is there some common technique for this that I'm not familiar with? Seems like this is a trivial problem.
So you have Task model. You can create tasks, complete them, destroy them. Then you have recurring tasks. It's just like regular task, but it has a recurrence rule attached to it. However, tasks can recur indefinitely — you can go a year ahead in the schedule, and you should see the task show up.
So when a user creates a recurring task, you don't want to build thousands of tasks for hundred years into the future, and save them to database, right? So I started thinking — how do you create them?
One way would be to create them as you view your schedule. So, when the user is moving a month ahead, any recurring tasks will be created. Of course that means that you can't simply work with database records of tasks any longer. Every SELECT operation on tasks you ever do has to be in the context of a particular date range, in order to trigger recurring tasks in that date range to persist. This is a maintenance and performance burden, but doable.
Alright, but how about the original task? Every recurrent task gets associated with the recurrence rule that created it, and every recurrence rule needs to know the original task that started the recurrence. The latter is important, because you need to clone the original task into new dates as the user browses their schedule. I guess doable too.
But what happens if the original task is updated? It means that now as we browse the schedule, we will be creating recurring tasks cloned off of the modified task. That's undesirable. All the implicitly persisted recurring tasks should show up the way the original task looked like when recurrence was added. So we need to store a copy of the original task separately, and clone from that, in order for recurrence to work.
However, when the user navigates the tasks in the schedule, how do we know if at a particular point a new recurrence task needs to be created? We ask recurrence rule: "hey, should I persist a task for this day?" and it says yes or no. If there is already a task for this recurrence for this day, we don't create one. All nice, except a user shall also be able to simply delete one of the recurring tasks that has been automatically persisted. In that case following our logic, the system will re-create the task that has been deleted. Not good. So it means we need to keep storing the task, but mark it as deleted task for this recurrence. Meh.
As I said in the beginning, I want to know if somebody else tackled this problem and can provide architectural advice here. Does it have to be this messy? Is there anything more elegant I'm missing?
Update: Since this question is hard to answer perfectly, I will approve the most helpful insight into design/architecture, which has the best helpfulness/trade-offs ratio for this type of problem. It does not have to encompass all the details.
I know this is an old question but I'm just starting to look into this for my own application and I found this paper by Martin Fowler illuminating: Recurring Events for Calendars
The main takeaway for me was using what he calls "temporal expressions" to figure out if a booking falls on a certain date range instead of trying to insert an infinite number of events (or in your case tasks) into the database.
Practically, for your use case, this might mean that you store the Task with a "temporal expression" property called schedule. The ice_cube recurrence gem has the ability to serialize itself into an active record property like so:
class Task < ActiveRecord::Base
include IceCube
serialize :schedule, Hash
def schedule=(new_schedule)
write_attribute(:schedule, new_schedule.to_hash)
end
def schedule
Schedule.from_hash(read_attribute(:schedule))
end
end
Ice cube seems really flexible and even allows you to specify exceptions to the recurrence rules. (Say you want to delete just one occurrence of the task, but not all of them.)
The problem is that you can't really query the database for a task that falls in a specific range of dates, because you've only stored the rule for making tasks, not the tasks themselves. For my case, I'm thinking about adding a property like "next_recurrence_date" which will be used to do some basic sorting/filtering. You could even use that to throw a task on a queue to have something done on the next recurring date. (Like check if that date has passed and then regenerate it. You could even store an "archived" version of the task once its next recurring date passes.)
This fixes your issue with "what if the task is updated" since tasks aren't ever persisted until they're in the past.
Anyway, I hope that is helpful to someone trying to think this through for their own app.
Having done a calendar-like component for an internal social networking app, here's my approach to that problem.
Tiny bit of background: I needed to book boardrooms for meetings for the entire company. Every boardroom needed to be booked either as a one-off or on a recurring basis. As you've found out, it's the recurrence rules that kill you. The additional twist to my problem was that there could be conflicts, i.e. two people could try to book the same boardroom for the same date and time.
I split my models into Boardroom (obviously) and Event (which is the booking associated to a User). I think there was a join model, as well, but it's been a while. When a User would try to book a boardroom, this is the process taken:
Attempt to book on the first available date (done through the calendar UI by the user similar to how Google Calendar creates events)
If it's a one-off, you're done
If it's a recurring event, try to immediately book the next 6 events based on the rule given (weekly, bi-weekly, monthly); If it fails, due to conflict, book the ones you can, e-mail the conflicts to the user
Book for the next year or up to the date the recurrence is ending in a background job; Follow the conflict resolution rule from #3
When resolving the conflicts, the user had the option of either resolving them on a case-by-case basis or moving the remaining bookings to the new, available date and time.
If the user updated the original booking (e.g changed the time and date), he/she had the option of updating only the that one or every following recurrence. If the latter was selected, steps 3 and 4 are re-invoked after the deletion of existing events.
If this sounds a lot like Google Calendar, then you've fully understood my approach, :)
Hope this helps.
I personally think that (in python which I know well), and ruby (which I know less well, but it's a dynamic language, and so I think the concepts map 1:1), you should be using generators. How's that for a minimalistic answer? Now, when you generate your UI, you pass in a reference to the generator, and it generates the objects you need, as they are requested.
As an interface, it has next item, and previous item methods, and acts a bit like a cursor that can wade forward and backward through the various interations. It is in fact, a piece of code masquerading as an infinite series (array) without using infinite memory.
Why do you need to proliferate objects? What you really need are virtual data display controls (for the web or desktop) also known as "paging" I think, in web contexts, and you can think of your schedule as an infinite generated-on-demand spreadsheet, with no top row, and no bottom row. The only values you need to be able to calculate (calculate, not store) are the ones that appear right now, as visible to the user.
The situation:
The magazine accepts submissions. Once you submit, an editor will schedule your submission for review. Once it has been reviewed, you are no longer allowed to edit it.
So, I have submission in various states. "Draft", "queued", "reviewed", etc. Most of the switches into these various states are triggered by some action, e.g., a submission becomes queued when an editor schedules it. Easy peasey. However, the switch into the "reviewed" state is not triggered by any action, it just happens after a certain datetime has passed.
I have two thoughts on how to accomplish this:
Run a daily/hourly cron job to check up on all the queued submissions and switch them to reviewed if necessary. I dislike this because
I would prefer it to be hourly, so that I can edit my submission up to three hours before a meeting starts
Hourly cron jobs cost money on Heroku, and this application will either never make money or won't make money for months and months to come
Somehow construct a before_load ActiveRecord callback, that will perform some logic on submissions each time they are loaded. "Queued? No? Nevermind. Otherwise, switch it to 'Reviewed' if its meeting is less than three hours away."
I wanted to get people's input on the second idea.
Is that an atrociously smelly way to accomplish this?
If so, can you suggest an awesomer third way?
If 'no' to both of the above, can you give tips on how to perform such logic each time a record is loaded from the database? I would need to always perform some logic before doing a select from the submissions table (which is gearing up to be the most-queried table in the app...)
If there's no good way to accomplish Option Two (or, I hope!, Option Three), I will resort to Option One with a daily cron job. Being able to edit up to a day before a meeting will just have to suffice.
Maybe using after_find, although your performance will sort of suck, same goes if you do something as crazy as before_load, performance would suck, that said money might be more important than performance, if that is so, I would go with the after_find.
I'm looking for guidance on how to architect an elegant solution to what has become a bit of a thorny problem. Although I am using Ruby (and Rails) I think my problem is largely an architectural one, though my choice of language obviously has an impact in terms of suggestions involving libraries, etc., so the language remains relevant.
Anyway, in a nutshell: my application contains objects representing memberships, belonging to people who are members of fitness facilities. Memberships contain a series of recurring payments. Some memberships automatically renew at the end of their term, while others do not.
So for example, you may have a membership that is for an initial period of one year, and then renews month-to-month after that. In the application, creating a membership of this kind causes 12 recurring payments to be created. When the last month expires, so does the membership. A daily cron task is responsible for causing memberships to expire based on completed payments. If the membership is set to automatically renew, the same cron task will renew the membership.
You may also have memberships that have no initial term and simply run month-to-month or week-to-week. These work in a similar manner, minus the initial payment scheduling.
So far so good. What makes things complicated are the additional requirements:
administrators can "freeze" memberships (put them on hold), for specific durations, after which they automatically reactivate (e.g. to represent people who go away on vacation for a set period of time). I can choose to freeze a membership right now and have it reactivate later, or I can choose to schedule a freeze by setting the freeze date at some point in the future, as well as the reactivation date (note: there is always a reactivation date, which makes things a bit easier).
administrators can cancel memberships, either right now, or by setting a cancellation to occur in the future. (Future cancellations are not yet built.)
administrators can refund memberships, which is like a cancellation except any past payments are refunded.
What makes these difficult to deal with is the effect on recurring payments. When you freeze a membership, the recurring payments must "stretch out" around the freeze period, so that the period of time that represents the freeze is not paid for. This is both conceptually and programmatically difficult to handle. Payments, for example, may extend for different periods (i.e. each payment for someone who pays every other week pays for two weeks of a membership), and the date of cancellation may be anywhere within the period the payment covers.
For the freezes, I have taken the approach where the membership object contains some dates, namely "freeze_on" and "thaw_on" to handle the freeze period. However, the client now wants future cancellations as well, and I have noticed some bugs with the freezing functionality, which leads me to believe I need to reconsider my approach.
I am considering changing things so that future events can be scheduled but have no effect on the recurring payments portion of the application. The idea would be to queue up particular events. For example, a freeze in the future would be accomplished by queuing up a freeze event on a particular date, and a thaw event on a subsequent date (these two events would be connected into a single "scheduled freeze" from the user's perspective). A future cancellation would be handled similarly.
This approach has some benefits, for example, if you wanted to cancel a future cancellation (that's the kind of annoying, tricky stuff I'm talking about), you could simply remove the scheduled cancellation from the events queue.
However, I have the nagging feeling that I may simply be jumping from the frying pan into the fire. I'm wondering if anyone could provide me with some guidance on this issue. Are there design patterns or existing architectural principles for this sort of problem that I can examine?
An additional thing to note is that recurring payments for memberships with scheduled terms (i.e. not month-to-month automatically renewing) must exist as database records that can be edited (moved in time, price adjusted), so using temporal expressions (as Martin Fowler suggests) is not appropriate for this problem, so far as I know. I realize that my proposed solution of an events queue would not display to the user the changes that would happen to any existing recurring payments, but I think I can live with that.
not a scanlife bar code, it's a qr code
toronto, give us your creative people
Edit: To respond to the two great suggestions below (the comment boxes don't allow nearly this level of detail):
Kris Robison:
Yes, the freeze period can be an arbitrary length, although in practice I imagine it would be rare for it to be less than two weeks. But any solution should work regardless of the length of the period.
Yes, the renewal date changes - it is pushed forward by the length of the freeze. So if the freeze is two weeks long, it pushes the payment forward by two weeks. To make things especially tricky, in some businesses, the payments can only be withdrawn on specific dates - for example, some clubs only process payments on the 1st and 15th of each month. So when dates are pushed around, for these clubs, they have to "snap" to a particular date.
Can you explain in more detail why these rules affect event queuing but not management of subscription payments?
I'm interested in your amortization table concept. That's basically exactly what I have built already - a year-long membership with monthly payments creates 12, with weekly it created 52 - and each of these have an amount, tax, etc., associated with them, along with a state machine that governs "pending", "paid", "failed", and "refunded" states.
The part I am struggling with is how this table responds to events. Right now, if you set a freeze, it affects the table immediately by changing the dates of the payments. Set a freeze in the middle of the table, and it pushes payments forward. That sounds effective, but it's actually quite complex and hard to manage. How would your amortization table idea improve this situation?
Arsen7:
This sounds like the event queue I proposed originally. It seems obvious to me that you've worked with stuff like this before (I was impressed by your error check on the processing date, this is a great idea and one I intend to implement ASAP) so I'm hoping that you can explain your suggestion in a little more detail.
Specifically, I'm wondering how your concept would deal with the recurring payment situation I've described in my original question, and in the comment that I just left on Kris Robison's answer. If I have set up a schedule of recurring payments for a given purchase, and a freeze event is scheduled for right in the middle of the payments, would the schedule of payments remain unchanged until the date of the freeze became the current date, at which time the freeze would be instituted and the payments would move forward?
This strikes me as perhaps a great way to simplify my application, but I am wondering how users would perceive it. How would I indicate to them that the schedule of payments they were looking at when a freeze has been scheduled is no longer an accurate schedule, but will change once the freeze takes place?
Is it acceptable to apply a scheme used by banking, where you process all account operations once a day?
Every object may have a set of (future) operations, like freeze periods, and every day the object has to make a simple decision, like: "should I expire today or not?"
The good part is, that such daily processing is very simple to program. Also a strange renewal rules (in case you would want them) are simple to design: "is it Friday? is it the last one in this month? If yes, mark me as renewed, add some amount to required payment, or do anything".
That would be very costly (in terms of computing power) to calculate the status dynamically, every time the object is asked. If you store the current "account", you only need more complex calculations when you want to predict future state.
Consider it a pseudo-code:
def process(day)
raise "Already processed or missed a day" unless day == last_processed_day + 1
check_expiration day
check_frozen day
check_anything day
#...
self.last_processed_day = day
self.save!
end
RESPONSE:
Specifically, I'm wondering how your concept would deal with the recurring payment situation I've described in my original question, and in the comment that I just left on Kris Robison's answer. If I have set up a schedule of recurring payments for a given purchase, and a freeze event is scheduled for right in the middle of the payments, would the schedule of payments remain unchanged until the date of the freeze became the current date, at which time the freeze would be instituted and the payments would move forward?
This strikes me as perhaps a great way to simplify my application, but I am wondering how users would perceive it. How would I indicate to them that the schedule of payments they were looking at when a freeze has been scheduled is no longer an accurate schedule, but will change once the freeze takes place?
The "daily processing" scheme helps you with providing quick responses for questions which require complex calculations.
You have three "groups": current state (asked often), history (almost never changing, asked relatively rarely), and future.
Your daily processing procedure is not constrained to only update "current" state. If some event is scheduled for the processed day, the procedure probably needs to add "history" records.
If your users will often ask questions about future (and as you said, they will), the "processing" may also create a kind of cache for these questions. For example: find (calculate) the next payment date, and write it in a helper table (a schedule).
The main thing you need to do is to decide which questions will be asked by users, and whether you are able to calculate the responses on-the-fly or you need to have the answer prepared.
In a bank (it varies, of course) if you ask about your current balance, they may give you the answer which was true at the beginning of the day. "Better" banks will tell you that you had X$ at the morning, but now there are also Y$ waiting for accounting.
So, if you put a freeze record into the event queue, you may call a method, which will update the schedule at once. The same procedure (or a very similar) will or may be called in the daily processing routine.
Couple of questions come to mind:
Can the freeze be for time periods less than a month?
If so, does the renewal date change? Or how does the monthly payment get applied for the partial month?
Those affect how your event queuing system may work, but don't ultimately change the management of subscription payments.
One way to address subscription issues is to create an amortization table of subscription payments. If they are paying monthly for a year, twelve payments are queued up in the table. A weekly customer may have 52 payments in the table for the same year. Each time the renewal date comes up, after checking whether a freeze is in place, apply the next payment.
The amortization table keeps track of which payments have been made. If an account cancels, unpaid rows are refunded. If an account freezes according to your event queue, no payment is applied and the table remains static until the account is thawed.
Reply
It sounds like you have the concept of renewal date built into the amortization table. I use the amortization table more as a queue, and keep just the next renewal date with the subscription.
If the renewal date is inherent with the amortization table, then yes, it would be complicated to make changes as things go along. However, the renewal date should only affect the day on which you check to see if another payment gets applied.
If you are preserving partial payments while a subscription is on hold, and if a hold can be for an unspecified period of time, having the duration value on the amortization table lets you push a partial payment back in the queue in a "credit" state with a duration equal to the remaining time left in the payment. That way when the account is thawed, that partial credit is applied first and you calculate the next renewal date from the remaining duration.
Use some form of ordered list to preserve payment order at this point. It also comes in handy should someone ever want to insert a renewal period's worth of credit for customer service reasons.
How scalable is memcache in an enviornment where a cache is potentially getting expired every second. In fact, my question is not just about scalability of memcached but about situations where a model is continuously changing and the best way to scale that type of environment. One might say, why cache if the cache is getting expired every second.
Consider this in an hypothetical app, where people are posting marking posts as favorites and let's just consider that there are thousands of people constantly marking posts favorite and creating a favorite record as a result. With each insertion the post view needs to be updated to show the current stats about posts, how many people made it favorite, a user's favorite count etc etc.
We were thinking this could be cached to show only a snapshot taken every x many minutes..but is there a good way to make this more real time in rails?
Try such algo:
Update db records for post with +1 vote
Create cache value || update existing cache +1 for post fav count
Create cache value || update existing cache +1 for user fav count
Because cache has no transactions values may be not consistent with current DB values -
rare parallel read and write race condition may occur. But these inconsistencies will expire as soon cache entry invalidates and values are recalculated from DB.
In above scenario cache sweeping operation may be performed every minute or even every hour - depending how accurate stats should be. Remember that you don't lose anything - all accurate data is kept in DB.
Most important here is that users see realtime value change for vote.
Memcached read/write cost is definitely less than DB here because memcache is just multi-access hash store with no transactions.
I have a rails app that tracks membership cardholders, and needs to report on a cardholder's status. The status is defined - by business rule - as being either "in good standing," "in arrears," or "canceled," depending on whether the cardholder's most recent invoice has been paid.
Invoices are sent 30 days in advance, so a customer who has just been invoiced is still in good standing, one who is 20 days past the payment due date is in arrears, and a member who fails to pay his invoice more than 30 days after it is due would be canceled.
I'm looking for advice on whether it would be better to store the cardholder's current status as a field at the customer level (and deal with the potential update anomalies resulting from potential updates of invoice records without updating the corresponding cardholder's record), or whether it makes more sense to simply calculate the current cardholder status based on data in the database every time the status is requested (which could place a lot of load on the database and slow down the app).
Recommendations? Or other ideas I haven't thought of?
One important constraint: while it's unlikely that anyone will modify the database directly, there's always that possibility, so I need to try to put some safeguards in place to prevent the various database records from becoming out of sync with each other.
The storage of calculated data in your database is generally an optimisation. I would suggest that you calculate the value on every request and then monitor the performance of your application. If the fact that this data is not stored becomes an issue for you then is the time to refactor and store the value within the database.
Storing calculated values, particularly those that can affect multiple tables are generally a bad idea for the reasons that you have mentioned.
When/if you do refactor and store the value in the DB then you probably want a batch job that checks the value for data integrity on a regular basis.
The simplest approach would be to calculate the current cardholder status based on data in the database every time the status is requested. That way you have no duplication of data, and therefore no potential problems with the duplicates becoming out of step.
If, and only if, your measurements show that this calculation is causing a significant slowdown, then you can think about caching the value.
Recently I had similar decision to take and I decided to store status as a field in database. This is because I wanted to reduce sql queries and it looks simpler. I choose to do it that way because I will very often need to get this status and calculating it is (at least in my case) a bit complicated.
Possible problem with it is that it get out of sync, so I added some after_save and after_destroy to child model, to keep it synchronized. And of course if somebody would modify database in different way, it would make some problems.
You can write simple rake task that will check all statuses and, if needed, correct them. You can run it in cron so you don't have to worry about it.