How should I be storing expiring stats? - ruby-on-rails

Let's say I have 2 models in my app: User and Survey
I'm trying to plot the number of paid surveys over time. A paid survey is one that has been created by a user that has an active subscription. For simplicity, let's assume the User model has subscription_start_date and subscription_end_date.
So a survey becomes "paid" the moment it is created (provided the user has an active subscription) and loses its "paid" status when the subscription_end_date has passed. Essentially, the "paid survey" is really a state with a defined start and end date.
I can generate the data fine. What I'm curious about is what's the most recommended way of storing this kind of stats? What should that table look like basically.
Another thing I'm concerned about is whether there are any disadvantages of having a daily task that adds the data point for the past day.
For more context, this app is written in Rails and we're thinking of using this stat architecture for other models too.

If I am understanding you correctly, I do not think you need an additional model or daily task to generate data points. To generate your report you just need to come up with the right SQL/ActiveRecord query. When you aggregate the information, be careful not to introduce nested queries. For simplicity's sake we could pull all the information you need using:
surveys = Survey.all.includes(:user)
Based on your description, an instance of survey has a start date that is just created_at.to_date. And since Survey belongs_to :user, it's end date is user.subscription_end_date.
When plotting the information you may need to transform surveys into some data structure that groups the information by date. Alternatively you could probably achieve that with a more complex SQL statement.
You could of course introduce a new table that stores the data points by date to avoid a complex query or data aggregation via ruby. The downside of this is that you are storing redundant information and assume the burden of maintaining data integrity. That doesn't mean you shouldn't do it because there may be an upside in regards to performance and reporting convenience.
I would need more information about your project before saying exactly what I would do, but it sounds like you already have the information you need in your database and it's just a matter querying it properly.

Related

Logging data changes for synchronization

I am looking for solution of logging data changes for public API.
There is a need to tell client app which tables form db has changed and need to be synchronised since the app synchronised last time and also need to be for specific brand and country.
Current Solution:
Version table with class_names of models which is touched from every model on create, delete, touch and save action.
When we are touching version for specific model we also look at the reflected associations and touch them too.
Version model is scoped to brand and country
REST API is responding to a request that includes last_sync_at:timestamp, brand and country
Rails look at Version with given attributes and return class_names of models which were changed since lans_sync_at timestamp.
This solution works but the problem is performance and is also hard to maintenance.
UPDATE 1:
Maybe the simple question is.
What is the best practice how to find out and tell frontend apps when and what needs to be synchronized. In terms of whole concept.
Conditions:
Front end apps needs to download only their own content changes not whole dataset.
Does not invoked synchronization when application from different country or brand needs to be synchronized.
Thank you.
I think that the best solution would be to use redis (or some other key-value store) and save your information there. Writing to redis is much faster than any sql db. You can write some service class that would save the data like:
RegisterTableUpdate.set(table_name, country_id, brand_id, timestamp)
Such call would save given timestamp under key that could look like i.e. table-update-1-1-users, where first number is country id, second number is brand id, followed by table name (or you could use country and brand names if needed). If you would like to find out which tables have changed you would just need to find redis keys with query "table-update-1-1-*", iterate through them and check which are newer than timestamp sent through api.
It is worth to rmember that redis is not as reliable as sql databases. Its reliability depends on configuration so you might want to read redis guidelines and decide if you would like to go for it.
You can take advantage of the fact that ActiveModel automatically logs every time it updates a table row (the 'Updated at' column)
When checking what needs to be updated, select the objects you are interested in and compare their 'Updated at' with the timestamp from the client app
The advantage of this approach is that you don't need to keep an additional table that lists all the updates on models, which should speed things up for the API users and be easier to maintain.
The disadvantage is that you cannot see the changes in data over time, you only know that a change occurred and you can access the latest version. If you need to track changes in data over time efficiently, than I'm afraid you'll have to rework things from the top.
(read last part - this is what you are interested in)
I would recommend that you use the decorator design pattern for changing the client queries. So the client sends a query of what he wants and the server decides what to give him based on the client's last update.
so:
the client sends a query that includes the time it last synched
the server sees the query and takes into account the client's nature (device-country)
the server decorates (changes accordingly) the query to request from the DB only the relevant data, and if that is not possible:
after the data are returned from the database manager they are trimmed to be relevant to where they are going
returns to the client all the new stuff that the client cares about.
I assume that you have a time entered field on your DB entries.
In that case the "decoration" of the query (abstractly) would be just to add something like a "WHERE" clause in your query and state you want data entered after the last update.
Finally, if you want that to be done for many devices/locales/whatever implement a decorator for the query and the result of the query and serve them to your clients as they should be served. (Keep in mind that in contrast with a subclassing approach you will only have to implement one decorator for each device/locale/whatever - not for all combinations!
Hope this helped!

What are CourseCompletions and when are they created?

I see the entries in the API documentation for getting "CourseCompletion" objects. But do not see how these are entered in the Learning Environment. Can you explain what these objects are?
CourseCompletion records are essentially meta-data type notes that you can attach to a user/course-offering combination to make a record of a user having "completed" a course on such-and-such a date. The course completion record can also carry an expiry date for when the "completion" becomes out of date or no longer relevant. These features are not heavily used by D2L customers, and are not exposed through the Web UI.
I don't believe there is any automation within the back-end service around the creation or modification of these records (for example, there isn't an event in the system when a course completion record would get created: a client would need to manually create such a record when it wants one to exist).

Rails: Multiple databases, same schema

I'm in the middle of a fictional scenario project where I have allowed multiple users for a company to log in, create records, and so on, who all connect to the one database. They can all records absence records, attendance records, and so on.
What I want to do however, is use this same schema but expands this to allow several companies to have their own databases using the same schema. So each company will have their own data, but all companies use the same data model. In other words all company's can create absence records, but they each only have access to their own absence records that they created themselves.
How can I achieve this?
All I need is two or three files for this, I'm not going commercial with it in case you guys think I'm cutting corners at someone else's expense!
Something as simple as an if-else that decides which file to use would be very useful to me, so if such a line of code exists please let me know.
I think you are doing it wrong (unless you have a really good reason to have a database for each company), because it seems like you are repeating your data model over and over while introducing unnecessary complexity to your code.
Try to have all the companies in one DB/tables with having separated by the company_id.
Ex: data structure would be as follows
companies table
id
name
users table
id
user_name
company_id
However if you really want to connect to multiple databases, check this SO question.

Should votes be it's own model in RoR app?

Let's say you want to create a Digg.com-like site. Should the votes be its own separate model, or should the votes be a field in the table for the model of the object that is voted on?
It depends on how much information you want to store. If you just have a reference to something and a total score, then you don't need a model. If you have want to store who voted, how many up/down votes were received, timestamp when votes were received, and be able to rollback votes from unruly sources, then you'll need to keep each of those votes as their own model. Personally, I'd make each vote its own record, if I were designing such a system.
Considering Digg.com-like site requirement, I would say - own model. Greatly due to the need of so called "voting rings" detection - detecting groups of fake voters.
Other than that - I would go with fields. MySQL for example can update rows atomically (so they say, never tried it myself), what is supposed to be quite efficient. More information on the MySQL docs.
It depends if you want to keep related vote information or not. This has nothing to do with RoR but with database normalization.
If you want to keep extra information with the votes, like maybe the date it was recorded you should keep it in another table (and as such it will be another model). If not you can store it in the other objects table.

Store data in Ruby on Rails without Database

I have a few data values that I need to store on my rails app and wanted to know if there are any alternatives to creating a database table just to do this simple task.
Background: I'm writing some analytics and dashboard tools for my ruby on rails app and i'm hoping to speed up the dashboard by caching results that will never change. Right now I pull all users for the last 30 days, and re-arrange them so I can see the number of new users per day. It works great but takes quite a long time, in reality I should only need to calculate the most recent day and just store the rest of the array somewhere else.
Where is the best way to store this array?
Creating a database table seems a bit overkill, and I'm not sure that global variables are the correct answer. Is there a best practice for persisting data like this?
If anyone has done anything like this before let me know what you did and how it turned out.
Ruby has a built-in Hash-based key value store named PStore. This provides simple file based, transactional persistance.
PStore documentation
If you've got a database already, it's really not a big deal to create a separate table for tracking this sort of thing. When doing reporting, it's often to your advantage to create derivative summary tables exactly like what you're describing. You can update these as required using a simple SQL statement and there's no worry that your temporary store will somehow go away.
That being said, the type of report you're trying to generate is actually something that can be done in real-time except on extravagantly large data sets. The key is to have indexes that describe the exact grouping operation you're trying to do. For instance, if you're grouping by calendar date, you can create a "date" field and sync it to the "created_at" time as required. An index on this date field will make doing a GROUP BY created_date very quick:
SELECT created_date AS on_date, COUNT(id) AS new_users FROM users GROUP BY created_date
Using a lightweight database like sqlite shouldn't feel like an overkill. Alternatively, you can use key-store solutions like tokyo cabinet or even store the array in a flat file manually but I really don't see any overkill in using sqlite.

Resources