Storing "likes" of an item in a database? - ruby-on-rails

I'm running a Rails app on Postgres through Heroku.
I'd like to implement similar to Facebook "likes" on my site for various items, such as user comments. What's the smartest way to store these in my database that will be efficient and fast?
The obvious one is just to have a like join table between users and items, something like this:
user_id int
item_id int
item_type string
created_at datettime
However, when being displayed, this would mean every time I pull an item, I would have to pull a join across the entire like table, which could get very big.
The obvious response for this would be to store a counter in the items, for their ongoing like count. However, this won't work because who liked an item matters, both for display next to the item, and also to hide the like button for things a user has already liked.
My plan is to add to all likable items a text field in which I would store a serialized array. That way, every pull of an item would come with the complete list of who liked it. Is there a better way to do this, or is this the recommended approach?

Do you have reason to believe that your dataset is going to be so large that the join is going to be too expensive? Postgres, while not as fast as the fastest RDBMSs out there, is pretty fast these days. I used to run a website that got millions of pageviews a day, and required some pretty complicated queries to generate each page. By doing a bit of simple caching we were able to run it on very modest hardware.
You give up a lot of the benefits of using an RDBMS when you denormalize. I would only do so if I knew I had to. And if that were the case I would consider using something else, like a simple key/value database, for that data. But I think that that's only likely to be the case for you if you have an awful lot of data.

Related

What is a good approach to allow users subscribe to keywords

I have a rails application using Rails 4, PostgreSQL and hosted on Heroku.
The application revolves around the following models: User and Article.
A user can create articles. An article contains a title, description, location (latitude, longitude) and an image.
I would like to add a notification system that works as follows:
A user can set-up a list of keywords that they wish to subscribe to.
The user gets a notification if an article containing one of their keywords is added (in the title, but perhaps in description in time).
What is the best approach to implement this in a scalable way?
In its simplest form, I could create a model called Keyword that stores what keywords a user wants to be notified for.
Then in the create action for article, check to see if the title (or description) contains any of the saved keywords.
This sounds good but will probably fall over once any reasonable amount of users are added.
Obviously, a background task would do the trick but it still sounds wrong to do a basic string contains directly on the database.
Perhaps I could tokenize the title and description into an index and use a background process to handle the heavy lifting? I heard Postgres has some built in text search - could this work?
Could I use a Heroku add-on like Solr or Redis to handle all this or is it overkill? (Not having to pay for an add-on is an advantage).
Perhaps someone has a better implementation for the same functionality.
I know I can implement it quickly, I just want to be sure it implementation is up to scratch.
Thanks,
Brian
I have faced a similar problem. The slowest thing is to do a case insensitive search. What I would suggest to you is the following approach: let TID be the id of the row in which you store the title; then create a table which has one row for every word in your title in lowercase, with the corresponding TID. Than what you need is a join between the word and the keywords of the given user. You can speed up this query with hash indexes.
In my case, no one of the postgres text function was usable because they all have poor performance.
PS we implemented a full text search over about 60000 documents, so your case might be a bit different.

What is the best way to store a user's Facebook friends list in my database?

Overview
I'm creating a Ruby on Rails website which uses Facebook to login.
For each user I have a database entry which stores their Facebook User ID along with other basic information.
I'm also using the Koala gem in order to retrieve a user's friendlist from Facebook, but I'm unsure as to how I should store this data...
Option 1
I could store the user's friends as a serialized hash in the User table, then if I wanted to display a list of all the current user's friends, I could grab this hash and do something along the lines of SELECT FROM Users WHERE facebook_user_id IN hash
Each time the user logs in I could update this field to store the latest friends list.
Option 2
I could create a Friend table and store friendship information in here, where a User has many Friends. So there would be a row for each friendship, (User1 and User2 columns). Then to display a list of the current user's friends I could do something like SELECT User2 FROM Friends WHERE User1 = current_user
This seems like the better option to me, but...
It has the disadvantage that there would be many rows... If there were 100,000 users, each with 100 friends, that's now 10,000,000 rows in the Friends table.
It also means each time the user logs in, I'd need to loop over their Facebook friends list returned using Koala and create a Friend record if someone on their friendlist is in my User table and there isn't a corresponding entry in the Friends table. This seems like it'd be slow if a user has 1000 Facebook friends?
I'd appreciate any guidance on how it would be best to achieve this.
Apologies for the badly worded question, I'll try and reword/organise it shortly.
Thanks for any help in advance.
If you need to store a lot of data, then you need to store a lot of data. If you are like most, you probably won't run into that problem sooner than you have the cash to solve it. In other words, you are probably assuming you'll have more traffic and data than you'll get, at least in the short-term. So I doubt this is an issue, even though it is a good sign that you are thinking about it now rather than later.
As I mentioned in my comment below, the easiest solution is to have a tie table with a row for each side of the friend relationship (a has_many :friends, through: :facebook_friend_relationships, class_name: 'FacebookFriend' on FacebookFriend, per the design mentioned below). But your question seemed to be about how to reduce the number of records, so that is what the remainder of the answer will address.
If you have to store in the DB and you know for sure that you will absolutely have every FB user on the planet hitting your site because it is so awesome, but they won't all hit at once, then if you are limited in storage, you may want to use a LRU algorithm (remove the least recently used records) possibly with timed expiration also. You could just have a cron job that does a query on the DB then deletes old/unused records to do this. Wouldn't be perfect, but it would be a simple solution.
You could also archive older data rather than throw it away. So, frequently used data could stay in the table of active users, and then you might offload older data to another table or even another database (and you might see the apartment and second_base gems for that). However, once you get to the size, you're probably looking at a number of other architectural solutions that have much less to do with ActiveRecord models/associations or schema design. Though it pays to plan ahead, I wouldn't worry about that excessively until you are sure that the application will get enough users to invest the time in that.
Even though ActiveRecord has some caching, you could just avoid the DB and cache friends in memory yourself in the beginning for speed, especially if you don't yet have many users, which you probably don't yet. If you think you'll run out of memory because of the high number of users, LRU might be a good option here also, and lru_redux looks interesting. Again, you might want to time the cache also so expires and re-gets friends when the cache expires. Even just storing the results in the user session may be adequate, i.e. in the controller action method, just do #friends ||= Something.find_friends(fb_user_id), and the latter is what most might do as a first shot at it while you're getting started.
If you use ActiveRecord, in your query in the controller (or on the association in the model) consider using include: to avoid n+1 queries. That will speed up things.
For the schema design, maybe:
User - users table with email and authN info. Look at the Devise gem.
FacebookUser - info about the Facebook user.
FacebookFriendRelationship - a tie model with (id and) two columns, one for one FacebookUser id and one for the other.
By separating the authN info (User) from the FB data (FacebookUser and FacebookFriendRelationship), you make it easier to have other social media accounts, etc. each with information specific to those accounts in other tables.
The complexity comes in FacebookUser's relationship with friends if the goal is to minimize rows in the relationship table. To half the number of rows, you'd have a single row for a relationship where the id of FacebookUser could be in either foreign key column. Either the user has a friend or is a friend, so you could have two has_many :through associations on FacebookFriend that each use a different foreign key in FacebookFriendRelationship. Or you could do HABTM without the model and use foreign_key and association_foreign_key options in each association. Either way, you could add a method to add both associations together (because they are arrays). Instead, you could use custom SQL in a single has_many if you didn't care about having to use ActiveRecord to remove associations the normal way. However, per your comments, I think you want to avoid this complexity, and I agree with you, unless you really must limit the number of relationship rows. However, it isn't the number of tie table rows that will eat the data, it is going to be all of the user info you keep in the FacebookFriends table.

Rails: Multiple databases, same schema

I'm in the middle of a fictional scenario project where I have allowed multiple users for a company to log in, create records, and so on, who all connect to the one database. They can all records absence records, attendance records, and so on.
What I want to do however, is use this same schema but expands this to allow several companies to have their own databases using the same schema. So each company will have their own data, but all companies use the same data model. In other words all company's can create absence records, but they each only have access to their own absence records that they created themselves.
How can I achieve this?
All I need is two or three files for this, I'm not going commercial with it in case you guys think I'm cutting corners at someone else's expense!
Something as simple as an if-else that decides which file to use would be very useful to me, so if such a line of code exists please let me know.
I think you are doing it wrong (unless you have a really good reason to have a database for each company), because it seems like you are repeating your data model over and over while introducing unnecessary complexity to your code.
Try to have all the companies in one DB/tables with having separated by the company_id.
Ex: data structure would be as follows
companies table
id
name
users table
id
user_name
company_id
However if you really want to connect to multiple databases, check this SO question.

Populating dropdownlists for mult-tenant applications

I am building an mvc 3 application that will be multi-tenant, which means it will use the same basic data structure, but provide different data depending on the domain name used to access it.
A problem I am trying to solve is this. How best do I populate a number of dropdown lists with selection choices based on the site being rendered. To add another wrinkle, I will need to localize the strings as well.
An obvious choice is to simply create a table with columns for website id and language id, plus field id and string value. This seems ok, but also seems to ignore possible mechanisms that are already in place for localization. I feel like i'm recreating the wheel here.
As an example, site 1 might have a dropdownlist for Favorite Activities, and have ranges items that are geared toward musical interests. Site 2 might have the same dropdown, but have items geared for sports intersts.
So my question is, how would you go about solving this problem? Also, in a similar vein... If you have selection lists, say State codes, cities, etc.. would you tend to create seperate tables to populate this data (states table, cities table, etc..) or would you put all this information in a common table and have an ID to indicate which dropdown it was to be used for? The former seems more normalized, but the latter seems more efficient (less code to write).
Thoughts about Common Lookup Tables. This guy is definitely against.
http://www.projectdmx.com/dbdesign/lookup.aspx
I have used it and believe that I have saved some time, or at least some keystrokes. Might be sorry later on.

Store data in Ruby on Rails without Database

I have a few data values that I need to store on my rails app and wanted to know if there are any alternatives to creating a database table just to do this simple task.
Background: I'm writing some analytics and dashboard tools for my ruby on rails app and i'm hoping to speed up the dashboard by caching results that will never change. Right now I pull all users for the last 30 days, and re-arrange them so I can see the number of new users per day. It works great but takes quite a long time, in reality I should only need to calculate the most recent day and just store the rest of the array somewhere else.
Where is the best way to store this array?
Creating a database table seems a bit overkill, and I'm not sure that global variables are the correct answer. Is there a best practice for persisting data like this?
If anyone has done anything like this before let me know what you did and how it turned out.
Ruby has a built-in Hash-based key value store named PStore. This provides simple file based, transactional persistance.
PStore documentation
If you've got a database already, it's really not a big deal to create a separate table for tracking this sort of thing. When doing reporting, it's often to your advantage to create derivative summary tables exactly like what you're describing. You can update these as required using a simple SQL statement and there's no worry that your temporary store will somehow go away.
That being said, the type of report you're trying to generate is actually something that can be done in real-time except on extravagantly large data sets. The key is to have indexes that describe the exact grouping operation you're trying to do. For instance, if you're grouping by calendar date, you can create a "date" field and sync it to the "created_at" time as required. An index on this date field will make doing a GROUP BY created_date very quick:
SELECT created_date AS on_date, COUNT(id) AS new_users FROM users GROUP BY created_date
Using a lightweight database like sqlite shouldn't feel like an overkill. Alternatively, you can use key-store solutions like tokyo cabinet or even store the array in a flat file manually but I really don't see any overkill in using sqlite.

Resources