strategy for writing format of object to mysql table - ruby-on-rails

Rails newbie here so just want to be sure I'm doing this right. I have a couple of complex relationships and would like to cache those relationships locally in the mysql row. Think a feature like facebook likes. I have this information currently in a mysql table. I was going to put in a "liked_ids" column that is text and stored as json. And then have an accessor like:
def likes
str=self.liked_ids
b=JSON.parse(str)
return b
end
I have seen some people mention storing as YAML instead but there's a ton of json already being used.
And then when someone submits to the main table (say likes_users), we just add a callback that updates this json field.
Are there alternatives or does this seem like a reasonable idea?
thx

Why not use a table and separate model?
User likes Post
users_posts ---> user_id, post_id
Class user has and belongs to many likes (class override to Post)
I wouldn't store relationship data as a field when you are using a relational database.

You can use the serialize method to define a serialization column.
The serialization is done through YAML.
class User < ActiveRecord::Base
serialize :liked_ids, Array
end
user = User.create(:liked_ids => [1, 2, 3])
User.find(user.id).liked_ids # => [1, 2, 3]
All there's left is to keep the column in sync via callbacks.

Related

Rails Data Modelling

In my company, we are trying to cache some data that we are querying from an API. We are using Rails. Two of my models are 'Query' and 'Response'. I want to create a one-to-many relationship between Query and Response, wherein, one query can have many responses.
I thought this is the right way to do it.
Query = [query]
Response = [query_id, response_detail_1, response_detail_2]
Then, in the Models, I did the following Data Associations:
class Query < ActiveRecord::Base
has_many :response
end
class Response < ActiveRecord::Base
belongs_to :query
end
So, canonically, whenever I want to find all the responses for a given query, I would do -
"_id" = Query.where(:query => "given query").id
Response.where(:query_id => "_id")
But my boss made me use an Array column in the Query model, remove the Data Associations between the models and put the id of each response record in that array column in the Query model. So, now the Query model looks like
Query = [query_id, [response_id_1, response_id_2, response_id_3,...]]
I just want to know what are the merits and demerits of doing it both ways and which is the right way to do it.
If the relationship is really a one-to-many relationship, the "standard" approach is what you originally suggested, or using a junction table. You're losing out on referential integrity that you could get with a FK by using the array. Postgres almost had FK constraints on array columns, but from what I researched it looks like it's not currently in the roadmap:
http://blog.2ndquadrant.com/postgresql-9-3-development-array-element-foreign-keys/
You might get some performance advantages out of the array approach if you consider it like a denormalization/caching assist. See this answer for some info on that, but it still recommends using a junction table:
https://stackoverflow.com/a/17012344/4280232. This answer and the comments also offer some thoughts on the array performance vs the join performance:
https://stackoverflow.com/a/13840557/4280232
Another advantage of using the array is that arrays will preserve order, so if order is important you could get some benefits there:
https://stackoverflow.com/a/2489805/4280232
But even then, you could put the order directly on the responses table (assuming they're unique to each query) or you could put it on a join table.
So, in sum, you might get some performance advantages out of the array foreign keys, and they might help with ordering, but you won't be able to enforce FK constraints on them (as of the time of this writing). Unless there's a special situation going on here, it's probably better to stick with the "FK column on the child table" approach, as that is considerably more common.
Granted, that all applies mainly to SQL databases, which I notice now you didn't specify in your question. If you're using NoSQL there may be other conventions for this.

Ruby on Rails: Saving multiple values in a single database cell

How do I save multiple values in a single cell record in Ruby on Rails applications?
If I have a table named Exp with columns named: Education, Experience, and Skill, what is the best practice if I want users to store multiple values such as: education institutions or skills in a single row?
I'd like to have users use multiple text fields, but should go into same cell record.
For instance if user has multiple skills, those skills should be in one cell? Would this be best or would it be better if I created a new table for just skills?
Please advise,
Thanks
I would not recommend storing multiple values in the same database column. It would make querying very difficult. For example, if you wanted to look for all the users with a particular skill set, the query would clumsy both on readability and performance.
However, there are still certain cases where it makes sense.
When you want to allow for variable list of data points
You are not going to query the data based on one of the values in the list
ActiveRecord has built-in support for this. You can store Hash or Array in a database column.
Just mark the columns as Text
rails g model Exp experience:text education:text skill:text
Next, serialize the columns in your Model code
class Exp < ActiveRecord::Base
serialize :experience, :education, :skill
# other model code
end
Now, you can just save the Hash or Array in the database field!
Exp.new(:skill => ['Cooking', 'Singing', 'Dancing'])
You can do it using a serialized list in a single column (comma-separated), but a really bad idea, read these answers for reasoning:
Is storing a delimited list in a database column really that bad?
How to store a list in a column of a database table
I suggest changing your schema to have a one to many relationship between users and skills.
Rails 4 and PostgreSQL comes with hstore support out of the box, more info here In rails 3 you can use gem to enable it.
It depends on what kind of functionality you want. If you want to bind the Exp model attributes with a form (for new and update operations) and put some validations on them, it is always better to keep it in a separate table. On the other hand, if these are just attributes, which you just need in database keep them in a single column. There is way by which you can keep the serialized object like arrays and hashes in database columns. Make them a array/hash as per your need and save it like this.
http://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/Serialization/ClassMethods.html#method-i-serialize
Serialized attributes, automatically deserializes when they are pulled out of tables and serialized automatically when saved.

Loading all the data but not from all the tables

I watched this rails cast http://railscasts.com/episodes/22-eager-loading but still I have some confusions about what is the best way of writing an efficient GET REST service for a scenario like this:
Let's say we have an Organization table and there are like twenty other tables that there is a belongs_to and has_many relations between them. (so all those tables have a organization_id field).
Now I want to write a GET and INDEX request in form of a Rails REST service that based on the organization id being passed to the request in URL, it can go and read those tables and fill the JSON BUT NOT for ALL of those table, only for a few of them, for example let's say for a Patients, Orders and Visits table, not all of those twenty tables.
So still I have trouble with getting my head around how to write such a
.find( :all )
sort of query ?
Can someone show some example so I can understand how to do this sort of queries?
You can include all of those tables in one SQL query:
#organization = Organization.includes(:patients, :orders, :visits).find(1)
Now when you do something like:
#organization.patients
It will load the patients in-memory, since it already fetched them in the original query. Without includes, #organization.patients would trigger another database query. This is why it's called "eager loading", because you are loading the patients of the organization before you actually reference them (eagerly), because you know you will need that data later.
You can use includes anytime, whether using all or not. Personally I find it to be more explicit and clear when I chain the includes method onto the model, instead of including it as some sort of hash option (as in the Railscast episode).

Rails ActiveRecord - Uniqueness and Lookup on Array Attribute

Good morning,
I have a Rails model in which I’m currently serializing an array of information. Two things are important to me:
I want to be able to ensure that this is unique (i.e. can’t have two models with the same array)
I want to be able to search existing models for this hash (in a type of find_or_create_by method).
This model describes a “portfolio” – i.e. a group of stock or bonds. The array is the description of what securities are inside the portfolio, and in what weights. I also have a second model, which is a group of portfolios (lets call it a “Portcollection” to keep things simple). A collection has many portfolios, and a portfolio can be in many collections. In other words:
class Portfolio
serialize :weights
has_and_belongs_to_many :portcollections
class Portcollection
has_and_belongs_to_many :portfolios
When I am generating a “portcollection” I need to build a bunch of portfolios, which I do programmatically (implementation not important). Building a portfolio is an expensive operation, so I’m trying to check for the existence of one first. I thought I could do this via find_or_create_by, but wasn’t having much luck. This is my current solution:
Class Portcollection
before_save :build_portfolios
def build_portfolios
……
proposed_weights = ……
yml =proposed_weights.to_yaml
if port = Portfolio.find_by_weights(yml)
self.portfolios << port
else
self.portfolios << Portfolio.create!(:weights => proposed_weights)
end
……..
end
This does work, but it is quite slow. I have a feeling this is because I’m converting stuff to YAML each time it runs when I try to check for an existing portfolio (this is running probably millions of times), and I’m searching for a string, as opposed to an integer. I do have an index on this column though.
Is there a better way to do this? A few thoughts had crossed my mind:
Calculate an MD5 hash of the “weights” array, and save to a database column. I’ll still have to calculate this hash each time I want to search for an array, but I have a gut feeling this would be easier for the database to index & search?
Work on moving from has_and_belongs_to_many to a has_many => through, and store the array information as database columns. That way I could try to sort out a database query that could check for the uniqueness, without any YAML or serialization…
i.e. something like :
class Portfolio
has_many :portcollections, :through => security_weights
class Portcollections
has_many :portfolios, :through => security_weights
SECURITY_WEIGHTS
id portfolio_id portcollection_id weight_of_GOOG weight_of_APPLE ……
1 14 15 0.4 0.3
In case it is important, the “weights” array would look like this:
[ [‘GOOG’, 0.4] , [‘AAPL’, 0.3] , [‘GE’, 0.3] ]
Any help would be appreciated. Please keep in mind I'm quite an amateur - programming is just a hobby for me! Please excuse me if I'm doing anything really hacky or missing something obvious....
Thanks!
UPDATE 1
I've done some research into the Rails 3.2 "store" method, but that doesn't seem to be the answer either... It just stores objects as JSON, which gives me the same lack of searchability I have now.
I think storing a separate hash in it's own column is the only way to do this efficiently. You are using serialization or a key/value store that is designed to not be easily searchable.
Just make sure you consider sorting on your values before hashing them, other wise you could have the same content but differing hashes.

newbie: append serialized integers into database column and retrieve them back

How Could I store integers (user id's ranging from 1 to 9999) serialized in a database column and retrieve them back?
In my User model I have invites column,
User model
serialize: invites
invites = text field
Now I'm trying to do 2 things:
Append the user_id integer (from 1 to 9999) serialized in a column "invites"
Retrieve all the user id's back from the User.invited column ( deserialize it ? )
From the fine manual:
serialize(attr_name, class_name = Object)
If you have an attribute that needs to be saved to the database as an object, and retrieved as the same object, then specify the name of that attribute using this method and it will be handled automatically. The serialization is done through YAML. If class_name is specified, the serialized object must be of that class on retrieval or SerializationTypeMismatch will be raised.
So, if you want to store an array of integers as a serialized object, then:
class User < ActiveRecord::Base
serialize :invites, Array
#...
end
You'd want the invites column to be a text column in the database (not string!) to avoid running into size issues.
Then you can treat user.invites as a plain Array:
user.invites = [ 1, 2, 3 ]
user.invites.push(11)
That of course doesn't verify that that numbers are valid or that you don't have duplicates (but you could use a Set instead of an Array for that), it also won't prevent you from putting a string in there.
I don't recommend that you do this though, serialization is almost always a mistake that will come back to bite you later. A serialized column is an opaque blob of data as far as the database is concerned: you can't update it in-place, you can't query it, all you can do is pull it out of the database and put it back. serialize uses YAML for serialization and that's an awful format if you need to work with your serialized data inside the database; you can also run into interesting encoding issues during upgrades.
You're better off setting up a traditional association table and a separate model (possibly using has_many ... :through =>) to handle this situation.

Resources