Finding distinct values in array column - ruby-on-rails

I'm using Postgresql with ActiveRecord in Rails 4.
I have a customer model and one of the columns is called "tags" and it is an array column (["sports", "broadcasting"]).
How can I select all the distinct values from this column? I would like to avoid doing anything where I would have to instantiate AR objects due to the amount of customer records we have. So I don't want something like:
Customer.select(:tags).map(&:tags).flatten.uniq
which works but uses too much memory.
I need the values to provide suggestions when someone is adding a tag to a customer. Hopefully it will help prevent variations of words or misspellings.
Thanks in advance!

You can use pluck like below
Customer.pluck(:tags).flatten.uniq

Related

Which one is faster between map, collect, select and pluck?

I have been using different methods to get specific fields from active record, But which one is faster and preferred to use and how are they different from one another?
User.all.collect(&:name)
User.all.pluck(:name)
User.all.select(:name)
User.all.map(&:name)
Thanks for your help in advance.
Usage of any of these methods requires different use cases:
Both select and pluck make SQL's SELECT of specified columns (SELECT "users"."name" FROM "users"). Hence, if you don't have users already fetched and not going to, these methods will be more performant than map/collect.
The difference between select and pluck:
Performance: negligible when using on a reasonable number of records
Usage: select returns the list of models with the column specified, pluck returns the list of values of the column specified. Thus, again, the choice depends on the use case.
collect/map methods are actually aliases, so there's no difference between them. But to iterate over models they fetch the whole model (not the specific column), they make SELECT "users".* FROM "users" request, convert the relation to an array and map over it.
This might be useful, when the relation has already been fetched. If so, it won't make additional requests, what may end up more performant than using pluck or select. But, again, must be measured for a specific use case.
pluck: retrieve just names from users, put them in an array as strings (in this case) and give it to you.
select: retrieve all the users from db with just the 'name' column and returns a relation.
collect/map (alias): retrieve all the users from db with all columns, put them in an array of User objects with all the fields, then transform every object in just the name and give this names array to you.
I put this in order of performance to me.

How to remove some items from a relation?

I am loading data from two models, and once the data are loaded in the variables, then I need to remove those items from the first relation, that are not in the second one.
A sample:
users = User.all
articles = Articles.order('created_at DESC').limit(100)
I have these two variables filled with relational data. Now I would need to remove from articles all items, where user_id value is not included in the users object. So in the articles would stay only items with user_id, that is in the variable users.
I tried it with a loop, but it was very slow. How do I do it effectively?
EDIT:
I know there's a way to avoid doing this by building a better query, but in my case, I cannot do that (although I agree that in the example above it's possible to do that). That thing is that I have in 2 variables loaded data from database and I would need to process them with Ruby. Is there a command for doing that?
Thank you
Assuming you have a belongs_to relation on the Article model:
articles.where.not(users: users)
This would give you at most 100, but probably less. If you want to return 100 with the condition (I haven't tested, but the idea is the same, put the conditions for users in the where statement):
Articles.includes(:users).where.not(users: true).order('created_at DESC').limit(100)
The best way to do this would probably be with a SQL join. Would this work?
Articles.joins(:user).order('created_at DESC').limit(100)

Ruby on Rails: Saving multiple values in a single database cell

How do I save multiple values in a single cell record in Ruby on Rails applications?
If I have a table named Exp with columns named: Education, Experience, and Skill, what is the best practice if I want users to store multiple values such as: education institutions or skills in a single row?
I'd like to have users use multiple text fields, but should go into same cell record.
For instance if user has multiple skills, those skills should be in one cell? Would this be best or would it be better if I created a new table for just skills?
Please advise,
Thanks
I would not recommend storing multiple values in the same database column. It would make querying very difficult. For example, if you wanted to look for all the users with a particular skill set, the query would clumsy both on readability and performance.
However, there are still certain cases where it makes sense.
When you want to allow for variable list of data points
You are not going to query the data based on one of the values in the list
ActiveRecord has built-in support for this. You can store Hash or Array in a database column.
Just mark the columns as Text
rails g model Exp experience:text education:text skill:text
Next, serialize the columns in your Model code
class Exp < ActiveRecord::Base
serialize :experience, :education, :skill
# other model code
end
Now, you can just save the Hash or Array in the database field!
Exp.new(:skill => ['Cooking', 'Singing', 'Dancing'])
You can do it using a serialized list in a single column (comma-separated), but a really bad idea, read these answers for reasoning:
Is storing a delimited list in a database column really that bad?
How to store a list in a column of a database table
I suggest changing your schema to have a one to many relationship between users and skills.
Rails 4 and PostgreSQL comes with hstore support out of the box, more info here In rails 3 you can use gem to enable it.
It depends on what kind of functionality you want. If you want to bind the Exp model attributes with a form (for new and update operations) and put some validations on them, it is always better to keep it in a separate table. On the other hand, if these are just attributes, which you just need in database keep them in a single column. There is way by which you can keep the serialized object like arrays and hashes in database columns. Make them a array/hash as per your need and save it like this.
http://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/Serialization/ClassMethods.html#method-i-serialize
Serialized attributes, automatically deserializes when they are pulled out of tables and serialized automatically when saved.

Rails 4 order by virtual attribute

I have a Product model which has name and description columns in the database.
I also have a Product.search_results_for(query), where query is a string like "Green Apple".
I need to return an ActiveRecord::Relation of the results ordered by which is the best hit. Currently, I'm setting a search_result_value to each product. search_result_value IS NOT a column in the database, and I don't want it to be.
So in essence, I have an ActiveRecord::Relation of Products that I need to order by search_result_value without changing it to an array, which is an instance variable that isn't stored in the database. How can I do this?
Something like this:
Product.order(:search_result_value)
If you do not put the value in a column or express the logic in search_result_value in pure SQL, then you’ll have to load all Products into memory and then sort them in Ruby using sort_by:
Product.all.to_a.sort_by(&:search_result_value)

How to create table in erlang mnesia with multiple unique columns?

something like unique column in sql. Any suggestion?
Your question is quite "open", so I tried to figure out what you want to do.
If you need to add a column which is not the primary key to store something like a unique ID, you can store there an erlang reference (Ref = make_ref()). which is almost guaranteed to be unique (cycle around 2^82). I don't know what is the behavior in multinode, but if there is a problem it is possible to tag the record with {node(),make_ref()}.
if you want create unique records by the combination of several keys: K1,K2,K3 you can use the tuple {K1,K2,K3} as key of the table and use a set or ordered set. but it will more complex to look into the table
if it it something else, some complementary information could help.

Resources