Sorting elements of model depending upon habtm association - ruby-on-rails

I have to models named attachments and users associated by has_and_belongs_to_many. now i have to find all attachments sorted in such a way that attachments having association will be displayed first and then those with no association. How can i do this?

One simple and relatively efficient way to do this would be to add a counter cache to your Attachment model. The counter cache would store and keep up-to-date the number of associations in a column on your attachments table, so you could do Attachment.order( 'user_attachments_count DESC' ).
Unfortunately HABTM does not support counter cache, so you would have to pop up a "middle-man" model between the two others just to get access to the join table.
Another way (yet with poor performance) is to simply use :
#attachments = Attachment.includes(:users)
#sorted = #attachments.sort_by {|r| r.to_a.size }.reverse!
Well, if it doesn't fit, you can always start sweating over a SQL query...

Related

Ruby On Rails - what's the best way to find an object with most (has-many) associations?

I have a variable (#cars) containing data from the database and from here, I need to generate an XLS document.
Because of some specifics of the XLS document, I need to know in advance the length of some associations (model Car has a has_many association for the model PreviousOwner) - particularly, I need to know how many previous owners each car had and I need to capture the highest number of previous owners of all cars.
One way of finding that is adding counter_cache to the Car model structure, is there any other way to deal with this situation? I have #cars variable and from there I need to find the car with the most previous owners.
One of the ways of dealing with it is by joining and selecting a count:
Car.left_joins(:previous_owners)
.select(
'cars.*',
'COUNT(previous_owners.*) AS previous_owners_count'
)
.group(:id)
.order(previous_owners_count: :desc)
Advantages when compared to a counter cache:
No additional update queries when inserting associated records.
More accurate if the count is critical and you have a lot of write activity.
Disadvantages:
Count is calculated for every query which is less efficient when reading.
It gets in the way of eager loading the records.
More code complexity vs a simple model callback.

Rails Data Modelling

In my company, we are trying to cache some data that we are querying from an API. We are using Rails. Two of my models are 'Query' and 'Response'. I want to create a one-to-many relationship between Query and Response, wherein, one query can have many responses.
I thought this is the right way to do it.
Query = [query]
Response = [query_id, response_detail_1, response_detail_2]
Then, in the Models, I did the following Data Associations:
class Query < ActiveRecord::Base
has_many :response
end
class Response < ActiveRecord::Base
belongs_to :query
end
So, canonically, whenever I want to find all the responses for a given query, I would do -
"_id" = Query.where(:query => "given query").id
Response.where(:query_id => "_id")
But my boss made me use an Array column in the Query model, remove the Data Associations between the models and put the id of each response record in that array column in the Query model. So, now the Query model looks like
Query = [query_id, [response_id_1, response_id_2, response_id_3,...]]
I just want to know what are the merits and demerits of doing it both ways and which is the right way to do it.
If the relationship is really a one-to-many relationship, the "standard" approach is what you originally suggested, or using a junction table. You're losing out on referential integrity that you could get with a FK by using the array. Postgres almost had FK constraints on array columns, but from what I researched it looks like it's not currently in the roadmap:
http://blog.2ndquadrant.com/postgresql-9-3-development-array-element-foreign-keys/
You might get some performance advantages out of the array approach if you consider it like a denormalization/caching assist. See this answer for some info on that, but it still recommends using a junction table:
https://stackoverflow.com/a/17012344/4280232. This answer and the comments also offer some thoughts on the array performance vs the join performance:
https://stackoverflow.com/a/13840557/4280232
Another advantage of using the array is that arrays will preserve order, so if order is important you could get some benefits there:
https://stackoverflow.com/a/2489805/4280232
But even then, you could put the order directly on the responses table (assuming they're unique to each query) or you could put it on a join table.
So, in sum, you might get some performance advantages out of the array foreign keys, and they might help with ordering, but you won't be able to enforce FK constraints on them (as of the time of this writing). Unless there's a special situation going on here, it's probably better to stick with the "FK column on the child table" approach, as that is considerably more common.
Granted, that all applies mainly to SQL databases, which I notice now you didn't specify in your question. If you're using NoSQL there may be other conventions for this.

How to remove some items from a relation?

I am loading data from two models, and once the data are loaded in the variables, then I need to remove those items from the first relation, that are not in the second one.
A sample:
users = User.all
articles = Articles.order('created_at DESC').limit(100)
I have these two variables filled with relational data. Now I would need to remove from articles all items, where user_id value is not included in the users object. So in the articles would stay only items with user_id, that is in the variable users.
I tried it with a loop, but it was very slow. How do I do it effectively?
EDIT:
I know there's a way to avoid doing this by building a better query, but in my case, I cannot do that (although I agree that in the example above it's possible to do that). That thing is that I have in 2 variables loaded data from database and I would need to process them with Ruby. Is there a command for doing that?
Thank you
Assuming you have a belongs_to relation on the Article model:
articles.where.not(users: users)
This would give you at most 100, but probably less. If you want to return 100 with the condition (I haven't tested, but the idea is the same, put the conditions for users in the where statement):
Articles.includes(:users).where.not(users: true).order('created_at DESC').limit(100)
The best way to do this would probably be with a SQL join. Would this work?
Articles.joins(:user).order('created_at DESC').limit(100)

Rails - Only pull in some HABTM associations on a case-by-case basis to avoid unnecessary joins

In Rails 4, I have a project in which I've set up three models with the following many-to-many relationships:
An Item
has_and_belongs_to_many categories
has_and_belongs_to_many tags
A Category
has_and_belongs_to_many items
A Tag
has_and_belongs_to_many items
And while it's easy to select an Item and automatically get all associated categories and tags, there are some situations in which I'd want to select items AND their associated categories, but NOT their tags. In these cases, I'd like to avoid doing extra database joins against the Tags table and ItemsTags join table. Can anyone help me with the correct find syntax to only join Items to categories? (Side note: I'm also planning on adding 10 additional many-to-many relationships between items and other models, but I'm just simplifying the scenario for this question. In the end, I'm trying to avoid doing a join with an excessive number of tables whenever I can.)
Thanks!
Rails will by default not load associated records unless you request it
Item.all will only fetch record from 'items' table
Then later in your code if you call item.categories that's the point when a query is performed to fetch all categories of this particular item. If you never call item.tags then the query to 'tags' table is never executed and the records are not fetch. Bottom line is: you can have as many associations as needed, as long as you don't explicitly call them they won't be loaded.
Side note about performance, rails offer several ways to join and include associated tables:
Item.include(:category).all Will trigger only 2 queries to fetch all items, and all associated categories.
Item.include(:category).joins(:category).all -> will trigger only 1 query joining the items and categories tables (but it may be slower than 2 requests)
So you have all control over what's loaded from the database. Those can apply for scope as well.

Loading all the data but not from all the tables

I watched this rails cast http://railscasts.com/episodes/22-eager-loading but still I have some confusions about what is the best way of writing an efficient GET REST service for a scenario like this:
Let's say we have an Organization table and there are like twenty other tables that there is a belongs_to and has_many relations between them. (so all those tables have a organization_id field).
Now I want to write a GET and INDEX request in form of a Rails REST service that based on the organization id being passed to the request in URL, it can go and read those tables and fill the JSON BUT NOT for ALL of those table, only for a few of them, for example let's say for a Patients, Orders and Visits table, not all of those twenty tables.
So still I have trouble with getting my head around how to write such a
.find( :all )
sort of query ?
Can someone show some example so I can understand how to do this sort of queries?
You can include all of those tables in one SQL query:
#organization = Organization.includes(:patients, :orders, :visits).find(1)
Now when you do something like:
#organization.patients
It will load the patients in-memory, since it already fetched them in the original query. Without includes, #organization.patients would trigger another database query. This is why it's called "eager loading", because you are loading the patients of the organization before you actually reference them (eagerly), because you know you will need that data later.
You can use includes anytime, whether using all or not. Personally I find it to be more explicit and clear when I chain the includes method onto the model, instead of including it as some sort of hash option (as in the Railscast episode).

Resources