I have Project and Entry as models. Projects can have many entries, and entries belong to only a project. Entries have dates.
One reporting requirement is to show Projects that have Entries for a particular month. I have been successful in using scopes to achieve this, i.e. Project.with_entries.on(param_the_month).
The issue is that I now want to display the entries for that month only, grouped by projects.
If I do projects.each do |p|, then query for the entries (p.entries), the returned entries are for all months, not just the month I specified.
While this is an obvious result, is there a way in Rails to simply return the entries for that month using my original chained scope?
Edit: I did misunderstand :)
Take 2: You can merge scopes across models. So if you can create a where-type scope on Entry to select entries from a given month, you can then try something like
Project.with_entries.on(param_the_month).merge(Entry.on(param_the_month))
I've called it on by analogy with your scope on Project - without seeing your data model I can't say how exactly to implement it.
has-many associations also accept scopes, so you can do projects.entries.your_scope to filter them. The downside is that this would require another database query for every project, which might be slow depending on the size of your database.
An alternative that does not require extra queries would be to fetch the entries already filtered, and then go upward to get the projects:
entries = Entry.my_conditions.includes(:project)
entries_by_project = entries.group_by(&:project)
Now you have a hash whose keys are the projects, and the values are only the entries of that project that pass your conditions.
You can add includes into your scope, this way it will not query for those records again, it will eager load them when you use project scope.
scope :my_scope, includes(:entries).where( :active => true )
Related
I received some really good help in solving an issue in which I needed to get objects from objects in one query. That worked like a charm.
This is a follow up to that question. I chose to create a new question to this since the last one was answered according to my previous specification. Please do refer to that previous question for details.
Basically, I wanted to get all the objects of multiple objects in one single query. E.g. if a Product has several Categories which in turn has several Products, I want to get all the Products in that relation, easier (and erronously) put:
all_products = #my_product.categories.products
This was solved with this query. It is this query I would (preferably) like to alter:
Product.includes(:categories).where(categories: { id: #my_product.categories.pluck(:id) } )
Now, I realized something I missed using this solution was that I only get a list of unique Products (which one would expect as well). I would however like to get a list with possible duplicates as well.
Basically, if a "Blue, Electric Car" is included in categories ("Blue", "Electric" and "Car") I would like to get three instances of that object returned, instead of one unique.
I guess this does not make Rails-sense but is there a way to alter the query above so that it does not serve me a list of unique objects in the returned list but rather the "complete" list?
The includes method of AREL will choose between two strategies to make the query, one of which simply does two distinct query and the other one does an INNER JOIN.
In both cases the products will be distinct.
You have to do manually a right outer join:
Product.joins('RIGHT JOIN categories ON categories.product_id = products.id').where(categories: { id: #my_product.categories.pluck(:id) } )
adds also .preload(:categories) if you want to keep the eager loading of the categories.
Since you want duplicates, just change includes to joins, (I tested this just now). joins will essentially combine (inner-join) the two tables giving you a list of records that are all unique (per Product and Category). includes does eager loading which just loads the associated tables already but does an outer-join, and therefore, the retrieved records are also unique (but only per Product).
Product.joins(:categories).where(categories: { id: #my_product.categories.pluck(:id) } )
I am loading data from two models, and once the data are loaded in the variables, then I need to remove those items from the first relation, that are not in the second one.
A sample:
users = User.all
articles = Articles.order('created_at DESC').limit(100)
I have these two variables filled with relational data. Now I would need to remove from articles all items, where user_id value is not included in the users object. So in the articles would stay only items with user_id, that is in the variable users.
I tried it with a loop, but it was very slow. How do I do it effectively?
EDIT:
I know there's a way to avoid doing this by building a better query, but in my case, I cannot do that (although I agree that in the example above it's possible to do that). That thing is that I have in 2 variables loaded data from database and I would need to process them with Ruby. Is there a command for doing that?
Thank you
Assuming you have a belongs_to relation on the Article model:
articles.where.not(users: users)
This would give you at most 100, but probably less. If you want to return 100 with the condition (I haven't tested, but the idea is the same, put the conditions for users in the where statement):
Articles.includes(:users).where.not(users: true).order('created_at DESC').limit(100)
The best way to do this would probably be with a SQL join. Would this work?
Articles.joins(:user).order('created_at DESC').limit(100)
I have a model to which I need to create a default scope. I am unsure of the best way to write this scope but I will explain how it needs to work.
Basically I need to get all items of the model and if two items have the same "order" value then it should look to the "version" field (which will contain, 1, 2, 3 etc) and pick the one with the highest value.
Is there a way of achieving this with just a scope?
Try this code:
scope :group_by_order, -> { order('order ASC').group('order') }
default_scope, { (group_by_order.map{ |key,values| values.order('version DESC') }.map{|key, values| values - values[1..-1]}).values.flatten }
Explanation Code:
order by "order" field.
group by "order" field.
map on the result hash, and order each values by "version" field
map again on values, and remove from index "1" to the end.
get all values, and flatten them
A word of caution using default scopes with order. When you performs updated on the collection such as update_all it will use the default scope to fetch the records, and what you think would be a quick operation will bring your database to its knees as it copies the rows to a temporary table before updating.
I would recommend just using a normal scope instead of a default scope.
Have a look at Select the 3 most recent records where the values of one column are distinct on how to construct the sql query you want and then put that into a find_by_sql statemate mentioned in How to chain or combine scopes with subqueries or find_by_sql
The ActiveRecord order method simply uses the SQL ORDER function which can have several arguments. Let's say you have some model with the attributes order and version then the correct way order the records as you describe it, is order(:order, :version). If you want this as the default scope would you end up with:
default_scope { order(:order, :version) }
First, default_scopes are dangerous. They get used whenever you use the model, unless you specifically force 'unscoped'. IME, it is rare to need a scope to every usage of a model. Not impossible, but rare. And rarer yet when you have such a big computation.
Instead of making a complex query, can you simplify the problem? Here's one approach:
In order to make the version field work, you probably have some code that is already comparing the order fields (otherwise you would not have unique rows with the two order fields the same, but the version field differing). So you can create a new field, that is higher in value than the last field that indicated the right entity to return. That is, in order to create a new unique version, you know that you last had a most-important-row. Take the most-important-rows' sort order, and increment by one. That's your new most-important-rows' sort order.
Now you can query for qualifying data with the highest sort order (order_by(sort_order, 'DESC').first).
Rather than focus on the query, focus on whether you are storing the right data, that can the query you want to achieve, easier. In this case, it appears that you're already doing an operation that would help identify a winning case. So use that code and the existing database operation, to reduce future database operations.
In sql you can easily order on two things, which will first order on the first and then order on the second if the first thing is equal. So in your case that would be something like
select * from posts order by order_field_1, version desc
You cannot name a column order since it is a sql reserved word, and since you did not give the real column-name, I just named it order_field_1.
This is easily translated to rails:
Post.order(:order_field_1, version: :desc)
I would generally advice against using default_scope since once set it is really hard to avoid (it is prepended always), but if you really need it and know the risks, it is really to apply as well:
class Post < ActiveRecord::Base
default_scope { order(:order_field_1, version: :desc) }
end
This is all actually documented very well in the rails guides.
I watched this rails cast http://railscasts.com/episodes/22-eager-loading but still I have some confusions about what is the best way of writing an efficient GET REST service for a scenario like this:
Let's say we have an Organization table and there are like twenty other tables that there is a belongs_to and has_many relations between them. (so all those tables have a organization_id field).
Now I want to write a GET and INDEX request in form of a Rails REST service that based on the organization id being passed to the request in URL, it can go and read those tables and fill the JSON BUT NOT for ALL of those table, only for a few of them, for example let's say for a Patients, Orders and Visits table, not all of those twenty tables.
So still I have trouble with getting my head around how to write such a
.find( :all )
sort of query ?
Can someone show some example so I can understand how to do this sort of queries?
You can include all of those tables in one SQL query:
#organization = Organization.includes(:patients, :orders, :visits).find(1)
Now when you do something like:
#organization.patients
It will load the patients in-memory, since it already fetched them in the original query. Without includes, #organization.patients would trigger another database query. This is why it's called "eager loading", because you are loading the patients of the organization before you actually reference them (eagerly), because you know you will need that data later.
You can use includes anytime, whether using all or not. Personally I find it to be more explicit and clear when I chain the includes method onto the model, instead of including it as some sort of hash option (as in the Railscast episode).
Named scopes really made this problem easier but it is far from being solved. The common situation is to have logic redefined in both named scopes and model methods.
I'll try to demonstrate the edge case of this by using somewhat complex example. Lets say that we have Message model that has many Recipients. Each recipient is being able to mark the message as being read for himself.
If you want to get the list of unread messages for given user, you would say something like this:
Message.unread_for(user)
That would use the named scope unread_for that would generate the sql which will return the unread messages for given user. This sql is probably going to join two tables together and filter messages by those recipients that haven't already read them.
On the other hand, when we are using the Message model in our code, we are using the following:
message.unread_by?(user)
This method is defined in message class and even it is doing basically the same thing, it now has different implementation.
For simpler projects, this is really not a big thing. Implementing the same simple logic in both sql and ruby in this case is not a problem.
But when application starts to get really complex, it starts to be a problem. If we have permission system implemented that checks who is able to access what message based on dozens of criteria defined in dozens of tables, this starts to get very complex. Soon it comes to the point where you need to join 5 tables and write really complex sql by hand in order to define the scope.
The only "clean" solution to the problem is to make the scopes use the actual ruby code. They would fetch ALL messages, and then filter them with ruby. However, this causes two major problems:
Performance
Pagination
Performance: we are creating a lot more queries to the database. I am not sure about internals of DMBS, but how harder is it for database to execute 5 queries each on single table, or 1 query that is going to join 5 tables at once?
Pagination: we want to keep fetching records until specified number of records is being retrieved. We fetch them one by one and check whether it is accepted by ruby logic. Once 10 of them are accepted, process will stop.
Curious to hear your thoughts on this. I have no experience with nosql dbms, can they tackle the issue in different way?
UPDATE:
I was only speaking hypotetical, but here is one real life example. Lets say that we want to display all transactions on the one page (both payments and expenses).
I have created SQL UNION QUERY to get them both, then go through each record, check whether it could be :read by current user and finally paginated it as an array.
def form_transaction_log
sql1 = #project.payments
.select("'Payment' AS record_type, id, created_at")
.where('expense_id IS NULL')
.to_sql
sql2 = #project.expenses
.select("'Expense' AS record_type, id, created_at")
.to_sql
result = ActiveRecord::Base.connection.execute %{
(#{sql1} UNION #{sql2})
ORDER BY created_at DESC
}
result = result.map do |record|
klass = Object.const_get record["record_type"]
klass.find record["id"]
end.select do |record|
can? :read, record
end
#transactions = Kaminari.paginate_array(result).page(params[:page]).per(7)
end
Both payments and expenses need to be displayed within same table, ordered by creation date and paginated.
Both payments and expenses have completely different :read permissions (defined in ability class, CanCan gem). These permission are quite complex and they require querieng several other tables.
The "ideal" thing would be to write one HUGE sql query that would do return what I need. It would made pagination and everything else a lot easier. But that is going to duplicate my logic defined in ability.rb class.
I'm aware that CanCan provides a way of defining the sql query for the ability, but the abilities are so complex, that they couldn't be defined in that way.
What I did is working, but I'm loading ALL transactions, and then checking which ones I could read. I consider it a big performance issue. Pagination here seems pointless because I'm already loading all records (it only saves bandwidth). An alternative is to write really complex SQL that is going to be hard to maintain.
Sounds like you should remove some duplication and perhaps use DB logic more. There's no reason that you can't share code between named scopes between other methods.
Can you post some problematic code for review?