After researching for a good amount of time, finally writing this.
I am doing a solr query to find all the records that have specific object_id and also another field which tells active/inactive. And then need to get the counts that are active and as well as total count (that includes active/inactive and also belongs to specified object_ids)
Model.search() do
with(:object_id, params[:ids])
active_condition = with(:active, true)
facet(:object_id, exclude: active_condition)
end
This is returning all the records. But, the requirement is to fetch the count that are belongs to only specific object_ids and also both active/inactive.
Is it possible to get the counts with this approach ?
You need to facet on the active field instead:
Model.search() do
with(:object_id, params[:ids])
active_condition = with(:active, true)
facet(:active, exclude: active_condition)
end
This returns the active and inactive counts of the object_ids you scoped on.
Related
Assuming this simplified schema:
users has_many discount_codes
discount_codes has_many orders
I want to grab all users, and if they happen to have any orders, only include the orders that were created between two dates. But if they don't have orders, or have orders only outside of those two dates, still return the users and do not exclude any users ever.
What I'm doing now:
users = User.all.includes(discount_codes: :orders)
users = users.where("orders.created_at BETWEEN ? AND ?", date1, date2).
or(users.where(orders: { id: nil })
I believe my OR clause allows me to retain users who do not have any orders whatsoever, but what happens is if I have a user who only has orders outside of date1 and date2, then my query will exclude that user.
For what it's worth, I want to use this orders where clause here specifically so I can avoid n + 1 issues later in determining orders per user.
Thanks in advance!
It doesn't make sense to try and control the orders that are loaded as part of the where clause for users. If you were to control that it'd have to be part of the includes (which I think means it'd have to be a part of the association).
Although technically it can combine them into a single query in some cases, activerecord is going to do this as two queries.
The first query will be executed when you go to iterate over the users and will use that where clause to limit the users found.
It will then run a second query behind the scenes based on that includes statement. This will simply be a query to get all orders which are associated with the users that were found by the previous query. As such the only way to control the orders that are found through the user's where clause is to omit users from the result set.
If I were you I would create an instance method in User model for what you are looking for but instead of using where use a select block:
def orders_in_timespan(start, end)
orders.select{ |o| o.between?(start, end) }
end
Because of the way ActiveRecord will cache the found orders from the includes against the instance then if you start off with an includes in your users query then I believe this will not result in n queries.
Something like:
render json: User.includes(:orders), methods: :orders_in_timespan
Of course, the easiest way to confirm the number of queries is to look at the logs. I believe this approach should have two queries regardless of the number of users being rendered (as likely does your code in the question).
Also, I'm not sure how familiar you are with sql but you can call .to_sql on the end of things such as your users variable in order to see the sql that would be generated which might help shed some light on the discrepancies between what you're getting and what you're looking for.
Option 1: Write a custom query in SQL (ugly).
Option 2: Create 2 separate queries like below...
#users = User.limit(10)
#orders = Order.joins(:discount_code)
.where(created_at: [10.days.ago..1.day.ago], discount_codes: {user_id: users.select(:id)})
.group_by{|order| order.discount_code.user_id}
Now you can use it like this ...
#users.each do |user|
orders = #orders[user.id]
puts user.name
puts user.id
puts orders.count
end
I hope this will solve your problem.
You need to use joins instead of includes. Rails joins use inner joins and will reject all the records which don't have associations.
User.joins(discount_codes: :orders).where(orders: {created_at: [10.days.ago..1.day.ago]}).distinct
This will give you all distinct users who placed orders in a given period of time.
user = User.joins(:discount_codes).joins(:orders).where("orders.created_at BETWEEN ? AND ?", date1, date2) +
User.left_joins(:discount_codes).left_joins(:orders).group("users.id").having("count(orders.id) = 0")
I do the following so I am able to group all LineItem's together by count and display the LineItem by count along with the vendor_name
line_items = LineItem.all
vendor_line_items = line_items.group(:vendor_name).select('COUNT(*) as count', 'vendor_name').order('count desc')
My issue is that I am only able to receive the following params: id: nil, vendor_name: "name_here"
Is there a way to accomplish the same thing but allow all params from the model to be passed?
You can't select the rest of the columns since you have different values for each coulmn inside the group (like... if you have 2 LineItem in the same group, which ID do you expect to have?)
You could apply aggregate functions (like COUNT, MAX, MIN, etc) to other columns on the SELECT to tell the database which columns you want for each column I guess.
Personally, I would first get the groups ordered by count and then do more queries when needed to fetch the actual record for the groups.
counts = LineItem.group(:vendor_name).count
# counts should be something like: {vendor_1: X, vendor_2, Y, vendor_3: Z}
# order the vendors using the count for each vendor
ordered_vendors = counts.keys.sort_by { |ven| counts[ven] }
ordered_vendors.each do |vendor|
# do something with each vendor, fetch LineItems, etc
end
The reason why you only see the count and the vendor name is because that is all you are grouping by. Suppose in the database, you have 5 different Vendor A shown below.
vendor_name | product_name
-----------------------------
Vendor A | test
Vendor A | test2
Vendor A | test3
Vendor A | test4
Vendor A | test5
...
When you run your query, SQL will not know what to display for product_name as the group_by will only show 1 row instead of 5. Have a read about it here.
To achieve this you will need to either to group by the other columns too or use a min/max select to pick a value to display. Here is an example:
vendor_line_items = LineItem.select('COUNT(*) AS count', 'vendor_name', 'MAX(product_name)').group(:vendor_name).order('count DESC')
Now each of those results, you can call the attributes method.
Which will give you the following hash:
vendor_line_items.each do |x|
result = x.attributes
# Here result will be a hash.
# {"count" => 5, "vendor_name" => "Vendor A", "product_name" => "test5"}
end
(Not accepted answer unless a better way is received)
I did:
vendor_line_items = Vendor.joins(:line_items).group(:id).order('COUNT(line_items.id) DESC')
This gives me what I want by ordering the results by vendor.line_items.count and allowing me to get all of the associations to display any param I want.
I assume this way is much slower than what I was previously doing as it fetches all records and then on the front end goes through associations to get more records.
In the original way I was doing this. It is what I want minus an extra parameter that I would want the SUM of. The parameter is a decimal attribute. In the same way I count the LineItem that have the same vendor_name, I want to sum of the LineItem.attribute that share the same vendor_name.
Better Answer:
LineItem.select(:vendor_name, 'sum(line_item_revenue) as line_item_revenue', 'COUNT(*) as count').group(:vendor_name)
This seems to get me what I want with less queries (i believe) --- correct me if I am wrong on the queries.
I am quite confused about your code and your expectation. You are selecting the COUNT but the expected result is id instead of count?
If you want to group by vendor_name and show the count of group_by you can try
line_items.group(:vendor_name).count
I want to write this query using classic Active Record, instead of raw sql.
SELECT
user_id,
term,
SUM(views) AS Views,
SUM(clicks) AS Clicks
FROM reports
GROUP BY user_id, term
Tried this, but doesn't work.
Report.group([:user_id, :term]).sum([:views, :clicks])
I know it's possible to use .group_by{} but it's not very efficient because it's aggregated by Ruby and not performed the query.
DB: Postgresql
I think you can use this
#reports = Report.select("reports.user_id, reports.term,
sum(reports.views) as total_views
sum(reports.clicks) as total_clicks").
group("reports.user_id, reports.term")
please note all for total (as sum result) although they not show up, but they do exist if you call the name
#report.first # total_views and total_clicks not show up in rails console but
#report.first.total_views # you will see total calculation
In a rails 4 app, in one model I have a column containing multiple ids as a string with comma separated values.
"123,4568,12"
I have a "search" engine that I use to retrieve the records with one or many values using the full text search of postgresql I can do something like this which is very useful:
records = MyModel.where("my_models.col_name ## ?", ["12","234"])
This return all the records that have both 12 and 234 in the targeted column. The array comes from a form with a multiple select.
Now I'm trying to make a query that will find all the records that have either 12 or 234 in there string.
I was hopping to be able to do something like:
records = MyModel.where("my_models.col_name IN (?)", ["12","234"])
But it's not working.
Should I iterate through all the values in the array to build a query with multiple OR ? Is there something more appropriate to do this?
EDIT / TL;DR
#BoraMa answer is a good way to achieve this.
To find all the records containing one or more ids referenced in the request use:
records = MyModel.where("my_models.col_name ## to_tsquery(?)", ["12","234"].join('|'))
You need the to_tsquery(?) and the join with a single pipe |to do a OR like query.
To find all the records containing exactly all the ids in the query use:
records = MyModel.where("my_models.col_name ## ?", ["12","234"])
And of course replace ["12","234"] with something like params[:params_from_my_form]
Postgres documentation for full text search
If you already started to use the fulltext search in Postgres in the first place,I'd try to leverage it again. I think you can use a fulltext OR query which can be constructed like this:
records = MyModel.where("my_models.col_name ## to_tsquery(?)", ["12","234"].join(" | "));
This uses the | operator for ORing fulltext queries in Postgres. I have not tested this and maybe you'll need to do to_tsvector('my_models.col_name') for this to work.
See the documentation for more info.
Suppose your ids are :
a="1,2,3,4"
You can simply use:
ModelName.find(a)
This will give you all the record of that model whose id is present in a.
I just think a super simple solution, we just sort the ids in saving callback of MyModel, then the query must be easier:
class MyModel < ActiveRecord::Base
before_save :sort_ids_in_col_name, if: :col_name_changed?
private
def sort_ids_in_col_name
self.col_name = self.col_name.to_s.split(',').sort.join(',')
end
end
Then the query will be easy:
ids = ["12","234"]
records = MyModel.where(col_name: ids.sort.join(',')
I'm having trouble figuring out how to loop over the results of a ThinkingSphinx search that has been set to group_by. I currently have the following:
search = Event.search(
{
group_by: 'category_id',
group_function: :attr
}
)
search.each_with_groupby_and_count do |event, group, count|
puts [event, group, count].join(' - ')
end
This, however, only returns one record per category. It seems like the group and count values are correct, but I only get the first Event of each category, which I would have expected to be all the events in the group. Is it possible to get an array of Hashes or similar? Furthermore, if this is possible, would the per_page option be per group?
I would expect each_with_group_and_count to iterate over something like this:
[
{group: 1, hits: [Event1, Event2], count: 2},
{group: 2: hits: [Event3], count: 1}
]
I'm afraid Sphinx's grouping functionality doesn't behave in that matter - it only returns one document (in this situation, one event) per group value.
It may be more appropriate to just sort by category_id instead, and track when it changes as you iterate over it (or use Enumerable#group_by to group all events by category_id) - keep in mind that Sphinx paginates results, so you may want to increase the default page size (with :per_page) depending on how you're using these results.