RoR - print count after group_by - ruby-on-rails

In my Ruby on Rails application, I want to print a count result after I grouped my records in database and I get something like that :
{1=>6}
In my database there is 6 records, all with the same user_id.
Here is my code :
Whatever.group(:user_id).count(:user_id)
I just want to print 1 how to do this. I tried with distinct and uniq without any success...

If you just need to compact that down to a useful result:
Whatever.group(:user_id).count.keys.join(',')
This will handle the case where you have more than one user in the result set.
The count(:user_id) part is redundant unless you're counting based on other conditions. Just use count instead.

Here is an example
('a'..'b').group_by { |i| i * 2 } #=> {"aa"=>["a"], "bb"=>["b"], "cc"=>["c"]}
('a'..'c').group_by { |i| i * 2 }.keys #=> ["aa", "bb", "cc"]
('a'..'c').group_by { |i| i * 2 }.keys[0] #=> "aa"

Related

How to store data from an each where rows has the same id

I'm a newbie in rails development, i'm sorry if i can't express myself well.
I've a rails each cycle that do:
r.round_matches.each do |m|
m.round_matches_team.each do |mt|
sheet.add_row [m.round_id, mt.team_name]
end
end
Every round_match has :round_id doubled
The output is:
round_id: 2 team_name: TEST A
round_id: 2 team_name: TEST B
How i can group round by id in the each cycle and estrapolate the team_name from round_match_teams for every same round_id? I would like that my output will be:
round_id: 2 team_name[1]: TEST A team_name[2]: TEST B
This should work
r.round_matches.each do |m|
team_names = m.round_matches_team.map.with_index do |team, index|
"team_name[#{index + 1}]: #{team.team_name}"
end.join(' ')
sheet.add_row ["round_id: #{m.round_id} #{team_names}"]
end
I would handle this a little differently: I would manipulate the data to be in a better format, and create the sheet from that data.
sheet_data = Hash.new([])
r.round_matches.each do |m|
m.round_matches_team.each do |mt|
sheet_data[mt.round_id] << mt.team_name
end
end
sheet_data.each do |round_id, teams|
sheet.add_row [round_id, *teams]
end
Explained: I will generate a hash with as key the round_id and as value an array containing the collected team-names. Then when adding the row, I use the splat-operator (*) to make sure each team-name will get a separate column.
You could even sort the team-names if this might make more sense before using the splat, or instead of using *teams, use something like teams.sort.join(", ") to combine all teams into one column (if wanted/preferred).

Rails Optimize query and loop through large entity

I have a method that outputs the following hash format for charting.
# Monthly (Jan - Dec)
{
"john": [1,2,3,4,5,6,7,8,9,10,11,12],
"mike": [1,2,3,4,5,6,7,8,9,10,11,12],
"rick": [1,2,3,4,5,6,7,8,9,10,11,12]
}
# the indices represents the month
# e.g [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
# Index
# 0 = Jan
# 1 = Feb
# 2 = Mar
...
The following method loops through all the store invoices within given year with specific sales rep name and generate above outcome
def chart_data
hash = Hash.new {|h,k| h[k] = [] }
(1..12).each do |month|
date_range = "1/#{month}/#{date.year}".to_date.all_month
all_reps.each do |name|
hash[name] << store.bw_invoices.where(sales_rep_name: name,
purchase_date: date_range).sum(:subtotal).to_f
end
end
return hash
end
When I run run this method it takes over 4~5 sec to execute. I really need to optimize this query. I came up with two solutions that I think it would help but I would love to get some of your expertise.
move it to background job
perform a SQL query to optimize(I need help with this if this is optimal)
Thank you so much for your time
Yes, you've found a problem that is very hard to solve efficiently without letting the database do the hard work.
Assuming your dataset is potentially too large to load a whole year raw into ruby objects, this approach using just 1 postgreSQL query would be probably the best kind of idea:
More SQL approach
def chart_data
result = Hash.new {|h,k| h[k] = [] }
total_lines = store.bw_invoices.select("sales_rep_name, to_char(purchase_date, 'mm') as month, sum(subtotal) as total")
.where(purchase_date: Date.today.all_year)
.group("sales_rep_name, to_char(purchase_date, 'mm')")
total_lines.each do |total_line|
result[total_line.sales_rep_name][total_line.month.to_i - 1] = total_line.total.to_f
end
result
end
Note that this solution will leave nil rather than 0 for months where a rep had no sales. And if their last month with sales was June then there will only be 6 items in the array.
We can avoid this either with more complex SQL left joining from a virtual table or by filling in the array gaps afterwards. However, depending on how you setup your charting this might make no practical difference anyway.
More ruby approach
def chart_data
result = Hash.new {|h,k| h[k] = [] }
(1..12).each do |month|
date_range = "1/#{month}/#{Date.today.year}".to_date.all_month
rows = store.bw_invoices.select("sales_rep_name, SUM(subtotal) as total")
.where(purchase_date: date_range)
.group(:sales_rep_name)
all_reps.each do |rep_name|
row = rows.detect { |x| x.sales_rep_name == rep_name }
result[rep_name] << (row ? row.total : 0).to_f
end
end
result
end
This is more similar to your approach but takes the querying outside of the inner loop so we do 12 queries instead of 12 * number of reps. The detect used may become a little slow but only if there are thousands of reps. In which case you could sort both all_reps and the query output and implement your own kind of merge join but at that point you're getting into complexity you might as well let the database handle again.

Divide Query Into Two Random Groups

I'm working on an AB test and need to divide a population. How should I divide something like
User.where(:condition => true) randomly into two roughly equal groups?
I'm considering iterating through the whole array and pushing onto one of two other arrays based on a random value, but this is a large query and that sounds very slow.
e.g.
array.each do |object|
if rand(2) == 0
first_group << object
else
second_group << object
end
end
To get a random ordering right from the database you can do
# MySQL
User.order('RAND()')
# PostgreSQL
User.order('RANDOM()')
A nice one liner to split an array into two halves can be found here:
left, right = a.each_slice( (a.size/2.0).round ).to_a
I would write a definition which would return the following
def randomizer(sample_size)
initial_arr = ["objq","obj2", "objn"]
sampler = initial_arr(sample_size)
sampled_data = initial_arr - sampler
end
here sample_size will be size of the array would like to randomize and split like 50 or 100 based on your data size.
for basic trial I have done the same as
[:foo, :bar, :hello, :world, :ruby].sample(3)
output would be [:hello, :ruby, :bar].
second would be the result of [:foo, :bar, :hello, :world, :ruby] - [:foo, :bar, :hello, :world, :ruby].sample(3) which is [:hello, :world]
This way you can avoid looping over array and execute the code faster.
for additional information you can check http://www.ruby-doc.org/core-2.1.1/Array.html#method-i-sample
This is available in ruby 1.9.1 as well.
You could perform basic Array operations on the query result:
results = User.where(:condition => true)
Start by using shuffle to put the array to get a random ordering:
array = results.shuffle
Then slice the array in to roughly equal parts:
group1 = array.slice(0..array.length/2)
group2 = array.slice(array.length/2+1..array.length)
If order is important, sort the groups back into the initial order:
group1.sort! {|a, b| results.index(a) <=> results.index(b) }
group2.sort! {|a, b| results.index(a) <=> results.index(b) }

Set first group of group_by?

Good day, I was wondering, is it possible if i make a selection with group_by like so
#p = Performance.includes(place: [:category]).
order("places.title ASC").
group_by{|p| p.place.category}
so if i want a specific category to be the first, what do i do?
EDIT 1
in view a parse through the results by #p.each do |p|
The return value of group_by is just a normal hash, so you can apply sort_by on it to place your desired category first:
group_by { |p| p.place.category }.sort_by { |k,v| (k=="category name") ? "" : k }
where category name is the name of the category you want to prioritize (the empty string make it come first in the sort results, everything else will just be sorted alphabetically).
This will transform the hash into an array. If you want to keep the data in hash form, wrap the result in Hash[...]:
Hash[group_by { |p| p.place.category }.sort_by { |k,v| (k=="category name") ? "" : k }]
See also this article on sorting hashes: http://www.rubyinside.com/how-to/ruby-sort-hash
UPDATE:
A slightly less processor-intensive alternative to sorting:
grouped = group_by { |p| p.place.category }
Hash[*grouped.assoc("category name")].merge(grouped.except("category name"))
There might be a simpler way to do this, but basically this prepends the key and value for "category name" to the head of the hash.
Although I think shioyama's answer might help you, I doubt you really need the sort process. As shio correctly states, the return value of your sort_by is a hash. So why dont you just access the value, which you want as the first value, simply by using it as hash-key?

Clean way to find ActiveRecord objects by id in the order specified

I want to obtain an array of ActiveRecord objects given an array of ids.
I assumed that
Object.find([5,2,3])
Would return an array with object 5, object 2, then object 3 in that order, but instead I get an array ordered as object 2, object 3 and then object 5.
The ActiveRecord Base find method API mentions that you shouldn't expect it in the order provided (other documentation doesn't give this warning).
One potential solution was given in Find by array of ids in the same order?, but the order option doesn't seem to be valid for SQLite.
I can write some ruby code to sort the objects myself (either somewhat simple and poorly scaling or better scaling and more complex), but is there A Better Way?
It's not that MySQL and other DBs sort things on their own, it's that they don't sort them. When you call Model.find([5, 2, 3]), the SQL generated is something like:
SELECT * FROM models WHERE models.id IN (5, 2, 3)
This doesn't specify an order, just the set of records you want returned. It turns out that generally MySQL will return the database rows in 'id' order, but there's no guarantee of this.
The only way to get the database to return records in a guaranteed order is to add an order clause. If your records will always be returned in a particular order, then you can add a sort column to the db and do Model.find([5, 2, 3], :order => 'sort_column'). If this isn't the case, you'll have to do the sorting in code:
ids = [5, 2, 3]
records = Model.find(ids)
sorted_records = ids.collect {|id| records.detect {|x| x.id == id}}
Based on my previous comment to Jeroen van Dijk you can do this more efficiently and in two lines using each_with_object
result_hash = Model.find(ids).each_with_object({}) {|result,result_hash| result_hash[result.id] = result }
ids.map {|id| result_hash[id]}
For reference here is the benchmark i used
ids = [5,3,1,4,11,13,10]
results = Model.find(ids)
Benchmark.measure do
100000.times do
result_hash = results.each_with_object({}) {|result,result_hash| result_hash[result.id] = result }
ids.map {|id| result_hash[id]}
end
end.real
#=> 4.45757484436035 seconds
Now the other one
ids = [5,3,1,4,11,13,10]
results = Model.find(ids)
Benchmark.measure do
100000.times do
ids.collect {|id| results.detect {|result| result.id == id}}
end
end.real
# => 6.10875988006592
Update
You can do this in most using order and case statements, here is a class method you could use.
def self.order_by_ids(ids)
order_by = ["case"]
ids.each_with_index.map do |id, index|
order_by << "WHEN id='#{id}' THEN #{index}"
end
order_by << "end"
order(order_by.join(" "))
end
# User.where(:id => [3,2,1]).order_by_ids([3,2,1]).map(&:id)
# #=> [3,2,1]
Apparently mySQL and other DB management system sort things on their own. I think that you can bypass that doing :
ids = [5,2,3]
#things = Object.find( ids, :order => "field(id,#{ids.join(',')})" )
A portable solution would be to use an SQL CASE statement in your ORDER BY. You can use pretty much any expression in an ORDER BY and a CASE can be used as an inlined lookup table. For example, the SQL you're after would look like this:
select ...
order by
case id
when 5 then 0
when 2 then 1
when 3 then 2
end
That's pretty easy to generate with a bit of Ruby:
ids = [5, 2, 3]
order = 'case id ' + (0 .. ids.length).map { |i| "when #{ids[i]} then #{i}" }.join(' ') + ' end'
The above assumes that you're working with numbers or some other safe values in ids; if that's not the case then you'd want to use connection.quote or one of the ActiveRecord SQL sanitizer methods to properly quote your ids.
Then use the order string as your ordering condition:
Object.find(ids, :order => order)
or in the modern world:
Object.where(:id => ids).order(order)
This is a bit verbose but it should work the same with any SQL database and it isn't that difficult to hide the ugliness.
As I answered here, I just released a gem (order_as_specified) that allows you to do native SQL ordering like this:
Object.where(id: [5, 2, 3]).order_as_specified(id: [5, 2, 3])
Just tested and it works in SQLite.
Justin Weiss wrote a blog article about this problem just two days ago.
It seems to be a good approach to tell the database about the preferred order and load all records sorted in that order directly from the database. Example from his blog article:
# in config/initializers/find_by_ordered_ids.rb
module FindByOrderedIdsActiveRecordExtension
extend ActiveSupport::Concern
module ClassMethods
def find_ordered(ids)
order_clause = "CASE id "
ids.each_with_index do |id, index|
order_clause << "WHEN #{id} THEN #{index} "
end
order_clause << "ELSE #{ids.length} END"
where(id: ids).order(order_clause)
end
end
end
ActiveRecord::Base.include(FindByOrderedIdsActiveRecordExtension)
That allows you to write:
Object.find_ordered([2, 1, 3]) # => [2, 1, 3]
Here's a performant (hash-lookup, not O(n) array search as in detect!) one-liner, as a method:
def find_ordered(model, ids)
model.find(ids).map{|o| [o.id, o]}.to_h.values_at(*ids)
end
# We get:
ids = [3, 3, 2, 1, 3]
Model.find(ids).map(:id) == [1, 2, 3]
find_ordered(Model, ids).map(:id) == ids
Another (probably more efficient) way to do it in Ruby:
ids = [5, 2, 3]
records_by_id = Model.find(ids).inject({}) do |result, record|
result[record.id] = record
result
end
sorted_records = ids.map {|id| records_by_id[id] }
Here's the simplest thing I could come up with:
ids = [200, 107, 247, 189]
results = ModelObject.find(ids).group_by(&:id)
sorted_results = ids.map {|id| results[id].first }
#things = [5,2,3].map{|id| Object.find(id)}
This is probably the easiest way, assuming you don't have too many objects to find, since it requires a trip to the database for each id.

Resources