How to address nested array keys after grouping in Rails? - ruby-on-rails

I have table called sales like this:
id customer_id product_id sales
1 4 190 100
2 4 190 150
3 4 191 200
4 5 192 300
5 6 200 400
What I'd like to do is, get a total on how many of each products have been bought by a customer. Therefor I perform a simple group by:
Sales.all.group(:customer_id, product_id).sum(:sales)
What this gives me is a hash with keys consisting of arrays with the combination of customer_id and product_id:
hash = {
[4, 190] => 250,
[4, 191] => 200,
[5, 192] => 300,
[6, 200] => 400
}
While this gives me the result, I'm now also looking to get the total sum of sales for each customer. I could of course write another query, but this feels redundant. Isn't there an easy way to do find all the hash array keys that start with [4, *] or something?

To get this value for a single value, you could do the following:
hash.select{|x| x[0] == 4}.values.sum
=> 450
But you could also transform the hash to a grouped one on the customer-ids.
hash.inject(Hash.new(0)) do |result, v|
# v will look like [[4, 190], 250]
customer_id = v[0][0]
result[customer_id] += v[1]
result
end
which you could also write as a one-liner
hash.inject(Hash.new(0)){|result, v| result[v[0][0]] += v[1]; result}
=> {4=>450, 5=>700}
Which can be further shortened when using each_with_object instead of inject (thanks to commenter Sebastian Palma)
hash.each_with_object(Hash.new(0)){|v,result| result[v[0][0]] += v[1]}

You can try following,
hash.group_by { |k,v| k.first }.transform_values { |v| v.map(&:last).sum }
# => {4=>450, 5=>300, 6=>400}

Try this:
hash.each_with_object(Hash.new(0)) do |(ids, value), result|
result[ids.first] += value
end
#=> { 4 => 450, 5 => 700 }

You could group the resulting hash based on customer id. Then sum all the values in the group:
hash = {
[4, 190] => 250,
[4, 191] => 200,
[5, 192] => 300,
[6, 200] => 400
}
hash.group_by { |entry| entry.shift.first }
.transform_values { |sums| sums.flatten.sum }
#=> {4=>450, 5=>300, 6=>400}
group_by groups the key-value pairs based on the customer id. shift will remove the key (leaving only the values), while first selects the customer id (since it is the first element in the array). This leaves you with an hash { 4 => [[250], [200]], 5 => [[300]], 6 => [[400]] }.
You then flatten the groups and sum all the values to get the sum for each costomer id.

Another approach would be to group by ID.
hash.
group_by { |(k, _), _| k }.
each_with_object(Hash.new{0}) do |(k, arr), acc|
acc[k] += arr.sum(&:last)
end
#⇒ {4=>450, 5=>700}

Related

How to do a single-line cumulative count for hash values in Ruby?

I've got the following data set:
{
Nov 2020=>1,
Dec 2020=>2,
Jan 2021=>3,
Feb 2021=>4,
Mar 2021=>5,
Apr 2021=>6
}
Using the following code:
cumulative_count = 0
count_data = {}
data_set.each { |k, v| count_data[k] = (cumulative_count += v) }
I'm producing the following set of data:
{
Nov 2020=>1,
Dec 2020=>3,
Jan 2021=>6,
Feb 2021=>10,
Mar 2021=>15,
Apr 2021=>21
}
Even though I've got the each as a single line, I feel like there's got to be some way to do the entire thing as a one-liner. I've tried using inject with no luck.
This would do the trick:
input.each_with_object([]) { |(key, value), arr| arr << [key, arr.empty? ? value : value + arr.last[1]] }.to_h
=> {"Nov 2020"=>1, "Dec 2020"=>3, "Jan 2021"=>6, "Feb 2021"=>10, "Mar 2021"=>15, "Apr 2021"=>21}
for input defined as:
input = {
'Nov 2020' => 1,
'Dec 2020' => 2,
'Jan 2021' => 3,
'Feb 2021' => 4,
'Mar 2021' => 5,
'Apr 2021' => 6
}
The idea is to inject an array (via each_with_object) to keep the processed data, and to allow us to easily get which is value of the the previous pair, and therefore allows us to accumulate the value. At the end, we transform this array into a hash so that we have the data structure we want to have.
Just to add a disclaimer, as the data being processed is a Hash (and therefore not a data structure that preserves order), a full one-liner to consider also a Hash ignoring any possible ordering would be the following:
input.to_a.sort_by { |pair| Date.parse(pair[0]) }.each_with_object([]) { |pair, arr| arr << [pair[0], arr.empty? ? pair[1] : pair[1] + arr.last[1]] }.to_h
=> {"Nov 2020"=>1, "Dec 2020"=>3, "Jan 2021"=>6, "Feb 2021"=>10, "Mar 2021"=>15, "Apr 2021"=>21}
In this case, we apply the same idea, but first converting the original data into an ordered array by date.
input = {
'Nov 2020' => 1,
'Dec 2020' => 2,
'Jan 2021' => 3,
'Feb 2021' => 4,
'Mar 2021' => 5,
'Apr 2021' => 6
}
If it must be on one physical line, and semicolons are allowed:
t = 0; input.each_with_object({}) { |(k, v), a| t += v; a[k] = t }
If it must be on one physical line, and semicolons are not allowed:
input.each_with_object({ t: 0, data: {}}) { |(k, v), a| (a[:t] += v) and (a[:data][k] = a[:t]) }[:data]
But in real practice, I think it's easier to read on multiple physical lines :)
t = 0
input.each_with_object({}) { |(k, v), a|
t += v
a[k] = t
}
TL;DR
This is what I ultimately ended up going with:
input.each_with_object({}) { |(k, v), h| h[k] = v + h.values.last.to_i }
Hats off to Marcos Parreiras (the accepted answer) for pointing me in the direction of each_with_object and the idea to pull the last value for accumulation instead of using += on a cumulative variable initialized as 0.
Details
I ended up with 3 potential solutions (listed below). My original code plus two options utilizing each_with_object – one of which depending on an array and the other on a hash.
Original
cumulative_count = 0
count_data = {}
input.each { |k, v| count_data[k] = (cumulative_count += v) }
Using array
input.each_with_object([]) { |(k, v), a| a << [k, v + a.last&.last.to_i] }.to_h
Using hash
input.each_with_object({}) { |(k, v), h| h[k] = v + h.values.last.to_i }
I settled on the option using the hash because I think it's the cleanest. However, it's worth noting that it's not the most performant. Based purely on performance, the original solution is hands-down the winner. Naturally, they're all extremely fast, so in order to really see the performance difference I had to run the options a very high number of times (displayed below). But since my actual solution will only be run once at a time in Production, I decided to go for succinctness over nanoseconds of performance. :-)
Performance
Each solution was run inside of 2_000_000.times { }.
Original
#<Benchmark::Tms:0x00007fde00fb72d8 #real=2.5452079999959096, #stime=0.09558999999999962, #total=2.5108440000000005, #utime=2.415254000000001>
Using array
#<Benchmark::Tms:0x00007fde0a1f58e8 #real=7.3623509999597445, #stime=0.08986500000000053, #total=7.250730000000002, #utime=7.160865000000001>
Using hash
#<Benchmark::Tms:0x00007f9e19ca7678 #real=5.903417999972589, #stime=0.057482000000000255, #total=5.830285999999999, #utime=5.772803999999999>
input = {
'Nov 2020' => 1,
'Dec 2020' => 2,
'Jan 2021' => 3,
'Feb 2021' => 4,
'Mar 2021' => 5,
'Apr 2021' => 6
}
If, as in the example, the values begin at 1 and each after the first is 1 greater than the previous value (recall key/value insertion order is guaranteed in hashes), the value n is to be converted to 1 + 2 +...+ n, which, being the sum of an arithmetic series, equals the following.
input.transform_values { |v| (1+v)*v/2 }
#=> {"Nov 2020"=>1, "Dec 2020"=>3, "Jan 2021"=>6, "Feb 2021"=>10,
# "Mar 2021"=>15, "Apr 2021"=>21}
Note that this does not require Hash#transform_values to process key-value pairs in any particular order.

What's a good way to create a string array in Ruby based on integer variables?

The integer variables are:
toonie = 2, loonie = 1, quarter = 1, dime = 0, nickel = 1, penny = 3
I want the final output to be
"2 toonies, 1 loonie, 1 quarter, 1 nickel, 3 pennies"
Is there a way to interpolate this all from Ruby code inside [] array brackets and then add .join(", ")?
Or will I have to declare an empty array first, and then write some Ruby code to add to the array if the integer variable is greater than 0?
I would do something like this:
coins = { toonie: 2, loonie: 1, quarter: 1, dime: 0, nickel: 1, penny: 3 }
coins.map { |k, v| pluralize(v, k) if v > 0 }.compact.join(', ')
#=> "2 toonie, 1 loonie, 1 quarter, 1 nickel, 3 penny"
Note that pluralize is a ActionView::Helpers::TextHelper method. Therefore it is only available in views and helpers.
When you want to use your example outside of views, you might want to use pluralize from ActiveSupport instead - what makes the solution slightly longer:
coins.map { |k, v| "#{v} #{v == 1 ? k : k.pluralize}" if v > 0 }.compact.join(', ')
#=> "2 toonie, 1 loonie, 1 quarter, 1 nickel, 3 penny"
Can be done in rails:
hash = {
"toonie" => 2,
"loonie" => 1,
"quarter" => 1,
"dime" => 0,
"nickel" => 1,
"penny" => 3
}
hash.to_a.map { |ele| "#{ele.last} #{ele.last> 1 ? ele.first.pluralize : ele.first}" }.join(", ")
Basically what you do is convert the hash to an array, which will look like this:
[["toonie", 2], ["loonie", 1], ["quarter", 1], ["dime", 0], ["nickel", 1], ["penny", 3]]
Then you map each element to the function provided, which takes the inner array, takes the numeric value in the last entry, places it in a string and then adds the plural or singular value based on the numeric value you just checked. And finally merge it all together
=> "2 toonies, 1 loonie, 1 quarter, 1 nickel, 3 pennies"
I'm not sure what exactly you're looking for, but I would start with a hash like:
coins = {"toonie" => 2, "loonie" => 1, "quarter" => 1, "dime" => 0, "nickel" => 1, "penny" => 3}
then you can use this to print the counts
def coin_counts(coins)
(coins.keys.select { |coin| coins[coin] > 0}.map {|coin| coins[coin].to_s + " " + coin}).join(", ")
end
If you would like appropriate pluralizing, you can do the following:
include ActionView::Helpers::TextHelper
def coin_counts(coins)
(coins.keys.select { |coin| coins[coin] > 0}.map {|coin| pluralize(coins[coin], coin)}).join(", ")
end
This is just for fun and should not be used in production but you can achieve it like
def run
toonie = 2
loonie = 1
quarter = 1
dime = 0
nickel = 1
penny = 3
Kernel::local_variables.each_with_object([]) { |var, array|
next if eval(var.to_s).to_i.zero?
array << "#{eval(var.to_s)} #{var}"
}.join(', ')
end
run # returns "2 toonie, 1 loonie, 1 quarter, 1 nickel, 3 penny"
The above does not implement the pluralization requirement because it really depends if you will have irregular plural nouns or whatever.
I would go with a hash solution as described in the other answers

Hash and frequency of a value

I have the following hash in Ruby :
{
0 => {
:method=> "POST",
:path=> "/api/customer/191023",
:host=> "host.8",
:duration=> "1221"
},
1 => {
:method=> "GET",
:path=> "/api/customer/191023",
:host=> "host.8",
:duration=> "99"
},
2 => {
:method=> "POST",
:path=> "/api/customer/191023",
:host=> "host.10",
:duration=> "142"
},
3 => {
:method=> "POST",
:path=> "/api/customer/191023",
:host=> "host.8",
:duration=> "243"
}
4 => {
:method=> "POST",
:path=> "/api/customer/191023",
:host=> "host.10",
:duration=> "132"
}
}
I would like to do a simple search within these hashes to find the host with the highest frequency. For example, in the previous example, I should get host.8.
Thank you for your help,
M.
To find host value with highest frequency do:
hs = hash.values.group_by { |h| h[:host] =~ /host\.(\d+)/ && $1.to_i || 0 }.to_a
hs.reduce([-1,0]) { |sum,v| v[1].size > sum[1] && [ v[0], v[1].size ] || sum }.first
Description: [-1,0] is the default value for set for #reduce method, where -1 is a number (like in host.number), and 0 is a count of the number. So, when reduce encounters the number with size more than of passed sum, it replaces with the new value on next iteration.
Here's one way to do that.
Code
def max_host(hash)
hash.each_with_object(Hash.new(0)) { |(_,v),h| h[v[:host]] += 1 }
.max_by { |_,v| v }
.first
end
Example
Let's take the simplified example below. Note that I've changed, for example, :host = \"host.10\" to :host = "host.10", as the former is not a correct syntax. You could write the string as '\"host.10\" (=> "\\\"host.10\\\""), but I assume you simply want "host.10". The code is the same for both.
hash = {
0 => {
:method=>"POST",
:host =>"host.8"
},
1 => {
:method=>"GET",
:host =>"host.10"
},
2 => {
:method=>"POST",
:host =>"host.10"
}
}
max_host(hash)
#=> "host.10"
Explanation
For the example hash above,
enum = hash.each_with_object(Hash.new(0))
#=> #<Enumerator: {
# 0=>{:method=>"POST", :host=>"host.8"},
# 1=>{:method=>"GET", :host=>"host.10"},
# 2=>{:method=>"POST", :host=>"host.10"}}:each_with_object({})>
The enumerator will invoke the method Hash#each to pass each element of the enumerator into the block. We can see what those elements are by converting the enumerator to an array:
enum.to_a
#=> [[[0, {:method=>"POST", :host=>"host.8"}], {}],
# [[1, {:method=>"GET", :host=>"host.10"}], {}],
# [[2, {:method=>"POST", :host=>"host.10"}], {}]]
The empty hash shown in the first element is the initial value of the hash created by
Hash.new(0)
This creates a hash h with a default value of zero. By doing it this way, if h does not have a key k, h[k] will return the default value (0), but (important!) this does not change the hash.
The first value passed into the block is
[[0, {:method=>"POST", :host=>"host.8"}], {}]
This is then decomposed (or "disambiguated") into individual objects that are assigned to three block variables:
k => 0
v => {:method=>"POST", :host=>"host.8"}
h => Hash.new(0)
We then execute:
h[v[:host]] += 1
which is
h["host.8"] += 1
which is shorthand for
h["host.8"] = h["host.8"] + 1
[Aside: you may have noticed that in the code I show the block variables as |(_,v),h|, whereas above I refer to them above as |(k,v),h|. I could have used the latter, but since k is not reference in the block, I've chosen to replace it with a "placeholder" _. This ensures k won't be referenced and also tells any readers that I'm not using what would be the first block variable.]
As h does not have a key "host.8", h["host.8"] to the right of = returns the default value:
h["host.8"] = 0 + 1
#=> 1
so now
h #=> {"host.8"=>1}
The second element passed into the block is
[[1, {:method=>"GET", :host=>"host.10"}], {"host.8"=>1}]
so the block variables become:
v => {:method=>"GET", :host=>"host.10"}
h => {"host.8"=>1}
Notice that the hash h has been updated. We execute
h[v[:host]] += 1
#=> h["host.10"] += 1
#=> h["host.10"] = h["host.10"] + 1
#=> h["host.10"] = 0 + 1
#=> 1
so now
h #=> {"host.8"=>1, "host.10"=>1}
Lastly, the block variables are assigned the values
v = {:method=>"POST", :host=>"host.10"}
h => {"host.8"=>1, "host.10"=>1}
so
h[v[:host]] += 1
#=> h["host.10"] += 1
#=> h["host.10"] = h["host.10"] + 1
#=> h["host.10"] = 1 + 1
#=> 2
h #=> {"host.8"=>1, "host.10"=>2}
and the value of h is returned by the method.

Creating a range from one column

I have a column called "Marks" which contains values like
Marks = [100,200,150,157,....]
I need to assign Grades to those marks using the following key
<25=0, <75=1, <125=2, <250=3, <500=4, >500=5
If Marks < 25, then Grade = 0, if marks < 75 then grade = 1.
I can sort the results and find the first record that matches using Ruby's find function. Is it the best method ? Or is there a way by which I can prepare a range using the key by adding Lower Limit and Upper Limit columns to the table and by populating those ranges using the key? Marks can have decimals too Ex: 99.99
Without using Rails, you could do it like this:
marks = [100, 200, 150, 157, 692, 12]
marks_to_grade = { 25=>0, 75=>1, 125=>2, 250=>3, 500=>4, Float::INFINITY=>5 }
Hash[marks.map { |m| [m, marks_to_grade.find { |k,_| m <= k }.last] }]
#=> {100=>2, 200=>3, 150=>3, 157=>3, 692=>5, 12=>0}
With Ruby 2.1, you could write this:
marks.map { |m| [m, marks_to_grade.find { |k,_| m <= k }.last] }.to_h
Here's what's happening:
Enumerable#map (a.k.a collect) converts each mark m to an array [m, g], where g is the grade computed for that mark. For example, when map passes the first element of marks into its block, we have:
m = 100
a = marks_to_grade.find { |k,_| m <= k }
#=> marks_to_grade.find { |k,_| 100 <= k }
#=> [125, 2]
a.last
#=> 2
so the mark 100 is mapped to [100, 2]. (I've replaced the block variable for the value of the key-value pair with the placeholder _ to draw attention to the fact that the value is not being used in the calculation within the block. One could also use, say, _v as the placeholder.) The remaining marks are similarly mapped, resulting in:
b = marks.map { |m| [m, marks_to_grade.find { |k,_| m <= k }.last] }
#=> [[100, 2], [200, 3], [150, 3], [157, 3], [692, 5], [12, 0]]
Lastly
Hash[b]
#=> {100=>2, 200=>3, 150=>3, 157=>3, 692=>5, 12=>0}
or, for Ruby 2.1+
b.to_h
#=> {100=>2, 200=>3, 150=>3, 157=>3, 692=>5, 12=>0}
You can make use of update_all:
Student.where(:mark => 0...25).update_all(grade: 0)
Student.where(:mark => 25...75).update_all(grade: 1)
Student.where(:mark => 75...125).update_all(grade: 2)
Student.where(:mark => 125...250).update_all(grade: 3)
Student.where(:mark => 250...500).update_all(grade: 4)
Student.where("mark > ?", 500).update_all(grade: 5)

Splitting/Slicing array in ruby

I've found this similar two questions to the one I'm about to ask:
Split array up into n-groups of m size?
and
Need to split arrays to sub arrays of specified size in Ruby
This splits array into three arrays with each array having three elements :
a.each_slice(3) do |x,y,z|
p [x,y,z]
end
So if I do this (my array size is 1000) :
a.each_slice(200) do |a,b,c,d,e|
p "#{a} #{b} #{c} #{d} #{e}"
end
This should split my array into 5 arrays each having 200 members? But it doesn't?
What I actually need to do is to put 200 random elements into 5 arrays, am I on the right track here, how can I do this?
Enumerable#each_slice
If you provide a single argument to the block of each_slice then it will fill that argument with an array of values less than or equal to the given argument. On the last iteration if there are less than n values left then the array size will be whatever is left.
If you provide multiple arguments to the block of each_slice then it will fill those values with the values from the source array. If the slice size is greater than the number of arguments given then some values will be ignored. If it is less than the number of arguments than the excess arguments will be nil.
a = (1..9).to_a
a.each_slice(3) {|b| puts b.inspect }
[1,2,3]
[4,5,6]
[7,8,9]
a.each_slice(4) {|b| puts b.inspect }
[1,2,3,4]
[5,6,7,8]
[9]
a.each_slice(3) {|b,c,d| puts (b + c + d)}
6 # 1 + 2 + 3
15 # 4 + 5 + 6
24 # 7 + 8 + 9
a.each_slice(3) {|b,c| puts (b + c)}
3 # 1 + 2, skips 3
9 # 4 + 5, skips 6
15 # 7 + 8, skips 9
a.each_slice(2) {|b,c| puts c.inspect}
2
4
6
8
nil
a.each_slice(3) {|b,c,d,e| puts e.inspect}
nil
nil
nil
irb(main):001:0> a= (1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
irb(main):002:0> a.sample(3)
=> [5, 10, 1]
irb(main):003:0> (1..3).map{ a.sample(3) }
=> [[6, 2, 5], [8, 7, 3], [4, 5, 7]]
irb(main):004:0>
Actually you will return a string with the five elements inserted in it.
You can try something:
a1 = [], a2 = [], a3 = [], a4 = [], a5 = []
a.each_slice(5) do |a,b,c,d,e|
a1 << a
a2 << b
a3 << c
a4 << d
a5 << e
end
You will end up with five arrays containing 200 elements each.
I used the simplest possible syntax to make it clear, you can
make it much more condensed.
If you want to assign that result to 5 different arrays, you could use the splat operator,like this:
a,b,c,d,e = *(1..1000).each_slice(200).to_a

Resources