array of hashes sort by - ruby-on-rails

I have a big array of hashes:
array = [
{color: '5 absolute', ... },
{color: '5.0', ... },
{color: '5.1', ... },
{color: 'last', ... },
{color: '50', ... },
{color: '5 elite', ... },
{color: 'edge'}
]
I need colors to ordered:
5 absolute
5 elite
5.0
5.1
50
edge
last
The priority is:
first going spaces ' ',
then dots '.',
then digits '7',
then other 'string'
This is like SQL activerecord analog query, but I don't want that difficult query in the background. I want this logic. How can I do this using AR query?

You could always just sort the array of hashes.
array.map{|h| h[:color]}.sort
=> ["5 absolute", "5 elite", "5.0", "5.1", "50", "edge", "last"]
The following first sorts by number and then by the string after the number.
array = [{color: '5 absolute'}, {color: '5.0'}, {color: '5.1'},
{color: 'last'}, {color: '50'}, {color: '5 elite'},
{color: 'edge'}, {color: '6 absolute'}, {color: '7'}]
array.map{|h| h[:color]}.sort_by do |s|
n = s.to_f
if n == 0 && s.match(/\d/).nil?
n = Float::INFINITY
end
[n, s.split(" ")[-1]]
end
=> ["5.0", "5 absolute", "5 elite", "5.1", "6 absolute", "7", "50", "edge", "last"]

so like this?
h = [{:color=>"5 absolute"},
{:color=>"5.0"},
{:color=>"5.1"},
{:color=>"last"},
{:color=>"50"},
{:color=>"5 elite"},
{:color=>"edge"}]
h.map(&:values).flatten.sort
# => ["5 absolute", "5 elite", "5.0", "5.1", "50", "edge", "last"]
or all the other answers...

From you question it is very hard to tell what you want. Especially since the order you ask for is exactly the same one a normal sort would create.
I any case, here is a way of creating a "custom sort" order the way you wanted. The difference between this and a regular sort is that in this sort can make certain types of characters or sets of characters triumph others.
array = [
{color: '5 absolute'},
{color: '5.0'},
{color: '50 hello'},
{color: 'edge'}
]
p array.sort_by{|x| x[:color]} #=> [{:color=>"5 absolute"}, {:color=>"5.0"}, {:color=>"50 hello"}, {:color=>"edge"}]
# '50 hello' is after '5.0' as . is smaller than 0.
Solving this problem is a bit tricky, here is how I would do it:
# Create a custom sort order using regexp:
# [spaces, dots, digits, words, line_endings]
order = [/\s+/,/\./,/\d+/,/\w+/,/$/]
# Create a union to use in a scan:
regex_union = Regexp.union(*order)
# Create a function that maps the capture in the scan to the index in the custom sort order:
custom_sort_order = ->x{
x[:color].scan(regex_union).map{|x| [order.index{|y|x=~y}, x]}.transpose
}
#Sort:
p array.sort_by{|x| custom_sort_order[x]}
# => [{:color=>"5 absolute"}, {:color=>"50 hello"}, {:color=>"5.0"}, {:color=>"edge"}]

Related

How to do a single-line cumulative count for hash values in Ruby?

I've got the following data set:
{
Nov 2020=>1,
Dec 2020=>2,
Jan 2021=>3,
Feb 2021=>4,
Mar 2021=>5,
Apr 2021=>6
}
Using the following code:
cumulative_count = 0
count_data = {}
data_set.each { |k, v| count_data[k] = (cumulative_count += v) }
I'm producing the following set of data:
{
Nov 2020=>1,
Dec 2020=>3,
Jan 2021=>6,
Feb 2021=>10,
Mar 2021=>15,
Apr 2021=>21
}
Even though I've got the each as a single line, I feel like there's got to be some way to do the entire thing as a one-liner. I've tried using inject with no luck.
This would do the trick:
input.each_with_object([]) { |(key, value), arr| arr << [key, arr.empty? ? value : value + arr.last[1]] }.to_h
=> {"Nov 2020"=>1, "Dec 2020"=>3, "Jan 2021"=>6, "Feb 2021"=>10, "Mar 2021"=>15, "Apr 2021"=>21}
for input defined as:
input = {
'Nov 2020' => 1,
'Dec 2020' => 2,
'Jan 2021' => 3,
'Feb 2021' => 4,
'Mar 2021' => 5,
'Apr 2021' => 6
}
The idea is to inject an array (via each_with_object) to keep the processed data, and to allow us to easily get which is value of the the previous pair, and therefore allows us to accumulate the value. At the end, we transform this array into a hash so that we have the data structure we want to have.
Just to add a disclaimer, as the data being processed is a Hash (and therefore not a data structure that preserves order), a full one-liner to consider also a Hash ignoring any possible ordering would be the following:
input.to_a.sort_by { |pair| Date.parse(pair[0]) }.each_with_object([]) { |pair, arr| arr << [pair[0], arr.empty? ? pair[1] : pair[1] + arr.last[1]] }.to_h
=> {"Nov 2020"=>1, "Dec 2020"=>3, "Jan 2021"=>6, "Feb 2021"=>10, "Mar 2021"=>15, "Apr 2021"=>21}
In this case, we apply the same idea, but first converting the original data into an ordered array by date.
input = {
'Nov 2020' => 1,
'Dec 2020' => 2,
'Jan 2021' => 3,
'Feb 2021' => 4,
'Mar 2021' => 5,
'Apr 2021' => 6
}
If it must be on one physical line, and semicolons are allowed:
t = 0; input.each_with_object({}) { |(k, v), a| t += v; a[k] = t }
If it must be on one physical line, and semicolons are not allowed:
input.each_with_object({ t: 0, data: {}}) { |(k, v), a| (a[:t] += v) and (a[:data][k] = a[:t]) }[:data]
But in real practice, I think it's easier to read on multiple physical lines :)
t = 0
input.each_with_object({}) { |(k, v), a|
t += v
a[k] = t
}
TL;DR
This is what I ultimately ended up going with:
input.each_with_object({}) { |(k, v), h| h[k] = v + h.values.last.to_i }
Hats off to Marcos Parreiras (the accepted answer) for pointing me in the direction of each_with_object and the idea to pull the last value for accumulation instead of using += on a cumulative variable initialized as 0.
Details
I ended up with 3 potential solutions (listed below). My original code plus two options utilizing each_with_object – one of which depending on an array and the other on a hash.
Original
cumulative_count = 0
count_data = {}
input.each { |k, v| count_data[k] = (cumulative_count += v) }
Using array
input.each_with_object([]) { |(k, v), a| a << [k, v + a.last&.last.to_i] }.to_h
Using hash
input.each_with_object({}) { |(k, v), h| h[k] = v + h.values.last.to_i }
I settled on the option using the hash because I think it's the cleanest. However, it's worth noting that it's not the most performant. Based purely on performance, the original solution is hands-down the winner. Naturally, they're all extremely fast, so in order to really see the performance difference I had to run the options a very high number of times (displayed below). But since my actual solution will only be run once at a time in Production, I decided to go for succinctness over nanoseconds of performance. :-)
Performance
Each solution was run inside of 2_000_000.times { }.
Original
#<Benchmark::Tms:0x00007fde00fb72d8 #real=2.5452079999959096, #stime=0.09558999999999962, #total=2.5108440000000005, #utime=2.415254000000001>
Using array
#<Benchmark::Tms:0x00007fde0a1f58e8 #real=7.3623509999597445, #stime=0.08986500000000053, #total=7.250730000000002, #utime=7.160865000000001>
Using hash
#<Benchmark::Tms:0x00007f9e19ca7678 #real=5.903417999972589, #stime=0.057482000000000255, #total=5.830285999999999, #utime=5.772803999999999>
input = {
'Nov 2020' => 1,
'Dec 2020' => 2,
'Jan 2021' => 3,
'Feb 2021' => 4,
'Mar 2021' => 5,
'Apr 2021' => 6
}
If, as in the example, the values begin at 1 and each after the first is 1 greater than the previous value (recall key/value insertion order is guaranteed in hashes), the value n is to be converted to 1 + 2 +...+ n, which, being the sum of an arithmetic series, equals the following.
input.transform_values { |v| (1+v)*v/2 }
#=> {"Nov 2020"=>1, "Dec 2020"=>3, "Jan 2021"=>6, "Feb 2021"=>10,
# "Mar 2021"=>15, "Apr 2021"=>21}
Note that this does not require Hash#transform_values to process key-value pairs in any particular order.

How to address nested array keys after grouping in Rails?

I have table called sales like this:
id customer_id product_id sales
1 4 190 100
2 4 190 150
3 4 191 200
4 5 192 300
5 6 200 400
What I'd like to do is, get a total on how many of each products have been bought by a customer. Therefor I perform a simple group by:
Sales.all.group(:customer_id, product_id).sum(:sales)
What this gives me is a hash with keys consisting of arrays with the combination of customer_id and product_id:
hash = {
[4, 190] => 250,
[4, 191] => 200,
[5, 192] => 300,
[6, 200] => 400
}
While this gives me the result, I'm now also looking to get the total sum of sales for each customer. I could of course write another query, but this feels redundant. Isn't there an easy way to do find all the hash array keys that start with [4, *] or something?
To get this value for a single value, you could do the following:
hash.select{|x| x[0] == 4}.values.sum
=> 450
But you could also transform the hash to a grouped one on the customer-ids.
hash.inject(Hash.new(0)) do |result, v|
# v will look like [[4, 190], 250]
customer_id = v[0][0]
result[customer_id] += v[1]
result
end
which you could also write as a one-liner
hash.inject(Hash.new(0)){|result, v| result[v[0][0]] += v[1]; result}
=> {4=>450, 5=>700}
Which can be further shortened when using each_with_object instead of inject (thanks to commenter Sebastian Palma)
hash.each_with_object(Hash.new(0)){|v,result| result[v[0][0]] += v[1]}
You can try following,
hash.group_by { |k,v| k.first }.transform_values { |v| v.map(&:last).sum }
# => {4=>450, 5=>300, 6=>400}
Try this:
hash.each_with_object(Hash.new(0)) do |(ids, value), result|
result[ids.first] += value
end
#=> { 4 => 450, 5 => 700 }
You could group the resulting hash based on customer id. Then sum all the values in the group:
hash = {
[4, 190] => 250,
[4, 191] => 200,
[5, 192] => 300,
[6, 200] => 400
}
hash.group_by { |entry| entry.shift.first }
.transform_values { |sums| sums.flatten.sum }
#=> {4=>450, 5=>300, 6=>400}
group_by groups the key-value pairs based on the customer id. shift will remove the key (leaving only the values), while first selects the customer id (since it is the first element in the array). This leaves you with an hash { 4 => [[250], [200]], 5 => [[300]], 6 => [[400]] }.
You then flatten the groups and sum all the values to get the sum for each costomer id.
Another approach would be to group by ID.
hash.
group_by { |(k, _), _| k }.
each_with_object(Hash.new{0}) do |(k, arr), acc|
acc[k] += arr.sum(&:last)
end
#⇒ {4=>450, 5=>700}

ruby compare two arrays of hash, with certain keys [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
There's two arrays of hash and I want remove the 'common' elements from the two arrays, based on certain keys. For example:
array1 = [{a: '1', b:'2', c:'3'}, {a: '4', b: '5', c:'6'}]
array2 = [{a: '1', b:'2', c:'10'}, {a: '3', b: '5', c:'6'}]
and the criteria keys are a and b. So when I get the result of something like
array1-array2 (don't have to overwrite '-' if there's better approach)
it will expect to get
[{a: '4', b: '5', c:'6'}]
sine we were using a and b as the comparing criteria. It will wipe the second element out since the value for a is different for array1.last and array2.last.
As I understand, you are given two arrays of hashes and a set of keys. You want to reject all elements (hashes) of the first array whose values match the values of any element (hash) of the second array, for all specified keys. You can do that as follows.
Code
require 'set'
def reject_partial_dups(array1, array2, keys)
set2 = array2.each_with_object(Set.new) do |h,s|
s << h.values_at(*keys) if (keys-h.keys).empty?
end
array1.reject do |h|
(keys-h.keys).empty? && set2.include?(h.values_at(*keys))
end
end
The line:
(keys-h.keys).empty? && set2.include?(h.values_at(*keys))
can be simplified to:
set2.include?(h.values_at(*keys))
if none of the values of keys in the elements (hashes) of array1 are nil. I created a set (rather than an array) from array2 in order to speed the lookup of h.values_at(*keys) in that line.
Example
keys = [:a, :b]
array1 = [{a: '1', b:'2', c:'3'}, {a: '4', b: '5', c:'6'}, {a: 1, c: 4}]
array2 = [{a: '1', b:'2', c:'10'}, {a: '3', b: '5', c:'6'}]
reject_partial_dups(array1, array2, keys)
#=> [{:a=>"4", :b=>"5", :c=>"6"}, {:a=>1, :c=>4}]
Explanation
First create set2
e0 = array2.each_with_object(Set.new)
#=> #<Enumerator: [{:a=>"1", :b=>"2", :c=>"10"}, {:a=>"3", :b=>"5", :c=>"6"}]
# #:each_with_object(#<Set: {}>)>
Pass the first element of e0 and perform the block calculation.
h,s = e0.next
#=> [{:a=>"1", :b=>"2", :c=>"10"}, #<Set: {}>]
h #=> {:a=>"1", :b=>"2", :c=>"10"}
s #=> #<Set: {}>
(keys-h.keys).empty?
#=> ([:a,:b]-[:a,:b,:c]).empty? => [].empty? => true
so compute:
s << h.values_at(*keys)
#=> s << {:a=>"1", :b=>"2", :c=>"10"}.values_at(*[:a,:b] }
#=> s << ["1","2"] => #<Set: {["1", "2"]}>
Pass the second (last) element of e0 to the block:
h,s = e0.next
#=> [{:a=>"3", :b=>"5", :c=>"6"}, #<Set: {["1", "2"]}>]
(keys-h.keys).empty?
#=> true
so compute:
s << h.values_at(*keys)
#=> #<Set: {["1", "2"], ["3", "5"]}>
set2
#=> #<Set: {["1", "2"], ["3", "5"]}>
Reject elements from array1
We now iterate through array1, rejecting elements for which the block evaluates to true.
e1 = array1.reject
#=> #<Enumerator: [{:a=>"1", :b=>"2", :c=>"3"},
# {:a=>"4", :b=>"5", :c=>"6"}, {:a=>1, :c=>4}]:reject>
The first element of e1 is passed to the block:
h = e1.next
#=> {:a=>"1", :b=>"2", :c=>"3"}
a = (keys-h.keys).empty?
#=> ([:a,:b]-[:a,:b,:c]).empty? => true
b = set2.include?(h.values_at(*keys))
#=> set2.include?(["1","2"] => true
a && b
#=> true
so the first element of e1 is rejected. Next:
h = e1.next
#=> {:a=>"4", :b=>"5", :c=>"6"}
a = (keys-h.keys).empty?
#=> true
b = set2.include?(h.values_at(*keys))
#=> set2.include?(["4","5"] => false
a && b
#=> false
so the second element of e1 is not rejected. Lastly:
h = e1.next
#=> {:a=>1, :c=>4}
a = (keys-h.keys).empty?
#=> ([:a,:c]-[:a,:b]).empty? => [:c].empty? => false
so return true (meaning the last element of e1 is not rejected), as there is no need to compute:
b = set2.include?(h.values_at(*keys))
So you really should try this out yourself because I am basically solving it for you.
The general approach would be:
For every time in array1
Check to see the same value in array2 has any keys and values with the same value
If they do then, delete it
You would probably end up with something like array1.each_with_index { |h, i| h.delete_if {|k,v| array2[i].has_key?(k) && array2[i][k] == v } }

Creating a range from one column

I have a column called "Marks" which contains values like
Marks = [100,200,150,157,....]
I need to assign Grades to those marks using the following key
<25=0, <75=1, <125=2, <250=3, <500=4, >500=5
If Marks < 25, then Grade = 0, if marks < 75 then grade = 1.
I can sort the results and find the first record that matches using Ruby's find function. Is it the best method ? Or is there a way by which I can prepare a range using the key by adding Lower Limit and Upper Limit columns to the table and by populating those ranges using the key? Marks can have decimals too Ex: 99.99
Without using Rails, you could do it like this:
marks = [100, 200, 150, 157, 692, 12]
marks_to_grade = { 25=>0, 75=>1, 125=>2, 250=>3, 500=>4, Float::INFINITY=>5 }
Hash[marks.map { |m| [m, marks_to_grade.find { |k,_| m <= k }.last] }]
#=> {100=>2, 200=>3, 150=>3, 157=>3, 692=>5, 12=>0}
With Ruby 2.1, you could write this:
marks.map { |m| [m, marks_to_grade.find { |k,_| m <= k }.last] }.to_h
Here's what's happening:
Enumerable#map (a.k.a collect) converts each mark m to an array [m, g], where g is the grade computed for that mark. For example, when map passes the first element of marks into its block, we have:
m = 100
a = marks_to_grade.find { |k,_| m <= k }
#=> marks_to_grade.find { |k,_| 100 <= k }
#=> [125, 2]
a.last
#=> 2
so the mark 100 is mapped to [100, 2]. (I've replaced the block variable for the value of the key-value pair with the placeholder _ to draw attention to the fact that the value is not being used in the calculation within the block. One could also use, say, _v as the placeholder.) The remaining marks are similarly mapped, resulting in:
b = marks.map { |m| [m, marks_to_grade.find { |k,_| m <= k }.last] }
#=> [[100, 2], [200, 3], [150, 3], [157, 3], [692, 5], [12, 0]]
Lastly
Hash[b]
#=> {100=>2, 200=>3, 150=>3, 157=>3, 692=>5, 12=>0}
or, for Ruby 2.1+
b.to_h
#=> {100=>2, 200=>3, 150=>3, 157=>3, 692=>5, 12=>0}
You can make use of update_all:
Student.where(:mark => 0...25).update_all(grade: 0)
Student.where(:mark => 25...75).update_all(grade: 1)
Student.where(:mark => 75...125).update_all(grade: 2)
Student.where(:mark => 125...250).update_all(grade: 3)
Student.where(:mark => 250...500).update_all(grade: 4)
Student.where("mark > ?", 500).update_all(grade: 5)

Number to English Word Conversion Rails

Anybody knows the method to convert the numericals to english number words in rails?
I found some Ruby scripts to convert numbericals to english words for corresponding words.
Instead of writing a script in ruby, i feel that direct function is available.
Eg. 1 -> One, 2 -> Two.
Use the numbers_and_words gem, https://github.com/kslazarev/numbers_and_words
The humanize gem that does exactly what you want:
require 'humanize'
23.humanize # => "twenty three"
0.42.humanize(decimals_as: :digits) # => "zero point four two"
No, you have to write a function yourself. The closest thing to what you want is number_to_human, but that does not convert 1 to One.
Here are some URLs that may be helpful:
http://codesnippets.joyent.com/posts/show/447
http://raveendran.wordpress.com/2009/05/29/ruby-convert-number-to-english-word/
http://deveiate.org/projects/Linguistics/
You can also use the to_words gem.
This Gem converts integers into words.
e.g.
1.to_words # one ,
100.to_words # one hundred ,
101.to_words # one hundred and one
It also converts negative numbers.
How about this? Written for converting numbers to words in the Indian system, but can be easily modified.
def to_words(num)
numbers_to_name = {
10000000 => "crore",
100000 => "lakh",
1000 => "thousand",
100 => "hundred",
90 => "ninety",
80 => "eighty",
70 => "seventy",
60 => "sixty",
50 => "fifty",
40 => "forty",
30 => "thirty",
20 => "twenty",
19=>"nineteen",
18=>"eighteen",
17=>"seventeen",
16=>"sixteen",
15=>"fifteen",
14=>"fourteen",
13=>"thirteen",
12=>"twelve",
11 => "eleven",
10 => "ten",
9 => "nine",
8 => "eight",
7 => "seven",
6 => "six",
5 => "five",
4 => "four",
3 => "three",
2 => "two",
1 => "one"
}
log_floors_to_ten_powers = {
0 => 1,
1 => 10,
2 => 100,
3 => 1000,
4 => 1000,
5 => 100000,
6 => 100000,
7 => 10000000
}
num = num.to_i
return '' if num <= 0 or num >= 100000000
log_floor = Math.log(num, 10).floor
ten_power = log_floors_to_ten_powers[log_floor]
if num <= 20
numbers_to_name[num]
elsif log_floor == 1
rem = num % 10
[ numbers_to_name[num - rem], to_words(rem) ].join(' ')
else
[ to_words(num / ten_power), numbers_to_name[ten_power], to_words(num % ten_power) ].join(' ')
end
end
You may also want to check gem 'rupees' - https://github.com/railsfactory-shiv/rupees to convert numbers to indian rupees (e.g. in Lakh, Crore, etc)

Resources