remove special characters in an array in ruby - ruby-on-rails

I have an array as shown below:
[
[
"[\"\", \"Mrs. Brain Bauch\", \"Vernice Ledner\"]",
"[\"\", \"Robb Ratke\", \"Amaya Jakubowski\"]",
"[\"\", \"Lindsey Cremin III\", \"Harvey Fisher\"]",
"[\"\", \"Daniela Schneider\", \"Benny Schumm\"]"
]
]
How can I convert this into the array structure shown below:
[
[
["Mrs. Brain Bauch", "Vernice Ledner"],
["Robb Ratke", "Amaya Jakubowski"],
["Lindsey Cremin III", "Harvey Fisher"],
["Daniela Schneider", "Benny Schumm"]
]
]

require 'json'
input = [[
"[\"\", \"Mrs. Brain Bauch\", \"Vernice Ledner\"]",
"[\"\", \"Robb Ratke\", \"Amaya Jakubowski\"]",
"[\"\", \"Lindsey Cremin III\", \"Harvey Fisher\"]",
"[\"\", \"Daniela Schneider\", \"Benny Schumm\"]"
]]
[input.first.map { |l| JSON.parse l }.map { |a| a.reject &:empty? }]
#⇒ [[
#   ["Mrs. Brain Bauch", "Vernice Ledner"],
#  ["Robb Ratke", "Amaya Jakubowski"],
#  ["Lindsey Cremin III", "Harvey Fisher"],
#  ["Daniela Schneider", "Benny Schumm"]
#  ]]

If arr is your array:
r = /
(?<=\") # match `\"` in a positive lookbehind
[A-Z] # match a capital letter
[a-z\.\s]+ # match a letter, period or space one or more times
/ix # case-insenitive (i) and free-spacing (x) regex definition modes
[arr.first.map { |s| s.scan r }]
#=> [[["Mrs. Brain Bauch", "Vernice Ledner"],
# ["Robb Ratke", "Amaya Jakubowski"],
# ["Lindsey Cremin III", "Harvey Fisher"],
# ["Daniela Schneider", "Benny Schumm"]]]

Related

Ruby replace hash key using a Regex

I am parsing an Excel file using Creek. This is the first row (the header):
{"A"=>"Date", "B"=>"Portfolio", "C"=>"Currency"}
and all the other rows are:
[
{"A"=>2019-05-16 00:00:00 +0200, "B"=>"TEXT", "C"=>"INR"},
{"A"=>2019-05-20 00:00:00 +0200, "B"=>"TEXT2", "C"=>"EUR"}
]
My goal is to have the same array, where all hash keys are replaced with key of mapping using a regex expression in the values of the mapping hash.
For example, in the header, the keys match these REGEX:
mapping = {
date: /Date|Data|datum|Fecha/,
portfolio_name: /Portfolio|portafoglio|Portfolioname|cartera|portefeuille/,
currency: /Currency|Valuta|Währung|Divisa|Devise/
}
So I need all data rows to be replaced like this:
[
{"date"=>2019-05-16 00:00:00 +0200, "portfolio_name"=>"TEXT", "currency"=>"INR"},
{"date=>2019-05-20 00:00:00 +0200, "portfolio_name"=>"TEXT2", "currency"=>"EUR"}
]
Detect column names in a separate step. Intermediate mapping will look like {"A"=>:date, "B"=>:portfolio_name, "C"=>:currency}, and then you can transform data array.
This is pretty straightforward:
header_mapping = header.transform_values{|v|
mapping.find{|key,regex| v.match?(regex) }&.first || raise("Unknown header field #{v}")
}
rows.map{|row|
row.transform_keys{|k| header_mapping[k].to_s }
}
Code requires Ruby 2.4+ for native Hash#transform_* or ActiveSupport
TL:DR;
require 'time'
mappings = {
date: /Date|Data|datum|Fecha/,
portfolio_name: /Portfolio|portafoglio|Portfolioname|cartera|portefeuille/,
currency: /Currency|Valuta|Währung|Divisa|Devise/
}
rows = [
{"A"=>"Date", "B"=>"Portfolio", "C"=>"Currency"},
{"A"=>Time.parse('2019-05-16 00:00:00 +0200'), "B"=>"TEXT", "C"=>"INR"},
{"A"=>Time.parse('2019-05-20 00:00:00 +0200'), "B"=>"TEXT2", "C"=>"EUR"}
]
header_row = rows.first
mapped_header_row = header_row.inject({}) do |hash, (k, v)|
mapped_name = mappings.find do |mapped_name, regex|
v.match? regex
end&.first
# defaults to `v.to_sym` (Header Name), if not in mappings
# you can also raise an Exception here instead if not in mappings, depending on your expectations
hash[k] = mapped_name || v.to_sym
hash
end
mapped_rows = rows[1..-1].map do |row|
new_row = {}
row.each do |k, v|
new_row[mapped_header_row[k]] = v
end
new_row
end
puts mapped_rows
# => [
# {:date=>2019-05-16 00:00:00 +0200, :portfolio_name=>"TEXT", :currency=>"INR"},
# {:date=>2019-05-20 00:00:00 +0200, :portfolio_name=>"TEXT2", :currency=>"EUR"}
# ]
Given:
require 'time'
mappings = {
date: /Date|Data|datum|Fecha/,
portfolio_name: /Portfolio|portafoglio|Portfolioname|cartera|portefeuille/,
currency: /Currency|Valuta|Währung|Divisa|Devise/
}
rows = [
{"A"=>"Date", "B"=>"Portfolio", "C"=>"Currency"},
{"A"=>Time.parse('2019-05-16 00:00:00 +0200'), "B"=>"TEXT", "C"=>"INR"},
{"A"=>Time.parse('2019-05-20 00:00:00 +0200'), "B"=>"TEXT2", "C"=>"EUR"}
]
Steps:
We first extract the first row, to get the column names.
header_row = rows.first
puts header_row
# => {"A"=>"Date", "B"=>"Portfolio", "C"=>"Currency"}
We need to loop through each of the Hash pairs: (key, value), and we need to find if the "value" corresponds to any of our mappings variable.
In short for this step, we need to somehow convert (i.e.):
header_row = {"A"=>"Date", "B"=>"Portfolio", "C"=>"Currency"}
into
mapped_header_row = {"A"=>"date", "B"=>"portfolio_name", "C"=>"currency"}
And so...
mapped_header_row = header_row.inject({}) do |hash, (k, v)|
mapped_name = mappings.find do |mapped_name, regex|
v.match? regex
end&.first
# defaults to `v.to_sym` (Header Name), if not in mappings
# you can also raise an Exception here instead if not in mappings, depending on your expectations
hash[k] = mapped_name || v.to_sym
hash
end
puts mapped_header_row
# => {"A"=>"date", "B"=>"portfolio_name", "C"=>"currency"}
See inject
See find
Now that we have the mapped_header_row (or the "mapped" labels / names for each column), then we can just simply update all of the "keys" of 2nd row until the last row, with the "mapped" name: the keys being "A", "B", and "C"... to be replaced correspondingly with "date", "portfolio_name", and "currency"
# row[1..-1] means the 2nd element in the array until the last element
mapped_rows = rows[1..-1].map do |row|
new_row = {}
row.each do |k, v|
new_row[mapped_header_row[k]] = v
end
new_row
end
puts mapped_rows
# => [
# {:date=>2019-05-16 00:00:00 +0200, :portfolio_name=>"TEXT", :currency=>"INR"},
# {:date=>2019-05-20 00:00:00 +0200, :portfolio_name=>"TEXT2", :currency=>"EUR"}
# ]
See map

Search and extract results from an array based on an entry keyword in ruby

I would like to compare an Array to an element and extract those data in the another Array
Here is an example of data i'm working with:
Array = [{:id=>3, :keyword=>"happy", :Date=>"01/02/2016"},
{:id=>4, :keyword=>"happy", :Date=>"01/02/2016"} ... ]
for example i want the first keyword happy to search the same array ,extract if there's any similar words and put them inside another array here is what i'm looking for an end result:
Results = [{:keyword=>happy, :match =>{
{:id=>3, :keyword=>"happy", :Date=>"01/02/2016"}... }]
Here is the first part of the code :
def relationship(file)
data = open_data(file)
parsed = JSON.parse(data)
keywords = []
i = 0
parsed.each do |word|
keywords << { id: i += 1 , keyword: word['keyword'].downcase, Date: word['Date'] }
end
end
def search_keyword(keyword)
hash = [
{:id=>1, :keyword=>"happy", :Date=>"01/02/2015"},
{:id=>2, :keyword=>"sad", :Date=>"01/02/2016"},
{:id=>3, :keyword=>"fine", :Date=>"01/02/2017"},
{:id=>4, :keyword=>"happy", :Date=>"01/02/2018"}
]
keywords = []
hash.each do |key|
if key[:keyword] == keyword
keywords << key
end
end
keywords
#{:keyword=> keyword, :match=> keywords}
end
search_keyword('fine')
#search_keyword('sad')
You could group the match elements by key (:match) then get the result with a single hash lookup.
Here's another idea that could help you with your situation using enumerable and index:
array to be search:
array = [
{:id=>3, :keyword=>"happy", :Date=>"01/02/2016"},
{:id=>4, :keyword=>"happy", :Date=>"01/02/2016"},
{:id=>1, :keyword=>"happy", :Date=>"01/02/2015"},
{:id=>2, :keyword=>"sad", :Date=>"01/02/2016"},
{:id=>30, :keyword=>"fine", :Date=>"01/02/2017"},
{:id=>41, :keyword=>"happy", :Date=>"01/02/2018"}
]
method search:
store all element matching the term in the array.
def search(term, array)
array = []
array << {keyword: "#{term}", match: []}
arr.select { |element| array.first[:match] << element if element[:keyword].index(term) }
array
end
Testing:
p search('sa', array)
# => [{:keyword=>"sa", :match=>[{:id=>2, :keyword=>"sad", :Date=>"01/02/2016"}, {:id=>21, :keyword=>"sad", :Date=>"01/02/2016"}]}]
hope that will get you goin!

Merge multidimensional array of hash based on hash key and value in ruby

I have one array and i want to match value of id key with other array of hash in multidimensional array,
input = [
[ {"id"=>"1","name"=>"a"},
{"id"=>"2","name"=>"b"},
{"id"=>"3","name"=>"c"},
{"id"=>"4","name"=>"d"},
{"id"=>"5","name"=>"e"},
{"id"=>"6","name"=>"f"}
],
[ {"id"=>"3","hoby"=>"AA"},
{"id"=>"3","hoby"=>"BB"},
{"id"=>"1","hoby"=>"CC"},
{"id"=>"1","hoby"=>"DD"},
{"id"=>"4","hoby"=>"EE"}
],
[ {"id"=>"1","language"=>"A"},
{"id"=>"1","language"=>"B"},
{"id"=>"2","language"=>"B"},
{"id"=>"2","language"=>"C"},
{"id"=>"6","language"=>"D"}
]
]
I need array output like,
output = [
{"id"=>"1","name"=>"a","id"=>"1","hoby"=>"CC","id"=>"1","language"=>"A","id"=>"1","language"=>"B"},
{"id"=>"2","name"=>"b","id"=>"2","language"=>"B"},
{"id"=>"3","name"=>"c","id"=>"3","hoby"=>"AA","id"=>"3","hoby"=>"BB"},
{"id"=>"4","name"=>"d","id"=>"4","hoby"=>"EE"},
{"id"=>"5","name"=>"e"},
{"id"=>"6","name"=>"f","id"=>"6","language"=>"D"}
]
I have wrote code for this,
len = input.length - 1
output = []
input[0].each do |value,index|
for i in 1..len
input[i].each do |j|
if value["id"] == j["id"]
output << value.merge(j)
end
end
end
end
But i am getting wrong output array.There might be any number of sub array in multidimensional array.
Thank,
First of all - it is impossible to have two elements in a hash with the same key. When assigning the value to some key will make the next assignment of the same key with new value override the previous one.
Let's consider the example:
hash = {}
hash["id"] = 1
hash["id"] = 3
hash["id"] = 5
What output for hash["id"] would you expect? 1, 3, 5 or maybe [1, 3, 5]? The way the Hash in ruby works it will output 5, because this is the last assignment to unique key.
Having said that, it is impossible to store multiple occurrences in your hash, but you can try processing it with something like:
input.flatten
.group_by { |h| h["id"] }
.map do |k, a|
a.each_with_object({}) { |in_h, out_h| out_h.merge!(in_h) }
end
Which will result with hash like:
[{"id"=>"1", "name"=>"a", "hoby"=>"DD", "language"=>"B"},
{"id"=>"2", "name"=>"b", "language"=>"C"},
{"id"=>"3", "name"=>"c", "hoby"=>"BB"},
{"id"=>"4", "name"=>"d", "hoby"=>"EE"},
{"id"=>"5", "name"=>"e"},
{"id"=>"6", "name"=>"f", "language"=>"D"}]
Well, it is not the hash as you would expect, but at least it might put you in some direction.
Hope that helps!
maybe this can help you.
input = [
[
{"id"=>"1","name"=>"a"},
{"id"=>"2","name"=>"b"},
{"id"=>"3","name"=>"c"},
{"id"=>"4","name"=>"d"},
{"id"=>"5","name"=>"e"},
{"id"=>"6","name"=>"f"}
],
[
{"id"=>"3","hoby"=>"AA"},
{"id"=>"3","hoby"=>"BB"},
{"id"=>"1","hoby"=>"CC"},
{"id"=>"1","hoby"=>"DD"},
{"id"=>"4","hoby"=>"EE"}
],
[
{"id"=>"1","language"=>"A"},
{"id"=>"1","language"=>"B"},
{"id"=>"2","language"=>"B"},
{"id"=>"2","language"=>"C"},
{"id"=>"6","language"=>"D"}
]
]
This way you can make your "sort" results.
output = {}
input.flatten.each do |h|
output[h["id"]] = {} unless output[h["id"]]
output[h["id"]].merge!(h)
end
output.values
# => [
# => {"id"=>"1", "name"=>"a", "hoby"=>"DD", "language"=>"B"},
# => {"id"=>"2", "name"=>"b", "language"=>"C"},
# => {"id"=>"3", "name"=>"c", "hoby"=>"BB"},
# => {"id"=>"4", "name"=>"d", "hoby"=>"EE"},
# => {"id"=>"5", "name"=>"e"},
# => {"id"=>"6", "name"=>"f", "language"=>"D"}
# => ]
But the better way is use Hash in input. You can define input like hash and "id" like key so if you generate the data, you dont have problem to sort it.
Someting like this
{
"1" => {"name" => "a", "hoby" => "DD", "language" => "B"}
}

Convert this string to array of hashes

In Ruby or Rails What's the cleanest way to turn this string:
"[{one:1, two:2, three:3, four:4},{five:5, six:6}]"
into an array of hashes like this:
[{one:1, two:2, three:3, four:4},{five:5, six:6}]
Here is a one-liner on two lines:
s.split(/}\s*,\s*{/).
map{|s| Hash[s.scan(/(\w+):(\d+)/).map{|t| proc{|k,v| [k.to_sym, v.to_i]}.call(*t)}]}
NB I was using split(":") to separate keys from values, but #Cary Swoveland's use of parens in the regex is cooler. He forgot the key and value conversions, however.
Or a bit shorter, but uses array indexing instead of the proc, which some may find unappealing:
s.split(/}\s*,\s*{/).
map{|s| Hash[s.scan(/(\w+):(\d+)/).map{|t| [t[0].to_sym, t[1].to_i]}]}
Result:
=> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]
Explanation: Start from the end. The last map processes a list of strings of the form "key: value" and returns a list of [:key, value] pairs. The scan processes one string of comma-separated key-value pairs into a list of "key: value" strings. Finally, the initial split separates the brace-enclosed comma-separated strings.
Try this:
"[{one:1, two:2, three:3, four:4},{five:5, six:6}]".
split(/\}[ ]*,[ ]*\{/).
map do |h_str|
Hash[
h_str.split(",").map do |kv|
kv.strip.
gsub(/[\[\]\{\}]/, '').
split(":")
end.map do |k, v|
[k.to_sym, v.to_i]
end
]
end
Not pretty, not optimized, but solves it. (It was fun to do, though :) )
a = "[{one:1, two:2, three:3, four:4},{five:5, six:6}]"
array = []
a.gsub(/\[|\]/, '').split(/{|}/).map{ |h| h if h.length > 0 && h != ','}.compact.each do |v|
hsh = {}
v.split(',').each do |kv|
arr = kv.split(':')
hsh.merge!({arr.first.split.join.to_sym => arr.last.to_i})
end
array << hsh
end
If you want me to explain it, just ask.
Another approach: Your string looks like a YAML or JSON -definition:
YAML
A slightly modified string works:
require 'yaml'
p YAML.load("[ { one: 1, two: 2, three: 3, four: 4}, { five: 5, six: 6 } ]")
#-> [{"one"=>1, "two"=>2, "three"=>3, "four"=>4}, {"five"=>5, "six"=>6}]
There are two problems:
The keys are strings, no symbols
You need some more spaces (one:1 is not recognized, you need a one: 1).
For problem 1 you need a gsub(/:/, ': ') (I hope there are no other : in your string)
For problem 2 was already a question: Hash[a.map{|(k,v)| [k.to_sym,v]}]
Full example:
require 'yaml'
input = "[{one:1, two:2, three:3, four:4},{five:5, six:6}]"
input.gsub!(/:/, ': ') #force correct YAML-syntax
p YAML.load(input).map{|a| Hash[a.map{|(k,v)| [k.to_sym,v]}]}
#-> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]
JSON
With json you need additonal ", but the symbolization is easier:
require 'json'
input = '[ { "one":1, "two": 2, "three": 3, "four": 4},{ "five": 5, "six": 6} ]'
p JSON.parse(input)
#-> [{"one"=>1, "two"=>2, "three"=>3, "four"=>4}, {"five"=>5, "six"=>6}]
p JSON.parse(input, :symbolize_names => true)
#-> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]
Example with original string:
require 'json'
input = "[{one: 1, two: 2, three:3, four:4},{five:5, six:6}]"
input.gsub!(/([[:alpha:]]+):/, '"\1":')
p JSON.parse(input)
#-> [{"one"=>1, "two"=>2, "three"=>3, "four"=>4}, {"five"=>5, "six"=>6}]
p JSON.parse(input, :symbolize_names => true)
#-> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]
You could do as below.
Edit: I originally prepared this answer in haste, while on the road, on a borrowed computer with an unfamiliar operating system (Windows). After #sawa pointed out mistakes, I set about fixing it, but became so frustrated with the mechanics of doing so that I gave up and deleted my answer. Now that I'm home again, I have made what I believe are the necessary corrections.
Code
def extract_hashes(str)
str.scan(/\[?{(.+?)\}\]?/)
.map { |arr| Hash[arr.first
.scan(/\s*([a-z]+)\s*:\d*(\d+)/)
.map { |f,s| [f.to_sym, s.to_i] }
]
}
end
Example
str = "[{one:1, two:2, three:3, four:4},{five:5, six:6}]"
extract_hashes(str)
#=> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]
Explanation
For str in the example above,
a = str.scan(/\[?{(.+?)\}\]?/)
#=> [["one:1, two:2, three:3, four:4"], ["five:5, six:6"]]
Enumerable#map first passes the first element of a into the block and assigns it to the block variable:
arr #=> ["one:1, two:2, three:3, four:4"]
Then
b = arr.first
#=> "one:1, two:2, three:3, four:4"
c = b.scan(/\s*([a-z]+)\s*:\d*(\d+)/)
#=> [["one", "1"], ["two", "2"], ["three", "3"], ["four", "4"]]
d = c.map { |f,s| [f.to_sym, s.to_i] }
#=> [[:one, 1], [:two, 2], [:three, 3], [:four, 4]]
e = Hash[d]
#=> {:one=>1, :two=>2, :three=>3, :four=>4}
In Ruby 2.0, Hash[d] can be replaced with d.to_h.
Thus, the first element of a is mapped to e.
Next, the outer map passes the second and last element of a into the block
arr #=> ["five:5, six:6"]
and we obtain:
Hash[arr.first
.scan(/\s*([a-z]+)\s*:\d*(\d+)/)
.map { |f,s| [f.to_sym, s.to_i] }
]
#=> {:five=>5, :six=>6}
which replaces a.last.

Sort specific items of an array first

I have a ruby array that looks something like this:
my_array = ['mushroom', 'beef', 'fish', 'chicken', 'tofu', 'lamb']
I want to sort the array so that 'chicken' and 'beef' are the first two items, then the remaining items are sorted alphabetically. How would I go about doing this?
irb> my_array.sort_by { |e| [ e == 'chicken' ? 0 : e == 'beef' ? 1 : 2, e ] }
#=> ["chicken", "beef", "fish", "lamb", "mushroom", "tofu"]
This will create a sorting key for each element of the array, and then sort the array elements by their sorting keys. Since the sorting key is an array, it compares by position, so [0, 'chicken'] < [1, 'beef'] < [2, 'apple' ] < [2, 'banana'].
If you don't know what elements you wanted sorted to the front until runtime, you can still use this trick:
irb> promotables = [ 'chicken', 'beef' ]
#=> [ 'chicken', 'beef' ]
irb> my_array.sort_by { |e| [ promotables.index(e) || promotables.size, e ] }
#=> ["chicken", "beef", "fish", "lamb", "mushroom", "tofu"]
irb> promotables = [ 'tofu', 'mushroom' ]
#=> [ 'tofu', 'mushroom' ]
irb> my_array.sort_by { |e| [ promotables.index(e) || promotables.size, e ] }
#=> [ "tofu", "mushroom", "beef", "chicken", "fish", "lamb"]
Mine's a lot more generic and more useful if you get your data only at runtime.
my_array = ['mushroom', 'beef', 'fish', 'chicken', 'tofu', 'lamb']
starters = ['chicken', 'beef']
starters + (my_array.sort - starters)
# => ["chicken", "beef" "fish", "lamb", "mushroom", "tofu"]
Could just do
firsts = ["chicken", "beef"]
[*firsts, *(my_array.sort - firsts)]
#=> ["chicken", "beef", "fish", "lamb", "mushroom", "tofu"]

Resources