Convert this string to array of hashes - ruby-on-rails

In Ruby or Rails What's the cleanest way to turn this string:
"[{one:1, two:2, three:3, four:4},{five:5, six:6}]"
into an array of hashes like this:
[{one:1, two:2, three:3, four:4},{five:5, six:6}]

Here is a one-liner on two lines:
s.split(/}\s*,\s*{/).
map{|s| Hash[s.scan(/(\w+):(\d+)/).map{|t| proc{|k,v| [k.to_sym, v.to_i]}.call(*t)}]}
NB I was using split(":") to separate keys from values, but #Cary Swoveland's use of parens in the regex is cooler. He forgot the key and value conversions, however.
Or a bit shorter, but uses array indexing instead of the proc, which some may find unappealing:
s.split(/}\s*,\s*{/).
map{|s| Hash[s.scan(/(\w+):(\d+)/).map{|t| [t[0].to_sym, t[1].to_i]}]}
Result:
=> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]
Explanation: Start from the end. The last map processes a list of strings of the form "key: value" and returns a list of [:key, value] pairs. The scan processes one string of comma-separated key-value pairs into a list of "key: value" strings. Finally, the initial split separates the brace-enclosed comma-separated strings.

Try this:
"[{one:1, two:2, three:3, four:4},{five:5, six:6}]".
split(/\}[ ]*,[ ]*\{/).
map do |h_str|
Hash[
h_str.split(",").map do |kv|
kv.strip.
gsub(/[\[\]\{\}]/, '').
split(":")
end.map do |k, v|
[k.to_sym, v.to_i]
end
]
end

Not pretty, not optimized, but solves it. (It was fun to do, though :) )
a = "[{one:1, two:2, three:3, four:4},{five:5, six:6}]"
array = []
a.gsub(/\[|\]/, '').split(/{|}/).map{ |h| h if h.length > 0 && h != ','}.compact.each do |v|
hsh = {}
v.split(',').each do |kv|
arr = kv.split(':')
hsh.merge!({arr.first.split.join.to_sym => arr.last.to_i})
end
array << hsh
end
If you want me to explain it, just ask.

Another approach: Your string looks like a YAML or JSON -definition:
YAML
A slightly modified string works:
require 'yaml'
p YAML.load("[ { one: 1, two: 2, three: 3, four: 4}, { five: 5, six: 6 } ]")
#-> [{"one"=>1, "two"=>2, "three"=>3, "four"=>4}, {"five"=>5, "six"=>6}]
There are two problems:
The keys are strings, no symbols
You need some more spaces (one:1 is not recognized, you need a one: 1).
For problem 1 you need a gsub(/:/, ': ') (I hope there are no other : in your string)
For problem 2 was already a question: Hash[a.map{|(k,v)| [k.to_sym,v]}]
Full example:
require 'yaml'
input = "[{one:1, two:2, three:3, four:4},{five:5, six:6}]"
input.gsub!(/:/, ': ') #force correct YAML-syntax
p YAML.load(input).map{|a| Hash[a.map{|(k,v)| [k.to_sym,v]}]}
#-> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]
JSON
With json you need additonal ", but the symbolization is easier:
require 'json'
input = '[ { "one":1, "two": 2, "three": 3, "four": 4},{ "five": 5, "six": 6} ]'
p JSON.parse(input)
#-> [{"one"=>1, "two"=>2, "three"=>3, "four"=>4}, {"five"=>5, "six"=>6}]
p JSON.parse(input, :symbolize_names => true)
#-> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]
Example with original string:
require 'json'
input = "[{one: 1, two: 2, three:3, four:4},{five:5, six:6}]"
input.gsub!(/([[:alpha:]]+):/, '"\1":')
p JSON.parse(input)
#-> [{"one"=>1, "two"=>2, "three"=>3, "four"=>4}, {"five"=>5, "six"=>6}]
p JSON.parse(input, :symbolize_names => true)
#-> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]

You could do as below.
Edit: I originally prepared this answer in haste, while on the road, on a borrowed computer with an unfamiliar operating system (Windows). After #sawa pointed out mistakes, I set about fixing it, but became so frustrated with the mechanics of doing so that I gave up and deleted my answer. Now that I'm home again, I have made what I believe are the necessary corrections.
Code
def extract_hashes(str)
str.scan(/\[?{(.+?)\}\]?/)
.map { |arr| Hash[arr.first
.scan(/\s*([a-z]+)\s*:\d*(\d+)/)
.map { |f,s| [f.to_sym, s.to_i] }
]
}
end
Example
str = "[{one:1, two:2, three:3, four:4},{five:5, six:6}]"
extract_hashes(str)
#=> [{:one=>1, :two=>2, :three=>3, :four=>4}, {:five=>5, :six=>6}]
Explanation
For str in the example above,
a = str.scan(/\[?{(.+?)\}\]?/)
#=> [["one:1, two:2, three:3, four:4"], ["five:5, six:6"]]
Enumerable#map first passes the first element of a into the block and assigns it to the block variable:
arr #=> ["one:1, two:2, three:3, four:4"]
Then
b = arr.first
#=> "one:1, two:2, three:3, four:4"
c = b.scan(/\s*([a-z]+)\s*:\d*(\d+)/)
#=> [["one", "1"], ["two", "2"], ["three", "3"], ["four", "4"]]
d = c.map { |f,s| [f.to_sym, s.to_i] }
#=> [[:one, 1], [:two, 2], [:three, 3], [:four, 4]]
e = Hash[d]
#=> {:one=>1, :two=>2, :three=>3, :four=>4}
In Ruby 2.0, Hash[d] can be replaced with d.to_h.
Thus, the first element of a is mapped to e.
Next, the outer map passes the second and last element of a into the block
arr #=> ["five:5, six:6"]
and we obtain:
Hash[arr.first
.scan(/\s*([a-z]+)\s*:\d*(\d+)/)
.map { |f,s| [f.to_sym, s.to_i] }
]
#=> {:five=>5, :six=>6}
which replaces a.last.

Related

How to build a hash of directory names as keys and file names as values in Ruby?

I have directories with files and I would like to build a hash of directory names as keys and file names as values. Example:
/app/foo/create.json
/app/foo/update.json
/app/bar/create.json
/app/bar/update.json
Output:
{
"foo" => {
"create.json" => {},
"update.json" => {}
},
"bar" => {
"create.json" => {},
"update.json" => {}
}
}
Currently I'd doing this:
OUTPUT ||= {}
Dir.glob('app', '**', '*.json')) do |file|
OUTPUT[File.basename(file)] = File.read(file)
end
But it's not working as expected, I'm not sure how to get the parent directory name.
Dir.glob('*/*.json', base: 'app').each_with_object(Hash.new {|g,k| g[k]={}}) do |fname,h|
h[File.dirname(fname)].update(File.basename(fname)=>{})
end
#=> {"foo"=>{"create.json"=>{}, "update.json"=>{}},
# "bar"=>{"update.json"=>{}, "create.json"=>{}}}
#Amadan explains the use of Dir#glob, which is exactly as in his answer. I have employed the version of Hash::new that invokes a block (here {|g,k| g[k]={}}) when g[k] is executed and the hash g does not have a key k.1. See also Hash#update (aka merge!), File::dirname and File::basename.
The steps are as follows.
a = Dir.glob('*/*.json', base: 'app')
#=> ["foo/create.json", "foo/update.json", "bar/update.json", "bar/create.json"]
enum = a.each_with_object(Hash.new {|g,k| g[k]={}})
#=> #<Enumerator: ["foo/create.json", "foo/update.json", "bar/update.json",
# "bar/create.json"]:each_with_object({})>
The first value is generate by the enumerator and passed to the block, and the block variables are assigned values by the process of array decomposition:
fname, h = enum.next
#=> ["foo/create.json", {}]
fname
#=> "foo/create.json"
h #=> {}
d = File.dirname(fname)
#=> "foo"
b = File.basename(fname)
#=> "create.json"
h[d].update(b=>{})
#=> {"create.json"=>{}}
See Enumerator#next. The next value is generated by enum and passed to the block, the block variables are assigned values and the block calculations are performed. (Notice that the hash being built, h, has been updated in the following.)
fname, h = enum.next
#=> ["foo/update.json", {"foo"=>{"create.json"=>{}}}]
fname
#=> "foo/update.json"
h #=> {"foo"=>{"create.json"=>{}}}
d = File.dirname(fname)
#=> "foo"
b = File.basename(fname)
#=> "update.json"
h[d].update(b=>{})
#=> {"create.json"=>{}, "update.json"=>{}}
Twice more.
fname, h = enum.next
#=> ["bar/update.json", {"foo"=>{"create.json"=>{}, "update.json"=>{}}}]
d = File.dirname(fname)
#=> "bar"
b = File.basename(fname)
#=> "update.json"
h[d].update(b=>{})
#=> {"update.json"=>{}}
fname, h = enum.next
#=> ["bar/create.json",
# {"foo"=>{"create.json"=>{}, "update.json"=>{}}, "bar"=>{"update.json"=>{}}}]
d = File.dirname(fname)
#=> "bar"
b = File.basename(fname)
#=> "create.json"
h[d].update(b=>{})
#=> {"update.json"=>{}, "create.json"=>{}}
h #=> {"foo"=>{"create.json"=>{}, "update.json"=>{}},
# "bar"=>{"update.json"=>{}, "create.json"=>{}}}
1. This is equivalent to defining the hash as follows: g = {}; g.default_proc = proc {|g,k| g[k]={}}. See Hash#default_proc=.
An alternative to regexp:
output =
Dir.glob('*/*.json', base: 'app').
group_by(&File::method(:dirname)).
transform_values { |files|
files.each_with_object({}) { |file, hash|
hash[File.basename(file)] = File.read(file)
}
}
Note the base: keyword argument to File.glob (or Pathname.glob, for that matter) which simplifies things as we don't need to remove app; also that for the purposes of OP's question there only needs to be one directory level, so * instead of **.

How can i count words frenquency and append results every time i run the script in ruby

["one", "two", "three", "three"]
I want to open a file and write
{"one" => 1, "two" => 1, "three" => 2}
["one", "two"]
and in the next time open the same file and search for the each word if exsit append + 1 else create new word
{"one" => 2, "two" => 2, "three" => 2}
This should do :
hash = ["one", "two", "three", "three"]
frequency_file = 'frequency.dat'
if File.exists?(frequency_file)
old_frequency = File.open(frequency_file) {|f| Marshal.load(f.read)}
else
old_frequency = {}
end
old_frequency.default = 0
frequency = hash.group_by{|name| name}.map{|name, list| [name,list.count+old_frequency[name]]}.to_h
File.open(frequency_file,'w'){|f| f.write(Marshal.dump(frequency))}
puts frequency.inspect
# => {"one"=>1, "two"=>1, "three"=>2}
# => {"one"=>2, "two"=>2, "three"=>4}
If you prefer a human-readable file :
require 'yaml'
hash = ["one", "two", "three", "three"]
frequency_file = 'frequency.yml'
if File.exists?(frequency_file)
old_frequency = YAML.load_file(frequency_file)
else
old_frequency = {}
end
old_frequency.default = 0
frequency = hash.group_by{|name| name}.map{|name, list| [name,list.count+old_frequency[name]]}.to_h
File.open(frequency_file,'w'){|f| f.write frequency.to_yaml}
puts frequency.inspect
# => {"one"=>1, "two"=>1, "three"=>2}
# => {"one"=>2, "two"=>2, "three"=>4}
Here are some variations that'd do it:
ary = %w[a b a c a b]
ary.group_by { |v| v }.map{ |k, v| [k, v.size] }.to_h # => {"a"=>3, "b"=>2, "c"=>1}
ary.each_with_object(Hash.new(0)) { |v, h| h[v] += 1} # => {"a"=>3, "b"=>2, "c"=>1}
ary.uniq.map { |v| [v, ary.count(v)] }.to_h # => {"a"=>3, "b"=>2, "c"=>1}
Since they're all about the same length it becomes important to know which is the fastest.
require 'fruity'
ary = %w[a b a c a b] * 1000
compare do
group_by { ary.group_by { |v| v }.map{ |k, v| [k, v.size] }.to_h }
each_with_object { ary.each_with_object(Hash.new(0)) { |v, h| h[v] += 1} }
uniq_map { ary.uniq.map { |v| [v, ary.count(v)] }.to_h }
end
# >> Running each test 4 times. Test will take about 1 second.
# >> group_by is faster than uniq_map by 30.000000000000004% ± 10.0%
# >> uniq_map is faster than each_with_object by 19.999999999999996% ± 10.0%
How to persist the data and append to it is a separate question and how to do it depends on the size of the data you're checking, and how fast you need the code to run. Databases are very capable of doing these sort of checks extremely fast as they have code optimized to search and count unique occurrences of records. Even SQLite should have no problem doing this. Using an ORM like Sequel or ActiveRecord would make it painless to talk to the DB and to scale or port to a more capable database manager if needed.
Writing to a local file is OK if you occasionally need to update, or you don't have a big list of words, and you don't need to share the information with other pieces of code or with another machine.
Reading a file to recover the hash then incrementing it assumes a word will never be deleted, they'll only be added. I've written a lot of document analysis code and that case hasn't occurred, so I'd recommend thinking about long-term use before settling on your particular path.
Could you put the string representation of a hash (the first line of the file) in a separate (e.g., JSON) file? If so, consider something like the following.
First let's create a JSON file for the hash and a second file, the words of which are to be counted.
require 'json'
FName = "lucy"
JSON_Fname = "hash_counter.json"
File.write(JSON_Fname, JSON.generate({"one" => 1, "two" => 1, "three" => 2}))
#=> 27
File.write(FName, '["one", "two", "three", "three"]')
#=>32
First read the JSON file, parse the hash and give h a default value of zero.1.
h = JSON.parse(File.read(JSON_Fname))
#=> {"one"=>1, "two"=>1, "three"=>2}
h.default = 0
(See Hash#default=). Then read the other file and update the hash.
File.read(FName).downcase.scan(/[[:alpha:]]+/).each { |w| h[w] += 1 }
h #=> {"one"=>2, "two"=>2, "three"=>4}
Lastly, write the hash h to the JSON file (as I did above).2
1 Ruby expands h[w] += 1 to h[w] = h[w] + 1 before parsing the expression. If h does not have a key w, Hash#[] returns the hash's default value, if it has one. Here h["cat"] #=> 0 since h has no key "cat" and the default has been set to zero. The expression therefore becomes h[w] = 0 + 1 #=> 1. Note that the method on the left of the equality is Hash#[]=, which is why the default value does not apply there.
2 To be safe, write the new JSON string to a temporary file, delete the JSON file, then rename the temporary file to the former JSON file name.

map array with condition

I have array like
strings = ["by_product[]=1", "by_product[]=2", "page=1", "per_page=10", "select[]=current", "select[]=requested", "select[]=original"]
which is array of params from request
Then there is code that generates hash from array
arrays = strings.map do |segment|
k,v = segment.split("=")
[k, v && CGI.unescape(v)]
Hash[arrays]
CUrrent output -
"by_product[]": "2",
"page":"1",
"per_page":"10",
"select[]":"original"
Expected output -
"by_product[]":"1, 2",
"page":"1",
"per_page":"10",
"select[]":"current, requested, original"
The problem is - after split method there are few by_product[] and the last one just overrides any other params, so in result instead of hash with array as value of these params im getting only last one. And i'm not sure how to fix it. Any ideas? Or at least algorithms
So try this:
hash = {}
arrays = strings.map do |segment|
k,v = segment.split("=")
hash[k]||=[]
hash[k] << v
end
output is
1.9.3-p547 :025 > hash
=> {"by_product[]"=>["1", "2"], "page"=>["1"], "per_page"=>["10"], "select[]"=>["current", "requested", "original"]}
or if you want just strings do
arrays = strings.map do |segment|
k,v = segment.split("=")
hash[k].nil? ? hash[k] = v : hash[k] << ", " + v
end
Don't reinvent the wheel, CGI and Rack can already handle query strings.
Assuming your strings array comes from a single query string:
query = "by_product[]=1&by_product[]=2&page=1&per_page=10&select[]=current&select[]=requested&select[]=original"
you can use CGI::parse: (all values as arrays)
require 'cgi'
CGI.parse(query)
#=> {"by_product[]"=>["1", "2"], "page"=>["1"], "per_page"=>["10"], "select[]"=>["current", "requested", "original"]}
or Rack::Utils.parse_query: (arrays where needed)
require 'rack'
Rack::Utils.parse_nested_query(query)
# => {"by_product[]"=>["1", "2"], "page"=>"1", "per_page"=>"10", "select[]"=>["current", "requested", "original"]}
or Rack::Utils.parse_nested_query: (values without [] suffix)
require 'rack'
Rack::Utils.parse_nested_query(query)
# => {"by_product"=>["1", "2"], "page"=>"1", "per_page"=>"10", "select"=>["current", "requested", "original"]}
And if these are parameters for a Rails controller, you can just use params.
this will also work :
strings.inject({}){ |hash, string|
key, value = string.split('=');
hash[key] = (hash[key]|| []) << value;
hash;
}
output :
{"by_product[]"=>["1", "2"], "page"=>["1"], "per_page"=>["10"], "select[]"=>["current", "requested", "original"]}
As simple as that
array.map { |record| record*3 if condition }
record*3 is the resultant operation you wanna do to the array while mapping

Ruby array of hash. group_by and modify in one line

I have an array of hashes, something like
[ {:type=>"Meat", :name=>"one"},
{:type=>"Meat", :name=>"two"},
{:type=>"Fruit", :name=>"four"} ]
and I want to convert it to this
{ "Meat" => ["one", "two"], "Fruit" => ["Four"]}
I tried group_by but then i got this
{ "Meat" => [{:type=>"Meat", :name=>"one"}, {:type=>"Meat", :name=>"two"}],
"Fruit" => [{:type=>"Fruit", :name=>"four"}] }
and then I can't modify it to leave just the name and not the full hash. I need to do this in one line because is for a grouped_options_for_select on a Rails form.
array.group_by{|h| h[:type]}.each{|_, v| v.replace(v.map{|h| h[:name]})}
# => {"Meat"=>["one", "two"], "Fruit"=>["four"]}
Following steenslag's suggestion:
array.group_by{|h| h[:type]}.each{|_, v| v.map!{|h| h[:name]}}
# => {"Meat"=>["one", "two"], "Fruit"=>["four"]}
In a single iteration over initial array:
arry.inject(Hash.new([])) { |h, a| h[a[:type]] += [a[:name]]; h }
Using ActiveSuport's Hash#transform_values:
array.group_by{ |h| h[:type] }.transform_values{ |hs| hs.map{ |h| h[:name] } }
#=> {"Meat"=>["one", "two"], "Fruit"=>["four"]}
array = [{:type=>"Meat", :name=>"one"}, {:type=>"Meat", :name=>"two"}, {:type=>"Fruit", :name=>"four"}]
array.inject({}) {|memo, value| (memo[value[:type]] ||= []) << value[:name]; memo}
I would do as below :
hsh =[{:type=>"Meat", :name=>"one"}, {:type=>"Meat", :name=>"two"}, {:type=>"Fruit", :name=>"four"}]
p Hash[hsh.group_by{|h| h[:type] }.map{|k,v| [k,v.map{|h|h[:name]}]}]
# >> {"Meat"=>["one", "two"], "Fruit"=>["four"]}
#ArupRakshit answer, slightly modified (the function has been added for sake of clarity in the final example):
def group(list, by, at)
list.group_by { |h| h[by] }.map { |k,v| [ k , v.map {|h| h[at]} ] }.to_h
end
sample =[
{:type=>"Meat", :name=>"one", :size=>"big" },
{:type=>"Meat", :name=>"two", :size=>"small" },
{:type=>"Fruit", :name=>"four", :size=>"small" }
]
group(sample, :type, :name) # => {"Meat"=>["one", "two"], "Fruit"=>["four"]}
group(sample, :size, :name) # => {"big"=>["one"], "small"=>["two", "four"]}
Please, notice that, although not mentioned in the question, you may want to preserve the original sample as it is. Some answers kept provision on this, others not as.
After grouping (list.group_by {...}) the part that does the transformation (without modifying the original sample's values) is:
.map { |k,v| [ k , v.map {|h| h[at]} ] }.to_h
Some hints:
iterating the pairs of the Hash of groups (first map), where
for each iteration, we receive |group_key, array] and return an Array of [group_key, new_array] (outer block),
and finally to_h transforms the Array of Arrays into the Hash (this [[gk1,arr1],[gk2,arr2]...] into this { gk1 => arr1, gk2 => arr2, ...})
There is one missing step not explained at step (2) above. new_array is made by v.map {|h| h[at]}, which justs casts the value at of each original Hash (h) element of the array (so we move from Array of Hashes to an Array of elements).
Hope that helps others to understand the example.

Ruby mixed array to nested hash

I have a Ruby array whose elements alternate between Strings and Hashes. For example-
["1234", Hash#1, "5678", Hash#2]
I would like to create a nested hash structure from this. So,
hash["1234"]["key in hash#1"] = value
hash["5678"]["key in hash#2"] = value
Does anyone have/now a nice way of doing this? Thank you.
Simply use
hsh = Hash[*arr] #suppose arr is the array you have
It will slice 2 at a time and convert into hash.
I don't think there is a method on array to do this directly. The following code works and is quite easy to read.
hsh = {}
ary.each_slice(2) do |a, b|
hsh[a] = b
end
# Now `hsh` is as you want it to be
Guessing at what you want, since "key in hash#1" is not clear at all, nor have you defined what hash or value should be:
value = 42
h1 = {a:1}
h2 = {b:2}
a = ["1234",h1,"5678",h2]
a.each_slice(2).each{ |str,h| h[str] = value }
p h1, #=> {:a=>1, "1234"=>42}
h2 #=> {:b=>2, "5678"=>42}
Alternatively, perhaps you mean this:
h1 = {a:1}
h2 = {b:2}
a = ["1234",h1,"5678",h2]
hash = Hash[ a.each_slice(2).to_a ]
p hash #=> {"1234"=>{:a=>1}, "5678"=>{:b=>2}}
p hash["1234"][:a] #=> 1
let's guess, using facets just for fun:
require 'facets'
xs = ["1234", {:a => 1, :b => 2}, "5678", {:c => 3}]
xs.each_slice(2).mash.to_h
#=> {"1234"=>{:a=>1, :b=>2}, "5678"=>{:c=>3}}

Resources