Ruby split string - ruby-on-rails

String = "Mod1:10022932,10828075,5946410,13321905,5491120,5030731|Mod2:22704455,22991440,22991464,21984312,21777721,21777723,21889761,21939852,23091478,22339903,23091485,22099714,21998260,22364832,21939858,21944274,21944226,22800221,22704443,21777728,21777719,21678184,21998265,21834900,21984331,22704454,21998261,21944214,21862610,21836482|Mod3:10828075,13321905,5491120,5946410,5030731,15806212,4100566,4787137,2625339,2408317,2646868,19612047,2646862,11983534,8591489,19612048,10249319,14220471,15806209,13330887,15075124,17656842,3056657,5086273|Mod4:10828075,5946410,13321905,5030731,5491120,4787137,4100566,15806212,2625339,3542205,2408317,2646862,2646868|Mod5:10022932;0.2512,10828075;0.2093,5030731;0.1465,5946410;0.1465,4787137;0.1465,2625339;0.0143,5491120;0.0143,13321905;0.0143,3542205;0.0143,15806212;0.0119,4100566;0.0119,19612047;0.0100,2408317;0.0100"
How can I split it out so that I can get each title(Mod1, Mod2..) and the ID's that belong to each title.
This is that I've tried so far, which is removing everything after the pipe, which I dont want.
mod_name = string.split(":")[0]
mod_ids = string.split(":")[1] #This gets me the ID's but also include the |Mod*
ids = mod_mod_ids.split("|").first.strip #Only returns Id's before the first "|"
Desired Output:
I need to save mod_name and mod_ids to their respective columns,
mod_name = #name ("Mod1...Mod2 etc) #string
mod_ids = #ids (All Ids after the ":" in Mod*:) #array

I think this does what you want:
ids = string.split("|").map {|part| [part.split(":")[0], part.split(":")[1].split(/,|;/)]}

There are a couple of ways to do this:
# This will split the string on "|" and ":" and will return:
# %w( Mod1 id1 Mod2 id2 Mod3 id3 ... )
ids = string.split(/[|:]/)
# This will first split on "|", and for each string, split it again on ":" and returs:
# [ %w(Mod1 id1), %w(Mod2 id2), %w(Mod3 id3), ... ]
ids = string.split("|").map { |str| str.split(":") }

If you want a Hash as a result for easy access via the titles, then you could do this:
str.split('|').inject({}){|h,x| k,v = x.split(':'); h[k] = v.split(','); h}
=> {
"Mod1"=>["10022932", "10828075", "5946410", "13321905", "5491120", "5030731"],
"Mod2"=>["22704455", "22991440", "22991464", "21984312", "21777721", "21777723", "21889761", "21939852", "23091478", "22339903", "23091485", "22099714", "21998260", "22364832", "21939858", "21944274", "21944226", "22800221", "22704443", "21777728", "21777719", "21678184", "21998265", "21834900", "21984331", "22704454", "21998261", "21944214", "21862610", "21836482"],
"Mod3"=>["10828075", "13321905", "5491120", "5946410", "5030731", "15806212", "4100566", "4787137", "2625339", "2408317", "2646868", "19612047", "2646862", "11983534", "8591489", "19612048", "10249319", "14220471", "15806209", "13330887", "15075124", "17656842", "3056657", "5086273"],
"Mod4"=>["10828075", "5946410", "13321905", "5030731", "5491120", "4787137", "4100566", "15806212", "2625339", "3542205", "2408317", "2646862", "2646868"],
"Mod5"=>["10022932;0.2512", "10828075;0.2093", "5030731;0.1465", "5946410;0.1465", "4787137;0.1465", "2625339;0.0143", "5491120;0.0143", "13321905;0.0143", "3542205;0.0143", "15806212;0.0119", "4100566;0.0119", "19612047;0.0100", "2408317;0.0100"]
}

Untested:
all_mods = {}
string.split("|").each do |fragment|
mod_fragments = fragment.split(":")
all_mods[mod_fragments[0]] = mod_fragments[1].split(",")
end

What I ended up using thanks to #tillerjs help.
data = sting.split("|")
data.each do |mod|
module_name = mod.split(":")[0]
recommendations = mod.split(":")[1]
end

Related

Replace until all occurrences are removed

I have the following strings:
",||||||||||||||"
",|||||a|||||,|"
I would like to achieve that all occurrences of ",|" are replaced with ",,"
The output should be the following:
",,,,,,,,,,,,,,,"
",,,,,,a|||||,,"
When I run .gsub(',|', ',,') on the strings I get not the desired output.
",,|||||||||||||"
",,||||a|||||,,"
That's because it does not run gsub several times.
Is there a similar method that runs recursively.
A regular expression matches can not overlap. Since matches are what is used for replacement, you can't do it that way. Here's two workarounds:
str = ",|||||a|||||,|"
while str.gsub!(/,\|/, ',,'); end
str = ",|||||a|||||,|"
str.gsub!(/,(\|+)/) { "," * ($1.length + 1) }
smoke_weed_every_day = lambda do |piper|
commatosed = piper.gsub(',|', ',,')
commatosed == piper ? piper : smoke_weed_every_day.(commatosed)
end
smoke_weed_every_day.(",||||||||||||||") # => ",,,,,,,,,,,,,,,"
smoke_weed_every_day.(",|||||a|||||,|") # => ",,,,,,a|||||,,"
From an old library of mine. This method iterates until the block output is equal to its input :
def loop_until_convergence(x)
x = yield(previous = x) until previous == x
x
end
puts loop_until_convergence(',||||||||||||||') { |s| s.gsub(',|', ',,') }
# ",,,,,,,,,,,,,,,"
puts loop_until_convergence(',|||||a|||||,|') { |s| s.gsub(',|', ',,') }
# ",,,,,,a|||||,,"
As a bonus, you can calculate a square root in very few iterations :
def root(n)
loop_until_convergence(1) { |x| 0.5 * (x + n / x) }
end
p root(2)
# 1.414213562373095
p root(3)
# 1.7320508075688772
As with #Amandan's second solution there is no need to iterate until no further changes are made.
COMMA = ','
PIPE = '|'
def replace_pipes_after_comma(str)
run = false
str.gsub(/./) do |s|
case s
when PIPE
run ? COMMA : PIPE
when COMMA
run = true
COMMA
else
run = false
s
end
end
end
replace_pipes_after_comma ",||||||||||||||"
#=> ",,,,,,,,,,,,,,,"
replace_pipes_after_comma ",|||||a|||||,|"
#=> ",,,,,,a|||||,,"

constructing a new hash from the given values

I seem lost trying to achieve the following, I tried all day please help
I HAVE
h = {
"kv1001"=> {
"impressions"=>{"b"=>0.245, "a"=>0.754},
"visitors" =>{"b"=>0.288, "a"=>0.711},
"ctr" =>{"b"=>0.003, "a"=>0.003},
"inScreen"=>{"b"=>3.95, "a"=>5.031}
},
"kv1002"=> {
"impressions"=>{"c"=>0.930, "d"=>0.035, "a"=>0.004, "b"=>0.019,"e"=>0.010},
"visitors"=>{"c"=>0.905, "d"=>0.048, "a"=>0.005, "b"=>0.026, "e"=>0.013},
"ctr"=>{"c"=>0.003, "d"=>0.006, "a"=>0.004, "b"=>0.003, "e"=>0.005},
"inScreen"=>{"c"=>4.731, "d"=>4.691, "a"=>5.533, "b"=>6.025, "e"=>5.546}
}
}
MY GOAL
{
"segment"=>"kv1001=a",
"impressions"=>"0.754",
"visitors"=>"0.711",
"inScreen"=>"5.031",
"ctr"=>"0.003"
}, {
"segment"=>"kv1001=b",
"impressions"=>"0.245",
"visitors"=>"0.288",
"inScreen"=>"3.95",
"ctr"=>"0.003"
}, {
"segment"=>"kv1002=a",
"impressions"=>"0.004"
#... etc
}
My goal is to create a hash with 'kv1001=a' i.e the letters inside the hash and assign the keys like impressions, visitors etc. The example MY GOAL has the format
So format type "kv1001=a" must be constructed from the hash itself, a is the letter inside the hash.
I have solved this now
`data_final = []
h.each do |group,val|
a = Array.new(26){{}}
val.values.each_with_index do |v, i|
keys = val.keys
segment_count = v.keys.length
(0..segment_count-1).each do |n|
a0 = {"segment" => "#{group}=#{v.to_a[n][0]}", keys[i] => v.to_a[n][1]}
a[n].merge! a0
if a[n].count > 4
data_final << a[n]
end
end
end
end`
Here's a simpler version
h.flat_map do |segment, attrs|
letters = attrs.values.flat_map(&:keys).uniq
# create a segment entry for each unique letter
letters.map do |letter|
seg = {"segment" => "#{segment}=#{letter}"}
seg.merge Hash[attrs.keys.map {|key| [key,attrs[key][letter]]}]
end
end
Output:
[{"segment"=>"kv1001=b",
"impressions"=>0.245,
"visitors"=>0.288,
"ctr"=>0.003,
"inScreen"=>3.95},
{"segment"=>"kv1001=a",
"impressions"=>0.754,
"visitors"=>0.711,
"ctr"=>0.003,
"inScreen"=>5.031},
{"segment"=>"kv1002=c",
"impressions"=>0.93,
"visitors"=>0.905,
"ctr"=>0.003,
"inScreen"=>4.731},
{"segment"=>"kv1002=d",
"impressions"=>0.035,
"visitors"=>0.048,
"ctr"=>0.006,
"inScreen"=>4.691},
{"segment"=>"kv1002=a",
"impressions"=>0.004,
"visitors"=>0.005,
"ctr"=>0.004,
"inScreen"=>5.533},
{"segment"=>"kv1002=b",
"impressions"=>0.019,
"visitors"=>0.026,
"ctr"=>0.003,
"inScreen"=>6.025},
{"segment"=>"kv1002=e",
"impressions"=>0.01,
"visitors"=>0.013,
"ctr"=>0.005,
"inScreen"=>5.546}]

Ruby way to group anagrams in string array

I implemented a function to group anagrams.
In a nutshell:
input: ['cars', 'for', 'potatoes', 'racs', 'four','scar', 'creams', scream']
output: [["cars", "racs", "scar"], ["four"], ["for"], ["potatoes"],["creams", "scream"]]
I would like to know if there is a better way to do this.
I really think I used too much repetition statements: until, select,
delete_if.
Is there any way to combine the select and delete_if statement? That
means, can selected items be automatically deleted?
Code:
def group_anagrams(words)
array = []
until words.empty?
word = words.first
array.push( words.select { |match| word.downcase.chars.sort.join.eql?(match.downcase.chars.sort.join ) } )
words.delete_if { |match| word.downcase.chars.sort.join.eql?(match.downcase.chars.sort.join ) }
end
array
end
Thanks in advance,
Like that:
a = ['cars', 'for', 'potatoes', 'racs', 'four','scar', 'creams', 'scream']
a.group_by { |element| element.downcase.chars.sort }.values
Output is:
[["cars", "racs", "scar"], ["for"], ["potatoes"], ["four"], ["creams", "scream"]]
If you want to you can turn this one-liner to a method of course.
You could use the partition function instead of select, implemented in Enumerable. It splits the entries within the array according to the decision-function into two arrays.
def group_anagrams(words)
array = []
until words.empty?
word = words.first
delta, words = words.partition { |match| word.downcase.chars.sort.join.eql?(match.downcase.chars.sort.join ) } )
array += delta
end
array
end
(untested)

How do I strip a URL from a string and place it an array?

I'm working on building a small script that searches for the 5 most recent pictures tweeted by a service, isolates the URL and puts that URL into an array.
def grabTweets(linkArray) #brings in empty array
tweets = Twitter.search("[pic] "+" url.com/r/", :rpp => 2, :result_type => "recent").map do |status|
tweets = "#{status.text}" #class = string
url_regexp = /http:\/\/\w/ #isolates link
url = tweets.split.grep(url_regexp).to_s #chops off link, turns link to string from an array
#add link to url array
#print linkArray #prints []
linkArray.push(url)
print linkArray
end
end
x = []
timelineTweets = grabTweets(x)
The function is returning things like this: ["[\"http://t.co/6789\"]"]["[\"http://t.co/12345\"]"]
I'm trying to get it to return ["http://t.co/6789", "http://t.co/1245"] but it's not managing that.
Any help here would be appreciated. I'm not sure what I'm doing wrong.
The easiest way to grab URLs in Ruby is to use the URI::extract method. It's a pre-existing wheel that works:
require 'uri'
require 'open-uri'
body = open('http://www.example.com').read
urls = URI::extract(body)
puts urls
Which returns:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
http://www.w3.org/1999/xhtml
http://www.icann.org/
mailto:iana#iana.org?subject=General%20website%20feedback
Once you have the array you can filter for what you want, or you can give it a list of schemes to extract.
To strip a url out a string and push into urls array, you can do:
urls = []
if mystring =~ /(http:\/\/[^\s]+)/
urls << $1
end
grep returns an array:
grep(pattern) → array
grep(pattern) {| obj | block } → array
Returns an array of every element in enum for which Pattern === element.
So your odd output is coming from the to_s call the follows your grep. You're probably looking for this:
linkArray += tweets.split.grep(url_regexp)
or if you only want the first URL:
url = tweets.split.grep(url_regexp).first
linkArray << url if(url)
You could also skip the split.grep and use scan:
# \S+ should be good enough for this sort of thing.
linkArray += tweets.scan(%r{https?://\S+})
# or
url = tweets.scan(%r{https?://\S+}).first
linkArray << url if(url)

Split a string into an array of numbers

My string:
>> pp params[:value]
"07016,07023,07027,07033,07036,07060,07062,07063,07065,07066,07076,07081,07083,07088,07090,07092,07201,07202,07203,07204,07205,07206,07208,07901,07922,07974,08812,07061,07091,07207,07902"
How can this become an array of separate numbers like :
["07016", "07023", "07033" ... ]
result = params[:value].split(/,/)
String#split is what you need
Try this:
arr = "07016,07023,07027".split(",")
Note that what you ask for is not an array of separate numbers, but an array of strings that look like numbers. As noted by others, you can get that with:
arr = params[:value].split(',')
# Alternatively, assuming integers only
arr = params[:value].scan(/\d+/)
If you actually wanted an array of numbers (Integers), you could do it like so:
arr = params[:value].split(',').map{ |s| s.to_i }
# Or, for Ruby 1.8.7+
arr = params[:value].split(',').map(&:to_i)
# Silly alternative
arr = []; params[:value].scan(/\d+/){ |s| arr << s.to_i }

Resources