Regular expression in ruby and matching so many results - ruby-on-rails

Trying to create a simple regular expression that can extract numbers(between 7 - 14) after a keyword starting with g letter and some id, something like following :
(g)(\d{1,6})\s+(\d{7,14}\s*)+
Lets assume :
m = (/(g)(\d{1,6})\s+(\d{7,14}\s*)+/i.match("g12 327638474 83873478 2387327683 44 437643673476"))
I have results of :
#<MatchData "g23333 327638474 83873478 2387327683 " "g" "12" "2387327683 ">
But what I need as a final result , to include, 327638474, 83873478, 2387327683 and exclude 44.
For now I just getting the last number 2387327683 with not including the previous numbers
Any help here .
cheers

Instead of a regex, you can use something like that:
s = "g12 327638474 83873478 2387327683 44 437643673476"
s.split[1..-1].select { |x| (7..14).include?(x.size) }.map(&:to_i)
# => [327638474, 83873478, 2387327683, 437643673476]

Just as a FYI, here is a benchmark showing a bit faster way of accomplishing the selected answer:
require 'ap'
require 'benchmark'
n = 100_000
s = "g12 327638474 83873478 2387327683 44 437643673476"
ap s.split[1..-1].select { |x| (7..14).include? x.size }.map(&:to_i)
ap s.split[1..-1].select { |x| 7 <= x.size && x.size <= 14 }.map(&:to_i)
Benchmark.bm(11) do |b|
b.report('include?' ) { n.times{ s.split[1..-1].select { |x| (7..14).include? x.size }.map(&:to_i) } }
b.report('conditional') { n.times{ s.split[1..-1].select { |x| 7 <= x.size && x.size <= 14 }.map(&:to_i) } }
end
ruby ~/Desktop/test.rb
[
[0] 327638474,
[1] 83873478,
[2] 2387327683,
[3] 437643673476
]
[
[0] 327638474,
[1] 83873478,
[2] 2387327683,
[3] 437643673476
]
user system total real
include? 1.010000 0.000000 1.010000 ( 1.011725)
conditional 0.830000 0.000000 0.830000 ( 0.825746)
For speed I'll use the conditional test. It's a tiny bit more verbose, but is still easily read.

Related

Opposite of Ruby's number_to_human

Looking to work with a dataset of strings that store money amounts in these formats. For example:
$217.3M
$1.6B
$34M
€1M
€2.8B
I looked at the money gem but it doesn't look like it handles the "M, B, k"'s back to numbers. Looking for a gem that does do that so I can convert exchange rates and compare quantities. I need the opposite of the number_to_human method.
I would start with something like this:
MULTIPLIERS = { 'k' => 10**3, 'm' => 10**6, 'b' => 10**9 }
def human_to_number(human)
number = human[/(\d+\.?)+/].to_f
factor = human[/\w$/].try(:downcase)
number * MULTIPLIERS.fetch(factor, 1)
end
human_to_number('$217.3M') #=> 217300000.0
human_to_number('$1.6B') #=> 1600000000.0
human_to_number('$34M') #=> 34000000.0
human_to_number('€1M') #=> 1000000.0
human_to_number('€2.8B') #=> 2800000000.0
human_to_number('1000') #=> 1000.0
human_to_number('10.88') #=> 10.88
I decided to not be lazy and actually write my own function if anyone else wants this:
def text_to_money(text)
returnarray = []
if (text.count('k') >= 1 || text.count('K') >= 1)
multiplier = 1000
elsif (text.count('M') >= 1 || text.count('m') >= 1)
multiplier = 1000000
elsif (text.count('B') >= 1 || text.count('b') >= 1)
multiplier = 1000000000
else
multiplier = 1
end
num = text.to_s.gsub(/[$,]/,'').to_f
total = num * multiplier
returnarray << [text[0], total]
return returnarray
end
Thanks for the help!

Ruby: Increasing Efficiency

I am dealing with a large quantity of data and I'm worried about the efficiency of my operations at-scale. After benchmarking, the average time to execute this string of code is about 0.004sec. The goal of this line of code is to find the difference between the two values in each array location. In a previous operation, 111.111 was loaded into the arrays in locations which contained invalid data. Due to some weird time domain issues, I needed to do this because I couldn't just remove the values and I needed some distinguishable placeholder. I could probably use 'nil' here instead. Anyways, back to the explanation. This line of code checks to ensure neither array has this 111.111 placeholder in the current location. If the values are valid then I perform the mathematical operation, otherwise I want to delete the values (or at least exclude them from the new array to which I'm writing). I accomplished this by place a 'nil' in that location and then compacting the array afterwards.
The time of 0.004sec for 4000 data points in each array isn't terrible but this line of code is executed 25M times. I'm hoping someone might be able to offer some insight into how I might optimize this line of code.
temp_row = row_1.zip(row_2).map do |x, y|
x == 111.111 || y == 111.111 ? nil : (x - y).abs
end.compact
You are wasting CPU generating nil in the ternary statement, then using compact to remove them. Instead, use reject or select to find elements not containing 111.111 then map or something similar.
Instead of:
row_1 = [1, 111.111, 2]
row_2 = [2, 111.111, 4]
temp_row = row_1.zip(row_2).map do |x, y|
x == 111.111 || y == 111.111 ? nil : (x - y).abs
end.compact
temp_row # => [1, 2]
I'd start with:
temp_row = row_1.zip(row_2)
.reject{ |x,y| x == 111.111 || y == 111.111 }
.map{ |x,y| (x - y).abs }
temp_row # => [1, 2]
Or:
temp_row = row_1.zip(row_2)
.each_with_object([]) { |(x,y), ary|
ary << (x - y).abs unless (x == 111.111 || y == 111.111)
}
temp_row # => [1, 2]
Benchmarking different size arrays shows good things to know:
require 'benchmark'
DECIMAL_SHIFT = 100
DATA_ARRAY = (1 .. 1000).to_a
ROW_1 = (DATA_ARRAY + [111.111]).shuffle
ROW_2 = (DATA_ARRAY.map{ |i| i * 2 } + [111.111]).shuffle
Benchmark.bm(16) do |b|
b.report('ternary:') do
DECIMAL_SHIFT.times do
ROW_1.zip(ROW_2).map do |x, y|
x == 111.111 || y == 111.111 ? nil : (x - y).abs
end.compact
end
end
b.report('reject:') do
DECIMAL_SHIFT.times do
ROW_1.zip(ROW_2).reject{ |x,y| x == 111.111 || y == 111.111 }.map{ |x,y| (x - y).abs }
end
end
b.report('each_with_index:') do
DECIMAL_SHIFT.times do
ROW_1.zip(ROW_2)
.each_with_object([]) { |(x,y), ary|
ary += [(x - y).abs] unless (x == 111.111 || y == 111.111)
}
end
end
end
# >> user system total real
# >> ternary: 0.240000 0.000000 0.240000 ( 0.244476)
# >> reject: 0.060000 0.000000 0.060000 ( 0.058842)
# >> each_with_index: 0.350000 0.000000 0.350000 ( 0.349363)
Adjust the size of DECIMAL_SHIFT and DATA_ARRAY and the placement of 111.111 and see what happens to get an idea of what expressions work best for your data size and structure and fine-tune the code as necessary.
You can try the parallel gem https://github.com/grosser/parallel and run it on multiple threads

Using Ruby convert numbers to words?

How to convert numbers to words in ruby?
I know there is a gem somewhere. Trying to implement it without a gem. I just need the numbers to words in English for integers. Found this but it is very messy. If you have any idea on how to implement a cleaner easier to read solution please share.
http://raveendran.wordpress.com/2009/05/29/ruby-convert-number-to-english-word/
Here is what I have been working on. But having some problem implementing the scales. The code is still a mess. I hope to make it more readable when it functions properly.
class Numberswords
def in_words(n)
words_hash = {0=>"zero",1=>"one",2=>"two",3=>"three",4=>"four",5=>"five",6=>"six",7=>"seven",8=>"eight",9=>"nine",
10=>"ten",11=>"eleven",12=>"twelve",13=>"thirteen",14=>"fourteen",15=>"fifteen",16=>"sixteen",
17=>"seventeen", 18=>"eighteen",19=>"nineteen",
20=>"twenty",30=>"thirty",40=>"forty",50=>"fifty",60=>"sixty",70=>"seventy",80=>"eighty",90=>"ninety"}
scale = [000=>"",1000=>"thousand",1000000=>" million",1000000000=>" billion",1000000000000=>" trillion", 1000000000000000=>" quadrillion"]
if words_hash.has_key?(n)
words_hash[n]
#still working on this middle part. Anything above 999 will not work
elsif n>= 1000
print n.to_s.scan(/.{1,3}/) do |number|
print number
end
#print value = n.to_s.reverse.scan(/.{1,3}/).inject([]) { |first_part,second_part| first_part << (second_part == "000" ? "" : second_part.reverse.to_i.in_words) }
#(value.each_with_index.map { |first_part,second_part| first_part == "" ? "" : first_part + scale[second_part] }-[""]).reverse.join(" ")
elsif n <= 99
return [words_hash[n - n%10],words_hash[n%10]].join(" ")
else
words_hash.merge!({ 100=>"hundred" })
([(n%100 < 20 ? n%100 : n.to_s[2].to_i), n.to_s[1].to_i*10, 100, n.to_s[0].to_i]-[0]-[10])
.reverse.map { |num| words_hash[num] }.join(" ")
end
end
end
#test code
test = Numberswords.new
print test.in_words(200)
My take on this
def in_words(int)
numbers_to_name = {
1000000 => "million",
1000 => "thousand",
100 => "hundred",
90 => "ninety",
80 => "eighty",
70 => "seventy",
60 => "sixty",
50 => "fifty",
40 => "forty",
30 => "thirty",
20 => "twenty",
19=>"nineteen",
18=>"eighteen",
17=>"seventeen",
16=>"sixteen",
15=>"fifteen",
14=>"fourteen",
13=>"thirteen",
12=>"twelve",
11 => "eleven",
10 => "ten",
9 => "nine",
8 => "eight",
7 => "seven",
6 => "six",
5 => "five",
4 => "four",
3 => "three",
2 => "two",
1 => "one"
}
str = ""
numbers_to_name.each do |num, name|
if int == 0
return str
elsif int.to_s.length == 1 && int/num > 0
return str + "#{name}"
elsif int < 100 && int/num > 0
return str + "#{name}" if int%num == 0
return str + "#{name} " + in_words(int%num)
elsif int/num > 0
return str + in_words(int/num) + " #{name} " + in_words(int%num)
end
end
end
puts in_words(4) == "four"
puts in_words(27) == "twenty seven"
puts in_words(102) == "one hundred two"
puts in_words(38_079) == "thirty eight thousand seventy nine"
puts in_words(82102713) == "eighty two million one hundred two thousand seven hundred thirteen"
Have you considered humanize ?
https://github.com/radar/humanize
Simple answer use humanize gem and you will get desired output
Install it directly
gem install humanize
Or add it to your Gemfile
gem 'humanize'
And you can use it
require 'humanize'
1.humanize #=> 'one'
345.humanize #=> 'three hundred and forty-five'
1723323.humanize #=> 'one million, seven hundred and twenty-three thousand, three hundred and twenty-three'
If you are using this in rails you can directly use this
NOTE: As mentioned by sren in the comments below. The humanize method provided by ActiveSupport is different than the gem humanize
You can also use the to_words gem.
This Gem converts integers into words.
e.g.
1.to_words # one ,
100.to_words # one hundred ,
101.to_words # one hundred and one
It also converts negative numbers.
I can see what you're looking for, and you may wish to check out this StackOverflow post: Number to English Word Conversion Rails
Here it is in summary:
No, you have to write a function yourself. The closest thing to what
you want is number_to_human, but that does not convert 1 to One.
Here are some URLs that may be helpful:
http://codesnippets.joyent.com/posts/show/447
http://raveendran.wordpress.com/2009/05/29/ruby-convert-number-to-english-word/
http://deveiate.org/projects/Linguistics/
I am not quite sure, if this works for you. Method can be called like this.
n2w(33123) {|i| puts i unless i.to_s.empty?}
Here is the method ( I have not tested it fully. I think it works upto million. Code is ugly, there is a lot of room for re-factoring. )
def n2w(n)
words_hash = {0=>"zero",1=>"one",2=>"two",3=>"three",4=>"four",5=>"five",6=>"six",7=>"seven",8=>"eight",9=>"nine",
10=>"ten",11=>"eleven",12=>"twelve",13=>"thirteen",14=>"fourteen",15=>"fifteen",16=>"sixteen",
17=>"seventeen", 18=>"eighteen",19=>"nineteen",
20=>"twenty",30=>"thirty",40=>"forty",50=>"fifty",60=>"sixty",70=>"seventy",80=>"eighty",90=>"ninety"}
scale = {3=>"hundred",4 =>"thousand",6=>"million",9=>"billion"}
if words_hash.has_key?n
yield words_hash[n]
else
ns = n.to_s.split(//)
while ns.size > 0
if ns.size == 2
yield("and")
yield words_hash[(ns.join.to_i) - (ns.join.to_i)%10]
ns.shift
end
if ns.size > 4
yield(words_hash[(ns[0,2].join.to_i) - (ns[0,2].join.to_i) % 10])
else
yield(words_hash[ns[0].to_i])
end
yield(scale[ns.size])
ns.shift
end
end
end
def subhundred number
ones = %w{zero one two three four five six seven eight nine
ten eleven twelve thirteen fourteen fifteen
sixteen seventeen eighteen nineteen}
tens = %w{zero ten twenty thirty **forty** fifty sixty seventy eighty ninety}
subhundred = number % 100
return [ones[subhundred]] if subhundred < 20
return [tens[subhundred / 10]] if subhundred % 10 == 0
return [tens[subhundred / 10], ones[subhundred % 10]]
end
def subthousand number
hundreds = (number % 1000) / 100
tens = number % 100
s = []
s = subhundred(hundreds) + ["hundred"] unless hundreds == 0
s = s + ["and"] unless hundreds == 0 or tens == 0
s = s + [subhundred(tens)] unless tens == 0
end
def decimals number
return [] unless number.to_s['.']
digits = number.to_s.split('.')[1].split('').reverse
digits = digits.drop_while {|d| d.to_i == 0} . reverse
digits = digits.map {|d| subhundred d.to_i} . flatten
digits.empty? ? [] : ["and cents"] + digits
end
def words_from_numbers number
steps = [""] + %w{thousand million billion trillion quadrillion quintillion sextillion}
result = []
n = number.to_i
steps.each do |step|
x = n % 1000
unit = (step == "") ? [] : [step]
result = subthousand(x) + unit + result unless x == 0
n = n / 1000
end
result = ["zero"] if result.empty?
result = result + decimals(number)
result.join(' ').strip
end
def words_from_numbers(number)
ApplicationHelper.words_from_numbers(number)
end
Its been quite a while since the question was asked. Rails has something inbuilt for this now.
https://api.rubyonrails.org/classes/ActionView/Helpers/NumberHelper.html
number_to_human(1234567) # => "1.23 Million"
number_to_human(1234567890) # => "1.23 Billion"
number_to_human(1234567890123) # => "1.23 Trillion"
number_to_human(1234567890123456) # => "1.23 Quadrillion"
number_to_human(1234567890123456789) # => "1230 Quadrillion"

Rails mapping array of hashes onto single hash

I have an array of hashes like so:
[{"testPARAM1"=>"testVAL1"}, {"testPARAM2"=>"testVAL2"}]
And I'm trying to map this onto single hash like this:
{"testPARAM2"=>"testVAL2", "testPARAM1"=>"testVAL1"}
I have achieved it using
par={}
mitem["params"].each { |h| h.each {|k,v| par[k]=v} }
But I was wondering if it's possible to do this in a more idiomatic way (preferably without using a local variable).
How can I do this?
You could compose Enumerable#reduce and Hash#merge to accomplish what you want.
input = [{"testPARAM1"=>"testVAL1"}, {"testPARAM2"=>"testVAL2"}]
input.reduce({}, :merge)
is {"testPARAM2"=>"testVAL2", "testPARAM1"=>"testVAL1"}
Reducing an array sort of like sticking a method call between each element of it.
For example [1, 2, 3].reduce(0, :+) is like saying 0 + 1 + 2 + 3 and gives 6.
In our case we do something similar, but with the merge function, which merges two hashes.
[{:a => 1}, {:b => 2}, {:c => 3}].reduce({}, :merge)
is {}.merge({:a => 1}.merge({:b => 2}.merge({:c => 3})))
is {:a => 1, :b => 2, :c => 3}
How about:
h = [{"testPARAM1"=>"testVAL1"}, {"testPARAM2"=>"testVAL2"}]
r = h.inject(:merge)
Every answers until now are advising to use Enumerable#reduce (or inject which is an alias) + Hash#merge but beware, while being clean, concise and human readable this solution will be hugely time consuming and have a large memory footprint on large arrays.
I have compiled different solutions and benchmarked them.
Some options
a = [{'a' => {'x' => 1}}, {'b' => {'x' => 2}}]
# to_h
a.to_h { |h| [h.keys.first, h.values.first] }
# each_with_object
a.each_with_object({}) { |x, h| h.store(x.keys.first, x.values.first) }
# each_with_object (nested)
a.each_with_object({}) { |x, h| x.each { |k, v| h.store(k, v) } }
# map.with_object
a.map.with_object({}) { |x, h| h.store(x.keys.first, x.values.first) }
# map.with_object (nested)
a.map.with_object({}) { |x, h| x.each { |k, v| h.store(k, v) } }
# reduce + merge
a.reduce(:merge) # take wayyyyyy to much time on large arrays because Hash#merge creates a new hash on each iteration
# reduce + merge!
a.reduce(:merge!) # will modify a in an unexpected way
Benchmark script
It's important to use bmbm and not bm to avoid differences are due to the cost of memory allocation and garbage collection.
require 'benchmark'
a = (1..50_000).map { |x| { "a#{x}" => { 'x' => x } } }
Benchmark.bmbm do |x|
x.report('to_h:') { a.to_h { |h| [h.keys.first, h.values.first] } }
x.report('each_with_object:') { a.each_with_object({}) { |x, h| h.store(x.keys.first, x.values.first) } }
x.report('each_with_object (nested):') { a.each_with_object({}) { |x, h| x.each { |k, v| h.store(k, v) } } }
x.report('map.with_object:') { a.map.with_object({}) { |x, h| h.store(x.keys.first, x.values.first) } }
x.report('map.with_object (nested):') { a.map.with_object({}) { |x, h| x.each { |k, v| h.store(k, v) } } }
x.report('reduce + merge:') { a.reduce(:merge) }
x.report('reduce + merge!:') { a.reduce(:merge!) }
end
Note: I initially tested with a 1_000_000 items array but as reduce + merge is costing exponentially much time it will take to much time to end.
Benchmark results
50k items array
Rehearsal --------------------------------------------------------------
to_h: 0.031464 0.004003 0.035467 ( 0.035644)
each_with_object: 0.018782 0.003025 0.021807 ( 0.021978)
each_with_object (nested): 0.018848 0.000000 0.018848 ( 0.018973)
map.with_object: 0.022634 0.000000 0.022634 ( 0.022777)
map.with_object (nested): 0.020958 0.000222 0.021180 ( 0.021325)
reduce + merge: 9.409533 0.222870 9.632403 ( 9.713789)
reduce + merge!: 0.008547 0.000000 0.008547 ( 0.008627)
----------------------------------------------------- total: 9.760886sec
user system total real
to_h: 0.019744 0.000000 0.019744 ( 0.019851)
each_with_object: 0.018324 0.000000 0.018324 ( 0.018395)
each_with_object (nested): 0.029053 0.000000 0.029053 ( 0.029251)
map.with_object: 0.021635 0.000000 0.021635 ( 0.021782)
map.with_object (nested): 0.028842 0.000005 0.028847 ( 0.029046)
reduce + merge: 17.331742 6.387505 23.719247 ( 23.925125)
reduce + merge!: 0.008255 0.000395 0.008650 ( 0.008681)
2M items array (excluding reduce + merge)
Rehearsal --------------------------------------------------------------
to_h: 2.036005 0.062571 2.098576 ( 2.116110)
each_with_object: 1.241308 0.023036 1.264344 ( 1.273338)
each_with_object (nested): 1.126841 0.039636 1.166477 ( 1.173382)
map.with_object: 2.208696 0.026286 2.234982 ( 2.252559)
map.with_object (nested): 1.238949 0.023128 1.262077 ( 1.270945)
reduce + merge!: 0.777382 0.013279 0.790661 ( 0.797180)
----------------------------------------------------- total: 8.817117sec
user system total real
to_h: 1.237030 0.000000 1.237030 ( 1.247476)
each_with_object: 1.361288 0.016369 1.377657 ( 1.388984)
each_with_object (nested): 1.765759 0.000000 1.765759 ( 1.776274)
map.with_object: 1.439949 0.029580 1.469529 ( 1.481832)
map.with_object (nested): 2.016688 0.019809 2.036497 ( 2.051029)
reduce + merge!: 0.788528 0.000000 0.788528 ( 0.794186)
Use #inject
hashes = [{"testPARAM1"=>"testVAL1"}, {"testPARAM2"=>"testVAL2"}]
merged = hashes.inject({}) { |aggregate, hash| aggregate.merge hash }
merged # => {"testPARAM1"=>"testVAL1", "testPARAM2"=>"testVAL2"}
Here you can use either inject or reduce from Enumerable class as both of them are aliases of each other so there is no performance benefit to either.
sample = [{"testPARAM1"=>"testVAL1"}, {"testPARAM2"=>"testVAL2"}]
result1 = sample.reduce(:merge)
# {"testPARAM1"=>"testVAL1", "testPARAM2"=>"testVAL2"}
result2 = sample.inject(:merge)
# {"testPARAM1"=>"testVAL1", "testPARAM2"=>"testVAL2"}

Array Merge (Union)

I have two array I need to merge, and using the Union (|) operator is PAINFULLY slow.. are there any other ways to accomplish an array merge?
Also, the arrays are filled with objects, not strings.
An Example of the objects within the array
#<Article
id: 1,
xml_document_id: 1,
source: "<article><domain>events.waikato.ac</domain><excerpt...",
created_at: "2010-02-11 01:32:46",
updated_at: "2010-02-11 01:41:28"
>
Where source is a short piece of XML.
EDIT
Sorry! By 'merge' I mean I need to not insert duplicates.
A => [1, 2, 3, 4, 5]
B => [3, 4, 5, 6, 7]
A.magic_merge(B) #=> [1, 2, 3, 4, 5, 6, 7]
Understanding that the integers are actually Article objects, and the Union operator appears to take forever
Here's a script which benchmarks two merge techniques: using the pipe operator (a1 | a2), and using concatenate-and-uniq ((a1 + a2).uniq). Two additional benchmarks give the time of concatenate and uniq individually.
require 'benchmark'
a1 = []; a2 = []
[a1, a2].each do |a|
1000000.times { a << rand(999999) }
end
puts "Merge with pipe:"
puts Benchmark.measure { a1 | a2 }
puts "Merge with concat and uniq:"
puts Benchmark.measure { (a1 + a2).uniq }
puts "Concat only:"
puts Benchmark.measure { a1 + a2 }
puts "Uniq only:"
b = a1 + a2
puts Benchmark.measure { b.uniq }
On my machine (Ubuntu Karmic, Ruby 1.8.7), I get output like this:
Merge with pipe:
1.000000 0.030000 1.030000 ( 1.020562)
Merge with concat and uniq:
1.070000 0.000000 1.070000 ( 1.071448)
Concat only:
0.010000 0.000000 0.010000 ( 0.005888)
Uniq only:
0.980000 0.000000 0.980000 ( 0.981700)
Which shows that these two techniques are very similar in speed, and that uniq is the larger component of the operation. This makes sense intuitively, being O(n) (at best), whereas simple concatenation is O(1).
So, if you really want to speed this up, you need to look at how the <=> operator is implemented for the objects in your arrays. I believe that most of the time is being spent comparing objects to ensure inequality between any pair in the final array.
Do you need the items to be in a specific order within the arrays? If not, you may want to check whether using Sets makes it faster.
Update
Adding to another answerer's code:
require "set"
require "benchmark"
a1 = []; a2 = []
[a1, a2].each do |a|
1000000.times { a << rand(999999) }
end
s1, s2 = Set.new, Set.new
[s1, s2].each do |s|
1000000.times { s << rand(999999) }
end
puts "Merge with pipe:"
puts Benchmark.measure { a1 | a2 }
puts "Merge with concat and uniq:"
puts Benchmark.measure { (a1 + a2).uniq }
puts "Concat only:"
puts Benchmark.measure { a1 + a2 }
puts "Uniq only:"
b = a1 + a2
puts Benchmark.measure { b.uniq }
puts "Using sets"
puts Benchmark.measure {s1 + s2}
puts "Starting with arrays, but using sets"
puts Benchmark.measure {s3, s4 = [a1, a2].map{|a| Set.new(a)} ; (s3 + s4)}
gives (for ruby 1.8.7 (2008-08-11 patchlevel 72) [universal-darwin10.0])
Merge with pipe:
1.320000 0.040000 1.360000 ( 1.349563)
Merge with concat and uniq:
1.480000 0.030000 1.510000 ( 1.512295)
Concat only:
0.010000 0.000000 0.010000 ( 0.019812)
Uniq only:
1.460000 0.020000 1.480000 ( 1.486857)
Using sets
0.310000 0.010000 0.320000 ( 0.321982)
Starting with arrays, but using sets
2.340000 0.050000 2.390000 ( 2.384066)
Suggests that sets may or may not be faster, depending on your circumstances (lots of merges or not many merges).
Using the Array#concat method will likely be a lot faster, according to my initial benchmarks using Ruby 1.8.7:
require 'benchmark'
def reset_arrays!
#array1 = []
#array2 = []
[#array1, #array2].each do |array|
10000.times { array << ActiveSupport::SecureRandom.hex }
end
end
reset_arrays! && puts(Benchmark.measure { #array1 | #array2 })
# => 0.030000 0.000000 0.030000 ( 0.026677)
reset_arrays! && puts(Benchmark.measure { #array1.concat(#array2) })
# => 0.000000 0.000000 0.000000 ( 0.000122)
Try this and see if this is any faster
a = [1,2,3,3,2]
b = [1,2,3,4,3,2,5,7]
(a+b).uniq

Resources