Reading defined amount of numbers w/o array - ruby-on-rails

Often in my tasks I've got to read previously defined amount of numbers in line and of course I know how they are separated. For example, let's say it'd be two coordinates separated with space. Here is what I actually do to read this line:
line = gets.split(' ').map(&:to_i)
It's fine, but I either have to call line[0] .. line[1] or add two lines for a = line[0] ...
Is there any way to read defined amount of numbers and assign them to variables, not an array?
EDIT/TL:DR: In the other words: I'm looking for Ruby's scanf("%d %d", &x, &y);

Just do this:
x, y = gets.split(' ').map(&:to_i)
puts "x: #{x}"
puts "y: #{y}"

If you do not know how many values will be in the string you cannot assign each value to a local variable. That's because you cannot use parallel assignment (see my comment on #K's answer) and, since Ruby v1.9, you cannot define local variables dynamically (e.g, eval("x=1") doesn't work).
You can assign the values to instance variables dynamically, like so:
arr = gets.chomp.split(/\s+/)
or
arr = gets.split(/\s+/).map(&:to_i)
depending on your needs.
Suppose
arr = [1, 2, 3, 4]
Then:
arr.each_with_index { |x,i| instance_variable_set("#my_var#{i}", x) }
instance_variable_get(:#my_var0) #=> 1
instance_variable_get(:#my_var1) #=> 2
instance_variable_get(:#my_var2) #=> 3
instance_variable_get(:#my_var3) #=> 4
But why would you want to do this?
Edit:
You know that gets will return a string that contains two substrings representing integers, separated by one or more spaces, and you wish to assign those two integers to variables x and y.
I expect you don't mind if unnamed temporary arrays are created, you just don't want any named arrays, such as:
arr = gets.map(&:to_i)
x,y = arr
If, however, you want to do this without creating any arrays, this might work:
s = gets #=> "1 -2\n"
x = s[/(-?\d+)\s/,1].to_i #=> 1
y = s[/\s(-?\d+)/,1].to_i #=> -2
I say "might" because I don't know if Ruby is creating any arrays in the background for her own use.
Alternatively, you could use positive lookarounds and no capture groups:
x = s[/-?\d+(?=\s)/].to_i #=> 1
y = s[/(?<=\s)-?\d+/].to_i #=> -2

Related

Ruby/Rails dictionary app - 6 letter words finder that are built of two concatenated smaller words

I need to create functionality which is going to process the dictionary (dictionary.txt file). The goal is to find all six-letter words that are built of two concatenated smaller words e.g.:
con + vex => convex
tail + or => tailor
we + aver => weaver
Of course, there may be some words inside the file that are not 6 letters long, but these can be easily sifted out using a simple method:
def cleanup_file
file_data = File.read('dictrionary.txt').split
file_data.reject! { |word| word.size < 6 }
end
But now comes the problem - how to find if the other strings in the array are made of two connected smaller words ?
[Edit]
Sample dictionary.txt file here
Thinking just in a pseudo code solution, but you should:
Iterate every line of the dictionary and store the words in 6 different arrays by the length of each word.
Make sure that all words are downcased, there are no duplicates and all the values are sorted, so later you could properly use .bsearch in the arrays.
Iterate the length-6 array (for example convex) and look for a match of the first character of the current word in the length-1 array (c for the given example) and in the length-5 array (onvex). If there's a match, save the words.
Then keep looking in the length-2 and length-4 arrays for matches (co and nvex correspondingly) and save a match.
Finally, look both parts of the string in the length-3 array (con and vex) and save any match
Look for the next 6 characters string until you've finished.
Most likely there are better ways to solve this, like in the first iteration inserting each word in its corresponding array using .bsearch_index to sort and not inserting duplicates in the same iteration, but most of the workload is going to be in the 2nd iteration and binary searches work in O(log n) time, so I guess it should work quick enough.
Suppose the dictionary is as follows.
dictionary = [
"a", "abased", "act", "action", "animal", "ape", "apeman",
"art", "assertion", "bar", "barbed", "barhop", "based", "be",
"become", "bed", "come", "hop", "ion", "man"
]
I assume that, like most dictionaries, dictionary is sorted.
First compute the following hash.
by_len = dictionary.each_with_object({}) do |w,h|
len = w.length
(h[len] ||= []) << w if len < 7
end
#=> {1=>["a"],
# 6=>["abased", "action", "animal", "apeman", "barbed",
# "barhop", "become"],
# 3=>["act", "ape", "art", "bar", "bed", "hop", "ion", "man"],
# 5=>["based"],
# 2=>["be"],
# 4=>["come"]}
Each key is a word length (1-6) and each value is an array of words from dictionary whose length is the value of the key.
Next I will define a helper function that returns true or false depending on whether a given array of words (list) contains a given word (word).
def found?(list, word)
w = list.bsearch { |w| w >= word }
w && w == word
end
For example:
found?(by_len[3], "art")
#=> true
found?(by_len[3], "any")
#=> false
See Array#bsearch.
We now extract the words of interest:
by_len[6].select { |w| (1..5).any? { |i|
found?(by_len[i], w[0,i]) && found?(by_len[6-i], w[i..-1]) } }
#=> ["abased", "action", "apeman", "barbed", "barhop", "become"]

Casting a value in Lua table, not a link

I have simplified my code so you can have a better understanding:
x = {}
x["foo"]=1
a = {}
a[1]=x
x["foo"]=2
a[2]=x
print(a[1]["foo"])
print(a[2]["foo"])
The result is:
2
2
Or I was expecting:
1
2
I understant that a[1] is directing at the adress of the table x["foo"]. Then, when I change the value of this table, the variable a[1] points to the new value.
How can I tell Lua that I want to assign the VALUE and not link to and adress?
And just another thing: if x is a "simple" variable, not an array, the value is passed:
y = {}
x = 1
a = {}
a[1] = x
x = 2
a[2] = x
print(a[1])
print(a[2])
returns
1
2
The Lua manual, last but one paragraph of §2.1, says:
Tables, functions, threads, and (full) userdata values are objects: variables do not actually contain these values, only references to them. Assignment, parameter passing, and function returns always manipulate references to such values; these operations do not imply any kind of copy.

Why does Lua's length (#) operator return unexpected values?

Lua has the # operator to compute the "length" of a table being used as an array.
I checked this operator and I am surprised.
This is code, that I let run under Lua 5.2.3:
t = {};
t[0] = 1;
t[1] = 2;
print(#t); -- 1 aha lua counts from one
t[2] = 3;
print(#t); -- 2 tree values, but only two are count
t[4] = 3;
print(#t); -- 4 but 3 is mssing?
t[400] = 400;
t[401] = 401;
print(#t); -- still 4, now I am confused?
t2 = {10, 20, nil, 40}
print(#t2); -- 4 but documentations says this is not a sequence?
Can someone explain the rules?
About tables in general
(oh, can't you just give me an array)
In Lua, a table is the single general-purpose data structure. Table keys can be of any type, like number, string, boolean. Only nil keys aren't allowed.
Whether tables can or can't contain nil values is a surprisingly difficult question which I tried to answer in depth here. Let's just assume that setting t[k] = nil should be the observably the same as never setting k at all.
Table construction syntax (like t2 = {10, 20, nil, 40}) is a syntactic sugar for creating a table and then setting its values one by one (in this case: t2 = {}, t2[1] = 10, t2[2] = 20, t2[3] = nil, t2[4] = 40).
Tables as arrays
(oh, from this angle it really looks quite arrayish)
As tables are the only complex data structure in Lua, the language (for convenience) provides some ways for manipulating tables as if they were arrays.
Notably, this includes the length operator (#t) and many standard functions, like table.insert, table.remove, and more.
The behavior of the length operator (and, in consequence, the mentioned utility functions) is only defined for array-like tables with a particular set of keys, so-called sequences.
Quoting the Lua 5.2 Reference manual:
the length of a table t is only defined if the table is a sequence, that is, the set of its positive numeric keys is equal to {1..n} for some integer n
As a result, the behavior of calling #t on a table not being a sequence at that time, is undefined.
It means that any result could be expected, including 0, -1, or false, or an error being raised (unrealistic for the sake of backwards compatibility), or even Lua crashing (quite unrealistic).
Indirectly, this means that the behavior of utility functions that expect a sequence is undefined if called with a non-sequence.
Sequences and non-sequences
(it's really not obvious)
So far, we know that using the length operator on tables not being sequences is a bad idea. That means that we should either do that in programs that are written in a particular way, that guarantees that those tables will always be sequences in practice, or, in case we are provided with a table without any assumptions about their content, we should dynamically ensure they are indeed a sequence.
Let's practice. Remember: positive numeric keys have to be in the form {1..n}, e.g. {1}, {1, 2, 3}, {1, 2, 3, 4, 5}, etc.
t = {}
t[1] = 123
t[2] = "bar"
t[3] = 456
Sequence. Easy.
t = {}
t[1] = 123
t[2] = "bar"
t[3] = 456
t[5] = false
Not a sequence. {1, 2, 3, 5} is missing 4.
t = {}
t[1] = 123
t[2] = "bar"
t[3] = 456
t[4] = nil
t[5] = false
Not a sequence. nil values aren't considered part of the table, so again we're missing 4.
t = {}
t[1] = 123
t[2] = "bar"
t[3.14] = 456
t[4] = nil
t[5] = false
Not a sequence. 3.14 is positive, but isn't an integer.
t = {}
t[0] = "foo"
t[1] = 123
t[2] = "bar"
Sequence. 0 isn't counted for the length and utility functions will ignore it, but this is a valid sequence. The definition only gives requirements about positive number keys.
t = {}
t[-1] = "foo"
t[1] = 123
t[2] = "bar"
Sequence. Similar.
t = {}
t[1] = 123
t["bar"] = "foo"
t[2] = "bar"
t[false] = 1
t[3] = 0
Sequence. We don't care about non-numeric keys.
Diving into the implementation
(if you really have to know)
But what happens in C implementation of Lua when we call # on a non-sequence?
Background: Tables in Lua are internally divided into array part and hash part. That's an optimization. Lua tries to avoid allocating memory often, so it pre allocates for the next power of two. That's another optimization.
When the last item in the array part is nil, the result of # is the length of the shortest valid sequence found by binsearching the array part for the first nil-followed key.
When the last item in the array part is not nil AND the hash part is empty, the result of # is the physical length of the array part.
When the last item in the array part is not nil AND the hash part is NOT empty, the result of # is the length of the shortest valid sequence found by binsearching the hash part for for the first nil-followed key (that is such positive integer i that t[i] ~= nil and t[i+1] == nil), assuming that the array part is full of non-nils(!).
So the result of # is almost always the (desired) length of the shortest valid sequence, unless the last element in the array part representing a non-sequence is non-nil. Then, the result is bigger than desired.
Why is that? It seems like yet another optimization (for power-of-two sized arrays). The complexity of # on such tables is O(1), while other variants are O(log(n)).
In Lua only specially formed tables are considered an array. They are not really an array such as what one might consider as an array in the C language. The items are still in a hash table. But the keys are numeric and contiguous from 1 to N. Lua arrays are unit offset, not zero offset.
The bottom line is that if you do not know if the table you have formed meets the Lua criteria for an array then you must count up the items in the table to know the length of the table. That is the only way. Here is a function to do it:
function table_count(T)
local count = 0
for _ in pairs(T) do count = count + 1 end
return count
end
If you populate a table with the "insert" function used in the manner of the following example, then you will be guaranteed of making an "array" table.
s={}
table.insert(s,[whatever you want to store])
table.insert could be in a loop or called from other places in your code. The point is, if you put items in your table in this way then it will be an array table and you can use the # operator to know how many items are in the table, otherwise you have to count the items.

Sorting an array in Ruby (Special Case)

I have an array in Ruby which has values as follows
xs = %w(2.0.0.1
2.0.0.6
2.0.1.10
2.0.1.5
2.0.0.8)
and so on. I want to sort the array such that the final result should be something like this :
ys = %w(2.0.0.1
2.0.0.6
2.0.0.8
2.0.1.5
2.0.1.10)
I have tried using the array.sort function, but it places "2.0.1.10" before "2.0.1.5". I am not sure why that happens
Using a Schwartzian transform (Enumerable#sort_by), and taking advantage of the lexicographical order defined by an array of integers (Array#<=>):
sorted_ips = ips.sort_by { |ip| ip.split(".").map(&:to_i) }
Can you please explain a bit more elaborately
You cannot compare strings containing numbers: "2" > "1", yes, but "11" < "2" because strings are compared lexicographically, like words in a dictionary. Therefore, you must convert the ip into something than can be compared (array of integers): ip.split(".").map(&:to_i). For example "1.2.10.3" is converted to [1, 2, 10, 3]. Let's call this transformation f.
You could now use Enumerable#sort: ips.sort { |ip1, ip2| f(ip1) <=> f(ip2) }, but check always if the higher abstraction Enumerable#sort_by can be used instead. In this case: ips.sort_by { |ip| f(ip) }. You can read it as "take the ips and sort them by the order defined by the f mapping".
Split your data into chunks by splitting on '.'. There is no standard function to do it as such so you need to write a custom sort to perform this.
And the behaviour you said about 2.0.1.10 before 2.0.1.5 is expected because it is taking the data as strings and doing ASCII comparisons, leading to the result that you see.
arr1 = "2.0.0.1".split('.')
arr2 = "2.0.0.6".split('.')
Compare both arr1 and arr2 element by element, for all the data in your input.

Is it right to assign multiple variables like this a = b = c = d = 5?

a = b = c = d = 5
puts (a) >> 5
puts (b) >> 5
puts (b) >> 5
puts (b) >> 5
a= a+1
puts (a) >> 6
puts (b) >> 5
I found there is no problem with the assigning of values like this. My question is should one assign like the one given above or like this?
a , b, c, d = 5, 5, 5, 5
The thing to be aware of here is that your case only works OK because numbers are immutable in Ruby. You don't want to do this with strings, arrays, hashes or pretty much anything else other than numbers, because it would create multiple references to the same object, which is almost certainly not what you want:
a = b = c = d = "test"
b << "x"
=> "testx"
a
=> "testx"
Whereas the parallel form is safe with all types:
a,b,c,d = "test","test","test","test"
=> ["test", "test", "test", "test"]
b << "x"
=> "testx"
a
=> "test"
There's nothing wrong with assigning it that way (a = b = c = d = 5). I personally prefer it over multiple assignment if all the variables need to have the same value.
Here's another way:
a, b, c, d = [5] * 4
If it feels good, do it.
The language allows it, as you discovered, and it behaves as you'd expect. I'd suggest that the only question you should ask yourself regards expressiveness: is the code telling you what its purpose is?
Personally, I don't particularly like using this construct for much other than initialisation to default values, preferably zero. Ideally the variables so initialised would all have a similar purpose as well, counters, for example. But if I had more than a couple of similarly-purposed variables I might very well consider declaring them to be a form of duplicate, to be refactored out into, for example, a Hash.
These two initializations express different meaning. The a = b = c = d = 5 means "all my variables should be initialized to the same value, and this value is 5". The other one, a, b, c, d = 5, 5, 5, 5, means "I have a list of variables, and corresponding list of init values".
Is your logic such that all the variables should always be the same? Then the first one is better. If not, the second one might be better. Another question: is your list of 4 variables comprehensive? is it likely that you will add or remove another variable to this group? If so, I'd suggest yet another variant instead:
a = 5
b = 5
c = 5
d = 5
I once got bitten with that one. It may save you a few keystrokes today but come to bite you later. As #glenn mentioned, it creates multiple references to the same object.
Example: This applies to both ruby 1.8 and 1.9
> a = b = Array.new
=> []
> a.object_id == b.object_id
=> true
> a << 1
=> [1]
> b << 2
=> [1, 2]
I don't use ruby at all, so that might be an acceptable idiom, but a = b = c = d = 5 looks pretty ugly to me. a , b, c, d = 5, 5, 5, 5 looks much nicer, IMO.

Resources