Iterating an Array by an Array (Another Ruby Word Count) - ruby-on-rails

Having some trouble figuring out the logic for a ruby word count. My goal is to pass in some text, and get the total count of a certain category of words as defined in an array. So if I gave the following variables, I'd want to find out the fraction of words mentioned that have anything to do with fruit:
content = "I went to the store today, and I bought apples, eggs, bananas,
yogurt, bacon, spices, milk, oranges, and a pineapple. I also had a fruit
smoothie and picked up some replacement Apple earbuds."
fruit = ["apple", "banana", "fruit", "kiwi", "orange", "pear", "pineapple", "watermelon"]
(I realize plural/singular is not consistent; just an example). Here's the code I've been trying:
content.strip
contentarray = content.downcase.split(/[^a-zA-Z]/)
contentarray.delete("")
total_wordcount = contentarray.size
IRB Test:
contentarray.grep("and")
=> ["and", "and", "and"]
contentarray.grep("and").count
=> 3
So then I try:
fruit.each do |i|
contentarray.grep(i).count
end
=> ["apple", "banana", "fruit", "kiwi", "orange", "pear", "pineapple", "watermelon"]
It just returns the array, no counts. I would add them all up after if it returned any numbers. The goal is to end up with:
fruitwordcount
=> 6 / 33
or
=> .1818181
I've tried searching and found a lot of methods saying to convert the content array to a hash count occurrences as many tutorials do, but that gives the count of every single word when I need the counts of only a subset. I can't seem to find a good way to search an array or string of words by an array of strings. I found a few articles saying to use a histogram from the Multiset gem, but that's still giving every word. Any help would be very much appreciated; please forgive my n00bery.

Fruit#each just iterates the fruits, while you likely want to collect value. map comes to the rescue:
result = fruit.map do |i|
[i, contentarray.grep(i).count]
end
Whether you need a hash of fruit ⇒ count, it’s simple:
result = Hash[result]
Hope it helps.

The method you are looking for is map, not each: each executes the block for each element in the array, and then returns the original array. map creates a new array containing the values returned by the block.
fruit.map do |i|
contentarray.grep(i).count
end
=> [1, 0, 1, 0, 0, 0, 1, 0]

It's because the each method just iterates and executes the block. Use map or collect to execute the block and return an array.
result = fruit.map { |i| counterarray.grep(i).count }

array#each returns the array itself as per ruby docs.
You probably want to try to give some of the other methods a try. Especially count and map look promising:
fruit.map do |f|
contentarray.count{|content| content == f}
end

To get just the fruits get your array - contentarray.keep_if{|x| fruit.include?(x) } then turn it into a hash count in the way you've found tutorials do.
Or just use inject on the contentarray to build the hash
contentarray.inject(Hash.new(0)) do |result, element|
if fruit.include?(element)
result[element] += 1
end
result
end
Hash.new(0) sets the default value to 0 so we can just add one

Related

LuA How to sort table from lowest value without key changes [duplicate]

I have a key => value table I'd like to sort in Lua. The keys are all integers, but aren't consecutive (and have meaning). Lua's only sort function appears to be table.sort, which treats tables as simple arrays, discarding the original keys and their association with particular items. Instead, I'd essentially like to be able to use PHP's asort() function.
What I have:
items = {
[1004] = "foo",
[1234] = "bar",
[3188] = "baz",
[7007] = "quux",
}
What I want after the sort operation:
items = {
[1234] = "bar",
[3188] = "baz",
[1004] = "foo",
[7007] = "quux",
}
Any ideas?
Edit: Based on answers, I'm going to assume that it's simply an odd quirk of the particular embedded Lua interpreter I'm working with, but in all of my tests, pairs() always returns table items in the order in which they were added to the table. (i.e. the two above declarations would iterate differently).
Unfortunately, because that isn't normal behavior, it looks like I can't get what I need; Lua doesn't have the necessary tools built-in (of course) and the embedded environment is too limited for me to work around it.
Still, thanks for your help, all!
You seem to misunderstand something. What you have here is a associative array. Associative arrays have no explicit order on them, e.g. it's only the internal representation (usually sorted) that orders them.
In short -- in Lua, both of the arrays you posted are the same.
What you would want instead, is such a representation:
items = {
{1004, "foo"},
{1234, "bar"},
{3188, "baz"},
{7007, "quux"},
}
While you can't get them by index now (they are indexed 1, 2, 3, 4, but you can create another index array), you can sort them using table.sort.
A sorting function would be then:
function compare(a,b)
return a[1] < b[1]
end
table.sort(items, compare)
As Komel said, you're dealing with associative arrays, which have no guaranteed ordering.
If you want key ordering based on its associated value while also preserving associative array functionality, you can do something like this:
function getKeysSortedByValue(tbl, sortFunction)
local keys = {}
for key in pairs(tbl) do
table.insert(keys, key)
end
table.sort(keys, function(a, b)
return sortFunction(tbl[a], tbl[b])
end)
return keys
end
items = {
[1004] = "foo",
[1234] = "bar",
[3188] = "baz",
[7007] = "quux",
}
local sortedKeys = getKeysSortedByValue(items, function(a, b) return a < b end)
sortedKeys is {1234,3188,1004,7007}, and you can access your data like so:
for _, key in ipairs(sortedKeys) do
print(key, items[key])
end
result:
1234 bar
3188 baz
1004 foo
7007 quux
hmm, missed the part about not being able to control the iteration. there
But in lua there is usually always a way.
http://lua-users.org/wiki/OrderedAssociativeTable
Thats a start. Now you would need to replace the pairs() that the library uses. That could be a simples as pairs=my_pairs. You could then use the solution in the link above
PHP arrays are different from Lua tables.
A PHP array may have an ordered list of key-value pairs.
A Lua table always contains an unordered set of key-value pairs.
A Lua table acts as an array when a programmer chooses to use integers 1, 2, 3, ... as keys. The language syntax and standard library functions, like table.sort offer special support for tables with consecutive-integer keys.
So, if you want to emulate a PHP array, you'll have to represent it using list of key-value pairs, which is really a table of tables, but it's more helpful to think of it as a list of key-value pairs. Pass a custom "less-than" function to table.sort and you'll be all set.
N.B. Lua allows you to mix consecutive-integer keys with any other kinds of keys in the same table—and the representation is efficient. I use this feature sometimes, usually to tag an array with a few pieces of metadata.
Coming to this a few months later, with the same query. The recommended answer seemed to pinpoint the gap between what was required and how this looks in LUA, but it didn't get me what I was after exactly :- which was a Hash sorted by Key.
The first three functions on this page DID however : http://lua-users.org/wiki/SortedIteration
I did a brief bit of Lua coding a couple of years ago but I'm no longer fluent in it.
When faced with a similar problem, I copied my array to another array with keys and values reversed, then used sort on the new array.
I wasn't aware of a possibility to sort the array using the method Kornel Kisielewicz recommends.
The proposed compare function works but only if the values in the first column are unique.
Here is a bit enhanced compare function to ensure, if the values of a actual column equals, it takes values from next column to evaluate...
With {1234, "baam"} < {1234, "bar"} to be true the items the array containing "baam" will be inserted before the array containing the "bar".
local items = {
{1004, "foo"},
{1234, "bar"},
{1234, "baam"},
{3188, "baz"},
{7007, "quux"},
}
local function compare(a, b)
for inx = 1, #a do
-- print("A " .. inx .. " " .. a[inx])
-- print("B " .. inx .. " " .. b[inx])
if a[inx] == b[inx] and a[inx + 1] < b[inx + 1] then
return true
elseif a[inx] ~= b[inx] and a[inx] < b[inx] == true then
return true
else
return false
end
end
return false
end
table.sort(items,compare)

How can I return the highest "valued" element -- per "name" -- in an Array?

I've read a lot of posts about finding the highest-valued objects in arrays using max and max_by, but my situation is another level deeper, and I can't find any references on how to do it.
I have an experimental Rails app in which I am attempting to convert a legacy .NET/SQL application. The (simplified) model looks like Overlay -> Calibration <- Parameter. In a single data set, I will have, say, 20K Calibrations, but about 3,000-4,000 of these are versioned duplicates by Parameter name, and I need only the highest-versioned Parameter by each name. Further complicating matters is that the version lives on the Overlay. (I know this seems crazy, but this models our reality.)
In pure SQL, we add the following to a query to create a virtual table:
n = ROW_NUMBER() OVER (PARTITION BY Parameters.Designation ORDER BY Overlays.Version DESC)
And then select the entries where n = 1.
I can order the array like this:
ordered_calibrations = mainline_calibrations.sort do |e, f|
[f.parameter.Designation, f.overlay.Version] <=> [e.parameter.Designation, e.overlay.Version] || 1
end
I get this kind of result:
C_SCR_trc_NH3SensCln_SCRT1_Thd 160
C_SCR_trc_NH3SensCln_SCRT1_Thd 87
C_SCR_trc_NH3Sen_DewPtHiThd_Tbl 310
C_SCR_trc_NH3Sen_DewPtHiThd_Tbl 160
C_SCR_trc_NH3Sen_DewPtHiThd_Tbl 87
So I'm wondering if there is a way, using Ruby's Enumerable built-in methods, to loop over the sorted array, and only return the highest-versioned elements per name. HUGE bonus points if I could feed an integer to this method's block, and only return the highest-versioned elements UP TO that version number ("160" would return just the second and fourth entries, above).
The alternative to this is that I could somehow implement the ROW_NUMBER() OVER in ActiveRecord, but that seems much more difficult to try. And, of course, I could write code to deal with this, but I'm quite certain it would be orders of magnitude slower than figuring out the right Enumerable function, if it exists.
(Also, to be clear, it's trivial to do .find_by_sql() and create the same result set as in the legacy application -- it's even fast -- but I'm trying to drag all the related objects along for the ride, which you really can't do with that method.)
I'm not convinced that doing this in the database isn't a better option, but since I'm unfamiliar with SQL Server I'll give you a Ruby answer.
I'm assuming that when you say "Parameter name" you're talking about the Parameters.Designation column, since that's the one in your examples.
One straightforward way you can do this is with Enumerable#slice_when, which is available in Ruby 2.2+. slice_when is good when you want to slice an array "between" values that are different in some way. For example:
[ { id: 1, name: "foo" }, { id: 2, name: "foo" }, { id: 3, name: "bar" } ]
.slice_when {|a,b| a[:name] != b[:name] }
# => [ [ { id: 1, name: "foo" }, { id: 2, name: "foo" } ],
# [ { id: 3, name: "bar" } ]
# ]
You've already sorted your collection, so to slice it you just need to do this:
calibrations_by_designation = ordered_calibrations.slice_when do |a, b|
a.parameter.Designation != b.parameter.Designation
end
Now calibrations_by_designation is an array of arrays, each of which is sorted from greatest Overlay.Version to least. The final step, then, is to get the first element in each of those arrays:
highest_version_calibrations = calibrations_by_designation.map(&:first)

Ruby Calling Join on an Array versus String

I have a script that creates an array , then adds items to the array depending on certain circumstances. In most cases, the array will end up with several values inside of it. Occasionally, the array will only hold one value inside of it.
After preparing this array, I usually call .join(",") to create a comma-separated string of all the array values:
tags.join(",")
It works fine when the array has multiple values, but when it only has one value it throws an error:
NoMethodError: undefined method 'join' for "Whatever the array value": String
Any idea why this is? What is the easiest way to resolve this? Do I need to do an if statement to check if the variable is an array or string? Seems a bit silly...let me know if I am missing something here.
If obj is your object, you can write
[*obj].join
For example
arr = ["Fa", "bu", "lo", "us!"]
[*arr].join #=> "Fabulous!"
str = "Whoa!"
[*str].join #=> "Whoa!"
This works because
[*arr] #=> ["Fa", "bu", "lo", "us!"] == arr
[*str] #=> ["Whoa!"]
Similarly,
[*[1,2,3]].join #=> "123"
[*7].join #=> "7"
You can use join on an array as following way :
#array = ["this","is","join","method","example"]
#array.join(" ")
"this is join method example"
#array.join("_")
"this_is_join_method_example"
In the case of a single element (say, 'Hello'), you should be calling join on an array, not the string itself; for example, ['Hello'].join(",") rather than 'Hello'.join(","). Of course, if there's only one element join doesn't actually do anything, so you could just use a conditional if to skip it... but that's kinda ugly. Most of the time, I'd use the construction Array(tags).join(","). If passed a single string, that'll wrap it in an array; if passed an array, it's a noop, returning the array as-is.

Order array depending on appearance in another

Lets say I have 2 arrays of users.
[user1, user2, user3]
[user3]
Based on the second array, I want to sort the first array so that occurrences in the second array appear first in the first array.
So the result of the first array would be:
[user3, user1, user2]
I realise the simple way would be to iterate over the first array and populate an empty array, ordering it if the second array contains it, then merging the rest. The below is pseudo code and untested, but gives an idea of what I was thinking as a simple solution
return_array = []
array1.each do |a|
if array2.include? a
return_array.push array1.pop(a)
end
end
return_array.merge array1
Is there any way to refine this? Built in rails or ruby methods for example.
You should use array intersection and array difference:
a&b + a-b
would give you what you're looking for.
The manual for intersection: http://ruby-doc.org/core-2.2.0/Array.html#method-i-26
The manual for difference: http://ruby-doc.org/core-2.2.0/Array.html#method-i-2D
You could just do this:
array2 + (array1 - array2)
my_array = ['user1', 'user2', 'user3']
my_array2 = ['user3']
my_array2 + (my_array.sort - my_array2)

How to create object array in rails?

I need to know how to create object array in rails and how to add elements in to that.
I'm new to ruby on rails and this could be some sort of silly question but I can't find exact answer for that. So can please give some expert ideas about this
All you need is an array:
objArray = []
# or, if you want to be verbose
objArray = Array.new
To push, push or use <<:
objArray.push 17
>>> [17]
objArray << 4
>>> [17, 4]
You can use any object you like, it doesn't have to be of a particular type.
Since everything is an object in Ruby (including numbers and strings) any array you create is an object array that has no limits on the types of objects it can hold. There are no arrays of integers, or arrays of widgets in Ruby. Arrays are just arrays.
my_array = [24, :a_symbol, 'a string', Object.new, [1,2,3]]
As you can see, an array can contain anything, even another array.
Depending on the situation, I like this construct to initialize an array.
# Create a new array of 10 new objects
Array.new(10) { Object.new }
#=> [#<Object:0x007fd2709e9310>, #<Object:0x007fd2709e92e8>, #<Object:0x007fd2709e92c0>, #<Object:0x007fd2709e9298>, #<Object:0x007fd2709e9270>, #<Object:0x007fd2709e9248>, #<Object:0x007fd2709e9220>, #<Object:0x007fd2709e91f8>, #<Object:0x007fd2709e91d0>, #<Object:0x007fd2709e91a8>]
Also if you need to create array of words next construction could be used to avoid usage of quotes:
array = %w[first second third]
or
array = %w(first second third)

Resources