Ruby min_by logic - ruby-on-rails

Please help me to figure out min_by behaviour.
I have a "normas" table with two columns (raw_value, percentile).
After some calculations I get calculated_result, and my goal is to find percentile for closest raw_value to my calculated_result.
My approach as below:
raw = Norma.where(name: name).map(&:raw_value).min_by { |x| (x.to_f - value.to_f).abs }
It works, but I can't figure out all logic, here's what I mean:
arr = [1,2,3,4,5]
arr.min_by {|x| (x - 3.5).abs}
=> 3
in this case we have two identical differences (0.5 to 3 as well as to 4),
so my question is what is rule for choosing result if more than one minimal found.
Have a productive day! :)

In case of equal values, the first minimum counts.
Try it with [5, 4, 3, 2, 1] and you'll see the result is now 4.
This is consistent with #index which returns the first index position that matches a value.
Think of it as this...
temp_arr = arr.map{ |x| (x-3.5).abs }
arr[temp_arr.index(temp_arr.min)]

Related

ruby pattern for generating geometrically expanding objects

Given three arrays of unique ids, where the goal is to create individual identifiers that join each member of the three arrays
array_a = [1,2]
array_b = [43,44,47]
array_c = [3,15]
this implies 2 * 3 * 2 individual identifiers (seperated by underscores for legibility purposes):
1_43_3, 1_43_15, 1_44_3, 1_44_15, 1_47_3, 1_47_15, 2_43_3, 2_43_15, 2_44_3, 2_44_15, 2_47_3, 2_47_15
Is there a ruby method that allows to create such a set, i.e. to multiply arrays of arrays ?
Use product method
Input
array_a = [1,2]
array_b = [43,44,47]
array_c = [3,15]
Program
p array_a.product(array_b,array_c).map{|x|x.join("_")}
Output
["1_43_3", "1_43_15", "1_44_3", "1_44_15", "1_47_3", "1_47_15", "2_43_3", "2_43_15", "2_44_3", "2_44_15", "2_47_3", "2_47_15"]
Not to my knowledge, but it's fairly trivial to implement with a couple of loops:
array_a = [1,2]
array_b = [43,44,47]
array_c = [3,15]
combined = array_a.flat_map do |a|
array_b.flat_map do |b|
array_c.map do |c|
[a, b, c].join("_")
end
end
end
Edit - although the solution using product from #Rajagopalan is very neat.
This is not an answer, just shedding light on the two valid answers provided.
Running a performance test in the following manner:
time = Benchmark.measure {
code_to_test
}
puts time
with four data sets:
the first an array with sizes 10x10x10,
the second an array with sizes 20x20x20, which is almost an order of magnitude greater than the former, then
a third array with sizes 30x30x30. and a final
40x40x40, almost another order of magnitude.
The product method returns for each array
user system total
0.002692 0.000340 0.003032
0.057010 0.003608 0.060618
0.078614 0.010978 0.089592
0.217555 0.015326 0.232881
while the nested flat_map array returns
0.002562 0.000145 0.002707
0.077731 0.001857 0.079588
0.085422 0.001829 0.087251
0.263692 0.005506 0.269198
rather indistinguisahble, even at relatively high numbers.

Subtracting two queries from each other in rails

I'm new to rails, and even ruby, so I'm having some trouble figuring this out.
I want to find the difference between two queries. This particular query should return a single record, since I've set it up such that Recipe is missing one of the IDs from recipes.
Current code:
q = Recipe.all - Recipe.where(recipe_id: recipes)
Where recipes is an array of IDs.
From my limited understanding of the language, this would work if both Recipe.all and Recipe.where both returned arrays.
I've spent some time searching the web, with nothing coming up to aid me.
Other things I've tried:
q = [Recipe.all] - [Recipe.where(recipe_id: recipes)]
q = Recipe.where.not(recipe_id: recipes) # Wouldn't work because the array is the one with the extra id
Though neither proved helpful.
Try this:
q = Recipe.where('recipe_id NOT IN (?)', recipes)
Turns out I was asking the wrong question.
Since the array of IDs is the one with extra elements, not the database query, I should have been comparing the difference of it to the query.
My answer is as follows:
q = recipes - Recipe.where(recipe_id: recipes).ids
Which returns the missing IDs.
If you are using Rails 4, you can use the not query method
q = Recipe.where.not(id: recipes)
this will generator following query:
SELECT "recipes".* FROM "recipes" WHERE ("recipes"."id" NOT IN (12, 8, 11, 5, 6, 7))

Rails method for checking if a number in a range appears in an array

I have an array (or possibly a set) of integers (potentially non sequential but all unique) and a range of (sequential) numbers. If I wanted to check if any of the numbers in the range existed in the array, what would be the most efficient way of doing this?
array = [1, 2, 5, 6, 7]
range = 3..5
I could iterate over the range and check if the array include? each element but this seems wasteful and both the array and the range could easily be quite large.
Are there any methods I could use to do some sort of array.include_any?(range), or should I look for efficient search algorithms?
I would do
(array & range.to_a).present?
or
array.any? { |element| range.cover?(element) }
I would choose a version depending on the size of the range. If the range is small that the first version is probably faster, because it creates the intersection once and doesn't need to check cover for every single element in the array. Whereas if the range is huge (but the array is small) the second version might be faster, because a few comparisons might be faster that generating an array out of a huge range and building the intersection.
([1, 2, 5, 6, 7] & (3..5).to_a).any?
# => true
Don't need no stinkin' &:
array.uniq.size + range.size > (array + range.to_a).uniq.size

groovy simple way to find hole in list?

I'm using the grails findAllBy() method to return a list of Position(s). Position has an integer field called location, which ranges from 1 to 15. I need to find the lowest location in the position list that is free.
For example, if there are positions at locations 1,2 and 4, then the algorithm should return 3. If locations 1 - 4 were filled, it would return 5.
Is there some simple groovy list/map functions to get the right number?
Thanks
If your list of positions were (limited to a mx of 5 for brevity):
def list = [ 1, 2, 4, 5 ]
And you know that you have a maximum of 5 of them, you can do:
(1..5).minus(list).min()
Which would give you 3
Just another option, because I originally thought he wanted to know the first unused slot in a list, say you had this:
def list = ['a', 'b', null, 'd', 'e', null, 'g']
You could easily find the first empty slot in the array by doing this:
def firstOpen = list.findIndexOf{ !it } // or it == null if you need to avoid groovy truth
Tim's way works, and is good for small ranges. If you've got the items sorted by location already, you can do it in O(n) by leveraging findResult
def firstMissing = 0
println list.findResult { (it.location != ++firstMissing) ? firstMissing : null }
prints 3.
If they're not sorted, you can either modify your db query to sort them, or add sort{it.location} in there.

Ruby on Rails method to calculate percentiles - can it be refactored?

I have written a method to calculate a given percentile for a set of numbers for use in an application I am building. Typically the user needs to know the 25th percentile of a given set of numbers and the 75th percentile.
My method is as follows:
def calculate_percentile(array,percentile)
#get number of items in array
return nil if array.empty?
#sort the array
array.sort!
#get the array length
arr_length = array.length
#multiply items in the array by the required percentile (e.g. 0.75 for 75th percentile)
#round the result up to the next whole number
#then subtract one to get the array item we need to return
arr_item = ((array.length * percentile).ceil)-1
#return the matching number from the array
return array[arr_item]
end
This looks to provide the results I was expecting but can anybody refactor this or offer an improved method to return specific percentiles for a set of numbers?
Some remarks:
If a particular index of an Array does not exist, [] will return nil, so your initial check for an empty Array is unnecessary.
You should not sort! the Array argument, because you are affecting the order of the items in the Array in the code that called your method. Use sort (without !) instead.
You don't actually use arr_length after assignment.
A return statement on the last line is unnecessary in Ruby.
There is no standard definition for the percentile function (there can be a lot of subtleties with rounding), so I'll just assume that how you implemented it is how you want it to behave. Therefore I can't really comment on the logic.
That said, the function that you wrote can be written much more tersely while still being readable.
def calculate_percentile(array, percentile)
array.sort[(percentile * array.length).ceil - 1]
end
Here's the same refactored into a one liner. You don't need an explicit return as the last line in Ruby. The return value of the last statement of the method is what's returned.
def calculate_percentile(array=[],percentile=0.0)
# multiply items in the array by the required percentile
# (e.g. 0.75 for 75th percentile)
# round the result up to the next whole number
# then subtract one to get the array item we need to return
array ? array.sort[((array.length * percentile).ceil)-1] : nil
end
Not sure if it's worth it, but here is how I did it for the quartiles:
def median(list)
(list[(list.size - 1) / 2] + list[list.size / 2]) / 2
end
numbers = [1, 2, 3, 4, 5, 6]
if numbers.size % 2 == 0
puts median(numbers[0...(numbers.size / 2)])
puts median(numbers)
puts median(numbers[(numbers.size / 2)..-1])
else
median_index = numbers.index(median(numbers))
puts median(numbers[0..(median_index - 1)])
puts median(numbers)
puts median(numbers[(median_index + 1)..-1])
end
If you're calculating both quartiles, you might want to move the "sort" outside the function, so that it only needs to be done once. This also means you aren't modifying your caller's data (sort!), nor making a copy every time the function is called (sort).
I know, premature optimisation and all that. And it's a bit awkward for the function to say, "the array must be sorted before calling this function". So it's reasonable to leave it as it is.
But sorting already-sorted data is going to take considerably longer than the whole rest of the function put together(*). It also has higher algorithmic complexity: O(N) at best, when the function could be O(1) for the second quartile (although O(N log N) for the first one if the data is not already sorted, of course). So it's worth avoiding if performance might ever be an issue for this function.
There are slightly faster ways of finding the two quartiles than a full sort (look up "selection algorithms"). For instance if you're familiar with the way qsort uses pivots, observe that if you need to know the 25th and 75th items out of 100, and your pivot at some stage ends up in position 80, then there's absolutely no point recursing into the block above the pivot. You really don't care what order those elements are in, just that they're in the top quartile. But this will considerably increase the complexity of the code compared with just calling a library to sort for you. Unless you really need a minor performance boost, I think you're good as you are.
(*) Unless ruby arrays have a flag to remember they're already sorted and haven't been modified since. I don't know whether they do, but if so then using sort! a second time is of course free.

Resources