Ruby: Average array of times - ruby-on-rails

I have the following method in my Array class:
class Array
def avg
if partial_include?(":")
avg_times
else
blank? and 0.0 or (sum.to_f/size).round(2)
end
end
def avg_times
avg_minutes = self.map do |x|
hour, minute = x.split(':')
total_minutes = hour.to_i * 60 + minute.to_i
end.inject(:+)/size
"#{avg_minutes/60}:#{avg_minutes%60}"
end
def partial_include?(search_term)
self.each do |e|
return true if e[search_term]
end
return false
end
end
This works great with arrays of regular numbers, but there could instances where I have an array of times.
For example: [18:35, 19:07, 23:09]
Anyway to figure out the average of an array of time objects?

So you need do define a function that can calculate the average of times formatted as strings. Convert the data to minutes, avg the total minutes and then back to a time.
I would do it something like this:
a = ['18:35', '19:07', '23:09']
def avg_of_times(array_of_time)
size = array_of_time.size
avg_minutes = array_of_time.map do |x|
hour, minute = x.split(':')
total_minutes = hour.to_i * 60 + minute.to_i
end.inject(:+)/size
"#{avg_minutes/60}:#{avg_minutes%60}"
end
p avg_of_times(a) # = > "20:17"
Then when you call you function you check if any/all items in your array is formatted as a time. Maybe using regexp.

Average the Hours and Minutes Separately
Here's a simple method that we're using:
def calculate_average_of_times( times )
hours = times.collect{ |time| time.split( ":" ).first.to_i } # Large Arrays should only
minutes = times.collect{ |time| time.split( ":" ).second.to_i } # call .split 1 time.
average_hours = hours.sum / hours.size
average_minutes = ( minutes.sum / minutes.size ).to_s.rjust( 2, '0' ) # Pad with leading zero if necessary.
"#{ average_hours }:#{ average_minutes }"
end
And to show it working with your provided Array of 24-hour times, converted to Strings:
calculate_average_of_times( ["18:35", "19:07", "23:09"] )
#=> "20:17"
Thanks to #matt-privman for the help and inspiration on this.

Related

How do I convert a date into a time when parsing an .xls doc using Rails?

I'm using Rails 5. I want to parse an .xls (not to be confused with .xlsx doc) using the code below
book = Roo::Spreadsheet.open(file_location)
sheet = book.sheet(0)
text = sheet.to_csv
csv = CSV.parse(text)
arr_of_arrs = csv
text_content = ""
arr_of_arrs.each do |arr|
arr.map!{|v| v && v.to_f < 1 && v.to_f > 0 ? TimeFormattingHelper.time_as_str(v.to_f * 24 * 3600 * 1000) : v}
text_content = "#{text_content}\n#{arr.join("\t")}"
end
Here is the method I reference above
def time_as_str(time_in_ms)
regex = /^(0*:?)*0*/
Time.at(time_in_ms.to_f/1000).utc.strftime("%H:%M:%S.%1N").sub!(regex, '')
end
One area I'm having trouble is that a cell that appears in my .xls doc as
24:08:00
is processed as
1904-01-02T00:08:00+00:00
with the code above. How do I parse the value I see on the screen? That is, how do I convert the date value into a time value?
As an example from another Excel doc, the cell that appears as
24:02:00
is getting parsed by my code above as
1899-12-31T00:02:00+00:00
It seems your .xls is in the 1904 date system, and Roo is not able to distinguish between what is a Duration and what is a DateTime, so you'll need to subtract the base date 1904-01-01 to the cell value. Weirdly enough, in case of the 1900 date system, you need to subtract the base date 1899-12-30, due to a bug in Lotus 1-2-3 that Microsoft replicated in Excel for compatibility.
Here is a method that converts the DateTime read from the spreadsheet into the duration according to the base date:
def duration_as_str(datetime, base_date)
total_seconds = DateTime.parse(datetime).to_i - base_date.to_i
hours = total_seconds / (60 * 60)
minutes = (total_seconds / 60) % 60
seconds = total_seconds % 60
"%d:%02d:%02d" % [hours, minutes, seconds]
end
Let's test it:
irb(main):019:0> duration_as_str("1904-01-02T00:08:00+00:00", DateTime.new(1904, 1, 1))
=> "24:08:00"
irb(main):020:0> duration_as_str("1899-12-31T00:02:00+00:00", DateTime.new(1899, 12, 30))
=> "24:02:00"
You can use book.workbook.date_base.year to determine the spreadsheet's date system, and then just add another map inside your each loop:
book = Roo::Spreadsheet.open(file_location)
sheet = book.sheet(0)
text = sheet.to_csv
csv = CSV.parse(text)
base_date = book.workbook.date_base.year == 1904 ? DateTime.new(1904, 1, 1) : DateTime.new(1899, 12, 30)
arr_of_arrs = csv
text_content = ""
arr_of_arrs.each do |arr|
arr.map!{|v| v && v.to_f < 1 && v.to_f > 0 ? TimeFormattingHelper.time_as_str(v.to_f * 24 * 3600 * 1000) : v}
arr.map!{|v| v =~ /^(1904|1899)-/ ? duration_as_str(v, base_date) : v}
text_content = "#{text_content}\n#{arr.join("\t")}"
end
You could use something like the below and write a custom parser for that string.
duration = 0
"24:08:01".split(":").each_with_index do |value, i|
if i == 0
duration += value.to_i.hours
elsif i == 1
duration += value.to_i.minutes
else
duration += value.to_i.seconds
end
end
duration.value => 86881 (duration in seconds)
This parser will assume a format of hours:minutes:seconds and return an instance of ActiveSupport::Duration. Then, duration.value will give you the number of seconds.
You need to read the internal value of cell instead of formatted value.
Formatted value gets written to csv when you use to_csv
To read internal value, you would have to use either sheet objects excelx_value method or row object's cell_value method.
These methods return value in float (days). Here is an example using cell_value by iterating over rows, assuming no header and first column with value to be converted.
Using Roo 2.7.1 (similar methods exist in older version)
book = Roo::Spreadsheet.open(file_location)
sheet = book.sheet(0)
formatted_times = []
time_column_index = 0
sheet.each_row_streaming do |row|
time_in_days = row[time_column_index].cell_value
formatted_times << time_as_str(time_in_days.to_f * 24 * 3600)
end
def time_as_str(t)
minutes, seconds = t.divmod(60)
hours, minutes = minutes.divmod(60)
"%02d:%02d:%02d" % [hours, minutes, seconds]
end
# eg: time_in_days = 1.0169444444444444
# formatted_time = "24:24:24"
First, I will try rephrasing what you want to accomplish.
You want to “parse the value you see on the screen”, but I am not sure whether that is 24:08:00 or 1904-01-02T00:08:00+00:00. I assume it is the first.
You want to convert the date value into a time value. I am not sure you actually want the output var to be a Time, a Date, a DateTime, or simply a String. I assume it is ok for you to have it simply as a String, but this is a minor issue.
With this, I assume that what you in general see as HH:MM:SS in Excel, you want to get as “HH:MM:SS” in Rails, regardless of HH being > 23. As an example, 24:08:00 in Excel would turn into “24:08:00” in Rails.
The two seemingly discordant cases you report most likely stem from the two .xls files having different date systems.
To get the desired result you have two options:
Use to_csv, whose result is affected by the date system of the Excel file. In this case, you have to subtract the base_date, as done by Helder Pereira.
Directly get the numeric value from Excel, which is not affected by the date system. In this case, code is simpler, since you only need one conversion (function days2str below).
Code is (modulo minor adjustments)
def days2str(days)
days_int = int(days)
hours = ( days - days_int ) * 24
hours_int = int(hours)
seconds = ( hours - hours_int ) * 3600
seconds_int = int(seconds)
hours_int = hours_int + 24 * days_int
format("%d:%02d:%02d", hours_int, minutes_int, seconds_int)
end
def is_date(v)
# Define the checking function
end
require 'spreadsheet'
Spreadsheet.open('MyTestSheet.xls') do |book|
book.worksheet('Sheet1').each do |row|
break if row[0].nil?
puts row.join(',')
row.map!{|v| is_date(v) ? days2str(v) : v }
text_content = "#{text_content}\n#{arr.join("\t")}"
end
end

For a given period, getting the smallest list of dates, using jokers

I use Elasticsearch where I have one index per day, and I want my Ruby on Rails application to query documents in a given period by specifying the smallest and most precise list of indices.
I can't find the code to get that list of indices. Let me explain it:
Consider a date formatted in YYYY-MM-DD.
You can use the joker * at the end of the date string. E.g. 2016-07-2* describes all the dates from 2016-07-20 to 2016-07-29.
Now, consider a period represented by a start date and an end date.
The code must return the smallest possible array of dates representing the period.
Let's use an example. For the following period:
start date: 2014-11-29
end date: 2016-10-13
The code must return an array containing the following strings:
2014-11-29
2014-11-30
2014-12-*
2015-*
2016-0*
2016-10-0*
2016-10-10
2016-10-11
2016-10-12
2016-10-13
It's better (but I'll still take a unoptimized code rather than nothing) if:
The code returns the most precise list of dates (i.e. doesn't return dates with a joker that describes a period starting before the start date, or ending after the end date)
The code returns the smallest list possible (i.e. ["2016-09-*"] is better than ["2016-09-0*", "2016-09-1*", "2016-09-2*", "2016-09-30"]
Any idea?
Okay, after more thinking and the help of a coworker, I may have a solution. Probably not totally optimized, but still...
def get_indices_from_period(start_date_str, end_date_str)
dates = {}
dates_strings = []
start_date = Date.parse(start_date_str)
end_date = Date.parse(end_date_str)
# Create a hash with, for each year and each month of the period: {:YYYY => {:MMMM => [DD1, DD2, DD3...]}}
(start_date..end_date).collect do |date|
year, month, day = date.year, date.month, date.day
dates[year] ||= {}
dates[year][month] ||= []
dates[year][month] << day
end
dates.each do |year, days_in_year|
start_of_year = Date.new(year, 1, 1)
max_number_of_days_in_year = (start_of_year.end_of_year - start_of_year).to_i + 1
number_of_days_in_year = days_in_year.collect{|month, days_in_month| days_in_month}.flatten.size
if max_number_of_days_in_year == number_of_days_in_year
# Return index formatted as YYYY-* if full year
dates_strings << "#{year}-*"
else
days_in_year.each do |month, days_in_month|
formatted_month = format('%02d', month)
if Time.days_in_month(month, year) == days_in_month.size
# Return index formatted as YYYY-MM-* if full month
dates_strings << "#{year}-#{formatted_month}-*"
else
decades_in_month = {}
days_in_month.each do |day|
decade = day / 10
decades_in_month[decade] ||= []
decades_in_month[decade] << day
end
decades_in_month.each do |decade, days_in_decade|
if (decade == 0 && days_in_decade.size == 9) ||
((decade == 1 || decade == 2) && days_in_decade.size == 10)
# Return index formatted as YYYY-MM-D* if full decade
dates_strings << "#{year}-#{formatted_month}-#{decade}*"
else
# Return index formatted as YYYY-MM-DD
dates_strings += days_in_decade.collect{|day| "#{year}-#{formatted_month}-#{format('%02d', day)}"}
end
end
end
end
end
end
return dates_strings
end
Test call:
get_indices_from_period('2014-11-29', '2016-10-13')
=> ["2014-11-29", "2014-11-30", "2014-12-*", "2015-*", "2016-01-*", "2016-02-*", "2016-03-*", "2016-04-*", "2016-05-*", "2016-06-*", "2016-07-*", "2016-08-*", "2016-09-*", "2016-10-0*", "2016-10-10", "2016-10-11", "2016-10-12", "2016-10-13"]

How to loop through arrays of different length in Ruby?

Let's say i have two relation arrays of a user's daily buy and sell.
how do i iterate through both of them using .each and still let the the longer array run independently once the shorter one is exhaused. Below i want to find the ratio of someone's daily buys and sells. But can't get the ratio because it's always 1 as i'm iterating through the longer array once for each item of the shorter array.
users = User.all
ratios = Hash.new
users.each do |user|
if user.buys.count > 0 && user.sells.count > 0
ratios[user.name] = Hash.new
buy_array = []
sell_array = []
date = ""
daily_buy = user.buys.group_by(&:created_at)
daily_sell = user.sells.group_by(&:created_at)
daily_buy.each do |buy|
daily_sell.each do |sell|
if buy[0].to_date == sell[0].to_date
date = buy[0].to_date
buy_array << buy[1]
sell_array << sell[1]
end
end
end
ratio_hash[user.name][date] = (buy_array.length.round(2)/sell_array.length)
end
end
Thanks!
You could concat both arrays and get rid of duplicated elements by doing:
(a_array + b_array).uniq.each do |num|
# code goes here
end
Uniq method API
daily_buy = user.buys.group_by(&:created_at)
daily_sell = user.sells.group_by(&:created_at
buys_and_sells = daily_buy + daily_sell
totals = buys_and_sells.inject({}) do |hsh, transaction|
hsh['buys'] ||= 0;
hsh['sells'] ||= 0;
hsh['buys'] += 1 if transaction.is_a?(Buy)
hsh['sells'] += 1 if transaction.is_a?(Sell)
hsh
end
hsh['buys']/hsh['sells']
I think the above might do it...rather than collecting each thing in to separate arrays, concat them together, then run through each item in the combined array, increasing the count in the appropriate key of the hash returned by the inject.
In this case you can't loop them with each use for loop
this code will give you a hint
ar = [1,2,3,4,5]
br = [1,2,3]
array_l = (ar.length > br.length) ? ar.length : br.length
for i in 0..array_l
if ar[i] and br[i]
puts ar[i].to_s + " " + br[i].to_s
elsif ar[i]
puts ar[i].to_s
elsif br[i]
puts br[i].to_s
end
end

Donations over past 24 months with keys and sums

Having pulled donations from the past two years, I'm trying to derive the sum of those donations per month, storing the keys (each month) and the values (the sum of donations for each month) in an array of hashes. I would like the keys to be numbers 1 to 24 (1 being two years ago and 24 being this month) and if there are no donations for a given month, the value would be zero for that month. How would I do this as an array of hashes in Ruby/Rails?
This is my variable with the donations already in it.
donations = Gift.where(:date => (Date.today - 2.years)..Date.today)
the following gives you a hash, with keys '2013/09" , etc...
monthly_donations = {}
date = Time.now
while date > 2.years.ago do
range = date.beginning_of_month..date.end_of_month
monthly_donations[ "{#date.year}/#{date.month}" ] = Giftl.sum(:column, :conditions => {created_at >= range})
date -= 30.days
end
To select the records in that time-span, this should be enough:
donations = Gift.where("date >= #{2.years.ago}")
you can also do this:
donations = Gift.where("date >= :start_date AND date <= :end_date",
{start_date: 2.years.ago, end_date: Time.now} )
See also: 2.2.1 "Placeholder Conditions"
http://guides.rubyonrails.org/active_record_querying.html
To sum-up a column in the database record, you can then do this:
sum = Gift.sum(:column , :conditions => {created_at >= 2.years.ago})
First, we need a function to find the difference in months from the current time.
def month_diff(date)
(Date.current.year * 12 + Date.current.month) - (date.year * 12 + date.month)
end
Then we iterate through #donation, assuming that :amount is used to store the value of each donation:
q = {}
#donations.each do |donation|
date = month_diff(donation.date)
if q[date].nil?
q[date] = donation.amount
else
q[date] += donation.amount
end
end
I found a good solution that covered all the bases--#user1185563's solution didn't bring in months without donations and #Tilo's called the database 24 times, but I very much appreciated the ideas! I'm sure this could be done more efficiently, but I created the hash with 24 elements (key: beginning of each month, value: 0) and then iterated through the donations and added their amounts to the hash in the appropriate position.
def monthly_hash
monthly_hash = {}
date = 2.years.ago
i = 0
while date < Time.now do
monthly_hash["#{date.beginning_of_month}"] = 0
date += 1.month
i += 1
end
return monthly_hash
end
#monthly_hash = monthly_hash
#donations.each do |donation|
#monthly_hash["#{donation.date.beginning_of_month}"] += donation.amount
end

Ruby get arrays of dates based on given period

I have question is there maybe a fine simple solution to this task:
I have first_date = "2011-02-02" , last_date = "2013-01-20" and period = 90 (days).
I need to get arrays with two elements for example:
[first_date, first_date + period] ... [some_date, last_date].
I will make it with some kind of a loop but maybe there is some nice fancy way to do this :D.
Date has a step method:
require 'date'
first_date = Date.parse("2011-02-02")
last_date = Date.parse("2013-02-20")
period = 90
p first_date.step(last_date-period, period).map{|d| [d, d+period]}
#or
p first_date.step(last_date, period).map.each_cons(2).to_a
require 'pp'
require 'date'
first_date=Date.parse "2011-02-02"
last_date=Date.parse "2013-01-20"
period = 90
periods = []
current = first_date
last = current + period
while(last < last_date ) do
periods << [current, last]
current = last
last = current + period
end
if periods[-1][1] != last_date
periods << [periods[-1][1], last_date]
end
p periods
I am assuming that the last period must end on last_date regardless of its length, as your question implies.

Resources