How to test specific dates that cannot be parsed - ruby-on-rails

I need to test a specific array of dates to ensure that they are in the correct format, however I cannot use 'parse' because, if I do, the dates that are incorrect are sorted out. For instance, if I have an incorrect date with a month "13", it adds another year and sets the month to 1.
My code pulls in the dates from an SQL query:
table_birth_dates = self.class.connection.execute("SELECT birth_date FROM #{temp_table_name}").values.flatten
[
[0] "1980-30-54",
[1] "1980-30-54",
[2] "2020-09-10",
[3] "1890-10-30"
]
yr = 1900
year_test = table_birth_dates.select{|d| Date.parse(d).year < yr}
This now gives me an ArgumentError: invalid date.
I thought of using:
splitted_birth_date = table_birth_dates.first.split("-")
splitted_birth_date.first.to_i > 1900?
but if I try to loop through all of my dates, I'm not able to manipulate anything via splitting:
table_birth_dates.each do |birth_date|
birth_date.split("-")
end
What can I do with this?

I need to test a specific array of dates to ensure that they are in
the correct format...
If you get an error it means that the date is incorrect, you could rescue that error and do anything you want with that date to make it valid or whatever.
table_birth_dates.each do |birth_date|
begin
if Date.parse(d).year < yr
# do smth
end
rescue ArgumentError => e
# do smth with `d`
end
end

You could combine your select and split approaches together:
table_birth_dates.select { |d| d.split('-').first.to_i < 1900 }
#=> ["1890-10-30"]

Related

Generate array of daily avg values from db table (Rails)

Context:
Trying to generating an array with 1 element for each created_at day in db table. Each element is the average of the points (integer) column from records with that created_at day.
This will later be graphed to display the avg number of points on each day.
Result:
I've been successful in doing this, but it feels like an unnecessary amount of code to generate the desired result.
Code:
def daily_avg
# get all data for current user
records = current_user.rounds
# make array of long dates
long_date_array = records.pluck(:created_at)
# create array to store short dates
short_date_array = []
# remove time of day
long_date_array.each do |date|
short_date_array << date.strftime('%Y%m%d')
end
# remove duplicate dates
short_date_array.uniq!
# array of avg by date
array_of_avg_values = []
# iterate through each day
short_date_array.each do |date|
temp_array = []
# make array of records with this day
records.each do |record|
if date === record.created_at.strftime('%Y%m%d')
temp_array << record.audio_points
end
end
# calc avg by day and append to array_of_avg_values
array_of_avg_values << temp_array.inject(0.0) { |sum, el| sum + el } / temp_array.size
end
render json: array_of_avg_values
end
Question:
I think this is a common extraction problem needing to be solved by lots of applications, so I'm wondering if there's a known repeatable pattern for solving something like this?
Or a more optimal way to solve this?
(I'm barely a junior developer so any advice you can share would be appreciated!)
Yes, that's a lot of unnecessary stuff when you can just go down to SQL to do it (I'm assuming you have a class called Round in your app):
class Round
DAILY_AVERAGE_SELECT = "SELECT
DATE(rounds.created_at) AS day_date,
AVG(rounds.audio_points) AS audio_points
FROM rounds
WHERE rounds.user_id = ?
GROUP BY DATE(rounds.created_at)
"
def self.daily_average(user_id)
connection.select_all(sanitize_sql_array([DAILY_AVERAGE_SELECT, user_id]), "daily-average")
end
end
Doing this straight into the database will be faster (and also include less code) than doing it in ruby as you're doing now.
I advice you to do something like this:
grouped =
records.order(:created_at).group_by do |r|
r.created_at.strftime('%Y%m%d')
end
At first here you generate proper SQL near to that you wish to get in first approximation, then group result records by created_at field converted to just a date.
points =
grouped.map do |(date, values)|
[ date, values.reduce(0.0, :audio_points) / values.size ]
end.to_h
# => { "1-1-1970" => 155.0, ... }
Then you remap your grouped hash via array, to calculate average values with audio_points.
You can use group and calculations methods built in AR: http://guides.rubyonrails.org/active_record_querying.html#group
http://guides.rubyonrails.org/active_record_querying.html#calculations

Ruby/Rails how to iterate months over a DateTime range?

I am trying to build a graph from data in a Rails table: The amount of sold products per time-fragment.
Because the graph should be able to show the last hour(in 1-minute steps), the last day (in 1-hour steps), the last week (in 1-day steps), the last month (in 1-day steps), etc, I am trying to reduce the code duplication by iterating over a range of DateTime objects:
# To prevent code-duplication, iterate over different time ranges.
times = {
:hour=>{newer_than: 1.hour.ago, timestep: :minute},
:day=>{newer_than: 1.day.ago, timestep: :hour},
:week=>{newer_than: 1.week.ago, , timestep: :day},
:month=>{newer_than: 1.week.ago, , timestep: :day}
}
products = Product.all
# Create symbols `:beginning_of_minute`, `:beginning_of_hour`, etc. These are used to group products and timestamps by.
times.each do|name, t|
t[:beginning_of] = ("beginning_of_" << t[:timestep].to_s).to_sym
end
graphs = times.map do |name, t|
graphpoints = {}
seconds_in_a_day = 1.day.to_f
step_ratio = 1.send(t[:timestep]).ago.to_f / seconds_in_a_day
time_enum = 1.send(t[:timestep]).ago.to_datetime.step(DateTime.now, step_ratio)
time_enum.each do |timestep|
graphpoints[time_moment.send(timehash[:beginning_of]).to_datetime] = []
end
# Load all products that are visible in this graph size
visible_products = products.select {|p| p.created_at >= t.newer_than}
# Group them per graph point
grouped_products = visible_products.group_by {|item| item.created_at.send(timehash[:beginning_of]).to_datetime}
graphpoints.merge!(grouped_products)
{
points: graphpoints,
labels: graphpoints.keys
}
end
This code works great for all time-intervals that have a constant size (hour,day,week). For months, however, it uses a step_ratio of 30 days: 1.month / 1.day == 30. Obviously, the amount of days that months has is not constant. In my script, this has the result that a month might be 'skipped' and therefore missing from the graph.
How can this problem be solved? How to iterate over months while keeping the different amount of days in the months in mind?
if you have to select month over a gigantic arrays, just make the range between two Date:class.
(1.year.ago.to_date..DateTime.now.to_date)).select{|date| date.day==1}.each do |date|
p date
end
Use groupdate gem. For example (modified example from the docs):
visible_products = Product.where("created_at > ?", 1.week.ago).group_by_day
# {
# 2015-07-29 00:00:00 UTC => 50,
# 2013-07-30 00:00:00 UTC => 100,
# 2013-08-02 00:00:00 UTC => 34
# }
Also, this will be much faster, because your grouping/counting will be done by database itself, without the need to pass all the records via Product.all call to your Rails code, and without the need to create ActiveRecord object for each one (even irrelevant).

Is there an easy way to fetch year from a date string in ruby?

might be a stupid question...
I am a new to ruby and recently I am writing a rake task to merge multiple tables into a general one. One thing I need to do is to fetch the date from the database and then convert the date into two integers as year and month and then save them into two separate columns.
I finished this task file last week but unfortunately that file is removed by accident, so I have to write the code again. I didn't remember how I manipulated the date in the original file, I think that the way I took in the original file is way more straightforward than the current code. The current code is as follows.
fetched_time=DateTime.strptime(pr.fetched_time,"%Y-%m-%d")
dr.year = fetched_time.strftime('%Y').to_i
dr.month = fetched_time.strftime('%m').to_i
I have tried many key words to search, but none of the results is helpful. Is the following code the best way to convert the date string to integer?
Thank you very much.
Yes possible, by using Date#year:
require 'date'
d = Date.parse("20-08-2013")
d.year # => 2013
now = Time.now.to_s
# => "2013-09-10 11:09:14 -0500"
fetched_time=DateTime.strptime(now, "%Y-%m-%d").to_s
# => "2013-09-10T00:00:00+00:00"
year = Date.parse(fetched_time).year
# => 2013
month = Date.parse(fetched_time).month
# => 9
year.class
# => Fixnum
month.class
# => Fixnum
Or
fetched_date=Date.strptime(now, "%Y-%m-%d").to_s
# => "2013-09-10"
date = Date.parse(fetched_date)
# => #<Date: 2013-09-10 ((2456546j,0s,0n),+0s,2299161j)>
Wouldn't you rather use a Date object than a String anyway? What do timestamps consist of? I'm new to Rails and ActiveRecord.
What are you setting your ActiveRecord::Base.default_timezone = # to be?
In case you want to know what those extra numbers are in a Date object
try pluging them in to
Date.jd(2299161)
# => #<Date: 1582-10-15 ((2299161j,0s,0n),+0s,2299161j)>
Date.jd(2456546)
# => #<Date: 2013-09-10 ((2456546j,0s,0n),+0s,2299161j)>
They are Julian Day Numbers. That last one is for calendar reform for Italy and some catholic countries.
Date::ITALY
# => 2299161

How can I speed up my Ruby/Rake task, which counts occurrences of dates among 300K date strings?

I have an array of 300K strings which represent dates:
date_array = [
"2007-03-25 14:24:29",
"2007-03-25 14:27:00",
...
]
I need to count occurrences of each date in this array (e.g., all date strings for "2011-03-25"). The exact time doesn't matter -- just the date. I know the range of dates within the file. So I have:
Date.parse('2007-03-23').upto Date.parse('2011-10-06') do |date_to_count|
count = 0
date_array.each do |date_string|
if Date.parse(date_string) >= date_to_count &&
Date.parse(date_string) <= date_to_count
count += 1
end
end
puts "#{date_to_count} occurred #{count} times."
end
Counting occurrences of just one date takes longer than 60 seconds on my machine. In what ways can I optimize the performance of this task?
Possibly useful notes: I'm using Ruby 1.9.2. This script is running in a Rake task with rake 0.9.2. The date_array is loaded from a CSV file. On each iteration, the count is saved as a record in my Rails project database.
Yes, you don't need to parse the dates at all if they are formatted the same. Knowing your data is one of the most powerful tools you can have.
If the datetime strings are all in the same format (yyyy-mm-dd HH:MM:SS) then you could do something like
data_array.group_by{|datetime| datetime[0..9]}
This will give you a hash like with the date strings as the keys and the array of dates as values
{
"2007-05-06" => [...],
"2007-05-07" => [...],
...
}
So you'd have to get the length of each array
data_array.group_by{|datetime| datatime[0..9]}.each do |date_string, date_array|
puts "#{date_string} occurred #{date_array.length} times."
end
Of course that method is wasting memory by arrays of dates when you don't need them.
so how about
A more memory-efficient method
date_counts = {}
date_array.each do |date_string|
date = date_string[0..9]
date_counts[date] ||= 0 # initialize count if necessary
date_counts[date] += 1
end
You'll end up with a hash with the date strings as the keys and the counts as values
{
"2007-05-06" => 123,
"2007-05-07" => 456,
...
}
Putting everything together
date_counts = {}
date_array.each do |date_string|
date = date_string[0..9]
date_counts[date] ||= 0 # initialize count if necessary
date_counts[date] += 1
end
Date.parse('2007-03-23').upto Date.parse('2011-10-06') do |date_to_count|
puts "#{date_to_count} occurred #{date_counts[date_to_count.to_s].to_i} times."
end
This is a really awful algorithm to use. You're scanning through the entire list for each date, and further, you're parsing the same date twice for no apparent reason. That means for N dates in the range and M dates in the list you're doing N*M*2 date parses.
What you really need is to use group_by and do it in one pass:
dates = date_array.group_by do |date_string|
Date.parse(date_string)
end
Then you can use this as a reference for your counts:
Date.parse('2007-03-23').upto Date.parse('2011-10-06') do |date_to_count|
puts "#{date_to_count} occurred #{dates[date_to_count] ? dates[date_to_count].length : 0} times."
end

Thinking Sphinx with a date range

I am implementing a full text search API for my rails apps, and so far have been having great success with Thinking Sphinx.
I now want to implement a date range search, and keep getting the "bad value for range" error.
Here is a snippet of the controller code, and i'm a bit stuck on what to do next.
#search_options = { :page => params[:page], :per_page => params[:per_page]||50 }
unless params[:since].blank?
# make sure date is in specified format - YYYY-MM-DD
d = nil
begin
d = DateTime.strptime(params[:since], '%Y-%m-%d')
rescue
raise ArgumentError, "Value for since parameter is not a valid date - please use format YYYY-MM-DD"
end
#search_options.merge!(:with => {:post_date => d..Time.now.utc})
end
logger.info #search_options
#posts = Post.search(params[:q], #search_options)
When I have a look at the log, I am seeing this bit which seems to imply the date hasn't been converted into the same time format as the Time.now.utc.
withpost_date2010-05-25T00:00:00+00:00..Tue Jun 01 17:45:13 UTC 2010
Any ideas? Basically I am trying to have the API request pass in a "since" date to see all posts after a certain date. I am specifying that the date should be in the YYYY-MM-DD format.
Thanks for your help.
Chris
EDIT: I just changed the date parameters merge statement to this
#search_options.merge!(:with => {:post_date => d.to_date..DateTime.now})
and now I get this error
undefined method `to_i' for Tue, 25 May 2010:Date
So obviously there is something still not setup right...
lets say d = "2010-12-10"
:post_date => (d.to_time.to_i..Time.now.to_i) would have gotten you there. I just did this in my project and it works great
I finally solved this, but it takes a slightly different approach but it works fine.
I was trying to put the date-range search inside a sphinx_scope (in the model) or as a :condition or :with (in the controller). This did not work, so instead I had to implement it inside the define_index in the model.
So what I did was put a check in the define_index to see if a record fell within a date range, the date range being defined by some SQL code, as shown below. In this case, I wanted to see if "start_date" fell within a date between now and 30 days ago, and an "end_date" fell within today and 30 days from now.
If the dates fell within the ranges, the code below causes the :live to be 0 or 1, depending on whether it falls outside or inside the date ranges (respectively):
define index do
# fields:
...
# attributes:
has "CASE WHEN start_date > DATE_ADD(NOW(), INTERVAL -30 DAY) AND end_date < DATE_ADD(NOW(), INTERVAL 30 DAY) THEN 1 ELSE 0 END", :type => :integer, :as => :live
...
# delta:
...
end
Then in your controller, all you have to do is check if :live => 1 to obtain all records that have start_dates and end_dates within the date ranges.
I used a sphinx_scope like this:
sphinx_scope(:live) {
{ :with => { :live => 1 } }
}
and then in my controller:
#models = Model.live.search(...)
To make sure it works well, you of course need to implement frequent reindexing to make sure the index is up to date, i.e. the correct records are :live => 1 or 0!
Anyway, this is probably a bit late for you now, but I implemented it and it works like a charm!!!
Wouldn't it work if you replaced
d = DateTime.strptime(params[:since], '%Y-%m-%d')
by
Time.parse(params[:since]).strftime("%Y-%m-%d")
(It seems the first one doesn't return a date in the expected format)

Resources