How to identify a set of dates from a string in rails - ruby-on-rails

I have the following strings
"sep 04 apr 06"
"29th may 1982"
"may 2006 may 2008"
"since oct 11"
Output
"September 2004 and April 2006"
"29 May 1982"
"May 2006 and May 2008"
"October 2011"
Is there a way to obtain the dates from these string. I used the gem 'dates_from_string', but it is unable to correctly obtain date from first scenario.

When you say "unfortunately I can't predict in which format the date is going to be in.", you imply that you actually need "natural language parsing". Which is something core Date or DateTime objects cannot and should not do.
So, either you will need to parse the strings so you can present them to the more strict parser in an understandable format. Like DateTime.parse('sep 04'). For your examples, it could be as simple as:
datestring = 'sep 04 apr 06'
matches = datestring.match(/[a-z]{3}\s\d{2,4}/)
if matches.many?
matches.map{|m| Date.parse(m) }.join(' and ')
else
Date.parse(datestring)
end
However, when you want true natural language parsing, have a look at Chronic. Which has all sorts of fancy parsers like Chronic.parse('summer').
Edit: on closer inspection, it seems Chronic too can only identify one string, so your example 'sep 04 apr 06' still needs some pre-processing.

The approach I took is as follows:
Divide the string into an array of words.
If the array contains fewer than two words, return an array containing all the date strings found; else go to step 3.
If the array contains at least three words and the first three words represent a date, save it, delete the first three words in the array and repeat step 2; else go to step 4.
If the first two words represent a date, save it, delete the first two words in the array and repeat step 2; else go to step 5.
Delete the first word in the array and go to step 2.
I search for dates using the class method Date::strptime. strptime employs a format string. For example, '%d %b %Y' searches for the day of the month, followed by a space, followed by a (case-insensitive) three-character month abbreviation ('Jan', 'Feb',...,'Dec'), followed by a four-digit year. (I initially consider using Date::parse, but that does not discriminate dates adequately.)
Code
I first generate all the strptime format strings of interest for month, day and year:
MON = %w{ %b %B } # '%b' for 'Jan', '%B' for 'January'
YR = %w{ %y %Y } # '%y' for '11', '%Y' for 2011
DAY = %w{ %d } # '4', '04' or '28'
PERM3 = MON.product(YR, DAY).
flat_map { |arr| arr.permutation(3).to_a }.
map { |arr| arr.join(' ') }
#=> ["%b %y %d", "%b %d %y", "%y %b %d", "%y %d %b", "%d %b %y", "%d %y %b",
# "%b %Y %d", "%b %d %Y", "%Y %b %d", "%Y %d %b", "%d %b %Y", "%d %Y %b",
# "%B %y %d", "%B %d %y", "%y %B %d", "%y %d %B", "%d %B %y", "%d %y %B",
# "%B %Y %d", "%B %d %Y", "%Y %B %d", "%Y %d %B", "%d %B %Y", "%d %Y %B"]
I then do the same for permutations of day and month and month and year:
PERM2 = MON.product(YR).
concat(MON.product(DAY)).
flat_map { |arr| arr.permutation(2).to_a }.
map { |arr| arr.join(' ') }
#=> ["%b %y", "%y %b", "%b %Y", "%Y %b", "%B %y", "%y %B",
# "%B %Y", "%Y %B", "%b %d", "%d %b", "%B %d", "%d %B"]
I then proceed as follows:
require 'date'
def pull_dates(str)
arr = str.split
dates = []
while arr.size > 1
if arr.size > 2
a = depunc(arr[0,3])
if date?(a, PERM3)
dates << a.join(' ')
arr.shift(3)
next
end
end
a = depunc(arr[0,2])
if date?(a, PERM2)
dates << a.join(' ')
arr.shift(2)
next
end
arr.shift
end
dates
end
depunc removes any punctuations at the beginning and end of the string arr.join(' ').
def depunc(arr)
arr.join(' ').gsub(/^\W|\W$/,'').split
end
date? determines if the three- or two-element string arr represents a date. I first obtain a "cleaned" string from arr, and then search through the applicable strptime format strings (the argument perm), looking for one that shows the cleaned string can be converted to a date.
def date?(arr, perm)
clean = to_str_and_clean(arr)
perm.find do |s|
begin
d = Date.strptime(clean, s)
return true
rescue
false
end
end
false
end
to_str_and_clean returns a cleaned string that has punctuation removed and strings such as 'st', 'nd', 'rd', and 'th' following the numerical representation of the day.
def to_str_and_clean(arr)
str = arr.map { |s| s[0][/\d/] ? s.to_i.to_s : s }.join(' ').tr('.?!,:;', '')
end
Example
Let's try it.
str =
"Bubba sighted a flying saucer on sep 04 2013 and again in apr 06. \
Greta was born on 29th may 1982. Hey, may 2006 may 2008 are two years apart.\
We have been at loose ends since oct 11 of this year."
pull_dates(str)
#=> ["sep 04 2013", "apr 06", "29th may 1982", "may 2006 may", "oct 11"]
Well, as you see, it's not perfect. Some tweaking is required, but this might get you started.

You can use the DateTime class like so:
DateTime.parse('sep 04 apr 06')
which outputs a DateTime object:
#<DateTime: 2006-04-04T00:00:00+00:00 ((2453830j,0s,0n),+0s,2299161j)>

You can use DateTime.strptime method

Related

Rails admin Date fields

when I update or create a record that has Date fields it works fine when the locale en but when the locale ar it set nil value do I miss anything
ar:
date:
formats:
short: "%B %Y"
short_month: "%B %Y"
short_day_month: "%B %d"
short_day_month_year: "%d %B %Y"
short_day_month_year_weekday: "%A %d/%m/%Y"
month_only: "%B"
short_month_day_year: '%B %-d, %Y'
long: '%d %B %Y'
Try defining them under the time key instead of date
ar:
time:
formats:
short: "%B %Y"
short_month: "%B %Y"
short_day_month: "%B %d"
short_day_month_year: "%d %B %Y"
short_day_month_year_weekday: "%A %d/%m/%Y"
month_only: "%B"
short_month_day_year: '%B %-d, %Y'
long: '%d %B %Y'

Efficiency of Ruby code: hash of month + frequency into formatted sorted array

I've hacked up some code that fulfills its purpose but it feels very clunky/inefficient. From a table of many entries, each has a month + year string associated with it: "September 2016" etc. From these I create a chronologically ordered array of months and their frequencies to be used in a dropdown selection form: ['Novemeber 2016 (5)', 'September 2016 (5)'].
#months = []
banana = Post.pluck(:month)
#array of all months posted in, eg ['September 2016', 'July 2017', etc
strawberry = banana.each_with_object(Hash.new(0)){|key,hash| hash[key] += 1}
#hash of unique month + frequency
strawberry.each { |k, v| strawberry[k] = "(#{v.to_s})" }
#value into string with brackets
pineapple = (strawberry.sort_by { |k,_| Date.strptime(k,"%b %Y") }).reverse
#sorts into array of months ordered by most recent
pineapple.each { |month, frequency| #months.push("#{month}" + " " + "#{frequency}") }
#array of formatted months + frequency, eg ['July 2017 (5)', 'September 2016 (5)']
I was hoping some of the Ruby gurus here could advise me in some ways to improve this code. Any ideas or suggestions would be greatly appreciated!
Thanks!
['September 2016', 'July 2017', 'September 2016', 'July 2017']
.group_by { |e| e } # .group_by(&:itself) since Ruby2.3
.sort_by { |k, _| Date.parse(k) }
.reverse
.map { |k, v| "#{k} (#{v.count})" }
#⇒ [
# [0] "July 2017 (2)",
# [1] "September 2016 (2)"
# ]

How to covert the date string range into datatime data type

I need to convert the following raw string (date range) into ruby datetime datetype.
How to finish it on Rails ?
raw string
"2014 April/July 24-1"
convert to ruby datetime variable
start_date = 2014-04-24
end_date = 2014-07-01
raw string
"2015 April 06-20"
convert to ruby datetime variable
start_date = 2015-04-06
end_date = 2015-04-20
This may help
# In order to generate
# year = 2014
# months = "April/July"
# days = "24-1"
/(?<year>\d{4})\s*(?<months>\w+\/\w+)\s*(?<days>\d{1,2}\-\d{1,2})/ =~ "2014 April/July 24-1"
date1 = "#{year} #{months.split('/')[0]} #{days.split('-')[0]}"
date2 = "#{year} #{months.split('/')[1]} #{days.split('-')[1]}"
start_date = Date.strptime(date1, "%Y %b %d") #=> Thu, 24 Apr 2014
end_date = Date.strptime(date2, "%Y %b %d") #=> Tue, 01 Jul 2014

How to convert the string 2012 , December, 15 to a Datatime object

I will get the String "2012" then "December" at first time
In the following get the string 1,15, 30
I should convert them into Datetime object
2012-12-01
2012-12-15
2012-12-30
How to do it ?
Date.strptime('2012 December', '%Y %B')
#=> Sat, 01 Dec 2012
date = Date.strptime('2012 December 10', '%Y %B %d')
#=> Mon, 10 Dec 2012
date.strftime("%Y-%m-%d")
#=> "2012-12-10"
You have this string:
[1] pry(main)> str = "2012 December 1,15, 30"
=> "2012 December 1,15, 30"
And you want to get three dates from it (1st, 15th and 30th of December. First at all, get the year, month and days from this string:
[2] pry(main)> m = /(?'year'\d*)\s(?'month'[a-zA-Z]*)(?'days'[\d, ]*)/.match str
=> #<MatchData
"2012 December 1,15, 30"
year:"2012"
month:"December"
days:" 1,15, 30">
Then split the days to and array and iterate on it to convert the days with year and month to Date object (you can use Date.strptime as shivam showed above or Time.parse):
[10] pry(main)> require 'time'
=> true
[11] pry(main)> m[:days].split(',').collect {|day| Time.parse "#{m[:year]} #{m[:month]} #{day}"}
=> [2012-12-01 00:00:00 +0100,
2012-12-15 00:00:00 +0100,
2012-12-30 00:00:00 +0100]

DateTime parse undefined method `year'

I'm trying to parse a string to DateTime and i'm experiencing an error when I try tp drop the year:
undefined method `year' for "Monday, Aug 25, 10:30":String
Controller
dates = []
temps = []
dt = []
#data['data'].flatten.each do |data|
dates << data.keys
temps << data.values
end
dates.flatten.each do |date|
dt << DateTime.parse(date).strftime("%A, %b %d, %H:%M")
end
json
{"status": "ok", "data": [{"2014-08-25 10:30:00": 12.6}]}
Your example isn't clear because you're not showing your use of the year method. But, note that you've left out %y from your strftime string. I would recommend playing around in rails console to get the methods you're looking for. For example:
[1] pry(main)> DateTime.parse("2014-08-25 10:30:00")
Mon, 25 Aug 2014 10:30:00 +0000
[2] pry(main)> DateTime.parse("2014-08-25 10:30:00").year
2014
[3] pry(main)> DateTime.parse("2014-08-25 10:30:00").strftime("%A, %b %d %y, %H:%M")
"Monday, Aug 25 14, 10:30"
[4] pry(main)> DateTime.parse("2014-08-25 10:30:00").strftime("%A, %b %d %y, %H:%M").year
NoMethodError: undefined method `year' for "Monday, Aug 25 14, 10:30":String
from (pry):4:in `<main>'

Resources