Extracting date values from hashes within Ruby array - ruby-on-rails

I have an array that contains a series of hashes as listed below.
[{:pgs=>6, :oo=>"No", :ms_no=>"D3273", :auth=>"Johnson", :ms_recd=>"Mar 14", :ready_for_ce=>"Mar 15", :proj_back_from_ce=>"Apr 1", :final_ok=>"Jul 25", :pub_date=>"Aug 5", :notes=>" "},
{:pgs=>17, :oo=>"No", :ms_no=>"R4382", :auth=>"Jacobs", :ms_recd=>"Apr 12", :ready_for_ce=>"Apr 16", :proj_back_from_ce=>"May 17", :final_ok=>"Jul 10", :pub_date=>"June 10 ", :notes=>" "},
{:pgs=>15, :oo=>"No", :ms_no=>"L3291", :auth=>"Smith", :ms_recd=>"Mar 25", :ready_for_ce=>"Mar 26", :proj_back_from_ce=>"Apr 22", :final_ok=>"Jun 21", :pub_date=>"Aug 10 ", :notes=>"Au prompted for cx 4/30", nil=>" 5/15."}]
I need to take two take two date values within each hash: the one with the key :ms_recd and the one with the key :pub_date. I then will determine how many days spanned between the two date ranges (for example, 18 days).
Fortunately, I have the last part pretty much figured out. I just need to do
ms_recd1 = DateTime.parse('Apr 24')
pub_date1 = DateTime.parse('Aug 15')
(pub_date1 - ms_recd1).to_i
Which returns 115 (meaning 115 days). So let's say for three hashes, I'll pull out date ranges of 115, 162, and 94 days, and then I'd average it to 123.6 days.
My question is, how do I pull these date values out of this array to do this? I feel like this should be simple, but I can't figure out how it should work.

My question is, how do I pull these date values out of this array to do this?
You could write as to get those dates :
array.map{|h| h.values_at(:ms_recd,:pub_date)}
Full solution as #robertodecurnex commented :
array.map{|h| (DateTime.parse(h[:pub_date]) - DateTime.parse(h[:ms_recd])).to_i}

Related

DynamoDB Timeseries: Querying large timespans of data

I have a simple timeseries table:
{
"n": "EXAMPLE", # Name, Hash Key
"t": 1640893628, # Unix Timestamp, Range Key
"v": 10 # Value being stored
}
Every 15 minutes I will poll data and insert into the table. If I want to query values between a 24-hour period, this works well - this would equate to a total of 96 records.
Now, say I want to query a larger timespan - 1 or 2 years. This is now tens of thousands of records, and (in my opinion) impractical to do regularly. This will require multiple queries to retrieve larger time ranges which would negatively impact response times as well as being much more costly.
I have thought of a couple of potential solutions to this problem:
1. Replicate data in another table, with larger increments. A table with a single record every 6 hours, for example.
2. Have another table to store common query results, such as records for "EXAMPLE" for the past week, month, and year (respectively). I would periodically update records in the new table to hold every N'th record in the main table (a total of 100). Something like:
{
"n": "EXAMPLE#WEEKLY",
"v": [
{
"t": 1640893628,
"v": 10
},
{
"t": 1640993628,
"v": 15
},
... 98 more.
]
}
I believe #2 is a solid approach. It seems to me like this would be a common enough problem, so I would love to hear about how other people have approached it.
More options present themselves if you can convert your unix timestamps into ISO 8601-type strings like 2021-12-31T09:27:58+00:00.
If so, DynamoDB's begins_with key condition expression lets us query for discrete calendar time buckets. December 2021, for example,
is queryable using n = id1 AND begins_with(t, "2021-12"). Same for days and hours. We can take this one step further by adding
other periods in indexes.
Some rolling windows are possible, too: n = id1 AND t > [24 hours ago] gives us last 24h.
n (PK) t (SK) hour_bucket (LSI1 SK) week (LSI2 SK)
id1 2021-12-31T10:45 2021-12-31T09-12 2021-52
id1 2021-12-31T13:00 2021-12-31T13-15 2021-52
id1 2022-06-01T22:00 2022-06-01T22-24 2022-01
If you are looking for arbitrary time-series queries, you might consider Athena, as the other answer suggested, or AWS's serverless
Timestream, which is a "purpose-built time series database that makes it easy to store and analyze trillions of time series data points per day. "
You could export the table to Amazon S3 and run Amazon Athena on the exported data. Here’s a blog post describing the process: https://aws.amazon.com/blogs/aws/new-export-amazon-dynamodb-table-data-to-data-lake-amazon-s3/

In snowflake how to get the date as i wanted format in snowflake

I have been facing the issue while retrieving year from snowflake table.
My table has value as below:
year :20
day:10
month :02
I need to dob value as 2020-10-02. When I am using the concat_ws I'm getting expected result, however the padded with 00 the dob printed like 0020-10-02.
Also when we have 99 in the year column then while retrieving it should display 1999
I have created query as below:
select to_date('concat_ws('-',json:year::varchar,json:month::varchar,json:date::varchar)', 'MM/DD/YYYY') from XXX;
Suggest me if any functions also.
Take a look at this
YY
Two-digit year, controlled by the TWO_DIGIT_CENTURY_START session parameter, e.g. when set to 1980, values of 79 and 80 parsed as 2079 and 1980 respectively.
select TO_DATE('99-10-02','YY-MM-DD');
There's no way to have automagically the "right" year display before, if some of your users are going to be very young and others very old, how could you know for sure if the person was born in 20th or 21st century?
I didn't quite get how your data is stored, so I'll assume a json structure since it appears in your query.
set json = '{
"elements": [{
"year": "01",
"month": "02",
"day": "10"
}, {
"year": "99",
"month": "02",
"day": "10"
}, {
"year": "20",
"month": "02",
"day": "10"
}]
}'::varchar;
Now I'll parse the json, extract the values, putting that here so you can make sure we're having the same data structure.
CREATE OR REPLACE TEMP TABLE test AS
select t2.VALUE: day::varchar dob_day,
t2.VALUE: month::varchar dob_month,
t2.VALUE: year::varchar dob_year
from (select parse_json($json) as json) t,
lateral flatten(input => parse_json(t.json), path => 'elements') t2
Now is the part that will interest you. It is dirty trick and assumes that if the two digit year is higher than the current two digit year, then it cannot be 2000, but instead 1900.
SELECT to_date(concat_ws('-',
IFF(t2.dob_year > RIGHT(extract(year, current_timestamp), 2), '19'||t2.dob_year , '20'||t2.dob_year ),
t2.dob_month,
t2.dob_day)) proper_date
FROM test t2;
Change the date format in your to_date to 'YY-MM-DD' should give you DOB you want, and I suggest to use try_to_date if you suspect data issues as it will NULL the field if not a valid date.
NOTE if you are using US formatting then use YY-DD-MM (the month in the end)
select to_date('concat_ws('-',json:year::varchar,json:month::varchar,json:date::varchar)', 'YY-MM-DD') from XXX;
Also, if you want to safely check that the DOB is not in future then add IFF as follows
IFF(CURRENT_DATE > to_date('concat_ws('-',json:year::varchar,json:month::varchar,json:date::varchar)', 'YY-MM-DD'), to_date('concat_ws('-',json:year::varchar,json:month::varchar,json:date::varchar)', 'YY-MM-DD'),NULL);

calculate days count, day and month name from the timeStamp swift 3?

In my current project i need to show most three months events details from the given response (eventTimestamp) array.
this is response:
[
{
"patientEventsId": 11,
"userId": 72,
"patientId": "CDMRI-U-2017030341",
"doctorId": "CDMRIDR2017030012",
"doctorEventsId": 18,
"doctorEventName": "Hypoglycemia",
"eventTimestamp": "2017-03-31 11:54:15",
"recordTimestamp": "2017-03-31 11:54:30",
"reviewed": false
}
]
I need to calculate:
most recent three months names
get dates of that particular month which are in response
List no of events on that particular date (count) in
that particular month
Find difference between two NSDates timeStamp using Swift to check the following Links,
Link-1
Link-2
Link-3
Link-4
Link-5
Link-6

How to get date out of a cell containing the string "2017|03"?

Here is my data:
I am trying to build a SUMIFS formula to sum the sessions, if the month = "last month" (i.e., parsed out of these strings), and the Channel Grouping = "Display".
Here's what I have so far:
=SUMIFS(H3:H,F3:F,________,G3:G,"Direct")
Since this is a string, not a date, I am not sure how to get it to match "last month".
Why not build up a string like this (or just hard-code it?)
=sumifs(H3:H,F3:F,year(today())&"|"&text(month(today())-1,"00"),G3:G,"Direct")
This builds up a string equal to "2017|03" by taking the year from today's date (2017) and one less than the month number from today's date which at time of writing is April so 4-1=3. The text function formats it with a leading zero. So the whole thing is"2017" & "|" & "03" which gives "2017|03" - this is compared against column F.
Note: January would be a special case (existing formula would give "2018|00" for previous month to January 2018 so would need a bit of extra code to cover this case and make it fully automatic).
By 'hard-code it' I mean just put 2017|03 in as a literal string like this
=sumifs(H3:H,F3:F,"2017|03",G3:G,"Direct")
then just change it manually for different months.
Here is a more general formula
=sumifs(H3:H,F3:F,year(eomonth(today(),-1))&"|"&text(month(eomonth(today(),-1)),"00"),G3:G,"Direct")
Just change the -1 to -2 etc. for different numbers of months.
EDIT
In light of #Max Makhrov's answer, this can be shortened significantly to
=sumifs(H3:H,F3:F,text(eomonth(today(),-1),"YYYY|MM"),G3:G,"Direct")
I would like to add two more options:
1
This formula is slightly shorter and more powerrful, because it gives the full control over date format:
=TEXT(TODAY(),"YYYY|MM")
formula syntax is here:
https://support.google.com/docs/answer/3094139?hl=en
2
In your case converting date to string is more efficient because it calculates one time in the formula, so there's fewer calculations. But sometimes you need to convert text into date. In this case I prefer using regular expresions:
=JOIN("/",{REGEXEXTRACT("2017|03","(\d{4})\|(\d{2})"),1})*1
How it works
REGEXEXTRACT("2017|03","(\d{4})\|(\d{2})") gives 2 separate cells output:
2017 03
{..., 1} adds 1 to ... and adds it to the right:
2017 03 1
JOIN("/", ...) joins the ... input:
2017/03/1
This looks like date, but to make it real date, multimpy it by 1:
"2017/03/1"*1 converts string that looks like date into a number 42795 which is serial number for date 2017 march 01

Count number of products since last 30 days in google spreadsheet

I have products I've sold which I would like to count for the last 30 days.
I've tried this formula:
=COUNTIFS(C:C;"=My_Item";A:A;">=11 novembre 2014")
It works. But now, instead of "11 novembre 2014", I would like something like "TODAY()-30" but it keeps returning me "0".
I assume it is a problem of date format but can't figure out what happens.
Any ideas?
Make sure the column with dates, is formatted as date. Then I think this should work:
=COUNTIFS(C:C;"=My_Item";A:A;">="&TODAY()-30)
or alternatively:
=SUMPRODUCT(C:C="My_Item"; A:A>=today()-30)

Resources