SPSS - flag cases within a calendar month - spss

I have a list of prisoners and when their prison term started (PrisonStart) and when it ended (PrisonEnd). If they're still in prison, PrisonEnd is blank.
I would like to flag prisoners who were in prison at least one full calendar month during a 6-month period (1/1/16 to 5/30/16).
compute PeriodBeg = date.mdy(01,01,16).
compute PeriodEnd = date.mdy(05,30,16).
formats PeriodBeg PeriodEnd (adate10).
execute.
Any suggestions for how best to go about this? Seems I might need to compare prisoners' start and end dates separately for each month during the 6-month period (like below), and then select any prisoner with at least one full month, but I'm wondering if there's a more efficient way.
if ((PrisonStart le [January 1, 2016]) and (PrisonEnd ge ([January 31, 2016]) | missing(PrisonEnd))) InPrisonJan = 1.
if ((PrisonStart le [February 1, 2016]) and (PrisonEnd ge ([February 28, 2016]) | missing(PrisonEnd))) InPrisonFeb = 1.
etc.
execute.
Some sample data below. The first two prisoners should be flagged as having been in prison for at least one full calendar month during the 6-month period (OneMonth = 1). The last three prisoners were not in prison for at lease one full calendar month during the 6-month period (OneMonth = 0).
data list list /PrisonerID (F8.0) PrisonStart (adate10) PrisonEnd (adate10) PeriodBeg (adate10) PeriodEnd (adate10).
begin data
1 10.3.14 7.12.16 1.1.16 5.30.16
2 2.9.16 4.1.16 1.1.16 5.30.16
3 5.2.16 10.11.16 1.1.16 5.30.16
4 12.1.13 2.8.14 1.1.16 5.30.16
5 1.7.16 1.20.16 1.1.16 5.30.16
6 1.1.17 3.2.17 1.1.16 5.30.16
end data.

The following syntax avoids mentioning the last day of each month separately, so it could be used to automate across any number of months. The trick to check if the date is the last day of the month, is by checking day "0" of NEXT month:
do repeat inpr=inPrison1 to inPrison5/ mon=1 to 5.
compute inpr=( PrisonStart<=date.dmy(1,mon,2016) and
(PrisonEnd>=date.dmy(0,mon+1,2016) or missing (PrisonEnd)) ).
end repeat.

Related

count how many specific days are in a time period

I want to count say how many Mondays we have from 2022-02-01 - 2022-03-01. I found smth like this:
=SUMPRODUCT(WEEKDAY(B4:C4)=2) - B4 and C4 are the dates
But it returns 0. I assume it only checks if specific date is the specific day. Any ideas how I can do this but for a date range? So how count how many Mondays there are in February
I also found this
=NETWORKDAYS.INTL(B4;C4;"1000000")
but this returns 25
You can take advantage of the NETWORKDAYS.INTL function by using the string method to make all the days as weekend except for Monday.
The String method states:
weekends can be specified using seven 0’s and 1’s, where the first number in the set represents Monday and the last number is for Sunday. A zero means that the day is a work day, a 1 means that the day is a weekend. For example, “0000011” would mean Saturday and Sunday are weekends.
In this case since you only want to know the Mondays, the string would be "0111111" and the function would look like:
=NETWORKDAYS.INTL(StartDate,EndDate,"0111111")
I think this is right. It's counting inclusively so you would get five Mondays starting on Monday 7th Feb 2022 and ended on Monday 7th March 2022 for example.
=floor((B2-(A2+7-weekday(A2,12)))/7)+1
where A2 and B2 contain the start date end end date.
Obvs nul points for me again but for the record this could be generalised if you put the day number in C2 (e.g. 1 if you want to find Sundays, 2 for Mondays):
=floor((B2-(A2+7-weekday(A2,10+C2)))/7)+1

Formula to calculate historical data based off a specific condition?

I am trying to observe historical trends on customer acquisitions (new and returning) and am looking to use a formula to automate it for me.
Essentially, I am looking to determine the average amount of new customers we acquire on a specific day, specific week, and specific month. For example: what are the average customers we have acquired every Monday for the past 6 months, or what is the average number of customers we acquire the first week of every month?
Solution:
You can use the date operators in your QUERY statement to filter by month, week, or even day of week.
Examples:
every Monday for past 6 months
=query(A1:B, "select avg(B) where datediff(todate(now()),todate(A)) < 180 and dayofweek(A) = 2", 1)
first week of every month
=query(A1:B, "select month(A),avg(B) where day(A) <= 7 group by month(A) offset 1", 1)
You would need to tweak the sample queries to cover your data range and which columns do you need to average and compare.
References:
QUERY()
Query Language Reference | Scalar Functions

Need to return production finish date of an order based on the weekly production Qty in excel

I have 2 tables
Table-1 = Order details
Table-2 = Production details.
Explanation of color inside table:
Yellow color = Output Qty week wise and product wise.
Green color = My expectation. Example- The second order of shirt(Qty-10) delivery date is 14 Jan & there are 2 more orders (order num 1 & 4) of shirt which have delivery earlier than 14 Jan. So the finish week will be 4 as the order num 1 & 4 (total Qty 6) will be produced till week 2 as per the Table-2 (total Qty =7 (3+4).
Thanks to help me write the formula in E 2 to E6 cells.
Table1:
Table2:
Work out the sum of quantities for the same product and dates including this one using sumifs.
Compare it to the cumulative sum of the numbers produced for this product using match.
=ArrayFormula(match(true,sumifs(C$2:C$6,B$2:B$6,B2,D$2:D$6,"<="&D2)<=sumif(column(H:K),"<="&column(H:K),index(H$3:K$4,match(B2,G$3:G$4,0),0)),0))
I'm assuming for the time being that you couldn't have two rows with the same product and delivery date. If this could happen, you could refine the formula for the situation where (say) the first delivery could be sent in week 2 but the next delivery would be in week 3.

Return an Integer between 1 and 52, based on a given Date

I have a client that's giving me data sets that are broken down into quarters, periods (a block of four weeks in a quarter), and weeks. I'm writing a quick reference algorithm to return the quarter, period, week given a date and year and vise versa.
Their data is always broken down into 52 weeks, where week 1 always contains Jan 1st and starts with the Monday before or at Jan 1st. This is how they handle the 365 / 7 = 52.142857 conundrum.
So, is there a gem or built in function (cweek returns 1-53), that would give me a week number based on the premise that week 1 always contains Jan 1st or do I need to design something additional?
Way 1. Date#strftime
Date.new(2016,1,1).strftime("%U").to_i + 1 # week starts with Sunday
Date.new(2016,1,1).strftime("%W").to_i + 1 # week starts with Monday
Way 2. Date#cweek
Date.new(2016,1,1).cweek % 53 + 1 # week starts with Monday

Store the day of the week and time?

I have a two-part question about storing days of the week and time in a database. I'm using Rails 4.0, Ruby 2.0.0, and Postgres.
I have certain events, and those events have a schedule. For the event "Skydiving", for example, I might have Tuesday and Wednesday and 3 pm.
Is there a way for me to store the record for Tuesday and Wednesday in one row or should I have two records?
What is the best way to store the day and time? Is there a way to store day of week and time (not datetime) or should these be separate columns? If they should be separate, how would I store the day of the week? I was thinking of storing them as integer values, 0 for Sunday, 1 for Monday, since that's how the wday method for the Time class does it.
Any suggestions would be super helpful.
Is there a way for me to store the the record for Tuesday and
Wednesday in one row or do should I have two records?
There are several ways to store multiple time ranges in a single row. #bma already provided a couple of them. That might be useful to save disk space with very simple time patterns. The clean, flexible and "normalized" approach is to store one row per time range.
What is the best way to store the day and time?
Use a timestamp (or timestamptz if multiple time zones may be involved). Pick an arbitrary "staging" week and just ignore the date part while using the day and time aspect of the timestamp. Simplest and fastest in my experience, and all date and time related sanity-checks are built-in automatically. I use a range starting with 1996-01-01 00:00 for several similar applications for two reasons:
The first 7 days of the week coincide with the day of the month (for sun = 7).
It's the most recent leap year (providing Feb. 29 for yearly patterns) at the same time.
Range type
Since you are actually dealing with time ranges (not just "day and time") I suggest to use the built-in range type tsrange (or tstzrange). A major advantage: you can use the arsenal of built-in Range Functions and Operators. Requires Postgres 9.2 or later.
For instance, you can have an exclusion constraint building on that (implemented internally by way of a fully functional GiST index that may provide additional benefit), to rule out overlapping time ranges. Consider this related answer for details:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL
For this particular exclusion constraint (no overlapping ranges per event), you need to include the integer column event_id in the constraint, so you need to install the additional module btree_gist. Install once per database with:
CREATE EXTENSION btree_gist; -- once per db
Or you can have one simple CHECK constraint to restrict the allowed time period using the "range is contained by" operator <#.
Could look like this:
CREATE TABLE event (event_id serial PRIMARY KEY, ...);
CREATE TABLE schedule (
event_id integer NOT NULL REFERENCES event(event_id)
ON DELETE CASCADE ON UPDATE CASCADE
, t_range tsrange
, PRIMARY KEY (event_id, t_range)
, CHECK (t_range <# '[1996-01-01 00:00, 1996-01-09 00:00)') -- restrict period
, EXCLUDE USING gist (event_id WITH =, t_range WITH &&) -- disallow overlap
);
For a weekly schedule use the first seven days, Mon-Sun, or whatever suits you. Monthly or yearly schedules in a similar fashion.
How to extract day of week, time, etc?
#CDub provided a module to deal with it on the Ruby end. I can't comment on that, but you can do everything in Postgres as well, with impeccable performance.
SELECT ts::time AS t_time -- get the time (practically no cost)
SELECT EXTRACT(DOW FROM ts) AS dow -- get day of week (very cheap)
Or in similar fashion for range types:
SELECT EXTRACT(DOW FROM lower(t_range)) AS dow_from -- day of week lower bound
, EXTRACT(DOW FROM upper(t_range)) AS dow_to -- same for upper
, lower(t_range)::time AS time_from -- start time
, upper(t_range)::time AS time_to -- end time
FROM schedule;
db<>fiddle here
Old sqliddle
ISODOW instead of DOW for EXTRACT() returns 7 instead of 0 for sundays. There is a long list of what you can extract.
This related answer demonstrates how to use range type operator to compute a total duration for time ranges (last chapter):
Calculate working hours between 2 dates in PostgreSQL
Check out the ice_cube gem (link).
It can create a schedule object for you which you can persist to your database. You need not create two separate records. For the second part, you can create schedule based on any rule and you need not worry on how that will be saved in the database. You can use the methods provided by the gem to get whatever information you want from the persisted schedule object.
Depending how complex your scheduling needs are, you might want to have a look at RFC 5545, the iCalendar scheduling data format, for ideas on how to store the data.
If you needs are pretty simple, than that is probably overkill. Postgresql has many functions to convert date and time to whatever format you need.
For a simple way to store relative dates and times, you could store the day of week as an integer as you suggested, and the time as a TIME datatype. If you can have multiple days of the week that are valid, you might want to use an ARRAY.
Eg.
ARRAY[2,3]::INTEGER[] = Tues, Wed as Day of Week
'15:00:00'::TIME = 3pm
[EDIT: Add some simple examples]
/* Custom the time and timetz range types */
CREATE TYPE timerange AS RANGE (subtype = time);
--drop table if exists schedule;
create table schedule (
event_id integer not null, /* should be an FK to "events" table */
day_of_week integer[],
time_of_day time,
time_range timerange,
recurring text CHECK (recurring IN ('DAILY','WEEKLY','MONTHLY','YEARLY'))
);
insert into schedule (event_id, day_of_week, time_of_day, time_range, recurring)
values
(1, ARRAY[1,2,3,4,5]::INTEGER[], '15:00:00'::TIME, NULL, 'WEEKLY'),
(2, ARRAY[6,0]::INTEGER[], NULL, '(08:00:00,17:00:00]'::timerange, 'WEEKLY');
select * from schedule;
event_id | day_of_week | time_of_day | time_range | recurring
----------+-------------+-------------+---------------------+-----------
1 | {1,2,3,4,5} | 15:00:00 | | WEEKLY
2 | {6,0} | | (08:00:00,17:00:00] | WEEKLY
The first entry could be read as: the event is valid at 3pm Mon - Fri, with this schedule occurring every week.
The second entry could be read as: the event is valid Saturday and Sunday between 8am and 5pm, occurring every week.
The custom range type "timerange" is used to denote the lower and upper boundaries of your time range.
The '(' means "inclusive", and the trailing ']' means "exclusive", or in other words "greater than or equal to 8am and less than 5pm".
Why not just store the datestamp then use the built in functionality for Date to get the day of the week?
2.0.0p247 :139 > Date.today
=> Sun, 10 Nov 2013
2.0.0p247 :140 > Date.today.strftime("%A")
=> "Sunday"
strftime sounds like it can do everything for you. Here are the specific docs for it.
Specifically for what you're talking about, it sounds like you'd need an Event table that has_many :schedules, where a Schedule would have a start_date timestamp...

Resources