I'm using SPSS and have a dataset comprised of individuals' responses to a survey question. This is longitudinal data, so the subjects have taken the survey at least twice and some as many as four or five times.
My variables are ID (scale), date of survey completion (date - dd-mmm-yyyy), and response to survey question (scale).
The dataset is sorted by ID then date (ascending). Each date corresponds to survey time 1, time 2, etc. What I would like to do is compute a new variable time that corresponds to the survey completion dates for a particular participant. I would then like to use that variable to complete a long-to-wide restructuring of the dataset.
So, I'd like to accomplish the following and am not sure how to go about doing it:
1) I have something like this:
ID Date Assessment_Answer
----------------------------------
1 01-Jan-2009 4
1 01-Jan-2010 1
1 01-Jan-2011 5
2 15-Oct-2012 6
2 15-Oct-2012 0
2) Want to compute another variable that would give me this:
ID Date Assessment_Answer Time
-----------------------------------------
1 01-Jan-2009 4 Time1
1 01-Jan-2010 1 Time2
1 01-Jan-2011 5 Time3
2 15-Oct-2012 6 Time1
2 15-Oct-2013 0 Time2
3) And restructure so that I have something like this:
ID Time1 Time2 Time3 Time4
--------------------------
1 4 1 5
2 6 0
You can use sequential case processing to create a variable that is a counter within each ID. So for example:
*Making fake data.
DATA LIST FREE / ID (F1.0) Date (DATE10) Assessment_Answer (F1.0).
BEGIN DATA
1 01-Jan-2009 4
1 01-Jan-2010 1
1 01-Jan-2011 5
2 15-Oct-2012 6
2 15-Oct-2012 0
END DATA.
*Making counter within ID.
SORT CASES BY Id Date.
DO IF ($casenum = 1) OR (Id <> LAG(ID)).
COMPUTE Time = 1.
ELSE.
COMPUTE Time = LAG(Time) + 1.
END IF.
FORMATS Time (F2.0).
EXECUTE.
Now you can use CASESTOVARS to reshape the data like you requested.
CASESTOVARS
/ID = Id
/INDEX = Time
/DROP Date.
Related
I'd like to query InfluxDB using InfluxQL and exclude any rows from 0 to 5 minutes after the hour.
Seems pretty easy to do using the time field (the number of nanoseconds since the epoch) and a little modulus math. But the problem is that any WHERE clause with even the simplest calculation on time returns zero records.
How can I get what I need if I can't perform calculations on time? How can I exclude any rows from 0 to 5 minutes after the hour?
# Returns 10 records
SELECT * FROM "telegraf"."autogen"."processes" WHERE time > 0 LIMIT 10
# Returns 0 records
SELECT * FROM "telegraf"."autogen"."processes" WHERE (time/1) > 0 LIMIT 10
I'm trying to build a sheet where I can see how much I have to pay each month.
Let's say I have the following table
Current installment (CI)
Total installments (TI)
Installment amount (IA)
1
3
$100
1
1
$200
2
3
$150
1
3
$75
2
4
$150
1
1
$50
So, the first month would be if TI-CI >= 1, then I will sum that value. For the following month I would do the same but TI-CI >= 2
And the result would be something like this
-
-
1st month debt
$475 (the result of 100+150+75+100)
2nd month debt
$325 (the result of 100+75+150)
3rd month debt
$100
Is this possible at all?
try:
=IFNA(SUM(FILTER(C$2:C, (B$2:B-A$2:A)>=ROW(A1))))
and drag down
I have 2 tables
Table-1 = Order details
Table-2 = Production details.
Explanation of color inside table:
Yellow color = Output Qty week wise and product wise.
Green color = My expectation. Example- The second order of shirt(Qty-10) delivery date is 14 Jan & there are 2 more orders (order num 1 & 4) of shirt which have delivery earlier than 14 Jan. So the finish week will be 4 as the order num 1 & 4 (total Qty 6) will be produced till week 2 as per the Table-2 (total Qty =7 (3+4).
Thanks to help me write the formula in E 2 to E6 cells.
Table1:
Table2:
Work out the sum of quantities for the same product and dates including this one using sumifs.
Compare it to the cumulative sum of the numbers produced for this product using match.
=ArrayFormula(match(true,sumifs(C$2:C$6,B$2:B$6,B2,D$2:D$6,"<="&D2)<=sumif(column(H:K),"<="&column(H:K),index(H$3:K$4,match(B2,G$3:G$4,0),0)),0))
I'm assuming for the time being that you couldn't have two rows with the same product and delivery date. If this could happen, you could refine the formula for the situation where (say) the first delivery could be sent in week 2 but the next delivery would be in week 3.
I have 2 models, Member and Slot
and there is middle table between them named MemberSlot for has_many through relationship.
Now user comes and register in a slot.
Slot has field name slot_date.
For metrics on index page I have to show that how many times each member registered for slot in terms of day of week using date field slot_date.
Like
User Monday Tuesday Wednesday Thursday Saturday Sunday Total
---------------------------------------------------------------
Faisal 0 1 1 3 0 1 6
User2 1 0 0 0 1 2 4
User3 0 0 0 0 0 0 0
and than allow user to sort this table like monday or Total.
I tried to do it something like
Slot.joins(member_slots: [:member]).group("members.id").group_by_day_of_week(:slot_date).count
Issue with this is that I am not even able sort this by specific day and not possible by total and than have to fetch member in loop to show
So it not workable.
someone can suggest a proper way to handle this.
Waiting for your response.
Hello guys I'm working on a interesting real time application.
The application is as follows.I have a meter model and meter_info model
calss Meter
has_many :meter_infos
# filed: id
end
class MeterInfo
belongs_to :meter
# field: meter_id, voltage
end
In every two minutes a new data is being saved to meter_info table.So you can imagine there are a huge data set over there.
Now what I want do is to find out exactly one voltage record of 10 meters each at a time in 10 minutes interval within 1 day.
So the result would be something like this
id created_at meter_id voltage
2001 2017-10-19 15:40:00 2 100
2001 2017-10-19 15:45:00 1 100
2001 2017-10-19 15:39:00 3 100
2001 2017-10-19 15:48:00 4 100
2001 2017-10-19 15:38:00 5 100
2001 2017-10-19 15:42:00 6 100
...
...
I've tried several queries but as it's taking too much time to find out the record, the request gets timeouted. Here is what I have tried for
(('2017-07-02 00:00:00').to_datetime.to_i ..
('2017-07-02 23:59:59').to_datetime.to_i).step(10.minutes) do |date|
query = "SELECT created_at, meter_id, voltage
FROM meter_infos
WHERE created_at between '#{Time.at(date).utc}' and
'#{Time.at(date).utc + 10.minutes}'
AND meter_id in (1,2,3,4,5)
ORDER BY id desc limit 1"
voltages = ActiveRecord::Base.connection.execute(query)
end
Which is timeouted even in the development environment.
Then I've tried to use Postgresql's generated_series like below
query= "SELECT meter_id,voltage, count(id) as ids
, GENERATE_SERIES( timestamp without time zone '2017-10-19',
timestamp without time zone '2017-10-19',
'10 min') as time_range
from meter_infos
where meter_infos.created_at between '2017-10-19 00:00:01'::timestamp and '2017-10-19 23:59:59'::timestamp
and meter_infos.meter_id in (1,2,3,4,5)
GROUP BY meter_id, voltage
ORDER BY meter_id ASC limit 1"
sbps_plot = ActiveRecord::Base.connection.execute(query)
Which is faster but gives me wrong data.
I am using Ruby on Rails and Postgresql.
Can somebody help me to write the faster query to find out data against time or suggest me any procedure to handle time series data analysis.
Thanks in advance.
You have records every two minutes, but you want to get a sample record from ten minute intervals. Here's my suggested solution:
You can take the modulus of the epoch time of the created_at timestamp with 600 (ten minutes in seconds). Then compare this against some 'tolerance' value (e.g. 119 seconds or less) in case the timestamps of your records aren't aligned to perfect ten minute intervals. Think of it of retrieving the first record with a created_at inside a 2 minute window following each 10 minute interval of the day.
For example,
MeterInfo
.where(
meter_id: [1, 2, 3, 4, 5],
created_at: your_date.beginning_of_day..your_date.end_of_day
)
.where("(cast(extract(epoch from created_at) as integer) % 600) < 119")
Give that a try and see if it works for you.