how to select SpatRaster layers from their names? - mean

I've got a SpatRaster of (150 x 150 x 1377) that shows temporal evolution of precipitations. Each layer is a given hour in a 2-month interval, but some hours are missing, and the dataset isn't continuous. The layers names are strings as "YYYYMMDDhhmm".
I need to find the mean value every three hours even on whole intervals or on missing-data intervals. On entire ones I want to average three data and on missing-data ones I would like to average two of them or, if two are missing, to select the unique value as the averaged one.
How can I use data names to select how to act?
I've already tried this code but I'm averaging on three continuous layers by index and not by hours. How can I convert names in DateTime form from "tidyverse" in order to use rollapply() to see if two steps back I find the DateTime I am expecting? Is there any other method to check this out?
HSAF=rast(c((paste0(resfolder, "HSAF_final1_5.tif")),(paste0(resfolder, "HSAF_final6_10.tif")),(paste0(resfolder, "HSAF_final11_15.tif")),
(paste0(resfolder, "HSAF_final16_20.tif")),(paste0(resfolder, "HSAF_final21_25.tif")),(paste0(resfolder, "HSAF_final26_30.tif")),
(paste0(resfolder, "HSAF_final31_N04.tif")),(paste0(resfolder, "HSAF_finalN05_N08.tif")),(paste0(resfolder, "HSAF_finalN09_N13.tif")),
(paste0(resfolder, "HSAF_finalN14_N18.tif")),(paste0(resfolder, "HSAF_finalN19_N23.tif")),(paste0(resfolder, "HSAF_finalN24_N28.tif")),
(paste0(resfolder, "HSAF_finalN29_N30.tif"))))
index=names(HSAF)
j=2
for (i in seq(1,3, by=3))
{third_el<- HSAF[index[i+j]]
second_el <- HSAF[index[i+j-1]]
first_el<- HSAF[index[i+j-2]]
newraster<- c(first_el, second_el, third_el)
newraster<- mean(newraster, filename=paste0(tempfile(), ".tif"))
names(newraster)<- paste0(index[i+j-2],index[i+j-1],index[i+j])
}
for (i in seq(4,1374 , by=3))
{ third_el<- HSAF[index[i+j]]
second_el <- HSAF[index[i+j-1]]
first_el<- HSAF[index[i+j-2]]
subraster<- c(first_el, second_el, third_el)
subraster<- mean(subraster, filename=paste0(tempfile(), ".tif"))
names(subraster)<- paste0(index[i+j-2],index[i+j-1],index[i+j])
add(newraster)<- subraster
}

Related

Find nodes with 3+ occurrences in a 10 minute period

I have a list of nodes with a startTime property. I need to determine if the list contains a clump of 3 or more nodes with a startTime within 10 minutes of each other. I don't need to get the nodes that are in the clump, I just need a boolean indicating the existence of such a clump.
I am at a loss, everything I have tried fails so badly that it is not worth posting them.
I feel that I am missing something easy.
This should be doable.
First you'll need to collect the startTimes, order them, and collect them.
From there, you'll need to get the relevant pairings (each entry, and the entry 2 indices ahead for the end of the duration) that will comprise a group of 3, then see if the start times of that pair occur within 10 minutes of each other.
Assuming for the sake of example :Event nodes with a startTime property, you might use this query to get the results you want:
MATCH (e:Event)
WITH e
ORDER BY e.startTime ASC
WITH collect(e.startTime)[1..] as times
WITH times, range(0, size(times) - 3) as indices
RETURN any(index in indices WHERE times[index + 2] <= times[index] + duration({minutes:10}))

Using SPSS IF syntax to create a new variable from two categorical variables

I want to create a new variable from two other variables.
The first is SEX (0=male, 1=female; there were no other genders selected by respondents though we had planned for that possibility) whereas the second is RACE9 (0=white, 1=racialized). The new variable is named SEXRACE9.
While the following code produces counts for white males, racialized males, white females and racialized females, the code fails to produce a count for total male or total female.
* Create combined sex and race categorical variable.
IF (sex=0 AND (race9=0 OR race9=1)) sexrace9=1. /*Total males - glitchy.
IF sex=0 AND race9=1 sexrace9=2. /*White males.
IF sex=0 AND race9=0 sexrace9=3. /*Racialized males.
IF (sex=1 AND (race9=0 OR race9=1)) sexrace9=4. /*Total females - glitchy.
IF sex=1 AND race9=1 sexrace9=5. /*White females.
IF sex=1 AND race9=0 sexrace9=6. /*Racialized females.
EXECUTE.
Am I missing something? Alternately, does anyone have a solution for how to insert a count for total males and total females using COMPUTE? Any help is greatly appreciated.
You are missing two key aspects:
Your sexracevariable is intended to define mutually exclusive groups (i.e. - each case will belong to one group, and no case could qualify for more than one group)
SPSS syntax is being run sequentially, line by line, so a syntax line can overwrite previous lines.
More to the point:
IF (sex=0 AND (race9=0 OR race9=1)) sexrace9=1.
is being partially overwritten by
IF sex=0 AND race9=1 sexrace9=2. /*White males.
because white males would qualify for both sexrace=1 and sexrace=2.
, and then by the line
IF sex=0 AND race9=0 sexrace9=3. /*Racialized males.
, because Racialized males qualify for both sexrace=1 and sexrace =3.
So I am guessing that no cases ghave sexrace=1 after running your syntax :)
Exactly the same logic goes for Females.
I am not sure what you want to achieve by your Total Males and Total Femalessyntax lines. You already have the sexvariable to differentiate between males and females.

SUM(LAST()) on GROUP BY

I have a series, disk, that contains a path (/mnt/disk1, /mnt/disk2, etc) and total space of a disk. It also includes free and used values. These values are updated at a specified interval. What I would like to do, is query to get the sum of the total of the last() of each path. I would also like to do the same for free and for used, to get a aggregate of the total size, free space, and used space of all of my disks on my server.
I have a query here that will get me the last(total) of all the disks, grouped by its path (for distinction):
select last(total) as total from disk where path =~ /(mnt\/disk).*/ group by path
Currently, this returns 5 series, each containing 1 row (the latest) and the value of its total. I then want to take the sum of those series, but I cannot just wrap the last(total) into a sum() function call. Is there a way to do this that I am missing?
Carrying on from my comment above about nested functions.
Building a toy example:
CREATE DATABASE FOO
USE FOO
Assuming your data is updated at intervals greater than[1] every minute:
CREATE CONTINUOUS QUERY disk_sum_total ON FOO
BEGIN
SELECT sum("total") AS "total_1m" INTO disk_1m_total FROM "disk"
GROUP BY time(1m)
END
Then push some values in:
INSERT disk,path="/mnt/disk1" total=30
INSERT disk,path="/mnt/disk2" total=32
INSERT disk,path="/mnt/disk3" total=33
And wait more than a minute. Then:
INSERT disk,path="/mnt/disk1" total=41
INSERT disk,path="/mnt/disk2" total=42
INSERT disk,path="/mnt/disk3" total=43
And wait a minute+ again. Then:
SELECT * FROM disk_1m_total
name: disk_1m_total
-------------------
time total_1m
1476015300000000000 95
1476015420000000000 126
The two values are 30+32+33=95 and 41+42+43=126.
From there, it's trivial to query:
SELECT last(total_1m) FROM disk_1m_total
name: disk_1m_total
-------------------
time last
1476015420000000000 126
Hope that helps.
[1] Picking intervals smaller than the update frequency prevents minor timing jitters from making all the data being accidentally summed twice for a given group. There might be some "zero update" intervals, but no "double counting" intervals. I typically run the query twice as fast as the updates. If the CQ sees no data for a window, there will be no CQ performed for that window, so last() will still give the correct answer. For example, I left the CQ running overnight and pushed no new data in: last(total_1m) gives the same answer, not zero for "no new data".

Index Match? Or some other function?

‘Student Needs’! Columns I through O contain information on when each student attends an intervention class. Intervention classes take place during the second half of the classes (Science or social studies) or during the second half of Co-taught classes (math or ELA). Science and social studies interventions are done on either Monday/Wednesday or Tuesday/Friday (Thursdays have a special schedule that we do not need to consider). Math and ELA interventions occur on all four days.
In ‘Student Master’!, each student’s schedule is listed for both MW and TF. In Columns E, G, K, and M, I would like to populate any of the interventions that are listed in the ‘Student Needs’! sheet. For instance, Lindsey Lukowski has Social Skills on MW2 (Mondays and Wednesdays 2nd hour). So in cell ‘Student Master’! G31 should return “Social Skills”.
William Watters is getting Read Naturally and Reading Comp during his 5th Hour Co-Taught ELA. So ‘Student Master’! K51 and K52 should both return Read Naturally & Reading Comp (in the same cell).
Here is the workbook:
https://docs.google.com/spreadsheets/d/1aW7ExATzMn9Rf8IFLI4v-CQiqsXnxyDm8PxqMW999bY/edit?usp=sharing
Here is a complex functions that seems to do what you want. I have tested it in a copy of your sheet
Just Change E$2 for different columns.
=IFERROR(INDIRECT("'Student Needs'!"&CHAR(72+IFERROR(MATCH("HR "&E$2,ARRAYFORMULA(REGEXEXTRACT(INDIRECT("'Student Needs'!I"&REGEXEXTRACT($A3,"[0-9]+")+5&":N"&REGEXEXTRACT($A3,"[0-9]+")+5),"[A-Z ]+[0-9]")),0),MATCH($C3,ARRAYFORMULA(REGEXEXTRACT(INDIRECT("'Student Needs'!I"&REGEXEXTRACT($A3,"[0-9]+")+5&":N"&REGEXEXTRACT($A3,"[0-9]+")+5),"[A-Z]+")),0)))&5))
Also I am not sure where "6- Science" should go? Is this also HR 6?
In order for it to work with actual
=IFERROR(INDIRECT("'Student Needs'!"&CHAR(72+IFERROR(MATCH("HR "&E$2,ARRAYFORMULA(REGEXEXTRACT(INDIRECT("'Student Needs'!I"&ROW(VLOOKUP($A3,'Student Needs'!$A$1:$B,2,false))&":N"&ROW(VLOOKUP($A3,'Student Needs'!$A$1:$B,2,false))),"[A-Z ]+[0-9]")),0),MATCH($C3,ARRAYFORMULA(REGEXEXTRACT(INDIRECT("'Student Needs'!I"&ROW(VLOOKUP($A3,'Student Needs'!$A$1:$B,2,false))&":N"&ROW(VLOOKUP($A3,'Student Needs'!$A$1:$B,2,false))),"[A-Z]+")),0)))&5))

Sybase compare columns with duplicate row ids

So far I have a query with a result set (in a temp table) with several columns but I am only concerned with four. One is a customer ID(varchar), one is Date (smalldatetime), one is Amount(money) and the last is Type(char). I have multiple rows with the same custmer ID and want to evaluate them based on Date, Amount and Type. For example:
Customer ID Date Amount Type
A 1-1-10 200 blue
A 1-1-10 400 green
A 1-2-10 400 green
B 1-11-10 100 blue
B 1-11-10 100 red
For all occurrences of A I want to compare them to identify only one, first by earliest date, then by greatest Amount, then if still tied by comparing Types. I would then return one row for each customer.
I would provide some of the query but I am at home now after spending two days trying to get a correct result. It looks something like this:
(query to populate #tempTable)
GROUP BY customer_id
HAVING date_cd =
(SELECT MIN(date_cd)
FROM order_table ot
WHERE ot.customerID = #tempTable.customerID
)
OR date_cd IS NULL
I assume the HAVING would result in only one row per customer_id. This did not end up being the case since there were some ties there.
I am not sure I can do the OR - there are some with NULL values here - and it did not account for the step to the next comparison if they were all the same anyway. I am not seeing a way to avoid doing some row processing of the temp table with some kind of IF or WHERE loop.
As I write I am thinking maybe I use #tempTable.date_cd in the HAVING clause instead of looking at the original table. but that should return the same dates?
Am I on the right track or is there something missing? Suggestions? More info??
try below query :-
select * from #tempTable
GROUP BY customer_id
HAVING isnull(date_cd,"1900/01/01") =min(isnull(date_cd,"1900/01/01"))

Resources