How to create a new table from a dataset in Google Sheets? - google-sheets

I have a logger which records a date/time and a value i.e.
05/06/21 11:29:43 0
05/06/21 11:29:48 0
05/06/21 11:29:53 0
05/06/21 11:29:58 1
05/06/21 11:30:03 1
05/06/21 11:30:08 1
05/06/21 11:30:13 0
The samples are recorded over a 30 day period.
I'd like to create a new dataset which filters this logging data to record the start and end time for each value within the current sequence i.e.
05/06/21 11:29:43 05/06/21 11:29:53 0
05/06/21 11:29:58 05/06/21 11:30:08 1
05/06/21 11:30:13 05/06/21 11:30:13 0
I would also (in addition) like a new dataset which is the COUNT of each value by date i.e.
05/06/21 0 4
05/06/21 1 3
How could either of these be best be achieved using Google Sheets?

Part 1 - to go in cell I3 of the sample sheet:
=arrayformula({iferror(vlookup(query(if(if(A2:A<>"",B3:B,)=if(A2:A<>"",B1:B,),if(B2:B=if(A2:A<>"",B3:B,),,row(A2:A)),if(B2:B=if(A2:A<>"",B3:B,),row(A2:A),)),"where Col1 is not null",0),{row($A:$A),$A:$A},2,false),),iferror(vlookup(query(if(if(A2:A<>"",B3:B,)="",if(A2:A<>"",row(A2:A),),if(B2:B=if(A2:A<>"",B3:B,),,row(A2:A))),"where Col1 is not null",0),{row($A:$A),$A:$B},{2,3},false),),query(if(if(A2:A<>"",B3:B,)="",if(A2:A<>"",row(A2:A),),if(B2:B=if(A2:A<>"",B3:B,),,row(A2:A))),"where Col1 is not null",0)-query(if(if(A2:A<>"",B3:B,)=if(A2:A<>"",B1:B,),if(B2:B=if(A2:A<>"",B3:B,),,row(A2:A)),if(B2:B=if(A2:A<>"",B3:B,),row(A2:A),)),"where Col1 is not null",0)+1})
The working for part 1:
Looking at sequential values of Amps in column B, we capture the start date/time from column A at the beginning of the sequence, then the stop date/time at the end of the sequence. Where there is only one value in a sequence (cell B21), the row is the start and stop value. The end value (cell B119) is also a stop value.
The following combines 3 formulas in and array {1,2,3}:
#1 - get start date/time (vlookup to get col A):
=arrayformula(iferror(vlookup(query(if(if(A2:A<>"",B3:B,)=if(A2:A<>"",B1:B,),if(B2:B=if(A2:A<>"",B3:B,),,row(A2:A)),if(B2:B=if(A2:A<>"",B3:B,),row(A2:A),)),"where Col1 is not null",0),{row($A:$A),$A:$A},2,false),))
It looks at column B to see where the value above the current cell matches, but the one below is different. It also matches where the values above and below are different (single entry).
#2 - get stop date/time and Amps (vlookup to get col A and B):
=arrayformula(iferror(vlookup(query(if(if(A2:A<>"",B3:B,)="",if(A2:A<>"",row(A2:A),),if(B2:B=if(A2:A<>"",B3:B,),,row(A2:A))),"where Col1 is not null",0),{row($A:$A),$A:$B},{2,3},false),))
It looks at column B to see where the value below the current cell matches, but the one above is different. It also matches where the values above and below are different (single entry), and where the value below is blank (stop value).
#3 - get count (stop row# minus start row#, +1):
=arrayformula(query(if(if(A2:A<>"",B3:B,)="",if(A2:A<>"",row(A2:A),),if(B2:B=if(A2:A<>"",B3:B,),,row(A2:A))),"where Col1 is not null",0)-query(if(if(A2:A<>"",B3:B,)=if(A2:A<>"",B1:B,),if(B2:B=if(A2:A<>"",B3:B,),,row(A2:A)),if(B2:B=if(A2:A<>"",B3:B,),row(A2:A),)),"where Col1 is not null",0)+1)
Part 2 - to go in cell E2 of the sample sheet:
=arrayformula(query({A:B,text(A:A,"dd/mm/yy")},"select Col3,Col2,count(Col2) where Col1 is not null group by Col3,Col2 label Col2 'Value', Col3 'Day', count(Col2) 'Count' ",0))
QUERY function used to select Amps, count their occurrence then group by day.

Related

Google Sheets: Get max value and corresponding values from other columns in a query function

I'm a bit lost with Google Sheets.
This is just a example with Google Finance to illustrate my problem!
I use a code like this
=GOOGLEFINANCE("AAPL", "all" , DATE(2022,6,21), DATE(2022,6,29))
to get this table:
My goal is to get the max value in Column 6 (Volume) with the corresponding values from Column 1(Date) & 3(High). The output has to be in the same cell as the formula (the whole table should never show up).
It should basically look like this:
I use this code to get the max value from Column 6
=QUERY(GOOGLEFINANCE("AAPL", "all" , DATE(2022,6,21), DATE(2022,6,29)),"select Max(Col6) label Max(Col6)''")
but I can't find a solution to add the corresponding values from Col1 and Col3 to the output.
Option 01
Paste this formula to get the "desired output", no need for the table either! Link to the Sheet.
=SORTN(QUERY(GOOGLEFINANCE("AAPL", "all" , DATE(2022,6,21), DATE(2022,6,29))," Select Col1,Col3,Col6 ",0),1,,2,0)
Explanation
1 - QUERY the input in this case GOOGLEFINANCE("AAPL", "all" , DATE(2022,6,21), DATE(2022,6,29)) and set "query" Col1,Col3 and Col6
Date, High and Volume, wiht [headers] set to 0
2 - SORTN the result and set The number of items to return [n] 1 to get the top 1 result including the headers in this case QUERY fuction [headers] is set to 0 so SORTN returns only one row , and set [sort_column] to 6 "Volume", and [is_ascending] to 0.
Option 02
To output the result with headers.
=SORTN(QUERY(GOOGLEFINANCE("AAPL", "all" , DATE(2022,6,21), DATE(2022,6,29))," Select Col1,Col3,Col6 "),2,,2,0)
Explanation
1 - QUERY the input in this case GOOGLEFINANCE("AAPL", "all" , DATE(2022,6,21), DATE(2022,6,29)) and set "query" Col1,Col3 and Col6
Date, High and Volume.
2 - SORTN the result and set The number of items to return [n] 2 to get the top 1 result including the headers, and set [sort_column] to 6 "Volume", and [is_ascending] to 0.
I hope that helps.
Try:
=QUERY(GOOGLEFINANCE("AAPL", "all" , DATE(2022,6,21), DATE(2022,6,29)),
"select Col1, Col3, Max(Col6) group by Col1, Col3 order by Max(Col6) desc limit 1 label Max(Col6) ''",0)

Return a value of first & last row matching criterion in Google Sheets

I have 2 columns of numbers, the idea is to find the first & last value in column L that respects a criterion and return the value from the same row in column K.
As the criterion is "higher than 99% of the max value in column L", I tried the MINIFS formula, but I cannot use this as a criterion.
I guess the solution will include the MATCH, INDEX formula but I cannot find the right combination
In this specific example, we want to return the value of the first column that has in column L a number higher than 0,99*max(L3:L62)(=3.0879...) so it should return 19
This will be verified for a couple rows until the value goes below the 99% again. This last row is 58.
Link to sheet :
https://docs.google.com/spreadsheets/d/1MUkYDPoR1NxB8qWcYr_2Fp91FgUbnUOpfGd7EUuWCOg/edit?usp=sharing
You can also try the following:
For first value
=Index(K3:K62;Min(IFERROR(1/(1/((Row(K3:K62)*(L3:L62>0,99*max(L3:L62)))))))-2)
For last value
=Index(K3:K62;Max(IFERROR(1/(1/((Row(K3:K62)*(L3:L62>0,99*max(L3:L62)))))))-2)
Try
=query({K3:L62};"select Col1 where Col2 > "&substitute(to_text(0,99*MAX(L3:L62));",";".")&" limit 1";0)
query as app script needs US notation in values (dot instead of comma)
to get the last row
=query({K3:L62};"select Col1 where Col2 > "&SUBSTITUTE(to_text(0,99*MAX(L3:L62));",";".")&" order by Col1 desc limit 1";0)

Google sheets Query function with Arrayformula

For each of the email id, I want to get latest 10 records by timestamp. How do I get the results with arrayformula? Query function is not important as long as I can still achieve this with arrayformula. Here is the sample data:
https://docs.google.com/spreadsheets/d/1YAHA02VM-5MXzVKhkxu_eODPKObpoz441mGX8lOFu5M/edit?usp=sharing
Try this on another sheet, row 1:
=arrayformula(query({query({Sheet1!$A:$C},"order by Col1 desc,Col2",1),{"Dupe position";countifs(query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),row(Sheet1!$A2:$C),"<="&row(Sheet1!$A2:$C))}},"select Col1,Col2,Col3 where Col1 is not null and Col4 <= 10 order by Col1",1))
You can adjust the number of records found by adjusting Col4 <= 10, and also the final sort by altering order by Col1 at the end of the formula.
Explanation
This gets the data from Sheet1, sorts it by date desc then email asc:
query({Sheet1!$A:$C},"order by Col1 desc,Col2",1)
Then to the side of this data, a COUNTIFS() is used to get the number each time an email appears in the list above (since it's sorted desc, 1 represents the most recent instance).
countifs(<EmailColumnData>,<EmailColumnData>,row(<EmailColumn>),"<="&row(<EmailColumn>))
In place of <EmailColumnData> in the COUNTIF() is:
query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0)
In place of <EmailColumn> above, we only want the row number so we don't need the actual data. We can use:
Sheet1!$A2:$C
Various {} work as arrays to bring the data together.
Eg., {a,b,c;d,e,f} would result in three columns, with a, b, c in row 1 and d, e, f in row 2. , is a new column, ; is a return for a new row.
A final query around everything gets the 3 columns we need, where the count number in col 4 is <=10, then sorts the output by Col1 (date asc).
On second thoughts, maybe this is bit cheeky, but this might do it ( taken from conditional rank idea )
=ArrayFormula(filter(A2:C,countifs(A2:A,">="&A2:A,B2:B,B2:B)<=10,A2:A<>""))
EDIT
The above assumes (because the data is time-stamped) dups shouldn't occur. If they do and the data is pre-sorted, you can use row number as a proxy for time stamp as suggested by #Aresvik.
Alternatively, you could count separately
(a) only rows with a later timestamp
plus
(b) rows with the same time stamp but with earlier (or identical) row number
=ArrayFormula(filter(A2:C,countifs(A2:A,">"&A2:A,B2:B,B2:B)+countifs(A2:A,"="&A2:A,B2:B,B2:B,row(A2:A),"<="&row(A2:A))<=10,A2:A<>""))
I have added a new sheet ("Erik Help") with the following formula in A1:
=ArrayFormula({"Submitted Time","Email","Score";SORT(SPLIT(FLATTEN(QUERY(SORT(TRANSPOSE(SPLIT(TRANSPOSE(QUERY(IF(Sheet1!B2:B=TRANSPOSE(UNIQUE(FILTER(Sheet1!B2:B,Sheet1!B2:B<>""))),Sheet1!A2:A&"|"&Sheet1!B2:B&"|"&Sheet1!C2:C,),,COUNTA(Sheet1!A2:A)))," ",0,1)),SEQUENCE(MAX(COUNTIF(Sheet1!B2:B,Sheet1!B2:B))),0),"LIMIT 10")),"|",1,0),1,0)})
The number of records is set after LIMIT.
The order is set by the final two numbers: 1,0 (meaning "sort by column 1 in reverse order," which, as currently set, is sorting in reverse order by date/time).

Rolling count with array formula in Google Sheets

I have the dataset below. Col1 is given data and Col2 is the rolling count of the previous 5 rows of Col1 (inclusive).
Date Col1 Col2
01/04/20 2 1
02/04/20 1 2
03/04/20 4 3
04/04/20 3
05/04/20 3
06/04/20 5 3
07/04/20 2 3
08/04/20 2
09/04/20 2
10/04/20 1 3
11/04/20 2
12/04/20 1
13/04/20 1
14/04/20 1
15/04/20 1 1
Is there a way to use arrayformula to do this rather than inputting a count formula into every cell in Col2 going down?
You can use Countifs with a condition on the rows:
=ArrayFormula(filter(countifs(B2:B,">0",row(B2:B),"<="&row(B2:B),row(B2:B),">"&row(B2:B)-5),A2:A<>""))
assuming the numbers are positive
To include any number, you can use:
=ArrayFormula(filter(countifs(isnumber(B2:B),true,row(B2:B),"<="&row(B2:B),row(B2:B),">"&row(B2:B)-5),A2:A<>""))
If you wanted to show rows corresponding to future dates as blanks, you could add an If statement:
=ArrayFormula(filter(if(A2:A>today(),"",countifs(isnumber(B2:B),true,row(B2:B),"<="&row(B2:B),row(B2:B),">"&row(B2:B)-5)),A2:A<>""))

how to count elements in google spreadsheets matching a certain date range

I have some data in columns with a timestamp in the first column and data columns.
A B C D
+++++++++++++++++++++++++++++++++++
20.5.2011 1 2 5
18.5.2011 3 5 4
12.5.2013 4 7 5
I am able to successfully filter columndata based on the timestamp with this google spreadsheets formula. The below returns a sum of all integers in column B if there is a corresponding 2011 timestamp.
=ArrayFormula(SUMIF(TEXT($A:$A;"yyyy");year(today())-1;$B:$B))
the above sums up the values 1 and 3 from column b and returns 4
The question is, how would I calculate the average for the above values 1 and 3 resulting in 2? My current approach is to divide the above formula by the count() of items that match the date criterion but I cannot get it to work.
=ArrayFormula(SUMIF(TEXT($A:$A;"yyyy");year(today())-1;$B:$B))/WFORMULA FOR THE DIVISOR
Any ideas?
You can use COUNTIF in much the same way as you used SUMIF:
=ArrayFormula(SUMIF(TEXT($A:$A;"yyyy");YEAR(TODAY())-1;$B:$B)/COUNTIF(TEXT($A:$A;"yyyy");YEAR(TODAY())-1))
(this would currently return the average of all the 2012 entries).
You can simplify this a little by using the YEAR function in the comparison array:
=ArrayFormula(SUMIF(YEAR($A:$A);YEAR(TODAY())-1;$B:$B)/COUNTIF(YEAR($A:$A);YEAR(TODAY())-1))
You can also generate a table of sums, averages or counts quite easily with QUERY:
=QUERY(A:B;"select year(A), avg(B) where A is not null group by year(A) label year(A) 'Year', avg(B) 'Average'")
and if you just wanted the average for 2012 as a single value:
=QUERY(A:B;"select avg(B) where year(A) = "&(YEAR(TODAY())-1)&" group by year(A) label avg(B) ''")

Resources