This question is concerning summarizing a database in Google spreadsheet using a predetermined criteria.
I have a data table like so in range A1:B8
A B
-------------
310 890.00
210 875.00
100 849.00
80 845.00
70 842.00
61 842.00
60 841.00
53 825.50
I also have a criteria table which specifies the boundaries for the ranges to be used in the merged table. The criteria table looks like this.
START END
------------
210 310
95 200
69 90
53 65
The criteria table is derived independently from the data table and as you can see, not all the values in the criteria table are present in the data table.
How can I use ArrayFormula such that the final table will have the following data.
START END MAX MIN START VALUE END VALUE
210 310 890 875 875 890
95 200 849 849 849 849
69 90 845 842 842 845
53 65 842 825.5 825.5 842
Here are the starting formulas used to calculate the MAX, MIN, START VALUE, and END VALUE
MAX FORMULA
INDEX(
query(
{ ARRAYFORMULA(VALUE(table1!$A$1:$A$8)),
ARRAYFORMULA(VALUE(table1!$B$1:$B$8)) },
" select max(Col2)
where Col1>="& VALUE(table2!A2) &" and Col1<=" & VALUE(table2!B2) &
" label max(Col2) ''"
),
1,
1
)
MIN FORMULA
INDEX(
query(
{ ARRAYFORMULA(VALUE(table1!$A$1:$A$8)),
ARRAYFORMULA(VALUE(table1!$B$1:$B$8)) },
" select min(Col2)
where Col1>="& VALUE(table2!A2) &" and Col1<=" & VALUE(table2!B2) &
" label min(Col2) ''"
),
1,
1
)
START VALUE FORMULA
INDEX(
query(
{ ARRAYFORMULA(VALUE(table1!$A$1:$A$8)),
ARRAYFORMULA(VALUE(table1!$B$1:$B$8)) },
" select Col2
where Col1>="& VALUE(table2!A2) &" and Col1<=" & VALUE(table2!B2) &
" order by Col1 asc"
),
1,
1
)
END VALUE FORMULA
INDEX(
query(
{ ARRAYFORMULA(VALUE(table1!$A$1:$A$8)),
ARRAYFORMULA(VALUE(table1!$B$1:$B$8)) },
" select Col2
where Col1>="& VALUE(table2!A2) &" and Col1<=" & VALUE(table2!B2) &
" order by Col1 desc"
),
1,
1
)
Here is the link to the publicly editable google sheet with the sample data on it.
How can I use ARRAYFORMULA in Google Sheets such that the details will be autopopulated for a very large dataset.
I have looked at FILTER and VLOOKUP. But since not all the values in the criteria table are actually in the data table, I am having trouble utilizing their ability to work well with ARRAYFORMULA. Please enlighten me. Thanks.
Not exactly what you want but very much easier:
E1 to F9 is named ArStart and used for ColumnC in:
=ArrayFormula(vlookup(A2:A,ArStart,2))
Output is by pivot table.
Related
I want to calculate the RPD (relative percent difference) for different groups in a google sheet.
I can do it for the whole array but miss it when I add the groups. Works when I use it for percentrank though.....
For all values, works fine
=(B20-AVERAGE($B$20:$B$30))/(AVERAGE($B$20:$B$30))*100
Try A
ArrayFormula((B20-AVERAGE($B$20:$B$30))/(AVERAGE(IFERROR(B$20:B$32*IF($A$20:$A$32=$A20,1,""),""),B20)))*100
Try B
=ARRAYFORMULA(B20-average(IFERROR(($B$20:$B$32)*IF($A$20:$A$32=$A20,1,""),B20)))/AVERAGE(B20:B34)*100
sheet as screenshot shows
Any idea?
Example of table, I made all values equal besides first one to see if the calculation is right. Expected result in the last column
Group
Value
Result
PG
40
18.92
PG
33
-1.89
SF
40
18.92
PG
33
-1.89
SG
40
18.92
SG
33
-1.89
PG
33
-1.89
SF
33
-1.89
SF
33
-1.89
SF
33
-1.89
SG
33
-1.89
SG
33
-1.89
String in "Group" can be in any order so I need to find a way to handle this. In the final version I will have 25 columns I need to calculate according to "Group".
To calculate the difference of each value and its group's average, and divide the result by the group average, try this:
=lambda(
groupAverage,
to_percent((B2 - groupAverage) / groupAverage)
)(
average(filter(B$2:B$14, A$2:A$14 = A2))
)
To evaluate the same through the whole column with an array formula, try this:
=lambda(
groups, values,
map(
groups, values,
lambda(
group, value,
lambda(
groupAverage,
if(
len(group),
to_percent((value - groupAverage) / groupAverage),
iferror(1/0)
)
)(
iferror(average(filter(values, groups = group)))
)
)
)
)(
A2:A, B2:B
)
To get the previous value in the same group on a row-by-row basis, try this:
=single( iferror( sort( filter( { B$1:B1, row(B$1:B1) }, A$1:A1 = A2 ), 2, false ), B2 ) )
To get the relative change to previous value in the same group, try this:
=to_percent( iferror((B2 - C2) / B2) )
See the sample spreadsheet.
I need column D summed wherever column A-C are identical. Column E is what I want the output to look like. I am only using google sheet functions right now and have not learned how to write script. This formula is the closest I've gotten.
=SUM(filter(D:D;COUNTIF(A2:A&B:B&C:C;A2:A&B:B&C:C)>1))
However, it does not distinguish between different text strings only sums any duplicate.
Thanks for any help!
A
B
C
D
E
papaya
10/10/2022
500
42
42
papaya
15/12/2022
550
30
59
papaya
15/12/2022
550
29
59
Pineapple
16/11/2022
400
55
55
Pineapple
09/11/2022
400
63
78
Pineapple
09/11/2022
400
15
78
use:
=QUERY(A:E; "select A,B,C,sum(D) where D is not null group by A,B,C label sum(D)''")
update
use in M2:
=INDEX(LAMBDA(bc; g; i; IFERROR(g/VLOOKUP(bc; QUERY({bc\i*1};
"select Col1,sum(Col2) where Col2 > 0 group by Col1 label sum(Col2)''"); 2; )))
(B2:B&C2:C; G2:G; I2:I))
I have 2 google sheets I'm working off. The master and a copy to create a 'dashboard' for analytics.
Master Sheet
name
quantity
price/quantity
RozMo
10
1.75
Tam
3
3.65
Gurba
36
12
Tam
30
0.55
RozMo
25
0.75
RozMo
5
0.50
RozMo
2
0.35
Gurba
150
8.75
Dashboard Sheet - Desired Output
name
quantity
price/quantity
RozMo
42
0.939
Tam
33
0.831
Gurba
186
9.379
Dashboard Sheet - This is how far I've got
name
quantity
price/quantity
RozMo
42
Tam
33
Gurba
186
Formulae used
To get the unique names
=UNIQUE('Master Sheet'!$A$2:$A)
To get quantity
=SUMIFS('Master Sheet'!$B$2:$B,'Master Sheet'!$A$2:$A,A2)
How do I populate the third column?
See how this works for you (I cannot test it, since you did not provide access to the spreadsheet):
=ArrayFormula(QUERY({'Master Sheet'!A2:C,'Master Sheet'!B2:B*'Master Sheet'!C2:C},"Select Col1, SUM(Col2), SUM(Col4)/SUM(Col2) WHERE Col1 Is Not Null GROUP BY Col1 LABEL Col1 'name', SUM(Col2) 'quantity', SUM(Col4)/SUM(Col2) 'price/qty' FORMAT SUM(Col4)/SUM(Col2) '0.000'"))
This one formula should produce all headers and results, formatted according to your full "desired result." If not, share a link to your spreadsheet (or a copy of it).
I have blood pressure data in two columns (SYS & DIA) and want to count the states according to these rules:
critical SYS > 180 OR DIA > 120
high stage 2 SYS > 140 OR DIA > 90
high stage 1 SYS 130-139 OR DIA 80-89
elevated SYS 120-129 AND DIA < 80
normal SYS < 120 AND DIA < 80
so that for
SYS DIA
120 73 (that's elevated)
123 81 (high stage 1)
112 83 (high stage 1)
129 68 (elevated)
118 72 (normal)
119 80 (elevated)
The result should be
normal: 1
elevated: 2
high stage 1: 3
For "elevated":
=COUNTIFS(SYS;">=120";SYS;"<=129";DIA;"<80") seems to work fine.
How do I handle "high stage 1" and other cases with OR???
I'm also considering working around the issue by adding a third column (as shown above) with a description of state via something like:
=IFS(OR(ISBETWEEN(SYS;130;139);ISBETWEEN(DIA;80;89));"High stage 1";AND(ISBETWEEN(SYS;120;129);DIA<80);"Elevated") and so on...
and then I guess I can just count words in that column. Still, one formula for all these states will get kind of messy and I suppose I'm missing a cleaner solution.
see:
=INDEX(QUERY(IF(B11:B="",,
IF(((A11:A>=120)*(A11:A<=129))*(B11:B<80), A4,
IF((A11:A<120)*(B11:B<80), A5,
IF((A11:A>180)+(B11:B>120), A1,
IF((A11:A>140)+(B11:B>90), A2,
IF(((A11:A>=130)*(A11:A<=139))+((B11:B>=80)*(B11:B<=89)), A3)))))),
"select Col1,count(Col1) where Col1 is not null group by Col1 label count(Col1)''"))
You can have a reference (1 on the screenshot) to lookup conditions by SYS or DIA values (presuming only whole numbers are used):
SYS DIA Condition
0 0 normal
120 80 elevated
130 80 high stage 1
141 91 high stage 2
181 121 critical
Then using this reference a formula in one cell (2 on the screenshot):
=QUERY(
FILTER(
VLOOKUP(
QUERY(
SPLIT(
FLATTEN(
ROW(A9:A) & "♥" & {IFNA(MATCH(A9:A, H2:H6)), IFNA(MATCH(B9:B, I2:I6))}
),
"♥"
),
"SELECT MAX(Col2)
GROUP BY Col1
LABEL MAX(Col2) ''",
),
{SEQUENCE(ROWS(J2:J6)), J2:J6},
2
),
A9:A <> "",
B9:B <> ""
),
"SELECT Col1, COUNT(Col1) GROUP BY Col1 LABEL COUNT(Col1) ''",
)
If you need to get condition for every row (3 on the screenshot) you can use this formula:
=ARRAYFORMULA(
IFS(
A9:A = "",,
B9:B = "",,
True,
VLOOKUP(
QUERY(
SPLIT(
FLATTEN(
ROW(A9:A) & "♥" & {IFNA(MATCH(A9:A, H2:H6)), IFNA(MATCH(B9:B, I2:I6))}
),
"♥"
),
"SELECT MAX(Col2)
GROUP BY Col1
LABEL MAX(Col2) ''",
),
{SEQUENCE(ROWS(J2:J6)), J2:J6},
2
)
)
)
I have 3 sheets that have the exact same format
Sheet1
A B C D
George 10 2 8
Nick 15 89 0
Mike 13 1 50
Lucas 9 -5 12
Sheet2
A B C D
Nick 1 9 5
Mike 1 10 6
George 11 22 5
Lucas 10 5 2
Panos 55 0 1
Sheet3
A B C D
Panos 0 9 1
George 1 2 5
Nick 7 2 1
Lucas 1 5 1
I want to query the range {'Sheet1'!A1:D5; 'Sheet2'!A1:D5; 'Sheet3'!A1:D5}
And get something like MAX(Col2:Col4) Group By Col1
Which would return something like:
George 22
Nick 89
Mike 50
Lucas 12
Panos 55
I tried:
=sort(query({'Sheet1'!A1:D5; 'Sheet2'!A1:D5;'Sheet3'!A1:D5}, "select Col1, MAX(Col2:Col4) Group by Col1 Label MAX(Col2:Col4) '' " ),2, FALSE)
and
=sort(query({'Sheet1'!A1:D5; 'Sheet2'!A1:D5;'Sheet3'!A1:D5}, "select Col1, MAX(MAX(Col2),MAX(Col3), MAX(Col4)) Group by Col1 " ),2, FALSE)
Both didn't work. Any ideas?
Please try:
=query(sort(transpose(query({Sheet1!A1:D5;Sheet2!A1:D5;Sheet3!A1:D5},"select max(Col2), max(Col3), max(Col4) pivot Col1"))),"select Col1, max(Col2) group by Col1 label(Col1) ''")
To sum up your question, It requires finding the MAX across the columns to the right as well as down. As such, QUERY does NOT have such 2D function.
So, Use a Helper column E&F in each sheet:
Max of B&C:
E2:
=ARRAYFORMULA(IF(B2:B>C2:C,B2:B,C2:C))
Max of B,C&D:
F2:
=ARRAYFORMULA(IF(D2:D>E2:E,D2:D,E2:E))
Now, Use Query:
Query:
=ARRAYFORMULA(QUERY({Sheet1!A2:F;Sheet2!A2:F;Sheet3!A2:F}, "Select Col1,max(Col5) where Col1 is not null group by Col1 order by max(Col5) desc"))
Notes:
Change ranges to suit
You could also simply use MAX for each row without the ARRAYFORMULA
Theoretically, For a single cell solution, You could enter this formula to find the max of 3 real numbers
Another approach perhaps a bit simpler but needing two queries
=sort(unique(({Sheet1!A1:A5;Sheet2!A1:A5;Sheet3!A1:A5})))
to get the names starting in (say) F2
Then this to get the maximum values for each name in (say) G2 and pulled down
max(query({Sheet1!A$1:D$5;Sheet2!A$1:D$5;Sheet3!A$1:D$5},"select max(Col2),max(Col3),max(Col4) where Col1='"&F2&"'"))