Can Google spreadsheets broadcast rows or columns in their functions, in particular in SUMIF, if different range dimensions are used for two arguments?
For example, I was hoping SUMIF(B1:F1, "Offices", B2:F4) would return 56+23+23, because the 1x5 range B1:F1 would be repeated in the row dimension to match the 3x5 range B2:F4. PS: this repeating is called dimension broadcasting. Unfortunately, this doesn't work, SUMIF just ignores the 2 rows it has no criteria for and returns 56.
A B D D E F
1 Month Maintenance Offices Cars Employees Cars
2 Jan 23 56 43 23 56
3 Feb 12 23 67 43 21
4 March 44 23 45 56 45
Question: Can I specify a criterium in SUMIF where the column or row stays fixed, as is possible with conditional formatting? Differently put, how can one specify a criterium in SUMIF such that the criterium range is broadcasted?
Why DSUM does not work: Notice that SUMIF(B1:F4,"Offices",B1:F4) could do the trick, only that it would fail here since the B1:F4 isn't a proper database since there are two columns named "Car". Moreover, DSUM requires the column headers to be adjacent to their data, while I would imagine I'd want to place the total sums between the header and the data. That being said, I also want to learn an imho powerful concept, not the DSUM function.
Conditional formatting Google spreadsheet does offer broadcasting in formatting, for example, if I format the range A1:F4 on the condition =A$1="Cars", then the two "Cars" columns would be formatted, even though they are columns D and F.
Comparison with numpy, which does broadcasting numpy, Which can be used as a spreadsheet library when programming Python, does something called (dimension) broadcasting. Consider an array (read spreadsheet) a of 1 row and 3 columns and another array b of 3 rows and 3 columns. We could then ask numpy to multiply both element- (read cell-) wise, and it will repeat the single row of a three times to match the dimension of b:
import numpy
a = numpy.array([
[0, 1, 2]
])
b = numpy.array([
[0, 1, 2],
[3, 4, 5],
[7, 8, 9],
])
a * b
Outcome, notice that the entire first column is multiplied with 0, the entire second with 1 and the entire third with 2:
numpy.array([
[0, 1, 4],
[0, 4, 10],
[0, 8, 18],
])
=SUMPRODUCT(QUERY(TRANSPOSE(A1:F10), "where Col1 = 'Offices'"))
Related
I need a single cell formula to create a sequence of numbers with a limit in Google Sheets as shown in the image.
3 rows repeat the value
then Increment by 5
Use this formula, you can adjust the Sequence() and REPT(rg&",",3) parameters to your need.
In this example Sequence(number_of_unique_numbers,columns,start_at,increment_by)
And REPT(rg&",",Repeat_N_times)
=ArrayFormula(FLATTEN(SPLIT(BYROW(SEQUENCE(3,1,5,5),
LAMBDA(rg, IF(rg="",,REPT(rg&",",3)))),",")))
Option 02
Based on Themaster - answer we use lambda with the names.
u unique s start r repat n time
=LAMBDA(u,s,r, FLATTEN(MAKEARRAY(u,r,LAMBDA(u,r,u*s))))
(4,5,3)
Used formulas help
ARRAYFORMULA - FLATTEN - SPLIT - BYROW - SEQUENCE - LAMBDA - IF - REPT - MAKEARRAY
Use MAKEARRAY with FLATTEN. Multiply the row index number by 5:
=FLATTEN(MAKEARRAY(4,3,LAMBDA(i,j,i*5)))
Output
5
5
5
10
10
10
15
15
15
20
20
20
A different approach, but probably not a good one...:
=arrayformula(mround(sequence(12,1,2),3)*(5/3))
try:
=INDEX(FLATTEN(SEQUENCE(4, 1, 5, 5)*SEQUENCE(1, 3, 1, 0)))
or:
=INDEX(FLATTEN(5*MAKEARRAY(4, 3, LAMBDA(x, O ,x))))
I have a raw data that contains types of defect (EC,VC,BC,NC).
first type of defect calculation (EC,VC,BC).
For example if i want to collect the EC defects, I will collect all the rows that contain EC=0 & if the EC is score down to 0 twice in the same row it will count as 1 EC only in this raw.
Same goes for VC & BC.
Second Type of defect calculation (NC)
Same calculation of the above but if the NC defect was repeated on the same row it will be summed.
I have the headers containing VC,EC,BC,NC above each column containing the score.
What i need here to calculate each raw containing defects.
I tried the below formula for VC,EC,BC.
=ARRAYFORMULA(IF(ROW(E2:E)=2,"EC",IF(LEN(E2:E)=0,IFERROR(1/0),IF(COUNTIFS($E$1:$BH$1,"EC",$E:$BH,0)>=1,1,0))))
The formula is working without the array formula, but will be hard to drag each time the data is updated.
Also, Tried the below for NC.
=ARRAYFORMULA(IF(ROW(H2:H)=2,"NC",IF(LEN(H2:H)=0,IFERROR(1/0),COUNTIFS($E$1:$BH$1,"NC",$E:$BH,0))))
Sample Sheet: Test Sheet
={"EC"; ARRAYFORMULA(IF(LEN(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(FILTER(
INDIRECT("E3:"&ADDRESS(MAX((E:Z<>"")*(ROW(A:A))), COLUMNS(1:1))),
E1:1="EC")=0, 1, )),,9^9))))>0, 1, 0))}
={"VC"; ARRAYFORMULA(IF(LEN(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(FILTER(
INDIRECT("E3:"&ADDRESS(MAX((E:Z<>"")*(ROW(A:A))), COLUMNS(1:1))),
E1:1="VC")=0, 1, )),,9^9))))>0, 1, 0))}
={"BC"; ARRAYFORMULA(IF(LEN(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(FILTER(
INDIRECT("E3:"&ADDRESS(MAX((E:Z<>"")*(ROW(A:A))), COLUMNS(1:1))),
E1:1="BC")=0, 1, )),,9^9))))>0, 1, 0))}
={"NC"; ARRAYFORMULA(MMULT(IF(FILTER(
INDIRECT("E3:"&ADDRESS(MAX((E:Z<>"")*(ROW(A:A))), COLUMNS(1:1))),
E1:1="NC")=0, 1, 0), SEQUENCE(COUNTIF(E1:1, "NC"))^0))}
I struggling to get the sum total of the top X values for a Row.
Let say top 3 in this case.
A
B
C
D
E
1
john
1
4
3
2
2
Mary
4
5
1
2
So the total of the top 3 values would be
name
Total
John
9
Mary
11
I can get a single LARGE number but can't figure out how to get the top 5 in the row (and then sum). Most examples have the values in COLS but my data is in ROWS
Answer:
Sum of top 5 in a row:
=ARRAYFORMULA(SUM(IFERROR(LARGE(A1:1, {1, 2, 3, 4, 5}), 0))
Rundown of this formula:
Use the array notation to denote the first, second, third, fourth and fifth highest in the row with {1, 2, 3, 4, 5}
Set this array as the n parameter for a LARGE formula, with the row as the range parameter
If the LARGE throws an error for whatever reason, return the value 0 for that value of n
SUM all answers
Wrap inside an ARRAYFORMULA so that LARGE gets all array values and not just the first.
References:
Using arrays in Google Sheets - Docs Editors Help
LARGE - Docs Editors Help
IFERROR - Docs Editors Help
SUM - Docs Editors Help
ARRAYFORMULA - Docs Editors Help
I have a scoring spreadsheet for a competition I'm working on. Competitors' place/rank are converted into points towards the overall series based on a chart of corresponding values. For ties, the sum of the points covered by all of the tied places are split evenly among the tied competitors (i.e. 2-way tie for 3rd; if 3rd usually gets 10 points and 4th usually gets 8, these competitors will receive (10+8)/2 (2 being the # of tied competitors), so they each receive 9 points).
I have a formula which does this exact calculation:
=IFERROR(IF(ISBLANK($A4:$A),,SUM(INDEX(SeriesPoints, E4:E):INDEX(SeriesPoints, MIN(E4:E + COUNTIF(E$4:E, E4:E) - 1, ROWS(SeriesPoints)))) / COUNTIF(E$4:E, E4:E), 0))
Where 'SeriesPoints' is a 2 column array; column 1 is the places/ranks (1:125) and column 2 is their corresponding point values. Column 'E' is the competitors' rank from the competition.
I have been unable to convert this formula to an ARRAYFORMULA() so I can avoid dragging it down the entire sheet (possibly up to 1000+ competitors over the series).
I'm mildly proficient with MMULT(), so I understood that would be a good approach for switching out SUM(), however, I haven't been able to create a matrix of the values to be summed.
INDEX():INDEX() doesn't work with ARRAYFORMULA() so I've tried switching to VLOOKUP(). With VLOOKUP() I've been able to produce the start and end values of the range of values for a tie, but not the full list. For example, if there is a 3-way tie for 4th, I can produce the respective points for 4th and 6th (the bounds of the tie).
In an attempt to list out even just the numbers from 4:6, I've hit a wall converting what would be a simple ROW() or SEQUENCE() formula to a matrix/array.
The following formula produces an array of the upper and lower bounds of ties or the single place should there be no tie, although the single place gets repeated.
=ARRAYFORMULA(IF(COUNTIF(E$4:E,E4:E)=1,E4:E,{E4:E,E4:E+COUNTIF(E$4:E,E4:E)-1}))
I'm assuming if I can get VLOOKUP({#:#}) to fill properly, I'll be where I need to be.
From here, I feel confident in my abilities to wrap a VLOOKUP() for the actual point values, an MMULT() to sum across these rows for the total, then a simple division to produce the correct point value.
Spreadsheet: https://docs.google.com/spreadsheets/d/1lpNewR3p4i7ZHmlFGLlG1tLuxgO-6onSeH8mWTeclBw/edit?usp=sharing
Currently, my workspace is off to the right. The original formula is in F4 and my test codes are working on column G instead of E.
So for sample placements of 1,1,3,3,3,6,7,8 and sample points values of 1000, 850,738,663,633,603,573,550 I expect the output to be 925 for the two 1st place tied competitors, 678 for the tied 3rd places, 603 for 6th, 573 for 7th, and 550 for 8th.
I'd appreciate any and all help!
=ARRAYFORMULA(IFERROR(IFERROR(VLOOKUP(G4:G, QUERY({INDIRECT("G4:G"&counta(A4:A)+3),
VLOOKUP(ROW(INDIRECT("A1:A"&COUNTA(A4:A))), SeriesPoints, 2, 0)},
"select Col1,sum(Col2) group by Col1 label sum(Col2)''", 0), 2, 0))/
IFERROR(VLOOKUP(G4:G, QUERY(G4:G,
"select G,count(G) where G is not NULL group by G label count(G)''", 0), 2, 0))))
I am trying to get GETPIVOTDATA to work right while using dates. I have looked at multiple questions here on SO that are for GETPIVOTDATA, but none of them use a date in a reference.
I can create a pivot table with the following data and pull out the total for a given division and subdivision. But I can't crack the code to handling dates right in GoogleSheets version of GETPIVOTDATA, even though my code works in MS Excel.
this data comes from the googledocs supportpage: https://support.google.com/docs/answer/6167538?hl=en
division subdivision product number number of units Date price per unit
east 1 1 14 3/1/2018 $10
east 2 1 15 3/1/2018 $11
west 1 1 11 3/3/2018 $10
west 2 1 21 3/4/2018 $9
east 3 1 16 3/1/2018 $8
west 3 1 18 3/6/2018 $12
east 4 1 11 3/7/2018 $9
east 1 2 10 3/1/2018 $9
east 2 2 9 3/9/2018 $13
west 1 2 12 3/10/2018 $10
west 2 2 15 3/1/2018 $10
east 3 2 12 3/12/2018 $9
west 3 2 16 3/1/2018 $12
east 4 2 12 3/14/2018 $9
The pivot table is anchored into H1 and the columns listed are
division, subdivision, Date, SUM of number of units
in cells H1, I1, J1, K1 respectively
23 =GETPIVOTDATA(K1,H1,"division", "east", "subdivision", 4)
#REF! =GETPIVOTDATA(K1,H1,"division", "east", "subdivision", 4, "Date", datevalue("2018-3-07"))
#REF! =GETPIVOTDATA(K1,H1,"division", "east", "subdivision", 4, "Date", DATE(2018, 3, 7))
It should return "11" which is the intersection of east, 4 and 3/7
The #REF errors return with "Field combination not found in pivot table for function GETPIVOTDATA" even though it seems like all of the fields are listed. As you can see, I can get my summary value if I use two division and subdivision, but not when I add the Date field. I have tried multiple ways to match the datevalue in the pivottable.
I am flustered. What silly thing am I missing here? Please check that your answer actually works in GoogleSheets before suggesting it :)
Thanks!
I know that I'm absurdly late, but this was bothering me as well and I could not figure it out.
I finally realized that the GETPIVOTDATA method uses the total rows, and will throw an error if the correct totals are not there.
Hopefully this helps people who find this like I did.
It turns out that the value argument (technically the pivot_item argument) for the date argument (original_column) must be text that matches the format of the date as it is formatted in the SOURCE of the pivot table, i.e. in the data.
So if the date item is formatted as 3/7/2018 in the original data, then, regardless of how you format the date in the pivot table, a formula that works would be:
=GETPIVOTDATA(K1,H1,"division", "east", "subdivision", 4, "Date", "3/7/2018")
If in the data, there is a subsequent reoccurrence of the same value but is formatted differently, e.g. 3/7 (no year), then as far as I can tell, the first occurrence of that value will be used as the reference format. So the formula above would capture all 3/7/2018 data* assuming the first 3/7/2018 data point is formatted as such.
If another date in the data is formatted as 3/8 Thu (first occurrence), then that's the text that needs to be used in the Pivot Item of the formula.
Google's definition of the Pivot Item is:
pivot_item… - [optional] repeatable
The name of the row or column
shown in the pivot table corresponding to original_column that you
want to retrieve.
It says the name of the row or column. Maybe they were very literal about it, but most likely not, considering this function is borrowed from Excel, and Excel uses the value independent of its format.
*subject to the other pivot columns/items (division & subdivision)
Combining answers from bsoo and Amos47 fixed it for me.
Need to make sure you have "Show Totals" ticked on your pivot table
For dates, don't use datevalue("2018-3-07") - instead use the text function to match the format that you have in the pivot table. This can vary based on the spreadsheet location. For example, if you are referencing a cell A4 you might use =GETPIVOTDATA(K1,H1,"division", "east", "subdivision", 4, "Date", text(A4,"yyyy-mm-dd")).
With dates, I find using either the iso 8601 format "yyyymmdd" or the text version of the month "dd-mmm-yy" helps to avoid errors if it switches between US and European format