Create array formula that calculates the top 20 values - google-sheets

I have a spreadsheet for calculating attendance statistics. Column I has the names of each of the members, and column H calculates the percentage of practices each member has attended. Here is the list of functions I use to calculate the top 20 people:
J2: =INDEX(I$2:I$23,MATCH(LARGE(H$2:H$23,1),H$2:H$23,0))
J3: =INDEX(I$2:I$23,MATCH(LARGE(H$2:H$23,2),H$2:H$23,0))
J4: =INDEX(I$2:I$23,MATCH(LARGE(H$2:H$23,3),H$2:H$23,0))
J5: =INDEX(I$2:I$23,MATCH(LARGE(H$2:H$23,4),H$2:H$23,0))
...
However, each time a new member joins the team, or an old member quits, I have to change each cell for 20 cells. This takes a long time to do.
Is there a way I can simplify this into one simple ARRAYFORMULA?

An alternative Query:
=query(H:I,"select I order by H desc limit 20")

Never mind. I solved my own problem! If anyone else there is struggling with this just as I have, put this:
=query(H2:I23, " select * where I<>'' order by H desc ")
It will create 2 columns of information, the first column contains the percentages, and the second column contains the names in order. If you don't want the percentages, then shrink the first column as small as it can.

Related

Techniques to accommodate new entries in google sheets

As you can see I transpose codes into unique column headings so that debits and credits are analysed and summated. Summations are transposed in another sheet to create summary profit/loss account. I need help how to replicate the sum formula in column I to serve any expanded transposed unique codes and whether/how I should use arrayformula for the individual cell output.
EDIT
Actual output looks like this:
My problem is to how to automatically accommodate new entries/codes in the totals row and main body of cells. The data belongs to a residents' committee so I can only show anonymous data as image.
EDIT 2
Actual input is imported from bank records, then coded:
Query is pretty good for the SUM part.
Starting in column I, you can do:
=ArrayFormula(INDEX(QUERY(
0+OFFSET(I4,0,0,ROWS(F6:F),COUNTA(UNIQUE(F4:F))),
"select "&
JOIN(
",",
"sum(Col"&SEQUENCE(COUNTA(UNIQUE(F4:F)))&")"
)
),2))
The 0+ or the VALUE in the second one (they both do the same thing here) transforms the data cells to default to 0 if blank, otherwise the query fails. This also lets us refer to the columns by sequence number, which is what we do in the second argument. We build the query into something that looks like select sum(Col1),sum(Col2),...,sum(ColN). Since this gives us a header by default, we could relabel everything in the query statement, but that gives too much extra code, so the easier thing to do is use INDEX to select the sums.
The EQ part is fairly straightforward to Arrayify. Starting in I4:
=ArrayFormula(
(FILTER(F4:F,F4:F<>"")=FILTER(I2:2,I2:2<>""))*
IF(
Array_constrain(G4:G,COUNTA(FILTER(F4:F,F4:F<>"")),1),
G4:G,
-H4:H
)
)
The FILTERs just filter out the blank cells, and the Array_Constrain sizes the G column to the same size as the filtered F column.

Query particular row + remove X columns + and sum the rest in one formula?

I have a CSV file that I'm pulling from a database. It's in an awkward layout so I need to reorganise it and display the result in a separate sheet.
Here is a dummy example of the data structure I get.
https://docs.google.com/spreadsheets/d/1sTfjr-rd0vMIeb3qgBaq9SC8felJ1Pb4Vk_fMNXQKQg/edit?usp=sharing
It looks like that. The database grows every day by date and sometimes countries so I need to account to that in my formula.
I need to pull data per each country and display it by date.
I don't need data from Column A, C and D. And when there are multiple states I need to sum them up in one column.
It should look like this and keep growing downwards. I'm gonna use this table for a graph chart
What I've tried so far
=TRANSPOSE(QUERY(IMPORTRANGE("url_to_a_separate_sheet_where_I_importing_a_row_csv_file", "CSV-source-sheet!A1:500"), "SELECT * WHERE Col2='Germany'"))
This works, kinda. But pulls in unnecessary columns and I can't figure out how to sum countries with multiple states. When I add select sum(*) it gives me a big and long error. I assume it might be because of unnecessary columns that the formula cant sum up and I don't know how to omit them. I'm stuck
I tried offset and skipping no luck. Any ideas?
try:
=ARRAYFORMULA(TRANSPOSE(QUERY({Sheet2!B:B, Sheet2!E:BE},
"select Col1,"&TEXTJOIN(",", 1,
"sum(Col"&ROW(INDIRECT("Sheet2!A2:A"&COUNTA(Sheet2!1:1)-5))&")")&"
where Col1 is not null
group by Col1
label Col1'Date'", 1)))
spreadsheet demo

Google Sheets sum rows with the same first cell value grouped by first row value

I have dynamic data for an online shop with sales by product, by week split into columns:
I want to create a header row of the unique weeks and summarise the total sales by product by week in a dynamic table using query and or array formula if possible. However, Arrays and Queries seem to be designed for data exclusively in columns so maybe I need to transpose it in some way? Any ideas?
you can do:
=QUERY(B2:E, "select B,C+D,E label C+D''", 0)
or:
=ARRAYFORMULA({IF(B99=C99, B100:B+C100:C, B100:B),
IF(C99=D99, C100:C+D100:D, C100:C),
IF(D99=E99, D100:D+E100:E, D100:D),
IF(E99=F99, E100:E+F100:F, E100:E)})
Okay, so I took my own advice and did a transpose to get the data into a state that Query can work with and then re-transposed it back to get the format I wanted. However, it's not exactly dynamic as I'd have to edit the formula if we added or took away any products.
=Transpose(query(transpose(A2:E13),"Select Col1, Sum(Col2), Sum (Col3), Sum(Col4), Sum(Col5), Sum(Col6) ,Sum(Col7), Sum(Col8), Sum(Col9), Sum(Col10), Sum(Col11), Sum(Col12) group by Col1",1))
Which produces a nice tabular result:
Any ideas how to make the formula more dynamic?

Reference Specific Row in Named Range within another Named Range

I'm writing a spreadsheet to keep track of a small business' financials. They operate a few Rooms for rent, and the structure of the document is made so that each sheet holds a year's worth of booking for all the rooms.
Essentially, each row is defines a specific date, while each rooms spans a few columns (reason is that they don't just want to track whether or not a room is booked, but also record names of clients & other remarks), among which the daily calculated income (some factors alter the daily rate each room will generate).
So this is all fine and dandy, and I've created named ranges for each month of the year, and for each room.
For example, rows 6:36 will represent the month of January, while columns C:I will represent Room 1. Room 2 will span J:P and so forth.
Now, in another sheet, I wanted to make a dashboard which lists the earning for each room, per month. It's a very simple table with 12 rows (one for each month) and 10 columns (1 for each room) where I planned to sum up all the earnings.
So my issue is that I can't find a way to retrieve a specific column of a named range for a room ('vertical named range'), which is also limited in a named range for a month ('horizontal named range'). I had read about using ARRAYFORMULA(INDEX(named_range, ,wished_column)) but that only works for a single named range. My knowledge of these two functions being non-existent, I didn't manage to extend it to a 2-named-range version...
(I mean I did try something along the lines of ARRAYFORMULA(INDEX(January, , INDEX(Room1, , 3))) but that didn't work)
So because there isn't a one-to-one relation from the Dashboard cells to the Rooms cells, my current only solution is to manually reference everything, which you'll understand is inefficient and time-consuming...
My question, in fine, is: How can I retrieve a range that results of the intersection of 2 (or more) named ranges ? Once I have that resulting range, I know it will be very easy to use INDEX().
Define a named range Base as
A:Z
Define a range named Horizontal as
6:36
Define a range named Vertical as
C:I
Then the intersection of the vertical and horizontal ranges is given by:
index(Base,row(Horizontal),COLUMN(Vertical)):index(Base,row(Horizontal)+rows(Horizontal)-1,COLUMN(Vertical)+columns(Vertical)-1)
This can be verified by using it in a function e.g.
=countblank(index(Base,row(Horizontal),COLUMN(Vertical)):index(Base,row(Horizontal)+rows(Horizontal)-1,COLUMN(Vertical)+columns(Vertical)-1))
gives the result 7 * 31 = 217 in my sheet because I haven't filled in any of the cells.
The Offset version of this would be:
=countblank(offset(A1,row(Horizontal)-1,COLUMN(Vertical)-1):offset(A1,row(Horizontal)+rows(Horizontal)-2,COLUMN(Vertical)+columns(Vertical)-2))
or more simply:
=countblank(offset(A1,row(Horizontal)-1,COLUMN(Vertical)-1,rows(Horizontal),COLUMNS(Vertical)))
So this works well in OP's case where you have two fully overlapping ranges like this:
Partial Overlap
Suppose you have two partially overlapping ranges like this:
You can use a variation on the standard overlap formula (This is one of the early references to it as used with a date range)
max(start1,start2) to min(end1,end2)
So the previous formula becomes
=countblank(index(Base,max(row(index(Partial1,1,1)),row(index(Partial2,1,1))),max(COLUMN(index(Partial1,1,1)),column(index(Partial2,1,1)))):
index(Base,min(row(index(Partial1,1,1))+rows(Partial1)-1,row(index(Partial2,1,1))+rows(Partial2)-1),min(COLUMN(index(Partial1,1,1))+columns(Partial1)-1,column(index(Partial2,1,1))+columns(Partial2)-1)))
and the offset version is
=countblank(offset(A1,max(row(offset(Partial1,0,0)),row(offset(Partial2,0,0)))-1,max(COLUMN(offset(Partial1,0,0)),column(offset(Partial2,0,0)))-1):
offset(A1,min(row(offset(Partial1,0,0))+rows(Partial1)-2,row(offset(Partial2,0,0))+rows(Partial2)-2),min(COLUMN(offset(Partial1,0,0))+columns(Partial1)-2,column(offset(Partial2,0,0))+columns(Partial2)-2)))
I have tested this on ranges C2:F10 and D3:G11 which gives the result 24 as expected.
However, if there is no overlap, this can still give a non-zero result, so a suitable test needs adding to the formula:
=if(and(max(row(index(Partial1,1,1)),row(index(Partial2,1,1)))<=min(row(index(Partial1,1,1))+rows(Partial1)-1,row(index(Partial2,1,1))+rows(Partial2)-1),
max(column(index(Partial1,1,1)),column(index(Partial2,1,1)))<=min(column(index(Partial1,1,1))+columns(Partial1)-1,column(index(Partial2,1,1))+columns(Partial2)-1)),"Overlap","No overlap")
Perhaps the best approach in Google Sheets is to go back to the full version of the Offset call OFFSET(cell_reference, offset_rows, offset_columns, [height], [width]) . Although this is rather long, it will return a #Value! error if there is no overlap:
=Countblank(offset(A1,
max(row(offset(Partial1,0,0)),row(offset(Partial2,0,0)))-1,
max(COLUMN(offset(Partial1,0,0)),column(offset(Partial2,0,0)))-1,
min(row(offset(Partial1,0,0))+rows(Partial1),row(offset(Partial2,0,0))+rows(Partial2))-max(row(offset(Partial1,0,0)),row(offset(Partial2,0,0))),
min(COLUMN(offset(Partial1,0,0))+columns(Partial1),column(offset(Partial2,0,0))+columns(Partial2))-max(COLUMN(offset(Partial1,0,0)),column(offset(Partial2,0,0)))
))
Notes
Why did I have to introduce some more indexes (indices?) in the second formula to make it work? Because if you use the row function with a range in an array context, you get an array of row numbers which isn't what I want. As it happens, in the first formula you are not using it in an array context, so you just get the first row and column of the given range which is fine. In the second formula, Max and Min try to evaluate all the rows in the array, which gives the wrong answer, so I have used Index(range,1,1) to force it to look only at the top left hand corner of each range. The other thing is that both index and offset return a reference, so it is valid to use the construct Index(...):Index(...) or Offset(...):Offset(...) to define a new range.
I have also tested the above in Excel (where as mentioned the Index version would be preferable). In this case Base would be set to $1:$1048576.
Although in Excel you have the Intersect Operator (single space) so it's not necessary to use an Index or Offset formula at all e.g. the first example above would simply be:
=COUNTBLANK(Vertical Horizontal)
and if there is no overlap the formula returns a #NULL! error.
"I've created named ranges for each month of the year, and for each
room. For example, rows 6:36 will represent the month of January,
while columns C:I will represent Room 1. Room 2 will span J:P and so
forth."
What I suggest is that if "January" is defined for columns C to whatever (the last column of the last room), then that's all you need.
You haven't shown us the layout of the dashboard. But let's assume that at the very least you're interested in the income generated by each room.
=query({January},"select sum(Col3) label sum(Col3)'' ")
In this image, the range called "January" is highlighted. Note that it does NOT include the header. Note also that it can be many columns wide; in this example, I've just made up a few columns, but your range should cover all the columns for rooms 1 to n.
Syntax: QUERY(data, query, [headers])
Data: This formula queries the range called "January". That range can be on the same sheet, on on another sheet (such as your Dashboard). Reminder: in this screenshot, "my version of "January" is highlighted.
Query to count Number of People: "select sum(Col3) label sum(Col3)'' "
Query to sum the income earned: "select count(Col2) label count(Col2)'' "
Col2 & Col4 = Number of People for Room#1 and Room#2 respectively.
Col3 & Col5 = Income for Room#1 and Room#2 respectively.
[headers]: You can ignore them.
This formula delivers just the value of the query; even though it includes a "label", the label will not print.
Modify and adapt these formulae to create the other information required for your Dashboard.

Transpose column and add separator column

I'm trying to transpose a column from one sheet into a row of another sheet with a new blank column separating each result
=TRANSPOSE(Sheet1!A1:A30)
Whats the easist way to achieve this without having to add a blank row between each of the rows in the orginal sheet
Thanks
I think this may be the easiest way
split(textjoin("||",,Sheet1!A1:A30),"|",,false)
This answer is based on Toms answer:
split(textjoin("||",,Sheet1!A1:A30),"|",,false)
I like the solution because it is simple.
More general question would be:
How to add N extra separator columns with a formula
Here's the formula:
=TRANSPOSE(SPLIT(JOIN("|"&rept("|",1),A1:A30),"|",1,0))
where
"|" is a rare char you do not have in your dataset
rept("|",1) is to get N separator columns. Change 1 to N.
The only problem with the formula is join function limit on 50000 characters.
The final function won't give the error with a large dataset.
Please try:
=TRANSPOSE(ArrayFormula(TRIM(SPLIT(QUERY(A1:A30&"|"&rept("|",1),,2^99),"|",1,0))))
query replaces join and have no limits
trim is needed because query creates spaces at the end of each line.
Going further in depth on the issue above (question):
What would you write if you want a certain text for each new column:
E.g. I have several datasets (columns) with 1) drilling resistance and 2) associated depths, all of which I will extract from another sheet into this new one.
I have a list of boreholenames which I will transpose and insert as text over the columns with 1).
Then I want to add a column for each borehole with the height (2). How do I then automatize adding text for each new column with the writing "height (m.a.s.l.) boreholenumber", where the latter could be just picked from the borehole name list.
And by the way. The split function doesn't exist in my excel-program :( How to I get it?

Resources