Google Sheets Count Specific Sequences - google-sheets

I have a complex employee schedule spanning one year with 25 individuals in a Google Sheets format. Each individual may have more than one duty on a given day and these are delimited by commas currently. Here is a shortened and simplified sample as I cannot attach the original sheet to respect anonymity of coworkers:
1/1/2022
1/2/2022
1/3/2022
1/4/2022
1/5/2022
1/6/2022
1/7/2022
1/8/2022
1/9/2022
1/10/2022
1/11/2022
1/12/2022
1/13/2022
Person 1
Office, ,
Lab, ,
Office, ,
Rounder, ,
Rounder, Night Call ,
Back Up Call, Rounder,
Back Up Call, Rounder,
Rounder, ,
Rounder, ,
Rounder, ,
Office, ,
Person 2
Rounder, ,
Rounder, ,
Rounder, ,
Rounder, ,
Rounder, ,
Office, ,
Office, ,
, ,
, ,
Office, ,
Office, ,
Office, ,
Office, ,
Person 3
Back Up Call
Night Call, Rounder,
Back Up Call, Rounder,
Office,,
Office, ,
Lab, ,
Lab, ,
, ,
, ,
Office, ,
Lab, ,
Office, ,
Rounder, ,
Person 4
,
,
Vacation,
Vacation,
Office,
Rounder,
Rounder,
Rounder,
Night Call, Rounder
Rounder,
Rounder,
Rounder,
Office,
Person 5
,
,
Vacation,
Back Up Call,
Night Call, ,
Back Up Call,
Vacation,
,
,
Vacation,
Vacation,
Vacation,
Vacation,
To ensure fairness, I need to quantify how often certain events occur. I was helped greatly by member Osm with a solution to Count the same event occurring multiple days in a row. I have worked through that solution and understand it now, but have hit another snag. I need to count the frequency of the following sequences:
-Back Up Call | Back Up Call | Night Call
-Back Up Call | Night Call | Back Up Call
-Night Call | Back Up Call | Back Up Call
I would like the output to look something like this:
Backup/Backup/Call
Backup/Call/Backup
Call/Backup/Backup
Person 1
0
0
0
Person 2
2
0
0
Person 3
1
1
0
Person 4
0
2
1
Person 5
3
0
0
So far I have tried variations of IF function but I am new to array formulas and the difference in syntax that is allowed is tripping me up. I have also begun working through using REGEXREPLACE and representing each of these various text strings as a different number and getting sums, but this does not allow me to determine the order the shifts occurred in.
Does anyone have a solution which might work for this? Thank you very much, in advance.

Get the count
Use this formula to get the count
=ArrayFormula(IF($E$3:$E$509="",,
{BYROW($F3:$509,
LAMBDA(v, COUNTIF(SPLIT(REGEXREPLACE(TEXTJOIN(",", 1, TRIM(REGEXREPLACE(
IFNA(REGEXEXTRACT(
TRANSPOSE(QUERY(TRANSPOSE(LAMBDA(x,y, IF(COLUMN(x)<=MAX(IF(y="",,COLUMN(x))),IF(y="","Empty", y),""))(v,TRIM(REGEXREPLACE(v, ", ,|(,)[^,]*$", "")))), " Where Col1 <> '' ")),
REGEXREPLACE(TRIM(A$2), " \| ", "|")),"NaError"), "[[:punct:]]", ""))),
REGEXREPLACE(A$2, " \| ", ","), "♥"), ","), "=♥"))),
BYROW($F3:$509,
LAMBDA(v, COUNTIF(SPLIT(REGEXREPLACE(TEXTJOIN(",", 1, TRIM(REGEXREPLACE(
IFNA(REGEXEXTRACT(
TRANSPOSE(QUERY(TRANSPOSE(LAMBDA(x,y, IF(COLUMN(x)<=MAX(IF(y="",,COLUMN(x))),IF(y="","Empty", y),""))(v,TRIM(REGEXREPLACE(v, ", ,|(,)[^,]*$", "")))), " Where Col1 <> '' ")),
REGEXREPLACE(TRIM($B$2), " \| ", "|")),"NaError"), "[[:punct:]]", ""))),
REGEXREPLACE(B$2, " \| ", ","), "♥"), ","), "=♥"))),
BYROW($F3:$509,
LAMBDA(v, COUNTIF(SPLIT(REGEXREPLACE(TEXTJOIN(",", 1, TRIM(REGEXREPLACE(
IFNA(REGEXEXTRACT(
TRANSPOSE(QUERY(TRANSPOSE(LAMBDA(x,y, IF(COLUMN(x)<=MAX(IF(y="",,COLUMN(x))),IF(y="","Empty", y),""))(v,TRIM(REGEXREPLACE(v, ", ,|(,)[^,]*$", "")))), " Where Col1 <> '' ")),
REGEXREPLACE(TRIM(C$2), " \| ", "|")),"NaError"), "[[:punct:]]", ""))),
REGEXREPLACE(C$2, " \| ", ","), "♥"), ","), "=♥")))}))
Display table
Assuming the first tab is named Sheet1, paste this formula in another sheet
=ArrayFormula({ "Persons", REGEXREPLACE(Sheet1!A2:C2, " \| ", CHAR(10)); Sheet1!E3:E, Sheet1!A3:C})
Named function
See here how you can create a named function to make the workflow easier. or make a copy of this sheet example.
and this demo on how to use it.
Notes
Keep an eye on the range $F3:$500 and $E$3:$E$500 If you have less or more than 500, adjust accordingly. I set it to 500 to avoid missing references.
Resources
| **Streaks of** | | | | | | | | | | | | | | | | | | | | | | | | |
|:------------------------------------------:|:------------------------------------------:|:------------------------------------------:|:-:|----------|--------------|----------------------|------------------------|---------------|---------------|---------------|-----------------------|------------------------|------------------------|------------|------------|------------|------------|--------------|------------|--------------|---|--------------|------------|--------------|
| Back Up Call \| Back Up Call \| Night Call | Back Up Call \| Night Call \| Back Up Call | Night Call \| Back Up Call \| Back Up Call | | 1/1/2022 | 1/2/2022 | 1/3/2022 | 1/4/2022 | 1/5/2022 | 1/6/2022 | 1/7/2022 | 1/8/2022 | 1/9/2022 | 1/10/2022 | 1/11/2022 | 1/12/2022 | 1/13/2022 | | | | | | | | |
| **0[Formula Here]** | 0 | 1 | | Person 1 | | | Office, , | Lab, , | Office, , | Rounder, , | Rounder, Night Call , | Back Up Call, Rounder, | Back Up Call, Rounder, | Rounder, , | Rounder, , | Rounder, , | Office, , | | | | | | | |
| 0 | 0 | 0 | | Person 2 | Rounder, , | Rounder, , | Rounder, , | Rounder, , | Rounder, , | Office, , | Office, , | , , | , , | Office, , | Office, , | Office, , | Office, , | | | | | | | |
| 0 | 1 | 0 | | Person 3 | Back Up Call | Night Call, Rounder, | Back Up Call, Rounder, | Office,, | Office, , | Lab, , | Lab, , | , , | , , | Office, , | Lab, , | Office, , | Rounder, , | | | | | | | |
| 0 | 0 | 0 | | Person 4 | , | , | Vacation, | Vacation, | Office, | Rounder, | Rounder, | Rounder, | Night Call, Rounder | Rounder, | Rounder, | Rounder, | Office, | | | | | | | |
| 0 | 3 | 0 | | Person 5 | , | , | Vacation, | Back Up Call, | Night Call, , | Back Up Call, | Vacation, | , | , | Vacation, | Vacation, | Vacation, | Vacation, | Back Up Call | Night Call | Back Up Call | | Back Up Call | Night Call | Back Up Call |

Related

Create a pivot table where the resulting columns are the product of the original columns and rows?

This might be impossible to do without a ton of expensive scripting, but I would like to run it by the experts in case I'm missing something. It's hard to explain (because it's nonsensical.. i.e. not my choice), so I'll just give a very simplified example.
My source data sheet is like this...
+----------+-------+------+--------+
| Date | Time | Cars | Trucks |
+----------+-------+------+--------+
| 01/01/19 | 08:00 | 2 | 12 |
| 01/01/19 | 12:00 | 4 | 10 |
| 01/01/19 | 20:00 | 6 | 8 |
| 01/02/19 | 08:00 | 8 | 6 |
| 01/02/19 | 12:00 | 10 | 4 |
| 01/02/19 | 20:00 | 12 | 2 |
+----------+-------+------+--------+
.. and I want to have another sheet dynamically display it like ...
+----------+---------------+---------------+---------------+
| | 08:00 | 12:00 | 20:00 |
+----------+------+--------+------+--------+------+--------+
| | Cars | Trucks | Cars | Trucks | Cars | Trucks |
+----------+------+--------+------+--------+------+--------+
| 01/01/19 | 2 | 12 | 4 | 10 | 6 | 8 |
| 01/02/19 | 8 | 6 | 10 | 4 | 12 | 2 |
+----------+------+--------+------+--------+------+--------+
In other words, a column for each time at category combined.
Keep in mind that, in reality, this is a large data set. Also, I have a little bit of flexibility in the headers in the sense that, the two header rows in the output could be one. Something like "Cars 8:00", "Trucks 8:00", "Cars 12:00"... etc
Does anybody know how this could be done with a pivot table? Or some other simple'ish method?
Here's a live version of the same example...
https://docs.google.com/spreadsheets/d/1npQikx3Zwa2QZwDAk8IxyawYw2hkeYpPe9Nh4ImkZAE/edit?usp=sharing
try:
=ARRAYFORMULA({{TEXT(SUBSTITUTE(SPLIT(TRANSPOSE(QUERY(TRANSPOSE(QUERY(
QUERY({Source!A2:B, TRANSPOSE(QUERY(TRANSPOSE(Source!C2:D),,999^99))},
"select Col1,max(Col3) where Col1 is not null group by Col1 pivot Col2"),
"limit 0", 1)),,999^99)), " "), "1899-12-30", ), "hh:mm"), ""}; {"",
SPLIT(REPT(Source!C1&" "&Source!D1&" ", COUNTUNIQUE(Source!B2:B)), " ")};
TRANSPOSE(QUERY(TRANSPOSE(SPLIT(TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({Source!A2:B,
TRANSPOSE(QUERY(TRANSPOSE(Source!C2:D),,999^99))},
"select Col1,max(Col3) where Col1 is not null group by Col1 pivot Col2"),
"offset 1", 0))&" ",,999^99)), " ", 1, 0)), "where "&JOIN(" or ", "Col"&
ROW(INDIRECT("A1:A"&COUNTUNIQUE(Source!A2:A)))&" is not null"), 0))})
spreadsheet demo

Query a count of unique attributes in a table

I have a table like this:
+--------+-------+--------+-------+
| attr1 | attr2 | attr3 | attr4 |
+--------+-------+--------+-------+
| purple | wine | clear | 10.0 |
| red | wine | solid | 20.0 |
| red | beer | cloudy | 10.0 |
| purple | ale | clear | 34.0 |
| blue | ale | solid | 16.0 |
+--------+-------+--------+-------+
that i want to transform like this:
+--------+-------+-------+-------+-------+
| | attr1 | attr2 | attr3 | attr4 |
+--------+-------+-------+-------+-------+
| purple | 2 | | | |
| red | 2 | | | |
| blue | 1 | | | |
| wine | | 2 | | |
| beer | | 1 | | |
| ale | | 2 | | |
| clear | | | 2 | |
| solid | | | 2 | |
| cloudy | | | 1 | |
| 10.0 | | | | 2 |
| 20.0 | | | | 1 |
| 34.0 | | | | 1 |
| 16.0 | | | | 1 |
+--------+-------+-------+-------+-------+
This pivoted or cross-table will show me the count of each attribute value in their respective columns.
How do i use the Google Query language to display such a cross-table?
Well if the data were laid out in two columns it would be straightforward e.g. for something like this
Attrib Column
Red 1
Red 1
Green 1
Blue 1
Beer 2
Ale 2
Ale 2
you could use a query like
=query(A:B,"select A,count(A) where A<>'' group by A pivot B")
So the problem is to organise OP#s data into two columns.
This can be done by what is by now a fairly standard split/join/transpose technique
=ArrayFormula(split(transpose(split(textjoin("|",true,if(A2:D="","",A2:D&" "&column(A2:D))),"|"))," "))
Giving
You could either run the query on the result of this or combine the two like this
=ArrayFormula(query({"Attrib","Number";split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&column(A2:D))),"|"))," ")},"Select Col1,count(Col1) group by Col1 pivot Col2"))
I have joined the column number to the attribute e.g. 1-blue so that it sorts into the right order. If you don't like it, you could get rid of it using regexreplace.
Edit
Slightly shorter formula - I didn't need to put the headers in separately:
=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" Attr"&column(A2:D))),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))
Edit 2
I was being a bit thick there, should have used first row of OP's data as attribute labels instead of column numbers
=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&A1:D1)),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))
Edit 3
Should have chosen a better pair of delimiters
=ArrayFormula(query(split(transpose(split(textjoin("😊",true,if(A2:D="","",column(A2:D)&"-"&A2:D&"🍺"&A1:D1)),"😊")),"🍺"),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))

Return MAX values (Top 1 and 2) from list

I have a Google sheet with data of different players attacks and their corresponding damage.
Sheet1
| Player | Attack | Damage |
|:---------|:-------:|-------:|
| Iron Man | Melee | 50 |
| Iron Man | Missile | 2500 |
| Iron Man | Unibeam | 100 |
| Iron Man | Dash | 125 |
| Superman | Melee | 9000 |
| Superman | Breath | 200 |
| Superman | X-ray | 0 |
| Superman | Laser | 1500 |
| Hulk | Smash | 500 |
| Hulk | Throw | 500 |
| Hulk | Stomp | 500 |
| Hulk | Jump | 325 |
In my second sheet, I want to list each player and display their two best attacks and the corresponding damage. Like this:
Sheet2
| Player | # | Attack | Damage | Comment |
|:---------|:---:|:-------:|-------:|----------:|
| Iron Man | 1 | Missile | 2500 | |
| Iron Man | 2 | Dash | 125 | |
| Superman | 1 | Melee | 9000 | Very nice |
| Superman | 2 | Laser | 1500 | |
| Hulk | 1 | Smash | 500 | |
| Hulk | 2 | Stomp | 500 | |
Update:
Some attack may have been producing the exact same damage, if this happens - I just want to return the first one in alphabetical order.
I am now using the following formulas:
Damage-column: =MAX(FILTER(Sheet1!C:C,Sheet1!A:A=A2))
Attack-column: =JOIN(",",FILTER(Sheet1!B:B,Sheet1!A:A=A2,Sheet1!C:C=C2))
This returns the best attack/damage. For example on row 1/2:
| Player | # | Attack | Damage | Comment |
| Iron Man | 1 | Missile | 2500 | |
| Iron Man | 2 | Missile | 2500 | |
But not the second best. How do I modify the formula on the second row to return the second best attack/damage?
Update:
Using =LARGE(Sheet1!C:C;B3) in the second row I can get the second best attack from Sheet1, but it dosen't segment on player.
Update 2:
=ArrayFormula(LARGE(IF(Player="Iron Man",Damage),B2)) (using named ranges) returns both first and second best damage. Still trying to figure out how to return the attack-name.
With the data you provided I was able to produce the expected outcome by using this formula:
=query({ArrayFormula(iferror(SORT(ROW(Sheet1!A2:A),SORT(ROW(Sheet1!A2:A),Sheet1!A2:A,1),1)-MATCH(Sheet1!A2:A,SORT(Sheet1!A2:A),0))) , sort(A2:C, 1, 0, 3, 0)}, "Select Col2, Col1, Col3, Col4 where Col1 < 3 ")
See if you can get this to work on your data.
Sample link
EDIT: Based on the comments below, here's an updated version.
=query({ArrayFormula(iferror(SORT(ROW(Sheet1!A2:A),SORT(ROW(Sheet1!A2:A),Sheet1!A2:A,1),1)-MATCH(Sheet1!A2:A,SORT(Sheet1!A2:A),0))) , Sheet1!A2:C}, "Select Col2, Col1, Col3, Col4 where Col1 < 3 order by Col2")
I used two different formulas:
To get the maximum damage per Player =FILTER(FILTER($D$3:$D$10,$B$3:$B$10 = $F3), FILTER($D$3:$D$10,$B$3:$B$10 = $F3) = LARGE(FILTER(D3:D10,B3:B10 = $F3),$G3))
To get the attack =INDEX($C$3:$C$10, MATCH($F3&$I3,$B$3:$B$10&$D$3:$D$10, 0))
Give JPV credit but use this for getting the answer on another sheet:
=query({ArrayFormula(iferror(SORT(ROW(Sheet1!A2:B),SORT(ROW(Sheet1!A2:A),Sheet1!A2:A,1),1)-MATCH(Sheet1!A2:A,SORT(Sheet1!A2:A),0)-ROW()+1)) , sort(Sheet1!A2:C, 1, 1, 3, 0)}, "Select Col2, Col1, Col3, Col4 where Col1 < 3")
Please, try this formula:
=QUERY({Sheet1!$A$2:$C},
"select Col2, Col3 where Col1 = '"&A2&"' order by Col3 desc limit 1 offset "&B2-1)
It sorts the data by Col3 - metric, and limits the result to 1 row, and offsets the result, so you have top 1, 2, and so on values. In my sample, Superman has 2 same values: 9000, the formula enters "Melee" first because it was at first position in the table, but you may sort it in a query text to get an alphabetical order

List of the most frequently occurring words in the row

I'm looking for a way to show (in the Google Spreadsheet) the most frequently occurring word in the row, but if it isn't one word I want to display all of them separated by semicolon which have the same count of occurrence.
Explanation:
For example, I want to fill the last column with values as below:
+---+------+------+------+------+------+-------------------+
| | A | B | C | D | E | F |
+---+------+------+------+------+------+-------------------+
| 1 | Col1 | Col2 | Col3 | Col4 | Col5 | Expected response |
| 2 | A | A | C | D | E | A |
| 3 | A | A | B | B | B | B |
| 4 | A | A | B | B | E | A, B |
| 5 | A | B | C | D | E | A, B, C, D, E |
+---+------+------+------+------+------+-------------------+
Here's what I have achieved (formula for cell F2):
=INDEX(A2:E2; MODE(MATCH(A2:E2; A2:E2; 0)))
but it doesn't work for 4th and 5th row as I expect.
This works in Office 365 Excel, but probably will not in Excel online, as it is an array formula.
=TEXTJOIN(", ",TRUE,INDEX(A2:E2,,N(IF({1},MODE.MULT(IF(((MATCH(A2:E2,A2:E2,0)=COLUMN(A2:E2))*(COUNTIF(A2:E2,A2:E2)=MAX(COUNTIF(A2:E2,A2:E2)))),COLUMN(A2:E2)*{1;1}))))))
Being an array formula it needs to be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode. If done correctly then Excel will put {} Around the formula.
EDIT:
To do it with Google Sheets as you now want:
=join(", ",filter(A2:E2,column(A2:E2)=match(A2:E2,A2:E2,0),countif(A2:E2,A2:E2)=max(countif(A2:E2,A2:E2))))
F2:
=JOIN(",",SORTN(TRANSPOSE(A2:E2),1,1,ARRAY_CONSTRAIN(FREQUENCY(MATCH(A2:E2,A2:E2,0),COLUMN(A2:E2)),COUNTA(A2:E2),1),0))
See syntax # https://support.google.com/docs/table/25273

COUNTA with FILTER using multiple conditions

I want to count the number of companies (Col B) that have a status of 'Our System' in Col A, grouped by their postcode area (e.g., SW10, SW11 etc)
As an example, the figures in the 'On System' column reflect what the formula should result in.
A | B | C | D | E | F | G |
----------|---------|----------|---|---|----------|-----------|
Status | Name | Postcode | | | Area | On System |
----------|---------|----------|---|---|----------|-----------|
On System | ABC Ltd | SW10 4ED | | | SW10 | 1 |
----------|---------|----------|---|---|----------|-----------|
On System | XYZ Ltd | SW11 5RF | | | SW11 | 2 |
----------|---------|----------|---|---|----------|-----------|
On System | GBH Ltd | SW11 5GR | | | SW12 | 0 |
----------|---------|----------|---|---|----------|-----------|
Fresh | DEF Ltd | SW11 7GG | | | SW13 | 0 |
----------|---------|----------|---|---|----------|-----------|
Fresh | GHI Ltd | SW12 5F5 | | | SW14 | 0 |
----------|---------|----------|---|---|----------|-----------|
I've used the following formula (the below example counts companies in SW10 that are 'On System'), but with no success.
=COUNTA(IFERROR(FILTER(C:C, C:C=F3&" *", A:A="On System" )))
I'm under the impression that IFERROR removes empty results or something similar. Without it, I just get a value of 1, even if there are no SW10 rows with an On System status.
Any ideas?
To count the 'on system' with postal code SW10 try:
=sumproduct(A:A="On System", regexmatch(C:C, "SW10"))
Of course you can replace the strings with cell references.
Or -shorter- use COUNTIFS() with a wildcard (*)
=countifs(A:A, "On System", C:C, "SW10*")

Resources