Google Sheets counting with multiple criteria - google-sheets

I have a Google Sheet where I have several columns of data, and I want to get a count of how many rows match two criteria, where one of the criteria is matching either one of two values.
Here’s an example of the data I have:
What I want to do is things like: get a count of how many rows have “Yes” in column A, and either “A” or “C” in column B. Or how many rows are “No” and either “I” or “X”.
I’ve come up with this:
=COUNTIFS($A1:$A21,"Yes",B1:B21,"="&"A")+COUNTIFS($A1:$A21,"Yes",B1:B21,"="&"C")
…but that feels clunky, and makes it harder to update if I decide to shift columns around. Not to mention really bad if I want to combine multiple bits of information into a single cell, such as this:
=(COUNTIFS($A1:$A21,"Yes",B1:B21,"="&"A")+COUNTIFS($A1:$A21,"Yes",B1:B21,"="&"C")) & "/" & (COUNTIFS($A1:$A21,"No",B1:B21,"="&"A")+COUNTIFS($A1:$A21,"No",B1:B21,"="&"C"))
I mean, that’s just awful. It works, but it’s awful.
I’ve tried using OR() without success, and also tried curly-bracket syntax without success. I fully acknowledge I may have done both of them wrong, but if so, darned if I can figure out what I missed. Any Sheets mavens willing to take pity on an old dude and show me a much smarter way to do this?

Shortest one so far:
=SUMPRODUCT(REGEXMATCH(A1:A8&B1:B8,"(?i)Yes(A|C)"))
CONCATENATE both columns using & and use REGEX on the result.
(?i) Case insensitive
yes(A|C) yes followed by A or C
SUM up all the trues.
For a complex condition,
=ARRAYFORMULA(SUM(--REGEXMATCH(A1:A8&B1:B8,"(?i)yes(A|C)"))&"/"&SUM(--REGEXMATCH(A1:A8&B1:B8,"(?i)no(I|X)")))
Note that there should be no trailing spaces following yes/No and no leading spaces before A or C etc. If there are, use TRIM.

I would use query with variables. In F1 put:
=query(A:B,"select count(A) where A ='"&C2&"' AND B='"&D2&"' OR A ='"&C2&"' AND B='"&E2&"'")
In C2 enter "Yes" or "No" and in D2 and E2 enter the B letters (or leave blank). Enter whatever headers you want in C1, D1, and E1.

I'm not sure if this is exactly what you're looking for, but you could simplify it a bit by just creating a "pairings" list.
E2: =COUNTIFS(A:A,$C2,B:B,$D2)
E3: =COUNTIFS(A:A,$C3,B:B,$D3)
...
E6: =SUM(E2:E5)
The benefit is that it's flexible - you can add as many pairings as you want later on. Also, no complexity of array formulas.

For complex logic, use more powerful commands like query or filter.
count the rows with “Yes” in column A, and either “A” or “C” in column B.
becomes
=query(A:C, "select count(A) where A='Yes' and (B='A' or B='C')")
or
=query(A:C, "select count(A) where A='Yes' and (B='A' or B='C') label count(A) ''")
if you don't want to have a column header such as "count".
This is pretty much stating the goal in English (well, SQL version of it).

Simplify:
Create a column C that is TRUE if B is A or C, FALSE otherwise.
Create a column D that is TRUE if B is I or X, FALSE otherwise.
Create a column E that is TRUE if A is "yes" and C is TRUE.
Create a column F that is TRUE if A is "no" and D is TRUE
Create a column G that is column E or column F.
Sum up the values in column G.

Related

Unnest two columns in google sheet

I have a table like this one here (basically it's data from a google form with multiple choice answers in column A and B and non-muliple choice data in column C) I need a separate row for each multiple choice answer.
Column A
Column B
Email
A,B
XX,YY
1#gmail.com
A,C
FF,DD
2#gmail.com
I tried to un-nest the first column and keep the remaining columns like this
enter image description here
I tried several approaches I found with flatten and split with array formulas but I don't know where to start really.
Any help or hint would be much appreciated!
You can use the split function on the column A and after that, use the index function. Considering the table, you can use:
=index(split(A2,","),1,1)
The split function separate the text using the delimiter indicated, returning an array with 1 line and 2 columns; the index function will return the first line and the first column from this array. To return the second element from the column A, just change to
=index(split(A2,","),1,2)
I think there's no easy solution for this. You're asking for as many combinations of elements as multiple-choice elections have been made. Any function in Google Sheets has its potentials and limitations about how many elements it can express. One very useful formula here is REDUCE. With REDUCE and sequences of elements separated by commas counted with COUNTA, you can stablish this formula:
=QUERY(REDUCE({"Col A","Col B","Email"},SEQUENCE(COUNTA(A2:A)),LAMBDA(z,c,{z;LAMBDA(ax,bx,
REDUCE({"","",""},SEQUENCE(ax),LAMBDA(w,a,
{w;
REDUCE({"","",""},SEQUENCE(bx),LAMBDA(y,b,
{y;INDEX(SPLIT(INDEX(A2:A,c),","),,a),INDEX(SPLIT(INDEX(B2:B,c),","),,b),INDEX(C2:C,c)}
))})))
(COUNTA(SPLIT(INDEX(A2:A,c),",")),COUNTA(SPLIT(INDEX(B2:B,c),",")))})),
"Where Col1 is not null",1)
Since I had to use a "initial value" in every REDUCE, I then used QUERY to filter the empty values:

COUNTIFS w/ Multiple Criteria in Same Column

I'm trying to count instances of letters (like letters C through Z, excluding RR) within the same column if they are listed next to a name. Here's my sample Google Sheet that you can edit.
I'm trying to insert the formula in cell F3 that is highlighted yellow. So far I have...
=arrayformula(SUM(COUNTIFS(A2:A,{"C","D","E","F","G"},C2:C,E3:E)))
It seems like what I have should work, but it's giving me a #VALUE error, saying, "ARRAY arguments to COUNTIFS are of different size".
It seems like REGEXMATCH could be used inside the COUNTIF formula to make it easier to restrict the search to the range of letters I need, but not sure how to construct the formula.
Thanks for your help!
UPDATE
This formula below works but only for column A. I actually need to specify a different range of letters to be counted in column B and totaled in column K.
=QUERY(A2:C,"select C,count(A) where A matches 'C|D|E|F|G|H|I|J|K' group by C label count(A)''", 0)
Seems like this post almost answers it.
Current progress:
={QUERY(A2:C,"select C,count(A) where A matches 'C|D|E|F|G|H|I|J|K' and A is not null group by C label count(A)''", 0),QUERY(A2:C,"select count(B) where B matches 'F|G|H|I|J|K' and B is not null group by C label count(B)''", 0)}
As we have discussed / tested on your sample sheet. This should work as close as possible to the data that you would want to filter/display.
A 2-formula solution was found to work as well. Answer is in the same spreadsheet linked above in Sheet2. It should prevent the first formula from not working should someone need a different one.
J2 - =QUERY(A2:C,"select C,count(A) where A matches 'C|D|E|F|G|H|I|J|K' group by C label count(A)''", 0)
L2 - =ArrayFormula(IFNA(vlookup(J2:J,QUERY(A2:C,"select C,count(B) where B matches 'F|G|H|I|J|K' group by C label count(B)''", 0),2,0)))
This formula should work for you if you paste it in cells F3 to F7.
I currently have cells P1 to P26 (you'd probably want to hide this column) occupied with the letters of the alphabet in order, making it so that you can select P3:P26 (C-Z) to put into your first condition for your COUNTIFS.
=arrayformula(SUM(COUNTIFS(A2:A, $P$3:$P$26,C2:C,E3)))
You'll have to implement these formulas in the other places that you want them too, but it shouldn't be hard to change this formula to work in the other places as well.

Google Sheets: Find a Row that Matches Only a Few Specific Characteristics

I can't seem to find the right equation to find a cell from a row that matches only a few specific characteristics. In this example, I am trying to find the equation for Column D which would be the cell in A that has the same cells for B & C.
Hope this makes sense!
I'll provide two options.
If you're sure your data will only ever have zero or one match, you can place the following formula into D2 of an otherwise empty range D2:D...
=ArrayFormula(IF(A2:A="",,SUBSTITUTE(VLOOKUP(B2:B&C2:C,{B2:B&C2:C,A2:A},2,FALSE)&VLOOKUP(B2:B&C2:C,SORT({B2:B&C2:C,A2:A,ROW(A2:A)},3,0),2,FALSE),A2:A,"")))
However, if you think more than one match may turn up and you want "None" to be returned if there is no match, you can use the following formula in D2 or an otherwise empty range D2:D...
=ArrayFormula(IF(A2:A="",,REGEXREPLACE(REGEXEXTRACT(REGEXREPLACE(SUBSTITUTE(VLOOKUP(B2:B&C2:C,TRIM(SPLIT(FLATTEN(QUERY(QUERY({B2:B&C2:C&"~",A2:A&","}, "Select MAX(Col2) where Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1"),, 9^9)),"~")),2,FALSE),A2:A,""),"^[,\s]+$","None"),"([^,\s].+[^,\s])[,\s]*$"),"[,\s]+",", ")))
The second formula will work even if there will only ever be zero or one match; it's just not necessary to have it be that lengthy. And the second formula is only as lengthy because it was unclear from your posted examples whether the data in Col A, B and C will really only ever be one word or not; so the formula is built to assume there will not always be one-word strings in those columns.
Either formula will provide results for the entire column without dragging.
Here's an option, You can use this formula in column D2:
=iferror(textjoin(", ",true,query($A$2:$C,"Select A where A is not null and A != '"&$A2&"' and B = '"&$B2&"' and C = '"&$C2&"'",0)),"None")
Limitation:
You need to manually drag the formula to its succeeding rows. Arrayformula() cannot be used in looping the query string values.
What it does?
Using query(), filter the data from A2:C that has the same current row last name(Column B) and food(Column C) at the same time having a different first name(Column A)
If there are multiple results, use textjoin() to combine them with ", " as its delimiter.
If there is no matched found, it will return an error, hence use iferror() to set the default value to "None"
Output

Only apply complex arrayformula() to rows with certain value in dataset

I have a quite complext formula (i mean that is complex to me) that Tom Sharpe helped me building to aggregate values and ordering them by months in a row(you can find the details in the original post but i think you'll only need the final formula which is:
=ArrayFormula(mmult(sequence(1,counta(A2:A),1,0), if((C2:index(C:C,counta(C:C))<=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0)))* (D2:index(D:D,counta(D:D))>=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0))),E2:index(E:E,counta(E:E)),0)))
and here is the result -> [J1:U1]
Now, what i would need to do as the final step is to be able to group data by a certain label (John or Jane in the example) on separate rows, but mantaining the order/aggregate by month on the row. On the example, this would mean having one row with only 'John' data and below, one with 'Jane' values.
I am struggling to understand how to adapt the formula to do so.
I have tried:
Using another array to first return a list of these labels with query(unique()) or something like that, but then i struggle looping in it with the other formula.
A bit more simplistic but it could work after all: on the 1st row (the cell next to where the data will be returned) writing 'John', on row 2 'Jane' and then using filter() to only pull data that matches. The 'John, Jane' value is for the example but the real labels won't be that many, the list of labels don't need to be dynamic.
The thing with these solutions is that they work when used separately, but i can't figure out how to nest this in the first arrayformula() that Tom helped me with...As i am just beginning with the google sheets queries.
I don't really need necessarily the complete formula/code but maybe just directions or tips to visualize the way i could solve this.
Thanks to all who might contribute
With hindsight I might have done better to go down the route of using a query to calculate the sums on my previous answer rather than Mmult.
This uses the same method as before to create a 2d array of amounts vs dates (going across) and individuals (going down). Then it uses Textjoin to generate a query to group by name with the required number of columns.
=ArrayFormula(query({A2:A,if((C2:C<=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0)))* (D2:D>=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0))),E2:E,0)},
"select Col1,sum(Col"&textjoin("),sum(Col",,sequence(1,datedif(G2,H2,"M")+1,2))&") where Col1 is not null group by Col1"))
This is the generated query
select Col1,sum(Col2),sum(Col3),sum(Col4),sum(Col5),sum(Col6),sum(Col7),sum(Col8),sum(Col9),sum(Col10),sum(Col11),sum(Col12),sum(Col13) where Col1 is not null group by Col1
Ideally there should be an extra section saying label sum(Col2) '' etc. to suppress the 'Sum' headers.
=ArrayFormula(query({A2:A,if((C2:C<=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0)))* (D2:D>=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0))),E2:E,0)},
"select Col1,sum(Col"&textjoin("),sum(Col",,sequence(1,datedif(G2,H2,"M")+1,2))&") where Col1 is not null group by Col1 label sum(Col" & textjoin(") '', sum(Col",,sequence(1,datedif(G2,H2,"M")+1,2)) & ") ''"))

pulling row number into query google spreadsheet

I have a data set that looks like this: starting on A1 with "1"
1 a
2 b
3 c
4 d
Column A is an arrayformula =arrayformula(row(b1:b))
Column B is manual input
i want to query the database and finding the row of the item by match column B so i have code as such
=query("A1:B","select A where B like '%c%')
this should give me "3"
My question:
is there a way to pull the 1-4 numbers into the query line? with something like array formula row(b1:b). I don't want to waste an extra column on column A
so basically I want just the manual input and when i query it gives me the row number.
No script code please.
I've tried a few things and it didn't work.
Looking for a solutions that starts with
=query()
You can also use a formula to pull in more than one row in the dataset which matches the condition, if this is important to you:
=arrayformula(filter(row(B:B); B:B="c"))
And you can have wildcard type operators, under certain circumstances (you are going to match text or items that can look like text (so numbers can be treated as text - but boolean will need more steps); that the dataset is not huge), using regular expressions. e.g.
=arrayformula(filter(row(B:B); regexmatch(B:B, "(c|d)")))
You could also use standard spreadsheet wildcard operators, e.g.
=arrayformula(filter(row(B:B); countif(B:B, "*c*")))
Explanation: In this case, the filter will be true when countif is greater than zero, i.e. when it sees something with a letter c in it, since spreadsheets see a value greater than zero as a boolean true and so, for that row where there is a countif match, there will be a a filter match, and so it will display that row (indeed, it is a similar situation with the regexmatch creating a true when there is a match of either c or d, in the case above).
Personally, I wanted to learn regex a bit, so I would go towards the regexmatch option. But that is your choice.
You can also, of course, create the match outside of the cell. This makes it easy to create a list of matches that you want to satisfy elsewhere on the sheet. So you could have a column of words or parts of words, from Z2 downwards, and then join them together in cell Z1 for example like this
="("&join("|",filter(Z2:Z50,len(Z2:Z50)))&")"
Then your filter function would look like this:
=arrayformula(filter(row(B:B), regexmatch(B:B, Z1)))
If you want to use like operator in the query function, you can try something like this:
=arrayformula(query(if({1,0}, B:B,row(B:B)),"select Col2 where Col1 like '%c%' "))
You can also use the regular expressions in the query function, for example:
=arrayformula(query(if({1,0}, B:B,row(B:B)),"select Col2 where Col1 matches '(.*c.*|.*d.*)' "))
I'm not entirely clear on the question, but as I understand it, you want to be able to enter a formula, and have it return the row number of the matched item in a range? I'm not sure where array formulas come in.
If I've understood your question correctly, this should do the trick:
=MATCH("C",B1:B,0)
In your example, this returns 3.
Please forgive me if I've misunderstood your question.
Note: If there are multiple matches, this will return the row number for the first instance of your search.
=QUERY({A1:A,ARRAYFORMULA(ROW(A1:A))},"SELECT Col2 WHERE Col1 LIKE '%c%'")

Resources