Google Sheets CountIFS(range, "*string*") but getting exact string - google-sheets

Looking for a way to count cells in a range CONTAINING the word "Transgender" but while ignoring cells that only contain the word "Transgender Man" or "Transgender Women". Cells CAN contain "Transgender" and "Transgender Man" separated by commas and that's worth still counting. But if a cell contains "Transgender Man" but NOT "Transgender" then that needs to be ignored. If a cell only says "Transgender" and does not contain "Transgender Man" then that needs to be counted.
The problem is that my formula:
countIFS($P$3:$P, "*Transgender*")
also counts cells that only contain "Transgender Man" or "Transgender Woman" when I don't want it to.

Another approach could be to use QUERY(), something like:
=Query(P3:P,"Select Count(P) where P matches '(?:^|.*,\s*)Transgender(?:\s*,.*|$)' label count(P) ''")

What possibly could work -- but with the price of quite an overhead -- is something like the following
COUNTIFS(range, "*transgender*") - COUNTIFS(range, "transgender man") - COUNTIFS(range, "transgender woman")
Ie, counting all cells that somehow contain the word "transgender" and then substracting all cells which contain exactly "transgender man" or "transgender woman" and nothing else.

Related

Google Sheets: Formula that checks if a specific string repeats down any number of rows between two surrounding strings?

Sample data here.
In my sheet, I mark header rows in the A-column. If all rows between any given header row are marked as "Ignore" in the B-column, then I'd like that header column to format to a different color.
How do you build a formala that can check if the string "Ignore" happens on any number of rows between two A-column cells with a given string?
Checking for an unknown number of rows is beyond my skillset in formula-building.
EDIT:
I've added a few new conditions that make this slightly more complicated.
A top header row, which should be ignored.
Some rows in column A have data in non-header rows. So, the dynamic range has to check for the exact string that marks a header row and how many rows it takes before that string repeats in the column.
Some B-column rows are blank. Blank doesn't mean "Ignore", so if all B-column rows beneath a header are blank, the header shouldn't have the special format.
try:
=(NOT(REGEXMATCH(ROW($A1)&"", INDEX(TEXTJOIN("|", 1, "×",
IFERROR(SORT(UNIQUE(FILTER(VLOOKUP(ROW($A1:$A),
IF($A1:$A<>"", {ROW($A1:$A), ROW($A1:$A)}), 2, 1),
$B1:$B<>"Ignore", $B1:$B<>"")), 1, 0)))))))*($A1<>"")
update:
=NOT(REGEXMATCH(ROW($A2)&"", "^"&TEXTJOIN("$|^", 1, "×",
IFERROR(SORT(UNIQUE(FILTER(IFNA(VLOOKUP(IF(($A2:$A<>"")*($A2:$A<>"*"),, ROW($A2:$A)),
IF(($A2:$A<>"")*($A2:$A<>"*"), {ROW($A2:$A), ROW($A2:$A)}), 2, 1)),
$B2:$B<>"ignore", $C2:$C<>"")), 1, 0)))&"$"))*($A2<>"")*($A2<>"*")
step-by-step formula explanation
This is essentially the same as Player() only a little shorter formula.
=if(A1<>"",len(SUBSTITUTE(TEXTJOIN("",,B2
:INDEX(B:B,MATCH(true,isblank(B2:B),0)+row()-1,1)),"Ignore",""))=0,"")
Explanation of Dynamic Range
The hardest part of this is matching the groups of values in column b. To do this, I used a vector approach of with an index function separating the ranges with a :. So like one would do B2:B3, one could do: B2:Index.
To get the lower position, I used a method of matching the first blank (note ="" won't work). This will identify the distance from the cell the function is being called from. We then need to add the row it's being called from, then one cell higher (less) as we don't want the blank cell, but the one above. So combining... INDEX(B:B,MATCH(true,isblank(B2:B),0)+row()-1,1) gets the dynamic lower value.
After that, there's a variety of ways to solve. I used textjoin and substation to confirm a length of zero as a method, but lots of other ways.
Paste this: formula in C1, to get a helper column that can be hidden.
=AND( A1<>"", LOWER(B2)= "ignore")
Paste this: formula in conditional formatting and set Apply to range to A1:A1000, take a look at Example Sheet
=$B:$B="Ignore"

How to use sumif with arrayformula?

I am having trouble with arrayformula.
I have some data at Col A & B, SUMIF($A$2:$A2,"ABC",$B$2:$B2) works perfectly fine, but I'd like to use arrayformula so I don't have to drag down the formula.
But using ArrayFormula(SUMIF($A$2:$A2,$C$1,$B$2:$B2)) doesn't do anything at all, is there any way I can make it work? I have no idea how.
Here is another option (say, in E2):
=ArrayFormula(IF(A2:A="",,SUMIF(ROW(A2:A)*IF(A2:A="ABC",1,9^9),"<="&ROW(A2:A),B2:B)))
How It Works:
IF(A2:A="",, means "If a row in A2:A is blank, do nothing for that row."
ROW(A2:A)*IF(A2:A="ABC",1,9^9) will create a number based on the row number at each row: the row number itself multiplied by 1 for rows matching "ABC" (resulting in the row number again, since anything times 1 is itself) or multiplied by 9^9 (i.e., some enormous number, which will be the result of for all rows that are not "ABC").
This will be matched against the condition "<="&ROW(A2:A). So only rows at or before "each row" that matched "ABC" will deliver results.
try in E2:
=INDEX(IF(A2:A="ABC",MMULT(1*TRANSPOSE(IF((TRANSPOSE(ROW(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>"")))))>=ROW(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>"")))))*(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>"")))=TRANSPOSE(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>""))))),
INDIRECT("B2:B"&MAX(ROW(A2:A)*(A2:A<>""))), 0)), ROW(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>""))))^0),))

Google Sheets: extracting numbers from multiple cells that contain text and numbers for one column of data?

I'm working in Google Sheets. I have a few hundred cells that contain text and numbers. The cells contain employee names and their ID#s. I want to extract the ID#s and compile them into one list. I have the formula below that will let me complete the task, but only for one cell, not for a range of cells (even if I select a range and add it to the formula):
=transpose(split(regexreplace(regexreplace(A1,"\s\d+\s"," "),"[^\d\.]"," ")," "))
For example, cell A1 would contain, "Tammy - 123456, Bob - 654987, Mike - 321456" and repeat similar until you get to something like cell DT75 "Marcus - 35768, Bruce - 95126, Lisa - 789123". Some cells in the sheet are blank. The above formula will give me the ID#s from A1 in their own cells:
123456
654987
321456
I'd like to get one column of all the ID#s in the sheet that I could then copy and paste into a completely different proprietary database. Am I coming at this the wrong way? Is a script a better angle?
Since you want your original range to be multi-column, you could try a slightly modified version of player0's formula, like this:
Use CONCATENATE to put all data in a single string.
REGEXREPLACE to remove everything but the numbers from your string.
SPLIT to divide your string into several cells, blank space being the separator.
FLATTEN put all resulting values into a single column.
=FLATTEN(SPLIT(REGEXREPLACE(CONCATENATE(A:DT), "[A-Za-z-,]+", )," "))
try:
=INDEX(FLATTEN(SPLIT(QUERY(REGEXREPLACE(A1:A, "[A-Za-z-,]+", ),,9^9), " ")))
for multi-column:
=INDEX(FLATTEN(SPLIT(FLATTEN(QUERY(REGEXREPLACE(A1:C, "[A-Za-z-,]+", ),,9^9)), " ")))

How can I search for a substring in a cell, using several cells as the search argument? [Google Sheets]

In Google Sheets,
I'm trying to indicate whether each cell in a specific column (Let's call it "Target column") contains any of the words listed in a group of cells (Let's call it "Word warehouse").
The idea is that each cell in Target column that isn't empty AND doesn't contain any word from Word warehouse will add +1 to some other cell in the spreadsheet.
For example, if my column contains any of {"No", "Not", "None", "Negative"} then I will ignore it. If it contains anything else (and is not empty) then it will be counted.
Using Search or Vlookup doesn't help since they expect a single string value rather than a range of cells (Word warehouse).
You can try following formula:
=--ArrayFormula((SUM((--ISNUMBER(SEARCH(TRANSPOSE($D$1:$D$4),A1))))=0)*(A1<>""))
In example range A1:A7 is Target column and range D1:D4 is Word warehouse.
I may have an answer that works for you. See my sample sheet here.
The key formula, C2 in the sample sheet, is:
=QUERY(A2:B,"SELECT A WHERE UPPER(A) MATCHES '" &
UPPER(".*" & JOIN(".*|.*",FILTER(B2:B,B2:B<>"")) & ".*") &
"' ",0)
where A2:A is your "target column" and B2:B is your "word warehouse".
This tests each word or phrase in column A against the (filtered) list of words (or phrases) in column B, and produces a list of all of the ones that match.
By counting the total number of entries in column A, and subtracting the count of the number that matched, you get a count of all of the ones that didn't match. This can be done with this formula - D2 in my sample sheet:
=COUNTA(A2:A) -
COUNTA(QUERY(A2:B,"SELECT A WHERE UPPER(A) MATCHES '" &
UPPER(".*" & JOIN(".*|.*",FILTER(B2:B,B2:B<>"")) & ".*") &
"' ",0))
Note that I've made the match insensitive to case. This can easily be removed, by removing the upper function in the two places in the formula. This also matches on partial matches, eg "Catcher" matches "cat" in the word warehouse. This could also be easily changed.
I also only count one match per phrase, even if it contains several of the words in the warehouse.
Let me know if this helps.

Find all cells, in a column of characters or strings, that match "a"

How do I look through a column of characters or strings, and find all cells in said column that matches with "a" and then add a value from a cell in the same row to (E2) the sum?
I have illustrated it in this picture:
In D2 put:
=SUMIF(A2:A,"=a",B2:B)

Resources