Count occurrences of a specific word in Google Spreadsheet - google-sheets

I have some cells with text. I need to count the occurrences of a specific word (not a list) from those cells.
Example sheet:https://docs.google.com/spreadsheets/d/1WECDbepLtZNwNfUmjxfKlCbLJgjyUBB72yvWDMzDBB0/edit?usp=sharing
So far I found one way to count it in English by using SUBSTITUTE to replace all these words with "":
=(LEN(B1)-LEN(SUBSTITUTE(UPPER(B1),UPPER(A5),"")))/LEN(A5)
However, I don't know why but it doesn't work in German.
Edited:
I don't want to count "Hero" in "Heroes". However, I'd like to count "afk" in "AFK-Spiel" (German for example). Is it possible?

If you want to count occurences of "Hero" word
=COUNTIF(SPLIT(JOIN(" ", B1:B3), " -."&CHAR(10)), "Hero")
Where:
B1:B3: cells with text
"Hero": the word to count
Explaination
JOIN(" ", B1:B3): Concatenation of all cells with text
SPLIT(..., " -."&CHAR(10)): Create an array with each words
COUNTIF(..., "Hero"): Count each array item equals to "Hero"
Example
if input text is:
Hero Hero-666 heroes heroic
➔ then formula will return 2.
If you want to count occurences of "Hero" string
(Even nested in an other word, i.e: "Heroes")
=COUNTA(SPLIT(UPPER(JOIN(" ",B1:B3)), "HERO", false, false))-1
Where:
B1:B3: cells with text
"HERO": the string to count
Explaination
JOIN(" ", B1:B3): Concatenation of all cells with text
UPPER(...): Convert text in upper case
SPLIT(..., "HERO"): Split on each occurences of the string
COUNTA(...)-1: Count how many splits have been done
Example
if input text is:
Hero Hero-666 heroes heroic
➔ then formula will return 4.

In your sheet you mention that the count should be 14.
Considering that, I believe you are looking for a solution to also include words like heroes or Hero
If you want to include variations of hero, like Hero or Heroes you can use the following:
Case insensitive for any language formula:
=COUNTIF(SPLIT(CONCATENATE(B1:B3), " "), "*heRO*")
You can even have *heRO* placed in a cell like A7 and use
=COUNTIF(SPLIT(CONCATENATE(B1:B3), " "), A7)
If you want just the word Hero, remove the asterisks * around it.
It also works for any language (including German).

try:
=ARRAYFORMULA(COUNTA(IFERROR(SPLIT(QUERY(SUBSTITUTE(
UPPER(B1:B3), UPPER(A5), "♦"),,99^99), "♦")))-1)
and for german:
=ARRAYFORMULA(COUNTA(IFERROR(SPLIT(QUERY(SUBSTITUTE(
UPPER(C1:C3), "HELD", "♦"),,99^99), "♦")))-1)

Related

Is there a function in Google Sheets to return the string next to a string that I match from a column of strings?

Okay, Sheet1!F:F is a list of words in English. The same word occurs multiple times and the sheet is organized in order of chapters with the words in question in order as they appear in the chapter. "G:G needs to be that word in Arabic. "H:H needs to be the definition in English. "I:I needs to be the definition in Arabic.
Sheet2!A:A has the word in English, B:B the word in Arabic, C:C the definition in English, D:D the definition in Arabic.
Is there a function that would allow me to find the word from Sheet1!F:F in Sheet2!A:A and return Sheet2!B:B in Sheet1!G:G?
Here's some snipits of an example sheet.
Sheet1!
Sheet2!
You want to find the "Word AR" Sheet1 column G in "Word AR" Sheet2 column B, in other word find the arabic word for the English word from another table.
Paste this formula in Sheet1 cell G2, and drag it down.
=IF(F2="",,INDEX(Sheet2!$B$2:$B,IFNA(MATCH(F2,Sheet2!$A$2:$A,0),"No Match")))
Breakdown:
1 - MATCH function to find the matching row in the range Sheet2!$A$2:$A with [search_type] set to 0 to finds the exact value when range is unsorted.
2 - INDEX gives back a cell's content from a range when given a row and column, our reference is Sheet2!$B$2:$B we set the [column] as 1 or left it blank in case of a single column and pass the result of MATCH function as [row].
3 - handel N/A error with IFNA function and set [value_if_na_error] to "No Match".
4 - IF function IF(F2="",,[value_if_false] To calculate only when the cells of F columns are not blank.
hope that answers your question.
One option would be to use a VLOOKUP formula. For example:
=ifna(arrayformula(vlookup(A2:A, Sheet2!A2:D, 2,0)))
Sheet1:
Sheet2:
This formula can be adjusted to fit your needs:
=ifna(arrayformula(vlookup(F3:F, Sheet2!A2:D, 2,0)))
This can be placed in cell G3 of your Sheet1 and it will auto fill down the column. Repeat this for the next 3 columns, and simply increment from 2 in the original (ie =ifna(arrayformula(vlookup(F3:F, Sheet2!A2:D, 3,0))), etc)

Google Sheets: extracting numbers from multiple cells that contain text and numbers for one column of data?

I'm working in Google Sheets. I have a few hundred cells that contain text and numbers. The cells contain employee names and their ID#s. I want to extract the ID#s and compile them into one list. I have the formula below that will let me complete the task, but only for one cell, not for a range of cells (even if I select a range and add it to the formula):
=transpose(split(regexreplace(regexreplace(A1,"\s\d+\s"," "),"[^\d\.]"," ")," "))
For example, cell A1 would contain, "Tammy - 123456, Bob - 654987, Mike - 321456" and repeat similar until you get to something like cell DT75 "Marcus - 35768, Bruce - 95126, Lisa - 789123". Some cells in the sheet are blank. The above formula will give me the ID#s from A1 in their own cells:
123456
654987
321456
I'd like to get one column of all the ID#s in the sheet that I could then copy and paste into a completely different proprietary database. Am I coming at this the wrong way? Is a script a better angle?
Since you want your original range to be multi-column, you could try a slightly modified version of player0's formula, like this:
Use CONCATENATE to put all data in a single string.
REGEXREPLACE to remove everything but the numbers from your string.
SPLIT to divide your string into several cells, blank space being the separator.
FLATTEN put all resulting values into a single column.
=FLATTEN(SPLIT(REGEXREPLACE(CONCATENATE(A:DT), "[A-Za-z-,]+", )," "))
try:
=INDEX(FLATTEN(SPLIT(QUERY(REGEXREPLACE(A1:A, "[A-Za-z-,]+", ),,9^9), " ")))
for multi-column:
=INDEX(FLATTEN(SPLIT(FLATTEN(QUERY(REGEXREPLACE(A1:C, "[A-Za-z-,]+", ),,9^9)), " ")))

Filter a string based on a variable number of substrings in Google Sheets

I have a cell with a variable number of substrings separated by a comma.
To search:
"first,second,third"
"primero,segundo,tercero,cuarto"
"eins,zwei"
and I have a column with many strings that are composed by some of the substrings:
Column with full items
"first,second,third,fourth"
"primero,segundo,tercero,cuarto,quinto"
"primero,tercero,cuarto"
"eins,zwei,drei"
...and so on...
I would like to find the items of the Column above which has the substrings to be searched. Not a big issue when the amount of substrings is fixed but when it varies it becomes harder. I have a horrible formula that counts the number of commas and then it uses IF for each amount of substrings to search and several FIND(index(SPLIT(A4,","),2) for each substring. The formula is gigant and hard to handle.
Can you think of a better way of doing it?
Here there is an example of what I would like to do. The blue cells are the ones that should have the formula.
https://docs.google.com/spreadsheets/d/1pD9r4JF48cVSNGqA4D69lSyasWxTvAcOhWWu1xW2mgw/edit?usp=sharing
Thanks in advance!
Thank you all for your help! In the end, I used the QUERY function.
=QUERY(E:F,"select E where F contains '" & textjoin("' AND F contains '",TRUE,split(A2,",")) &"'" )
If you are interested, you can see the solution applied in the original spreadsheet :)

How can I extract the exact part of the text on the cell of google sheet when the text can change?

In a Google Sheets spreadsheet, I have the cell A1 with value "people 12-14 ABC". I want to extract the exact match "ABC" into another cell. The contents of cell A1 can change, e.g. to "woman 60+ ABCD". For this input, I would want to extract "ABCD". If A1 was instead "woman 12-20 CAE", I would want "CAE".
There are 5 possible strings that the last part may be: (ABC, ABCD, AB, CAE, C), while the first portions are very numerous (~400 possibilities).
How can I determine which of the 5 strings is in A1?
If the first part "only" has lower case or numbers and the last part "only" UPPER case,
=REGEXREPLACE(D3;"[^A-E]";)
Anchor: Space
=REGEXEXTRACT(A31;"\s([A-E]+)$")
If you can guarantee well-formatted input, this is simply a matter of splitting the contents of A1 into its component parts (e.g. "gender_filter", "age range", and "my 5 categories"), and selecting the appropriate index of the resultant array of strings.
To convert a cell's contents into an array of that content, the SPLIT() function can be used.
B1 = SPLIT(A1, " ")
would put entries into B1, C1, and D1, where D1 has the value you want - provided your gender filter and age ranges.
Since you probably don't want to have those excess junk values, you want to contain the result of split entirely in B1. To do this, we need to pass the array generated by SPLIT to a function that can take a range or array input. As a bonus, we want to sub-select a part of this range (specifically, the last one). For this, we can use the INDEX() function
B1 = INDEX(SPLIT(A1, " "), 1, COUNTA(SPLIT(A1, " ")))
This tells the INDEX function to access the first row and the last column of the range produced by SPLIT, which for the inputs you have provided, is "ABC", "ABCD", and "CAE".

Count all nonwhite character

I'm trying to count number of items in given table. But table may consist cell with different whitespace chars.
I would like to do count, but only consider cell with letters and number.
I was trying to do
COUNTIF(<range>, "?*")
Is there any list of available wildchars provided by google-spreadsheet ? I need "*" but matching only numbers and letters.
Maybe use regexmatch to match letters or digits ?
=ArrayFormula(sum(N(regexmatch(A2:L18&"", "\w|\d"))))
Change the range to suit.
See also this spreadsheet

Resources