Count/sum occurrences of word in cell in column - Google Sheets - google-sheets

I have two columns of text/abbreviations i want to summarize in a (or two respectively) tables.
Example:
Column 1: SiteScopeInput
Swap_G09+Remove_U21+Rollout_L07+Swap_L08+Swap_L18+Rollout_L21+Rollout_N35
Remove_U21+Rollout_L07+Swap_L08+Swap_L18+Rollout_L21+Rollout_N35
Swap_G09+Rollout_L07+Swap_L08+Swap_L09+Swap_L18+Rollout_L21+Swap_L26+Rollout_N35
Swap_G09+Remove_U21+Rollout_L07+Swap_L08+Swap_L18+Rollout_L21+Rollout_N35
Here I'd like to summarize how many times words occure - for example:
Swap_G09
Remove_U1
Rollout_L07
and so on.
I've tried a combination of
ArrayFormula, LEN and REGEXREPLACE
but my beginner mind can't wrap around this task.
Any tips? Thanks!

If you want to treat Swap_G09 etc. as a separate word, then a combination of split, flatten and query should do it:
=ArrayFormula(query(FLATTEN(split(filter(A:A,A:A<>""),"+")),"select Col1,count(Col1) where Col1 is not null group by Col1 label Col1 'Word'"))

Related

How to find the highest N values in each group in Google Sheets

I want to use a formula to find the highest N values in each group in a Google Spread Sheets.
I tried this formula from infoinspired.com (credit to Prashanth):
=ArrayFormula(QUERY({SORT(A2:B;1;true;2;false);IFERROR(row(A2:A)-match(query(SORT(A2:B;1;true;2;false);"Select Col1");query(SORT(A2:B;1;true;2;false);"Select Col1");0))};"Select Col1,Col2 where Col3<3"))
But all it return is an Array_Literal error:
This is what I expect:
What is wrong with it?
You have to put a comma, not a semi colon before IFERROR. It's creating two columns, one twice larger than the other instead of three columns ;)
=ArrayFormula(QUERY({SORT(A2:B,1,true,2,false),IFERROR(row(A2:A)-match(query(SORT(A2:B,1,true,2,false),"Select Col1"),query(SORT(A2:B,1,true,2,false),"Select Col1"),0))},"Select Col1,Col2 where Col3<3"))
alternative formula:
=QUERY(SORT({{A2:B}\MAP(A2:A;B2:B;lambda(ax;bx;IFERROR(Rank(bx;Filter(B$2:$B;A$2:$A=ax);0);IFERROR(1/0))))};1;0;3;1);"Select Col1, Col2 Where Col3<3")
-

Efficient way to collate/aggregate specific data in Google Sheets

I'm looking for an efficient way to gather and aggregate some date in Google Sheets. I've been looking at the query function, pivot tables, and Index + Match formulas, but so far I've not found a way that brings me to the result I'm looking for. I have a set of data which looks more or less as follows.
The fields with an X represent irrelevant data which I don't want to show up in my end result. They only serve to illustrate that there are columns of data that I don't want in between the columns of data that I do want. The data in those columns is of varying types and of varying values per type, they are not actually fields with an "X" in it. Only the fields with numbers are of interest along with the related names at the top and left of those. The intent is to create a list that looks more or less like this.
I've highlighted those yellow fields because that data has been aggregated. For example, in the original file field D3 shows a relation between Laura and Pete with the number 1, and field L3 also shows a relation between Laura and Pete, so the number in that field is to be added to the number in the other field resulting in an aggregated total of 2 for that particular combination.
I would really appreciate any suggestions that can help me get to an elegant and efficient solution for this. The only solutions I can come up with would involve multiple "in-between" sheets and there just has to be a better way.
UPDATE:
Solved by applying the solution in player0's answer. I just had to switch around the order of Col1 and Col2 in the formula to get the table sorted the way I needed it. Formula looks like below now. Many thanks to both player0 and Erik Tyler for their efforts.
=INDEX(QUERY(SPLIT(FLATTEN(A2:A&"×"&D1:N1&"×"&D2:N), "×"),
"select Col2,Col1,sum(Col3)
where Col2 is not null
and Col3 is not null
group by Col2,Col1
label sum(Col3)''", ))
try:
=INDEX(QUERY(SPLIT(FLATTEN(A2:A&"×"&D1:N1&"×"&D2:N), "×"),
"where Col3 is not null and Col2 is not null", ))
update:
=INDEX(QUERY(SPLIT(FLATTEN(A2:A&"×"&D1:N1&"×"&D2:N), "×"),
"select Col1,Col2,sum(Col3)
where Col3 is not null
and Col2 is not null
group by Col1,Col2
label sum(Col3)''", ))
Given your current data set (which only appears to extend to Col N), place the following somewhere to the right of Col N:
=ArrayFormula(SPLIT(TRANSPOSE(QUERY(TRANSPOSE(QUERY(SPLIT(QUERY(FLATTEN(FILTER(IF(NOT(ISNUMBER(D2:N)),,D1:N1&"~ "&A2:A&"|"&D2:N),A2:A<>"")),"Select * WHERE Col1 Is Not Null"),"|"),"Select Col1, SUM(Col2) GROUP BY Col1 LABEL SUM(Col2) ''")&"~ "),,2)),"~ ",0,1))
It would be better if this were placed in a different sheet from the original data. Supposing that your original data sheet is named Sheet1, place the following version of the above formula into a new sheet:
=ArrayFormula(SPLIT(TRANSPOSE(QUERY(TRANSPOSE(QUERY(SPLIT(QUERY(FLATTEN(FILTER(IF(NOT(ISNUMBER(INDIRECT("Sheet1!D2:"&ROWS(Sheet1!A:A)))),,Sheet1!D1:1&"~ "&Sheet1!A2:A&"|"&INDIRECT("Sheet1!D2:"&ROWS(Sheet1!A2:A))),Sheet1!A2:A<>"")),"Select * WHERE Col1 Is Not Null"),"|"),"Select Col1, SUM(Col2) GROUP BY Col1 LABEL SUM(Col2) ''")&"~ "),,2)),"~ ",0,1))
This separate-sheet approach and formula allows for the original data to extend indefinitely past Col N.

How can I separate a column into multiple columns based on values?

I have searched on a lot of pages but I cannot find a solution to my problem except in reverse order. I have simplified what I do, but I have a query that comes looking for information in my data sheet. Here there are 3 columns, the date, the amount and the source.
I would like, with a query function, to be able to make different columns which counts the information of column C based on the values of its cells per month, like this
I'm okay with the start of the formula
=QUERY(A2:C,"select month(A)+1, sum(B), count(C) where A is not null group by month(A)+1")
But as soon as I try a little different things by putting 2 query together in an arrayformula, obviously the row count doesn't match as some minus are 0 for some sources.
Do you have a solution for what I'm trying to do? Thank you in advance :)
Solution:
It's not possible in Google Query Language to have a single query statement that has one result grouped by one column and another result grouped by another.
The first two columns can be like this:
=QUERY(A2:C,"select month(A)+1, sum(B) where A is not null group by month(A)+1 label month(A)+1 'Month', sum(B) 'Amount'")
To create the column labels for the succeeding columns, use in the first row, in my example, I1:
=TRANSPOSE(UNIQUE(C2:C))
Then from cell I2, enter this:
=COUNTIFS(arrayformula(month($A$2:$A)),$G2,$C$2:$C,I$1)
Then drag horizontally and vertically to apply to the entire table.
Results:
try:
=INDEX({
QUERY({MONTH(A2:A), B2:C},
"select Col1,sum(Col2) where Col2 is not null group by Col1 label Col1'month',sum(Col2)'amount'"),
QUERY({MONTH(A2:A), B2:C, C2:C},
"select count(Col3) where Col2 is not null group by Col1 pivot Col4")})

Calculate total of values in columns?

I have a Google Sheets document with four sheets. In each sheet, there are two columns: first is a name, and second a value.
I need to get the total of the values from the four sheets in a fifth sheet where the names from the four name-columns correspond.
The column with the values is called differently in most sheets.
How can I accomplish this task?
You can use Query + Arrays, like on image below:
Code:
=QUERY({Sheet1!A2:B;Sheet2!A2:B;Sheet3!A2:B;Sheet4!A2:B};
"select Col1,sum(Col2) where Col1 is not null group by Col1 label Col1 'Name',sum(Col2) 'Total'";0)
Explanation:
First: you combine 4 sheets using an Array. Ranges start from row 2 to omit different columns headers:
{Sheet1!A2:B;Sheet2!A2:B;Sheet3!A2:B;Sheet4!A2:B}
Second: Using Query to perform a total. In Query there are some extra operations:
Col1 is not null - for not showing empty rows in totals
label Col1 'Name',sum(Col2) 'Total' - for naming result columns
Syntax may be a little different depending on your local settings, so look at the working copy:
Link to working copy
Is that serves your needs?

Count borrowed books by type

I am collecting stats for a small library.
I have people completing Google forms when they check out books. I have one field where they enter the call number for each book they borrow. These call numbers fall into three major classifications and take the following form (REF.XX.YYYY, FA.XX.YYYY, AA.XX.YYYY) where X and Y are a combination of letters and numerals.
The result in the spreadsheet would look something like this:
Table
Shared table
I am trying to come up with a way to count the number of instances each classification designation comes up (REF, FA, or AA) throughout the column. For instance, REF should be 3, FA should be 0 and AA should be 1. I made several tries with COUNTIF function but without any success. Any clue about what could be such formula?
The expected output is in red. I want to obtain such output using formulas.
Any help would be appreciated.
You can try getting those classifications using regexextract and then use a query to count everything.
=ArrayFormula(query({A2:A\regexextract(B2:B; "^[A-Z]*")}; "Select Col2, count(Col1) where Col2 <>'' group by Col2 label count(Col1) ''"))
I've added this formula to your table. AA is not showing up in the result because it is not present in the source data.

Resources