Counting number of occurrences in column? - google-sheets

What would be a good approach to calculate the number of occurrences in a spreadsheet column? Can this be done with a single array formula?
Example (column A is input, columns B and C are to be auto-generated):
| A | B | C |
+-------+-------+-------+
| Name | Name | Count |
+-------+-------+-------+
| Joe | Joe | 2 |
| Lisa | Lisa | 3 |
| Jenny | Jenny | 2 |
| Lisa | | |
| Lisa | | |
| Joe | | |
| Jenny | | |

A simpler approach to this
At the beginning of column B, type
=UNIQUE(A:A)
Then in column C, use
=COUNTIF(A:A, B1)
and copy them in all row column C.
Edit: If that doesn't work for you, try using semicolon instead of comma:
=COUNTIF(A:A; B1)

Try:
=ArrayFormula(QUERY(A:A&{"",""};"select Col1, count(Col2) where Col1 != '' group by Col1 label count(Col2) 'Count'";1))
22/07/2014 Some time in the last month, Sheets has started supporting more flexible concatenation of arrays, using an embedded array. So the solution may be shortened slightly to:
=QUERY({A:A,A:A},"select Col1, count(Col2) where Col1 != '' group by Col1 label count(Col2) 'Count'",1)

=COUNTIF(A:A;"lisa")
You can replace the criteria with cell references from Column B

Just adding some extra sorting if needed
=QUERY(A2:A,"select A, count(A) where A is not null group by A order by count(A) DESC label A 'Name', count(A) 'Count'",-1)

=arrayformula(if(isblank(B2:B),iferror(1/0),mmult(sign(B2:B=TRANSPOSE(A2:A)),A2:A)))
I got this from a good tutorial - can't remember the title - probably about using MMult

Put the following in B3 (credit to #Alexander-Ivanov for the countif condition):
={UNIQUE(A3:A),ARRAYFORMULA(COUNTIF(UNIQUE(A3:A),"=" & UNIQUE(A3:A)))}
Benefits: It only requires editing 1 cell, it includes the name filtered by uniqueness, and it is concise.
Downside: it runs the unique function 3x
To use the unique function only once, split it into 2 cells:
B3: =UNIQUE(A3:A)
C3: =ARRAYFORMULA(COUNTIF(B3:B,"=" & B3:B))

Related

How to transpose & split multiple columns and repeat specific cells in a column

I am looking to transpose, split, and keep the correct corresponding Category/Reference Number.
Column A: Category / Reference Number.
Column B: Email (CSV)
| A | B | | A | B |
|001|Email1,Email2,Email3| |001|Email1|
|002|Email4,Email5,Email6| |001|Email2|
| | | |001|Email3|
| | | |002|Email4|
| | | |002|Email5|
| | | |002|Email6|
Here is another post which is similar to what I am looking to accomplish. The only difference is in this post, the OP requested that the formula duplicates data X times. Here is the formula that is used:
=ARRAYFORMULA({TRANSPOSE(SPLIT(CONCATENATE(REPT(B2:B&",", A2:A^2)), ",")),
TRANSPOSE(SPLIT(CONCATENATE(REPT(C2:C&",", A2:A)), ","))})
I have tried modifying this formula by removing the "^2", "A2:A" replacing with a COUNTIF (to determine the number of emails in each row), and keep breaking the formula.
What am I doing wrong?
Here is my sheet.
try:
=ARRAYFORMULA(TRIM(QUERY(SPLIT(FLATTEN(IF(IFERROR(SPLIT('Form Responses'!C2:C, ","))="",,
'Form Responses'!B2:B&"×"&SPLIT('Form Responses'!C2:C, ","))), "×"),
"where Col2 is not null")))

use ARRAYFORMULA to find all matching rows/values in a source table from a lookup table

I have a source table like so. As you can see, each each Lookup Value can have one or more Result.
| Lookup Value | Result |
|--------------|--------|
| a | a1 |
| a | a2 |
| a | a3 |
| b | b1 |
| b | b2 |
| c | c1 |
| c | c2 |
Then I have an input table like so:
| queries | results | | | | | |
|---------|---------|---------|---------|---------|---------|---------|
| a | ... | ... | ... | ... | ... | ... |
| c | ... | ... | ... | ... | ... | ... |
The ... for each row should be the transposed values from the lookup table. So, for example, the above table would look like this:
| queries | results | | |
|---------|---------|----|----|
| a | a1 | a2 | a3 |
| c | c1 | c2 | |
Right now I have to use multiple formulas like so:
I am trying to replace it with a single ARRAYFORMULA but it doesn't seem to work.
Is there another way to do this? Basically lookup all the matching rows from a lookup table and then transpose them?
Suppose your "Lookup Value" and "Result" data run from A1:B (with headers in A1 and B1). And suppose that your "queries" list is in D1:D (header in D1) with the "results" header in E1.
Depending on the maximum number of possible matches in B:B for any value in A:A, you could use this in E2:
=ArrayFormula(IFERROR(VLOOKUP(D2:D,QUERY(FILTER({A2:B,COUNTIFS(A2:A,A2:A,ROW(A2:A),"<="&ROW(A2:A))},A2:A<>""),"Select Col1, MAX(Col2) Group By Col1 Pivot Col3"),SEQUENCE(1,10,2),0)))
If your maximum possible matches is fewer than 10 or more than 10, feel free to edit the second argument of the SEQUENCE function accordingly.
Understand that, with such an array formula that is asked to process a range, you wouldn't be able to put other data anywhere below or to the right of your "queries and results" that you've asked the array formula to assess or fill. So if you want data under it, you'll need to limit your VLOOKUP from D2:D to, say, D2:D50 (or whatever your max queries range would be). Likewise, if that second argument of the SEQUENCE function is 10, you'll have "reserved" 10 columns (i.e., E:N) for possible results, and you won't be able to put data there or you'll "break" the array formula. That being the case, you may want to give yourself some sort of visual line of demarcation around the area you've reserved for the formula's use (e.g., change the background color of the block or place a border around it, etc.).
try:
=ARRAYFORMULA(IFERROR(VLOOKUP(D:D, SPLIT(TRANSPOSE(QUERY(TRANSPOSE(IF(ISNUMBER(
QUERY(QUERY(A:B, "select A,count(A) where A is not null group by A pivot B"), "offset 1", 0)),
QUERY(QUERY(A:B, "select A,count(A) where A is not null group by A pivot B"), "limit 0", 1),
QUERY(QUERY(A:B, "select A,count(A) where A is not null group by A pivot B"), "offset 1", 0)))
,,999^99)), " "), TRANSPOSE(ROW(INDIRECT("A2:A"&COUNTUNIQUE(B:B)))), 0)))

Google Sheets function to group and concat rows

Sample sheet: https://docs.google.com/spreadsheets/d/1AeP0sxDi0-3aaesUdCNTKfricIimjTMFaKO-FX9_g50/edit?usp=sharing
I am trying to find a formula that will group a table on a column and concat the values from all the rows in another column.
For example, if this is my table:
| name | value |
|-------|---------|
| one | alpha |
| two | bravo |
| three | charlie |
| one | delta |
| two | echo |
| four | foxtrot |
| two | golf |
| three | hotel |
| four | india |
This is what I want the formula to output:
| one | alpha, delta |
| two | bravo, echo, golf |
| three | charlie, hotel |
| four | foxtrot, india |
I wish I could share some formula that gets me close but I can't find anything. I thought maybe this formula but, as you can see from the sample sheet, it does not work.
=ARRAYFORMULA(JOIN(", ", TRANSPOSE(FILTER(B2:B, A2:A = {UNIQUE(A2:A)}))))
My thought was, get a unique list of values in the name column, and then use arrayformula to get a list of values in the value column where the name column equals each value in the unique list. :/
try:
=ARRAYFORMULA(REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(
QUERY(QUERY({A2:A&"♦", B2:B&","},
"select max(Col2)
where Col1 !=''
group by Col2
pivot Col1"),,999^99)), "♦")), ",$", ))
or:
=ARRAYFORMULA(IFNA(VLOOKUP(UNIQUE(A2:A),
REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(
QUERY(QUERY({A2:A&"♦", B2:B&","},
"select max(Col2)
where Col1 !=''
group by Col2
pivot Col1"),,999^99)), "♦")), ",$", ), {1, 2}, 0)))

Return a list of all pairs row/column if condition of cell value is met

Given a Google Sheet table similar to the following:
+---+---A--+---B--+---C--+---D--+
| 1 | | col1 | col2 | col3 |
|---+------+------+------+------+
| 2 | row1 | 1 | X | 45 |
| 3 | row2 | 5 | | |
| 4 | row3 | 4 | | 34 |
+---+------+------+------+------+
where row1, col1 are header labels and "X" is also a valid value for the combination "row/column";
I need to retrieve a list of all the possible combinations row/column headers that are not null, meaning in this example:
row1 | col1
row2 | col1
row3 | col1
row1 | col2
row1 | col3
row3 | col3
I tried in different ways such as using the ISBLANK function or the QUERY one as:
=QUERY(A1:D4, "SELECT A,B,C,D WHERE B IS NOT NULL OR C IS NOT NULL OR D IS NOT NULL",1)
but is simply a subset of the precedent table AND I cannot GROUP BY because there's no aggregate function;
Consider a spreadsheet with 3 sheets, "Main", "LabelTable", and "List".
"Main" contains the data:
+---+---A--+---B--+---C--+---D--+
| 1 | | col1 | col2 | col3 |
|---+------+------+------+------+
| 2 | row1 | 1 | X | 45 |
| 3 | row2 | 5 | | |
| 4 | row3 | 4 | | 34 |
+---+------+------+------+------+
A named range from A2:D4 is created and named "Data".
My solution was to first make an accompanying table on "LabelTable" of the form:
+-------+---A--+------B-----+------C-----+------D-----+
| 1 | | col1 | col2 | col3 |
|-------+------+------------+------------+------------+
| 2 | row1 | row1 col1; | row1 col2; | row1 col3; |
| 3 | row2 | row2 col1; | row2 col2; | row2 col3; |
| 4 | row3 | row3 col1; | row3 col2; | row3 col3; |
+-------+------+------------+------------+------------+
where the number of rows and columns matches the size of the table on "Main".
If you have a large table, you can easily generate these by using the formula in B2 as =$A2&","&B$1&";" and copying across and down to the bottom right of your required range.
I then created a named range "Labels" from B2:D4 (or for larger tables, the full extent of the contents of this table)
On "List", I then entered the formula:
=transpose(split(concatenate(arrayformula(if(not(ISBLANK(Data)),Labels,""))),";"))
This produces a list of pairs, like "Row1Col1", corresponding to every value that is not blank.
Description
This uses ISBLANK() to determine if a cell is null or not. In combination with ARRAYFORMULA() on the "outside" this creates a table of TRUE or FALSE values. It then wraps the TRUE/FALSE table with IF() to create a table of labels corresponding to where the TRUE/FALSE table has TRUE. This is where the named range "Labels" comes in - it provides the cell reference of where there was not a null value.
All of this is then wrapped in CONCATENATE() to give a single cell with the references joined. Then SPLIT() is used to turn this into an array - and this is also why ";" was added into the cell references in "Labels". Finally, TRANSPOSE() is used to turn the 1xN array into an Nx1 array, i.e., a single column.
Limitations
If you have a dynamic table then you'll need to also make "Labels" dynamic to match. This method does not do that.
If for some reason there is a requirement that means you can't have the accompanying table on "LabelTable" then this method will not work. Some more will need to be done to "build that in" to the formula.

Select GROUP BY column without selecting the Sum() Column

Say I have searched multiple sheets with a query and gotten the following table in Google Sheets: (its nested, so it will never actually be displayed)
| Item | Amount | ID |
------------------------------
| cat | 3 | 1 |
------------------------------
| dog | 2 | 2 |
------------------------------
| dog | 4 | 2 |
------------------------------
| bird | 1 | 3 |
------------------------------
| bird | 2 | 3 |
------------------------------
| dog | 1 | 2 |
------------------------------
Obviously, If I want to get the Sum of Dogs and Birds from this table I could do something like this:
(-not exact syntax, just an example)
"SELECT Col1, sum(Col2) WHERE Col1 = 'dog' or 'bird' GROUP BY Col1 ORDER BY Sum(Col2)"
And I should get something like the following:
| Item | Amount |
---------------------
| dog | 7 |
---------------------
| bird | 3 |
---------------------
BUT - Is there a way I can return ONLY Col1 (As in still do the Grouping and ordering) so that If I was to put it side by side with a result that showed all Columns it would still line up correctly?
OK, Solved it!!! Quite Simple Really....
All I have to do is to add another layer of nesting.
SO:
=QUERY(table, "SELECT Col1, sum(Col2) WHERE Col1 = 'dog' or 'bird' GROUP BY Col1 ORDER BY Sum(Col2))"
Becomes:
=QUERY(
QUERY(table, "SELECT Col1, sum(Col2) WHERE Col1 = 'dog' or 'bird' GROUP BY Col1 ORDER BY Sum(Col2)"),
"SELECT Col1"
)
Again, Syntax not exact, but its the Logic I was after :)
:)

Resources