Count uniques in comma separated column list - google-sheets

In one column I have a list of comma (and whitespace) separated responses to a question such as "what music genre do you listen to?"
Alternative, EDM, Electronic, Hip Hop,
Drum & Bass (D&B), Indie, House, R&B, Rap, Rock
Indie, House, R&B, Rap, Rock,
Rap, Rock
I want a function that will return the unique values of the respones. So something like
UNIQUE(SPLIT({A:A})
(Although that doesn't quite work).
Desired output is a column of unique like:
Rock
Rap
...
Hip Hop

If your data starts in column A1, try this formula:
=QUERY(SORT(UNIQUE(FLATTEN(ARRAYFORMULA(TRIM(SPLIT(FILTER(A1:A,LEN(A1:A)),",",1,1)))))),"where Col1 <>''",0)
This gets the data in column A, ignoring blank cells, splits the cells on the commas, trims any leading or trailing spaces, "flattens" the resulting columns into one, gets unique values, sorts, them, and then removes any blank rows.
UPDATE: A more efficient version of the above formula is:
=QUERY(UNIQUE(IFERROR(FLATTEN(ARRAYFORMULA(TRIM(SPLIT(A1:A,",",1,1)))))),"where Col1 <>'' order by Col1")
Note that FLATTEN is an unsupported function that may possibly be removed from Google Sheets at some point. There are other ways of performing its function, if necessary.
The caution, provided by Matt King, on the use of FLATTEN.
If this doesn't work for some reason, please share a sample of your sheet.

use:
=INDEX(UNIQUE(FLATTEN(TRIM(SPLIT(TEXTJOIN(",", 1, A:A), ",")))))

Related

Unnest two columns in google sheet

I have a table like this one here (basically it's data from a google form with multiple choice answers in column A and B and non-muliple choice data in column C) I need a separate row for each multiple choice answer.
Column A
Column B
Email
A,B
XX,YY
1#gmail.com
A,C
FF,DD
2#gmail.com
I tried to un-nest the first column and keep the remaining columns like this
enter image description here
I tried several approaches I found with flatten and split with array formulas but I don't know where to start really.
Any help or hint would be much appreciated!
You can use the split function on the column A and after that, use the index function. Considering the table, you can use:
=index(split(A2,","),1,1)
The split function separate the text using the delimiter indicated, returning an array with 1 line and 2 columns; the index function will return the first line and the first column from this array. To return the second element from the column A, just change to
=index(split(A2,","),1,2)
I think there's no easy solution for this. You're asking for as many combinations of elements as multiple-choice elections have been made. Any function in Google Sheets has its potentials and limitations about how many elements it can express. One very useful formula here is REDUCE. With REDUCE and sequences of elements separated by commas counted with COUNTA, you can stablish this formula:
=QUERY(REDUCE({"Col A","Col B","Email"},SEQUENCE(COUNTA(A2:A)),LAMBDA(z,c,{z;LAMBDA(ax,bx,
REDUCE({"","",""},SEQUENCE(ax),LAMBDA(w,a,
{w;
REDUCE({"","",""},SEQUENCE(bx),LAMBDA(y,b,
{y;INDEX(SPLIT(INDEX(A2:A,c),","),,a),INDEX(SPLIT(INDEX(B2:B,c),","),,b),INDEX(C2:C,c)}
))})))
(COUNTA(SPLIT(INDEX(A2:A,c),",")),COUNTA(SPLIT(INDEX(B2:B,c),",")))})),
"Where Col1 is not null",1)
Since I had to use a "initial value" in every REDUCE, I then used QUERY to filter the empty values:

Google Sheets: "Text to Columns", but in single column?

I have a huge data file. Each line may contain one or more entries separated by a pipe:
one|two|three
alpha
uno|dos
beta
gamma
I can use "Text to Columns" to remove the | and put each entry in its own column:
one two three
alpha
uno dos
beta
gamma
But my desired result is to have all items in a single column:
one
alpha
uno
beta
gamma
two
three
dos
How can I easily get all entries into a single column? (order does NOT matter)
try:
=ARRAYFORMULA(QUERY(FLATTEN(IFERROR(SPLIT(FLATTEN(A:Z), "|"))), "where Col1 is not null"))
Assume Range A1:A5 is your data
=TRANSPOSE(SPLIT(JOIN("|",A1:A5),"|"))
JOIN them to add "|" to all of them first
then SPLIT text that has "|" to be 1 row, many columns
then TRANSPOSE to get just one column going
This is the basic one I can think of, anyways there are a lot of ways to do this

Using SEARCH function to count occurrence of multiple values

So I've Two lists in Google sheets. one is a (relatively short) list of names, let's say a rooster of employees. The second list is (rather a long) list of shifts, which notes the employees who were present.
for example:
List A - (rooster):
___________________
Mike
Linda
Carrie
Dave
List B - (Import_shift_data):
____________________________
Mike, John
Dave, Linda, Mike
Carrie
Dave, John
Linda
Mike
Dave, Carrie, John, Mike
My goal is to count the presence of each employee.
Now, here are the tricky parts:
List B updates every day, and each cell contains more than one name.
List A also updates, as some employees join the team and other leave.
Each shift could by a day shift, or a night shift (listed in another column next to List B) and I need to count them separately.
The Day/night column is in a parallel column next to shift column, and has one of two values, "Day" or "Night"
So my notion was to create an array formula, who can expand or shrink based on the number of values in List A. The problems is, I Can't yield and results from using the whole {list A} as the first argument in the SEARCH function.
I've tried the foloowing:
=Arrayformula(IF(INDIRECT("A2"):INDIRECT(CONCATENATE("A",MAX(Arrayformula(IF(isblank($A:$A),"",Row($A:$A)))))) = 0,"",COUNTIFs('Import_shift_data'!$P:$P,INDIRECT("A2"):INDIRECT(CONCATENATE("A",MAX(Arrayformula(IF(isblank($A:$A),"",Row($A:$A)))))),'Import_shift_data'!$M:$M,"Night")))
.
But this formula only works for a shift with a single employee.
I also wrote this one:
=Countifs(Arrayformula(ISNUMBER(SEARCH(A2,'Import_shift_data'!$P:$P))),"true",'Import_shift_data'!$M:$M,"Night")
which works fine, but I need to manually drag it up or down every time List A (The rooster) is updated.
So my end game is to have two arrays, one that counts night shifts for each employee, and one who counts day shifts. those arrays should automatically shrink or expand by the size of the rooster. (List A)
Note: If relevant, I may also note that the names in {List A} may contain more than one word, in case there are two employees with the same first name.
A copy of the spreadsheet:
https://drive.google.com/open?id=1HRDAy9-T_rflFpzanZq0fmHpV0jTZg6Rc4vHyOu-1HI
day shift:
=ARRAYFORMULA(QUERY(TRIM(TRANSPOSE(SPLIT(TEXTJOIN(", ", 1, B2:B), ","))),
"select Col1,count(Col1) group by Col1 label count(Col1)''", 0))
night shift:
=ARRAYFORMULA(QUERY(TRIM(TRANSPOSE(SPLIT(TEXTJOIN(", ", 1, C2:C), ","))),
"select Col1,count(Col1) group by Col1 label count(Col1)''", 0))
I Think I've found the Solution, I've used player0's idea of rearranging the data vector and split non-single shifts into single cells.
so basically it goes:
=Arrayformula(CountiF(Transpose(SPlit(Textjoin(" , ",TRUE,QUERY('Import_shift_data'!A:P, "select P where M = 'Night' ", 1))," , ",False)),INDIRECT("A2"):INDIRECT(CONCATENATE("A",MAX(Arrayformula(IF(isblank($A:$A),"",Row($A:$A))))))))
Thanks player0 !

VLOOKUP remove spaces when cell is empty

This a simple customer sheet:
A B C D
ID First Middle Last
1 John Doe
2 Jane Maia Doe
And in F1 I put this vlookup code:
=VLOOKUP($G$1;$A$1:$D$3;2;FALSE)&" "&VLOOKUP($G$1;$A$1:$D$3;3;FALSE)&" "&VLOOKUP($G$1;$A$1:$D$3;4;FALSE)
When I lookup ID 2, it's perfect nicely spaced between the vlookups
But when I lookup ID 1 you see 2 spaces between the first and last name, because there is no middle name here.
How can I manage that I always see 1 space between the vlookups?
One way you could achieve the result you're looking for is to simply replace multiple spaces with a single space.
=REGEXREPLACE(JOIN(" ",ARRAYFORMULA(VLOOKUP(G1,A:D,{2,3,4},FALSE))),"\s{2,}"," ")
This formula looks up G1 in your table (A:D). VLOOKUP can be used in an ARRAYFORMULA to efficiently retrieve all of the columns you want in one shot. Your JOIN joins all of the retrieved columns, inserting a space between each value. Finally, your REGEXREPLACE function looks for multiple consecutive spaces and replaces them with a single space.
Alternatively, you could filter the resulting array (i.e. the result of what your VLOOKUP returns). The following formula looks up the array of first, middle, and last name, and then filters out any empty cells before joining the remaining elements with a space.
=JOIN(" ",FILTER(VLOOKUP(I1,A:D,{2,3,4},FALSE),INDIRECT("B"&MATCH(I1,A:A,0)&":D"&MATCH(I1,A:A,0))<>""))
all you need is TRIM fx and:
=ARRAYFORMULA(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(
VLOOKUP(G1:G2, A1:D3, {2,3,4}, 0))),,999^99))))

How do I order a mixed text and integer field in a pivot table in Google Sheets?

Let's say that we have two columns on a sheet:
Name Room
-------------
Steve A1
Jill A1
Sam A1
Steve A2
...
Lisa A10
Sally A11
Jim A11
My actual dataset has up to a hundred of these rooms.
The issue I'm running into is with pivot tables. When I want to get a list of rooms and the count (counta is the one I'm using) it works, but the order is not what I wanted. It comes out as:
Room Count
--------------
A1 3
A10 1
A11 2
...
A2 1
I guess I can kind of see why it would be doing that. I'd much rather have it list it out in order. A1, A2, A3... A10, A11, A12, etc.
Is there an easy way to do this without some sort of data manipulation?
An "easy" way to do this without "data manipulation" is to copy the PT, Paste special, Paste values only and then drag the relevant rows (presumably at most only 8) to where you want them. The easiest way is probably with "data manipulation", for example:
=if(len(A1)=2,SUBSTITUTE(A1,"A","A0"),A1)
(Though in you case, whichever column would be the right one, it would not be ColumnA.)
I suggest you transform the string elements into number values using a lookup table.
I've created a sample spreadsheet here.
The input data in the 'input' sheet has the keys as you described.
The next sheet is the "lookup table" to translate each key into a value number. I suggest choosing large numbers to leave room for future intermediate numbers if needed
Pivot 1 is based on the original data as you described
Pivot 2 is based on the re-calculated room name using the lookup table.
The formula I used for the re-calculation is:
=VALUE(SUBSTITUTE(A2,MID(A2,1,1),VLOOKUP(MID(A2,1,1),'Lookup table'!$A$1:$B$2,2)))
I was a little lazy with the string lookup in the original name (MID), assuming your string is the first character and is 1 character long. This can be mended specifically with pattern matching.

Resources