Google Sheet: Creating combination of all names from a column - google-sheets

I have a sheet with following structure
fname
lname
role
Nick
Fury
Manager
Tony
Stark
Manager
Bruce
Banner
Employee
Steve
Rogers
Employee
Clint
Barton
Employee
I want to create another sheet with combinations of all complete names of employees
So, I would like the output to be
names
Bruce Banner - Steve Rogers
Bruce Banner - Clint Barton
Steve Rogers - Clint Barton
I got the filter and concatenate part working with
=ARRAYFORMULA(FILTER(Roster!C2:C&" " & Roster!D2:D, Roster!K2:K = "Employee"))
But, I am not sure how to name create the combinations

You may try:
=index(split(query(unique(map(lambda(z,flatten(z& "🐠" &transpose(z)))(filter(A:A&" "&B:B,C:C="Employee")),lambda(c,join("🐠",sort(unique(transpose(split(c,"🐠")))))))),"select Col1 where Col1 contains '🐠'"),"🐠"))
Incase the new functions are operational for you; here's another approach:
=let(a,filter(A:A&" "&B:B,C:C="Employee"),b,byrow(a,lambda(z,wraprows(z,counta(a),z))),query(map(tocol(b),tocol(b,,1),lambda(c,d,if(lt(c,d),{c,d},))),"where Col1<>''"))

To get unique combinations without using split(), use reduce():
=let(
names, filter(A2:A & " " & B2:B, C2:C = "Employee"),
reduce(
{ "name 1", "name 2" }, names,
lambda(
result, name,
reduce(
result, filter(names, names <> name),
lambda(
result, otherName,
if(
name < otherName,
{
result;
{ name, otherName }
},
result
)
)
)
)
)
)
Text string manipulation and split() can mistreat dates, Booleans and numbers in certain formats, together with text strings that look like those types. That won't matter with names like Bruce Banner but split() will coerce text strings like 1 2, 1 2 3 and 1111-2-3 to dates, while the formula above will faithfully reproduce the original values.

Related

How do I set value depending on other columns value, according to some kind of completion key?

I'm using Google Sheets to get an overview on my banking transactions.
I would like to put every transaction in a category, for example, grocery, transport,...
I solved this by using this short script:
=IFS(REGEXMATCH(A1;$K$2); $L$2;REGEXMATCH(A1;$K$3);$L$3;REGEXMATCH(A1;$K$4);$L$4;true;"other")
In which column A is the one I want to check, with all the transactions, K has the shop names I'm checking on, for example "shopname1", and L is the category I would like to put transactions for this shop in.
Now, this is working just fine, but this list of shops and categories is now just 3 rows long, as I'm testing it out, but as I will be using it, it will be quite long, which means that my IFS statement will also be very long. This isn't very modular eighter, for changes in the future, so I would like to make this better.
I would like to know a way to for example let it check on a couple of lists, one for every category.
I hope this makes sence, and someone has an idea!
You can try this (in B2):
=ARRAYFORMULA(
IF(
A:A = "";;
VLOOKUP(
ROW(A:A);
{
FILTER(ROW(A:A); A:A <> "")\
REGEXREPLACE(
TRIM(
TRANSPOSE(QUERY(
IF(
NOT(REGEXMATCH(
TRANSPOSE(FILTER(A:A; A:A <> ""));
"(?i)" & FILTER(K2:K; K2:K <> "")
));;
FILTER(L2:L; K2:K <> "") & ", "
);;
COUNTA(K2:K)
)) & "other"
);
", other$";
)
};
2;
)
)
)
It will set other if there were no matches, otherwise it will set all matching categories separated by , .

Google Sheets - combine data from multiple rows to single row or cell with arrayformula

I have an export from our student information system that has multiple rows for each student, depending on how many contact email addresses the parent entered.
Sample data from the export
I would like to combine all the contact addresses either into multiple columns on the same row or even all to the same cell would be fine. After many attempts through a lot of searching, I can get it to work with =join(char(10), filter(extract.csv!G:G,extract.csv!A:A=J2)) and manually filling the formula down. (Although I'd rather not have the return first but rather just between the results, but I can live with it if it's not possible.)
What'd I'd love is to have that in an arrayformula so that I don't have to copy it down but I can't figure out how to adjust the last reference to the J row. If I leave it as is, it puts the same values in every cell to match the J2 data.
with arrayformula
Or is there another way to get what I'm trying for? Thanks for any help... I'm just a teacher who loves to code and automate things muddling through and learning bits and pieces as I go!
It's probably worth posting this, because it's how far I got just creating some representative data of my own before I realised that you'd posted a sheet for us (thank you).
=ArrayFormula(if(mod(sequence(countunique(A2:A),D2,0),D2)<countif(A2:A,unique(filter(A2:A,A2:A<>""))),
vlookup(vlookup(unique(filter(A2:A,A2:A<>"")),{A2:A,row(A2:A)},2,false)+MOD(sequence(COUNTUNIQUE(A2:A),D2,0),D2),{row(A2:A),B2:B},2,false),))
D2 is a helper cell which contains the maximum number of contacts per student - either from formula or entered manually.
I will have a go with your data, but I wasn't quite clear whether the student's own email should come first, followed by the parent contacts? I'm kind of hoping that the secondary email isn't populated because it would complicate things further.
Here's how it looks with your data - same formula with slightly different columns:
=ArrayFormula(if(mod(sequence(countunique(A2:A),I2,0),I2)<countif(A2:A,unique(filter(A2:A,A2:A<>""))), vlookup(vlookup(unique(filter(A2:A,A2:A<>"")),{A2:A,row(A2:A)},2,false)+MOD(sequence(COUNTUNIQUE(A2:A),I2,0),I2),{row(A2:A),G2:G},2,false),))
where I2 is currently set to 5 - it can be worked out from
=max(countif(A2:A,unique(filter(A2:A,A2:A<>""))))
if you want to make it more dynamic.
The issue being that I can't think of an easy way to remove the blank email address for the first student at the moment (I'm a bit surprised that the download contains blank addresses - data quality?).
I have added a new sheet ("Erik Help"), which is a duplicate of your "AutoFillData" sheet. In my sheet, I cleared Column AD and then placed the following formula in AD1:
`=ArrayFormula({"Contact Emails";IF(J2:J="",,IFERROR(VLOOKUP(J2:J,{UNIQUE(FILTER(extract.csv!A2:A,extract.csv!A2:A<>"")),SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IF(ISERROR(VLOOKUP(UNIQUE(FILTER(extract.csv!A2:A,extract.csv!A2:A<>""))&""&TRANSPOSE(UNIQUE(FILTER("|"&{extract.csv!E2:E;extract.csv!F2:F;extract.csv!G2:G},{extract.csv!E2:E;extract.csv!F2:F;extract.csv!G2:G}<>"")))&"",extract.csv!A2:A&"|"&extract.csv!E2:E&"|"&extract.csv!F2:F&"|"&extract.csv!G2:G,1,FALSE)),,TRANSPOSE(UNIQUE(FILTER({extract.csv!E2:E;extract.csv!F2:F;extract.csv!G2:G},{extract.csv!E2:E;extract.csv!F2:F;extract.csv!G2:G}<>"")))))," ",COUNTA(UNIQUE(FILTER({extract.csv!E2:E;extract.csv!F2:F;extract.csv!G2:G},{extract.csv!E2:E;extract.csv!F2:F;extract.csv!G2:G}<>""))))))," ",CHAR(10))},2,FALSE)))})'
Explaining this formula fully would take quite a long time.
In general, what it does is form a virtual 2D grid (never seen by the user) with the unique list of student IDs running vertically at left and the unique list of all email addresses (with an appended delineator) running horizontally across the top. If the combination of student ID and that email address is found in any string formed by the mash-up of studentID|email1|email2|email3, then that email address fills the virtual grid; if not, then that cross-section of the grid is left null.
This leaves a grid where all possible emails are filled in horizontally somewhere across from each unique ID, rather than being on separate lines.
Finally, a quirk in the QUERY function is used to combine all non-null entries per row. That is, the QUERY function can have any number of headers, not just 0 or 1. By having QUERY request every email section of the grid as headers and then TRIMing out spaces, we wind up with all the emails for each student ID together.
Then it's just a matter of replacing the remaining spaces with a line return character, i.e., CHAR(10).
Here are a few solutions (in sheets kishkin 1 in kishkin 2 respectively).
Emails in one row in separate columns:
=ARRAYFORMULA(
IF(
J2:J = "",,
TRIM(
SPLIT(
VLOOKUP(
J2:J,
SPLIT(
TRANSPOSE(QUERY(
QUERY(
FILTER({extract.csv!A:A & "♥", extract.csv!G:G & "♦"}, extract.csv!A:A <> ""),
"SELECT MAX(Col2)
GROUP BY Col2
PIVOT Col1",
1
),, COUNTA(extract.csv!A:A)
)),
"♥"
),
2,
),
"♦"
)
)
)
)
Emails in a single cell:
={
"CONTACT EMAILS";
ARRAYFORMULA(
IF(
J2:J = "",,
REGEXREPLACE(
VLOOKUP(
J2:J,
SPLIT(
TRANSPOSE(QUERY(
QUERY(
FILTER({extract.csv!A:A & "♥", extract.csv!G:G & CHAR(10)}, extract.csv!A:A <> ""),
"SELECT MAX(Col2)
GROUP BY Col2
PIVOT Col1",
1
),, COUNTA(extract.csv!A:A)
)),
"♥"
),
2,
),
"(?m)^\s+|\s+$",
)
)
)
}

Using SEARCH function to count occurrence of multiple values

So I've Two lists in Google sheets. one is a (relatively short) list of names, let's say a rooster of employees. The second list is (rather a long) list of shifts, which notes the employees who were present.
for example:
List A - (rooster):
___________________
Mike
Linda
Carrie
Dave
List B - (Import_shift_data):
____________________________
Mike, John
Dave, Linda, Mike
Carrie
Dave, John
Linda
Mike
Dave, Carrie, John, Mike
My goal is to count the presence of each employee.
Now, here are the tricky parts:
List B updates every day, and each cell contains more than one name.
List A also updates, as some employees join the team and other leave.
Each shift could by a day shift, or a night shift (listed in another column next to List B) and I need to count them separately.
The Day/night column is in a parallel column next to shift column, and has one of two values, "Day" or "Night"
So my notion was to create an array formula, who can expand or shrink based on the number of values in List A. The problems is, I Can't yield and results from using the whole {list A} as the first argument in the SEARCH function.
I've tried the foloowing:
=Arrayformula(IF(INDIRECT("A2"):INDIRECT(CONCATENATE("A",MAX(Arrayformula(IF(isblank($A:$A),"",Row($A:$A)))))) = 0,"",COUNTIFs('Import_shift_data'!$P:$P,INDIRECT("A2"):INDIRECT(CONCATENATE("A",MAX(Arrayformula(IF(isblank($A:$A),"",Row($A:$A)))))),'Import_shift_data'!$M:$M,"Night")))
.
But this formula only works for a shift with a single employee.
I also wrote this one:
=Countifs(Arrayformula(ISNUMBER(SEARCH(A2,'Import_shift_data'!$P:$P))),"true",'Import_shift_data'!$M:$M,"Night")
which works fine, but I need to manually drag it up or down every time List A (The rooster) is updated.
So my end game is to have two arrays, one that counts night shifts for each employee, and one who counts day shifts. those arrays should automatically shrink or expand by the size of the rooster. (List A)
Note: If relevant, I may also note that the names in {List A} may contain more than one word, in case there are two employees with the same first name.
A copy of the spreadsheet:
https://drive.google.com/open?id=1HRDAy9-T_rflFpzanZq0fmHpV0jTZg6Rc4vHyOu-1HI
day shift:
=ARRAYFORMULA(QUERY(TRIM(TRANSPOSE(SPLIT(TEXTJOIN(", ", 1, B2:B), ","))),
"select Col1,count(Col1) group by Col1 label count(Col1)''", 0))
night shift:
=ARRAYFORMULA(QUERY(TRIM(TRANSPOSE(SPLIT(TEXTJOIN(", ", 1, C2:C), ","))),
"select Col1,count(Col1) group by Col1 label count(Col1)''", 0))
I Think I've found the Solution, I've used player0's idea of rearranging the data vector and split non-single shifts into single cells.
so basically it goes:
=Arrayformula(CountiF(Transpose(SPlit(Textjoin(" , ",TRUE,QUERY('Import_shift_data'!A:P, "select P where M = 'Night' ", 1))," , ",False)),INDIRECT("A2"):INDIRECT(CONCATENATE("A",MAX(Arrayformula(IF(isblank($A:$A),"",Row($A:$A))))))))
Thanks player0 !

VLOOKUP partial names and with typing or spelling errors

Column A - full names
Column B - first names
Column C - surnames
Goal: find people in both lists (list 2 has its names split into 2 columns*)
Problem: can't search for a surname from a full name, and names are misspelled or typed wrongly, example
Column A: Jack Doyle
Column B: Jack
Column C: Doyles
I've got something like the following for looking up the first name (even if this name was Jackson), but I can't figure out the wildcard/LEFT/RIGHT for the surname, especially considering the known errors.
=VLOOKUP(LEFT(B138,4)&"*",A$1:A$999,1,FALSE)
I've tried wildcards before and after, a tilde, just not sure where to go...
*Speaking of this, is there an easy way to append a bunch of surnames to first names in back-to-back columns? A=Jack,B=Doyle,C=Jack Doyle (must have the space obv)?
with typing or spelling errors
this can't be really done. but you can force auto-suggestions if you put all names from all 3 columns into one column and then apply Data validation with hidden dropdown.
B4:
=ARRAYFORMULA(IFERROR(SPLIT(A4:A, " ")))
D4:
=ARRAYFORMULA(B4:B&" "&C4:C)
E4:
=ARRAYFORMULA(QUERY(SPLIT(D4:D, " "), "select Col2,Col1", 0))
I3:
=ARRAYFORMULA(IFERROR(VLOOKUP(H3:H5, {B4:B,C4:C;C4:C,B4:B}, 2, 0)))
P3:
=UNIQUE(QUERY({B4:B;C4:C}, "where Col1 is not null", 0))
spreadsheet demo

Categorize cells by keywords

I am not certain if excel can do this but I am trying to simplify the data dump that I get from twitter.
Basically what I would like to do is this:
If the tweet (in Column A) contains apple OR orange OR pear then it can be classified (in Column B) as "fruit" BUT if it has carrot OR squash OR lettuce it will be classified as "vegetable". If it has none of these then can be classified as "none"
Is this possible?
Thanks in advance.
Here is using array constant and range.
=IF(SUMPRODUCT(IF(ISERROR(SEARCH({"apple","orange","pear"},A1)),0,1))>0,"Fruit",IF(SUMPRODUCT(IF(ISERROR(SEARCH({"carrot","squash","lettuce"},A1)),0,1))>0,"Vegetable","None"))
Now for example, both fruit and vegetable are present in a string, it will always test for fruit first since that is the way the formula was arranged. (e.g. "more apple on salad than lettuce" will return "Fruit").
You can also use a range that contains your list instead of the array constant.
For example, you can put your fruit list in Column C (C1:C3) and your vegetable list in Column D (D1:D3). Your formula would then be:
=IF(SUMPRODUCT(IF(ISERROR(SEARCH(C$1:C$3,A1)),0,1))>0,"Fruit",IF(SUMPRODUCT(IF(ISERROR(SEARCH(D$1:D$3,A1)),0,1))>0,"Vegetable","None"))
But you need to enter it as Array Formula using Ctrl+Shift+Enter.
Same results and rule apply when both fruit and vegetable appear on a string. HTH.
Sure.
Try this formula
=IF(
OR(
NOT(ISERROR(SEARCH("apple",A1))),
NOT(ISERROR(SEARCH("pear",A1))),
NOT(ISERROR(SEARCH("orange",A1)))
),
"fruit",
IF(
OR(
NOT(ISERROR(SEARCH("carrot",A1))),
NOT(ISERROR(SEARCH("squash",A1))),
NOT(ISERROR(SEARCH("lettuce",A1)))
),
"veggie",
"none"
)
)

Resources