Google Sheets: Find a Row that Matches Only a Few Specific Characteristics - google-sheets

I can't seem to find the right equation to find a cell from a row that matches only a few specific characteristics. In this example, I am trying to find the equation for Column D which would be the cell in A that has the same cells for B & C.
Hope this makes sense!

I'll provide two options.
If you're sure your data will only ever have zero or one match, you can place the following formula into D2 of an otherwise empty range D2:D...
=ArrayFormula(IF(A2:A="",,SUBSTITUTE(VLOOKUP(B2:B&C2:C,{B2:B&C2:C,A2:A},2,FALSE)&VLOOKUP(B2:B&C2:C,SORT({B2:B&C2:C,A2:A,ROW(A2:A)},3,0),2,FALSE),A2:A,"")))
However, if you think more than one match may turn up and you want "None" to be returned if there is no match, you can use the following formula in D2 or an otherwise empty range D2:D...
=ArrayFormula(IF(A2:A="",,REGEXREPLACE(REGEXEXTRACT(REGEXREPLACE(SUBSTITUTE(VLOOKUP(B2:B&C2:C,TRIM(SPLIT(FLATTEN(QUERY(QUERY({B2:B&C2:C&"~",A2:A&","}, "Select MAX(Col2) where Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1"),, 9^9)),"~")),2,FALSE),A2:A,""),"^[,\s]+$","None"),"([^,\s].+[^,\s])[,\s]*$"),"[,\s]+",", ")))
The second formula will work even if there will only ever be zero or one match; it's just not necessary to have it be that lengthy. And the second formula is only as lengthy because it was unclear from your posted examples whether the data in Col A, B and C will really only ever be one word or not; so the formula is built to assume there will not always be one-word strings in those columns.
Either formula will provide results for the entire column without dragging.

Here's an option, You can use this formula in column D2:
=iferror(textjoin(", ",true,query($A$2:$C,"Select A where A is not null and A != '"&$A2&"' and B = '"&$B2&"' and C = '"&$C2&"'",0)),"None")
Limitation:
You need to manually drag the formula to its succeeding rows. Arrayformula() cannot be used in looping the query string values.
What it does?
Using query(), filter the data from A2:C that has the same current row last name(Column B) and food(Column C) at the same time having a different first name(Column A)
If there are multiple results, use textjoin() to combine them with ", " as its delimiter.
If there is no matched found, it will return an error, hence use iferror() to set the default value to "None"
Output

Related

Summing specific values in a joined list

I am having some difficulties summing up some values in Google Sheets. In my spreadsheet, from multiple other tabs, values and bonuses are combined into one cell (Cell B1 in this example). The format of each "unit" of data is Name,5%xxx (Where "Name" is the name of the item, "5%" represents the sum I want to add, mostly always a percentage, and "xxx" separates one unit from the next). As you can see in cell B1, there are two instances where "Parkour" receives a bonus to sum up (from different sources).
Parkour,5%xxxParkour (Subskill: Sense of Balance),10%xxxParkour,2%xxx
Parkour
0.07
Parkour (Subskill: Sense of Balance)
H2H Combat: Parkour
The formula in cell B2 is:
=IFERROR(SUM(ARRAYFORMULA(IFERROR(VALUE(MID(FILTER(SPLIT(TEXTJOIN("",TRUE,filter(B$1,regexmatch(B$1,$A2)=TRUE)),"xxx"),SEARCH($A2,SPLIT(TEXTJOIN("",TRUE,filter(B$1,regexmatch(B$1,$A2)=TRUE)),"xxx"))),len($A2)+2,1000)),""))),"")
(Dragged down through the rest of the list) (Could not figure out how to make the formula "in line" on the question.)
Expected Results:
B2 = .07 (Working)
B3 = .1 (Not working)
B4 = Blank (Working)
The goal of the formula is to look into cell B1, and split everything out by "xxx". Then, filter the array of items with only exact matches with the line item in column A, then split again by the comma and add up those values. It worked for the first line item, but not the second. (Unsure why, but I strongly believe it has something to do with the parenthesis. When I removed the parenthesis from the name in Column A (and adjusted cell B1 to not have parenthesis), it worked. However, given the structure of the data, parenthesis are required, and I need to find a way for it to work with them.)
When I removed the IFERROR wrap around it in cell B3, I get this error note:
Function SUM parameter 1 expects number values. But " is a text and cannot be coerced to a number.
Any help is greatly appreciated.
You may find useful combining SPLIT with QUERY like this. It will group names and sum percentages:
=QUERY(INDEX (IFERROR(SPLIT(FLATTEN(INDEX(SPLIT(B1:B100,"xxx"))),","))),"SELECT Col1,SUM(Col2) where Col1 is not null group by Col1")
PS: invented a couple of extra line
UPDATE
I've thought you had another goal, try this formula. Having the previous chart generated by QUERY, I used VLOOKUP to match first column and return second one:
=INDEX(IFERROR (VLOOKUP(A2:A,QUERY(INDEX (SPLIT(FLATTEN(SPLIT(B1,"xxx")),",")),"SELECT Col1,SUM(Col2) where Col1 is not null group by Col1"),2,0)))

COUNTIFS w/ Multiple Criteria in Same Column

I'm trying to count instances of letters (like letters C through Z, excluding RR) within the same column if they are listed next to a name. Here's my sample Google Sheet that you can edit.
I'm trying to insert the formula in cell F3 that is highlighted yellow. So far I have...
=arrayformula(SUM(COUNTIFS(A2:A,{"C","D","E","F","G"},C2:C,E3:E)))
It seems like what I have should work, but it's giving me a #VALUE error, saying, "ARRAY arguments to COUNTIFS are of different size".
It seems like REGEXMATCH could be used inside the COUNTIF formula to make it easier to restrict the search to the range of letters I need, but not sure how to construct the formula.
Thanks for your help!
UPDATE
This formula below works but only for column A. I actually need to specify a different range of letters to be counted in column B and totaled in column K.
=QUERY(A2:C,"select C,count(A) where A matches 'C|D|E|F|G|H|I|J|K' group by C label count(A)''", 0)
Seems like this post almost answers it.
Current progress:
={QUERY(A2:C,"select C,count(A) where A matches 'C|D|E|F|G|H|I|J|K' and A is not null group by C label count(A)''", 0),QUERY(A2:C,"select count(B) where B matches 'F|G|H|I|J|K' and B is not null group by C label count(B)''", 0)}
As we have discussed / tested on your sample sheet. This should work as close as possible to the data that you would want to filter/display.
A 2-formula solution was found to work as well. Answer is in the same spreadsheet linked above in Sheet2. It should prevent the first formula from not working should someone need a different one.
J2 - =QUERY(A2:C,"select C,count(A) where A matches 'C|D|E|F|G|H|I|J|K' group by C label count(A)''", 0)
L2 - =ArrayFormula(IFNA(vlookup(J2:J,QUERY(A2:C,"select C,count(B) where B matches 'F|G|H|I|J|K' group by C label count(B)''", 0),2,0)))
This formula should work for you if you paste it in cells F3 to F7.
I currently have cells P1 to P26 (you'd probably want to hide this column) occupied with the letters of the alphabet in order, making it so that you can select P3:P26 (C-Z) to put into your first condition for your COUNTIFS.
=arrayformula(SUM(COUNTIFS(A2:A, $P$3:$P$26,C2:C,E3)))
You'll have to implement these formulas in the other places that you want them too, but it shouldn't be hard to change this formula to work in the other places as well.

Why my ArrayFormula is giving error? How do I correct it? (I'm not looking for another Arrayformula as solutions!)

I wanted a ArrayFormula at C1 which gives the required result as shown.
Entry sheet:
(Column C is my required column)
Date Entered is the date when the Name is Assigned a group i.e. a, b, c, d, e, f
Criteria:
The value of count is purely on basis of Date Entered (if john is assigned a on lowest date(10-Jun) then count value is 1, if rose is assigned a on 2nd lowest date(17-Jun) then count value is 2).
The value of count does not change even when the data is sorted in any manner because Date Entered column values is always permanent & does not change.
New entry date could be any date not necessarily highest date (If a new entry with name Rydu is assigned a on 9-Jun then the it's count value will become 1, then john's (10-Jun) will become 2 and so on)
Example:
After I sort the data in any random order say like this:
Random ordered sheet:
(Count value remains permanent)
And when I do New entries in between (Row 4th & 14th) and after last row (Row 17th):
Random Ordered sheet:
(Doesn't matter where I do)
I already got a ArrayFormula which gives the required result:
={"AF Formula1"; ArrayFormula(IF(B2:B="", "", COUNTIFS(B$2:B, "="&B2:B, D$2:D, <"&D2:D)+1))}
I'm not looking for another Arrayformula as solutions. What I want is to know what is wrong in my ArrayFormula? and how do I correct it?
I tried to figure my own ArrayFormula but it's not working:
I got Formula for each cell:
=RANK($D2,FILTER($D$2:$D, $B$2:$B=$B2),1)
I figured out Filter doesn't work with ArrayFormula so I had to take a different approach.
I took help from my previous question answer (Arrayformula at H3) which was similar since in both cases each cell FILTER formula returns more than 1 value. (It was actually answered by player0)
Using the same technique I came up with this Formula which works absolutely fine :
=RANK($D2, ARRAYFORMULA(TRANSPOSE(SPLIT(VLOOKUP($B2, SUBSTITUTE(TRIM(SPLIT(FLATTEN(QUERY(QUERY({$B:$B&"×", $D:$D}, "SELECT MAX(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1", 1),, 9^9)), "×")), " ", ","), 2, 0), ","))), 1)
Now when I tried converting it to ArrayFormula:
($D2 to $D2:$D & $B2 to $B2:$B)
=ARRAYFORMULA(RANK($D2:$D,TRANSPOSE(SPLIT(VLOOKUP($B2:$B, SUBSTITUTE(TRIM(SPLIT(FLATTEN(QUERY(QUERY({$B:$B&"×", $D:$D}, "SELECT MAX(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1", 1),, 9^9)), "×")), " ", ","), 2, 0), ",")), 1))
It gives me an error "Did not find value '' in VLOOKUP evaluation", I figured out that the problem is only in VLOOKUP when I change $B2 to $B2:$B.
I'm sure VLOOKUP works with ArrayFormula, I fail to understand where my formula is going wrong! Please help me correct my ArrayFormula.
Here is the editable sheet link
if I understand correctly, you are trying to "rank" B column based on D column dates in such way that dates are in theoretical ascending order so if you randomize your dataset, the "rank" of each entry would stay same and not change based on the randomness you introduce.
therefore the correct formula would be:
={"fx"; INDEX(IFNA(VLOOKUP(B2:B&D2:D,
{INDEX(SORT({B2:B&D2:D, D2:D}, 2, 1),,1),
IFERROR(1/(1/COUNTIFS(
INDEX(SORT(B2:D, 3, 1),,1),
INDEX(SORT(B2:D, 3, 1),,1), ROW(B2:B), "<="&ROW(B2:B))))}, 2, 0)))}
{"fx"; ...} array of 2 tables (header & actual table) under each other eg. ;
outer shorter INDEX or longer ARRAYFORMULA (doesnt matter which one) is needed coz we are processing an array
IFNA for removing possible #N/A errors from VLOOKUP function when VLOOKUP fails to find a match
we VLOOKUP joint B and D column B2:B&D2:D in our virtual table {} and returning second 2 column if there is an exact match 0
our virtual table {INDEX(SORT({B2:B&D2:D, D2:D}, 2, 1),,1), ...} we VLOOKUP from is constructed with 2 columns next to each other eg. ,
we are getting the first column by creating an array of 2 columns {B2:B&D2:D, D2:D} next to each other where we SORT this array by date/2nd column 2, in ascending order 1 but all we need after sorting is the 1st column so we use INDEX where we bring all rows ,, and the first column 1
now lets take a look on how we getting the 2nd column of our virtual table by using COUNTIFS which will mimic the "rank"
IFERROR(1/(1/ is used to remove all zero values from the output (all empty rows would have 0 in it as the "rank")
under COUNTIFS we put 2 pairs of arguments: "if column is qual to column" and "if row is larger or equal to next row increment it by 1" ROW(B2:B), "<="&ROW(B2:B))
for "if column is qual to column" we do this twice and use range B2:D and sort it by date/3rd column 3 in ascending order 1 and of this we again need only the 1st column so we INDEX it and return all rows ,, and first column 1
with this formula you can add, remove or randomize your dataset and you will always get the right value for each of your rows
as for why your formula doesnt work... to not get #N/A error for vlookup you would need to define the end row of the range but still, the result wont be as you would expect coz formula is not the right one for this job.
as mentioned there are functions that are not supported under AF like SUM,AND,OR and then there are also functions which work but in a different way like IFS or with some limitations like SPLIT,GOOGLEFINANCE,etc.
I have answered you on the tab in your shared sheet called My Practice thusly:
You cannot split a two column array as you have attempted to do in cell CI2. That is why your formula does not work. You can only split a ONE column array.
I understand you are trying to learn, but attempting to use complicated formulas like that is going to make it harder I'm afraid.

Google sheets conditional formatting based on =QUERY result

I am trying to conditionally format a row in Google Sheets based on the result of a QUERY operation. I can get the QUERY to return the value I want (either 0 or non-zero), however QUERY insists on returning a header row.
Since the QUERY now takes up 2 rows for every row of data, changing the format of the row based off the QUERY result starts to get weird after just a few rows.
The problem I am ultimately trying to solve in the case where I enter a list of names, and I want to compare each name to a list of "NSF Names". If a name is entered, and it exists on the NSF list, I would like to flag the entry by highlighting the row red.
Help is greatly appreciated.
Edit: Formula, as requested:
=query(A:D,"select count(A) where C contains '"&E1&"' and D contains '"&E2&"'")
A:D is the data set (A is a numeric ID, B is full name, C and D are first and last names respectively).
E1 and E2 are placeholders for the person's first and last name, but would eventually be replaced with parsing the person's name, as it's inputted on the sheet (TRIM(LEFT(" ", A2) etc...)
I can get the QUERY to return the value I want (either 0 or non-zero),
however QUERY insists on returning a header row.
There might be better ways to achieve what you want to do, but in answer to this quote:
=QUERY(A:D,"select count(A) where C contains '"&E1&"' and D contains '"&E2&"' label count(A) ''")
A query seems a long-winded approach (=countifs(C1:C7,E$1,D1:D7,E$2) might be adequate) but the label might be trimmed off it by wrapping the query in:
index(...,2)

pulling row number into query google spreadsheet

I have a data set that looks like this: starting on A1 with "1"
1 a
2 b
3 c
4 d
Column A is an arrayformula =arrayformula(row(b1:b))
Column B is manual input
i want to query the database and finding the row of the item by match column B so i have code as such
=query("A1:B","select A where B like '%c%')
this should give me "3"
My question:
is there a way to pull the 1-4 numbers into the query line? with something like array formula row(b1:b). I don't want to waste an extra column on column A
so basically I want just the manual input and when i query it gives me the row number.
No script code please.
I've tried a few things and it didn't work.
Looking for a solutions that starts with
=query()
You can also use a formula to pull in more than one row in the dataset which matches the condition, if this is important to you:
=arrayformula(filter(row(B:B); B:B="c"))
And you can have wildcard type operators, under certain circumstances (you are going to match text or items that can look like text (so numbers can be treated as text - but boolean will need more steps); that the dataset is not huge), using regular expressions. e.g.
=arrayformula(filter(row(B:B); regexmatch(B:B, "(c|d)")))
You could also use standard spreadsheet wildcard operators, e.g.
=arrayformula(filter(row(B:B); countif(B:B, "*c*")))
Explanation: In this case, the filter will be true when countif is greater than zero, i.e. when it sees something with a letter c in it, since spreadsheets see a value greater than zero as a boolean true and so, for that row where there is a countif match, there will be a a filter match, and so it will display that row (indeed, it is a similar situation with the regexmatch creating a true when there is a match of either c or d, in the case above).
Personally, I wanted to learn regex a bit, so I would go towards the regexmatch option. But that is your choice.
You can also, of course, create the match outside of the cell. This makes it easy to create a list of matches that you want to satisfy elsewhere on the sheet. So you could have a column of words or parts of words, from Z2 downwards, and then join them together in cell Z1 for example like this
="("&join("|",filter(Z2:Z50,len(Z2:Z50)))&")"
Then your filter function would look like this:
=arrayformula(filter(row(B:B), regexmatch(B:B, Z1)))
If you want to use like operator in the query function, you can try something like this:
=arrayformula(query(if({1,0}, B:B,row(B:B)),"select Col2 where Col1 like '%c%' "))
You can also use the regular expressions in the query function, for example:
=arrayformula(query(if({1,0}, B:B,row(B:B)),"select Col2 where Col1 matches '(.*c.*|.*d.*)' "))
I'm not entirely clear on the question, but as I understand it, you want to be able to enter a formula, and have it return the row number of the matched item in a range? I'm not sure where array formulas come in.
If I've understood your question correctly, this should do the trick:
=MATCH("C",B1:B,0)
In your example, this returns 3.
Please forgive me if I've misunderstood your question.
Note: If there are multiple matches, this will return the row number for the first instance of your search.
=QUERY({A1:A,ARRAYFORMULA(ROW(A1:A))},"SELECT Col2 WHERE Col1 LIKE '%c%'")

Resources