Match Duplicates, Compare for Most Recent, Place Results - google-sheets

I'd like to have the following:
Identify duplicates in Sheet2!B:B
In those columns with duplicates, compare dates in Sheet2!A:A
Identify most recent Sheet2!A:A date within that range of matching Sheet2!B:B duplicates
Deliver that most recent date to cell in Sheet1!B:B which corresponds to the repeating duplicate associated in Sheet2!B:B
Example:
to populate Sheet1!A2, formula needs to compare dates in Sheet2!A5:A8 (based on detecting matches in Sheet2!B5:B8) to find most recent, which is 3/30/2016.

google-spreadsheet formula
Paste this formula in Sheet1, for example in cell E1:
=QUERY(Sheet2!A:B,
"select B, max(A) where not A is null group by B label B 'Name', max(A) 'Most Recent'")
Sample file
The result -- you get the report with on single formula.

Related

Google Sheets, splitting cell values within a Query?

(Related to this question)
I want to split the values in each cell, that is either blank or contains one or more comma-separated tags. Can I do this from within the QUERY? Or, how would I copy the column to a scratch column that is longer because the cell values are split into one or more columnar values?
This formula works nicely to show tags and counts, but treats each cell as a single text value:
=QUERY(Notes!D1:D, "Select D, count(D)
where D matches '^(?!(?:Labels|Tags)$).+'
group by D order by count(D) DESC label count(D) ''")
I also have this formula, which returns an array of non-blank, comma-separated values in a range:
=ArrayFormula(SPLIT(filter(Notes!D1:D, not(isblank(Notes!D1:D))), ","))
But this also has the problem that it splits values across columns (instead of rows), so I can't use the results as a simple range.
I have tried wrapping occurences of D, the data column, with the ArrayFormula. Each time I get a #VALUE! error from QUERY.
For what I get you're trying to do, you may find useful to FLATTEN your range and make it all in one column:
=FLATTEN(ArrayFormula(SPLIT(filter(Notes!D1:D, not(isblank(Notes!D1:D))), ",")))
Just if needed, you can add TRIM too so you don' have undesired spaces:
=FLATTEN(ArrayFormula(TRIM(SPLIT(filter(Notes!D1:D, not(isblank(Notes!D1:D))), ","))))
I don't know what your purpose then is, but you can wrap this in a QUERY to count as you expressed in your post too. Since it's a new column, you should name that column Col1:
=QUERY(FLATTEN(ArrayFormula(TRIM(SPLIT(filter(Notes!D1:D, not(isblank(Notes!D1:D))), ",")))),"Select Col1,COUNT(Col1) group by Col1 order by count(Col1) DESC label count(Col1) ''",)

Google Sheets: Selecting values in a column that don't exist in another column taking into account number of occurances

I have two column in a Google sheet that have values in them. Column A has all possible values (inclusive duplicate values) and Column B has some of the values in A (also inclusive duplicates).
I wanted to find out which values in Column A do not appear in Column B, also taking into account number of occurrences of these values.
The MATCH function works well however I wanted to have three instances of HQR123 appearing in the Pending column, as there are 4 occurrences of this value in Column A vs only 1 occurrence in Column B. If another instance of HQR123 is entered in Column B then only two instances should appear in the Pending column.
Is this possible?
Thanks and regards,
Shalin.
You could try something like
=sort(
index(
filter(A2:A,
isna(
match(A2:A&countifs(A2:A, A2:A, row(A2:A), "<="&row(A2:A)),
B2:B&countifs(B2:B, B2:B, row(B2:B), "<="&row(B2:B))
, 0)
))))

vlookup get latest match of duplicates

I'm trying to display a filtered version of sheet1 data in sheet2:
Sheet1:
Sheet2:
I used vlookup on sheet2 columns C,D,E to display sheet1 columns B,D,A respectively.
i.e., Sheet 2 - Column C
=vlookup(A2,Sheet1!A3:D,3)
but I'm not sure how to make it work with duplicates and to only get the latest one.
I tried using vlookup on query result but it didn't work out (because I was referencing a reference?)
=sortn(query(Sheet1!A2:D6,"select * where A is not null order by B,A desc"),99^99, 2, 2, true)
How can I apply vlookup to get the latest match of duplicate rows? If it's not possible, how can I go about this (if possible, without having to add extra sheets)
If I wanted to use Vlookup and Query to do it, I would end up with something like this:
=ArrayFormula(vlookup(query(B2:B,"select min(B) where B is not null group by B label min(B) ''"),
query(A2:D,"select B,A,C+D,C,D where A is not null order by B,A"),{2,1,3,4,5}))
so the first query gets the unique values of column B, and the second query gets the original data plus total (C+D) sorted in ascending order of family name then timestamp. Then vlookup finds the last matching value for each family name, which in this case is the latest one. You could also sort on timestamp descending and use the exact form of vlookup to find the first occurrence - probably a little bit slower:
=ArrayFormula(vlookup(query(B2:B,"select min(B) where B is not null group by B label min(B) ''"),
query(A2:D,"select B,A,C+D,C,D where A is not null order by B,A desc"),{2,1,3,4,5},0))
In reality I would probably use sort and sortn as I think you started to do in your question:
=sortn(sort(filter({A2:B,C2:C+D2:D,C2:D},A2:A<>""),2,1,1,0),999,2,2,1)
This time it's sorted on family name ascending then timestamp descending, then sortn removes duplicates.
use in row 2:
=INDEX(IFNA(VLOOKUP(A2:A, SORT(Sheet1!A:D, ROW(Sheet1!A:D), 0), 3, 0)))

Google Sheets Combine a column with duplicates and update total sum in another colum

This might be something fairly simple but struggling to find a way to do it.
In Column B, I have a list of foods required.
In Column C, I have the amount needed.
In Column D, I have g (for grams) ml (for mills) etc.
I would like to combine the duplicates in Column B and update the totals from Column C, with the g or ml in Column D beside it.
The list I have has been created by using an array formula based on dropdowns in another sheet.
I have seen people using UNIQUE formula in 1 column (this works) and then a SUMIF formula in another column and then a JOIN formula in another... I tried this but the SUMIF is always returning 0.
Would someone please be able to advise on how I can do this?
TIA :D
It's hard to be sure exactly what you need without seeing the data. But based on my understanding of solely what you've posted, this QUERY formula should generate a condensed mini-report:
=QUERY({B2:D},"Select Col1, SUM(Col2), Col3 WHERE Col1 Is Not Null GROUP BY Col1, Col3 LABEL SUM(Col2) ''")
In plain English, this means "Arrange the data from the range B2:D in the same order as the raw data, but sum the second column's data according to matches in both the first and third columns. Only return results for the raw data where the first column is not blank. Replace the default 'sum' header on the second column with nothing; I don't need it."
This formula assumes that every ingredient will always be attached to the same measurement (e.g., 'salt' in Col B is always paired with 'mg' in Col D, etc.). If this is not the case, you will wind up with ingredients being listed as many times as there are different measures in Col D.

How to add and highlight rows on input of particular column in Google Sheet

In the attached google sheet.
Google Sheet - https://docs.google.com/spreadsheets/d/1KxqaI-GYWur0Knt_bShI0GURucqU_cawu_sCoG_8Xlc/edit?usp=sharing
I need to execute the following two operations on input.
Add same number of rows against which ID if any user put the value as 1 in column K.
For Example, If I input 1 in column K for ID = ID_3 which has 5 rows, I want to append that five rows below where the last instance of ID_3 and in appended rows all the values should be the same as ID_3 except column C which will now be ID_3_Append, column K and O which should be blank for appended rows.
If someone input value as 1 in column M, We need to check for which ID it belongs and look for that id in column E, F, and G and highlight rows with red color if ID against which user has provided input as 1 is available in column E, F, and G.
If someone adds value as 1 in change type, I need to update the column B to I in the Test_1 sheet from Put_list but want to keep the Mark Status for common ID (if it has any Mark Status) between earlier Test_1andPut_list` unchanged. Also, we need to highlight the dependent rows accordingly.
Once we update column B to I we need to change the Name from 'Call' to 'Put'.
use:
=ARRAYFORMULA(QUERY({QUERY({Test_1!A:O;
IFERROR(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!K2:K}, 2, 0), 2, 0))=1,
{Test_1!A2:A, Test_1!B2:B, Test_1!C2:C&"_Append", Test_1!D2:J, Test_1!X2:X*0,
Test_1!L2:L, Test_1!X2:Y*0, Test_1!O2:O},
{"","","","","","","","","","","","","","",""}),
{"","","","","","","","","","","","","","",""})},
"where Col1 is not null and not Col3 matches '"&
TEXTJOIN("|", 1, "×", UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C,
SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C, )))&"' order by Col3", 1);
IF(LEN(TEXTJOIN("|", 1,
UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C, ))))>0,
QUERY({IF(Put_list!A2:A="",,IFERROR(Put_list!A2:A*1, "Put")), Put_list!A2:H,
IFNA(VLOOKUP(Put_list!B2:B&"♦"&COUNTIFS(Put_list!B2:B, Put_list!B2:B, ROW(Put_list!B2:B), "<="&ROW(Put_list!B2:B)),
{Test_1!C2:C&"♦"&COUNTIFS(Test_1!C2:C, Test_1!C2:C, ROW(Test_1!C2:C), "<="&ROW(Test_1!C2:C)), Test_1!J2:O}, {2,3,4,5,6,7}, 0))},
"where Col3 matches '"&TEXTJOIN("|", 1,
UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C, )),
UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C&".+", )))&"'"),
{"","","","","","","","","","","","","","",""})}, "where Col1 is not null order by Col3", 1))
transcript:
we start with array of {C, K} columns that we sort based on K column so if K column contains 1 then it will be moved up
this is convinient for VLOOKUP coz it will always look for 1st unique value eg. exactly wat we need if our array is sorted
so we vlookup C values to match our sorted array and return column K for all same IDs 2 stands for 2nd column from sorted array and 0 stands for "exact match"
if no match is found vlookup will output #N/A error so we wrap it into IFNA - then if no match is found vlookup will output empty rows
then we put this into IF statement... if our vlookup outputs 1, we output our columns (again in {array} form
if our vlookup outputs 0 then we output array of 15 empty cells in a row {"","","", .....} this is because we use arrays {} and all ranges in arrays needs to be of same size
our table has 15 columns so if some error would happen the we would get error in one single cell aganist our 15 columns
this way we avoid "array_literal error" having {15 columns; 15 cells in a row which is also 15 columns}
; semicolon puts these two arrays/ranges under each other while , comma will put then next to each other
so if vlookup results in 1 then we assemble our array with ranges... A, B columns are same then we append to C column the phrase "_Append" with &
D:J columns are same... then we force column of zeros by multipling random empty column (X) with 0 etc.
then again we use {"","","", ....} within IFERROR to deal with any possible error at this point
next we put all written above under our whole range Test_1!A:O and we wrap it into QUERY where we state to filter out all empty rows where Col1 is empty and 2nd condition of our query states
that Col3 of our array cant match our regex pattern which we assemble with TEXTJOIN formula
| stands for "or" in regex. 1 in textjoin means "all non empty cells". then we use × (unique symbol) in case textjoin would output empty cell on its own resulting in some error somewhere
again we perform same vlookup, same ifna and same IF but now we output only C column and we are interested only into UNIQUE values
then within the query we sort / order by Col3 our table and 1 at the end stands for "header rows"
This may answer the second part of your question.
Select the range of rows you want the red conditional formatting on, for example A2:A12 in your sample sheet.
Select Format - Conditional Formatting, verify the range, and apply a custom formula, as follows:
=OR(
IF($E2<>"NA",IFERROR(MATCH($E2,FILTER($C$1:$C$12,$M$1:$M$12=1),0)),0),
IF($F2<>"NA",IFERROR(MATCH($F2,FILTER($C$1:$C$12,$M$1:$M$12=1),0)),0),
IF($G2<>"NA",IFERROR(MATCH($G2,FILTER($C$1:$C$12,$M$1:$M$12=1),0)),0))
The FILTER pulls which IDs have a "1" in the update column, ColM.
MATCH looks to see if the ID in the Dependency column is in that filtered list.
The IFERROR sets the result to zero if no match.
And the initial IF, filters out the NA values in the Dependency column.
This logic is replicated for each Dependency column, so three IFs.
And the whole thing is wrapped in an OR function, so if any of the Dependency columns have a matching ID, the whole row is flagged for red.
I've applied this in a tab added to your sheet, Test_1-GK.
Let me know if this helps.
I'm afraid I didn't understand the first part of your question. I got confused with this bit:
I want to append that five rows below where the last instance of ID_3 and in appended rows all the values should be same as ID_3 except column C which will be now be ID_3_Append, column K and O which should be blank for appended rows.

Resources