I'm trying to write a script in Google Sheets that finds duplicates between 2 columns and highlights the entire row to indicate a duplicate. Here is the trick:
I have several columns in my sheet but am only comparing 2 of them for duplicates. Lets say column A and B.
If a value in column A is found to be a duplicate of column B, it should highlight the entire row (or at least columns A, B, and C).
The duplicate in column B will also need to be highlighted but only after 3 duplicates in in column A have been found. In other words, it will be highlighted for every 3 duplicates that are found.
Currently, I have a script that highlights duplicates but it only highlights the specific cell when I'd like multiple cells in the column to be highlighted. Additionally, duplicates for one of the columns are highlighted only after 3 duplicates are found.
Any help with this will be greatly appreciated. Thanks!
The first part is easy:
Custom formula:
=$A2=$B2
Apply to Range:
A2:D13
Enter in A2 (Assumes a header row). Change D to last column.
The second part is a little hard to understand, but this should give you an idea. It checks column C for three duplicates and highlights the cell. Add to C2:
Custom formula:
=countif($C2:$C,$C2)>2
Apply to:
C2:C13
Related
I have a table of content on google sheet where word list on column B and image name on column C. On column C image name is not given for every cell. Now I need to use ARRAYFORMULA on Cell D1 where it will give the output (Word name and Image Name) on Column D if Row of Column A is not empty. If you look at the attached screenshot, for some Word there is no image name given on column C. In this case I need the image name that used last time.
For example: On Row 17 for WORD 4 there is no image name given. So, in this case the image name will be Image 2 from cell C12 that used previously for WORD 3. I tried it in many different ways but never able to do it with ARRAYFORMULA. The only solution I am using right now is using formula for every row which is not a good solution. I need to do it with ARRAYFORMULA. I don't want to do it with google script.
➡ Spreadsheet link (Please check Tab 1)
➡ Please check the Screenshot
I have added a sheet ("Erik Help") with the following formula in D1:
=ArrayFormula({"Header";IF(A2:A="",,B2:B&" : "&VLOOKUP(ROW(A2:A),FILTER({ROW(A2:A),C2:C},C2:C<>""),2,TRUE))})
This one array formula creates a header and then fills the entire column with results.
You can change "Header" to whatever you like.
IF(A2:A="",, just leaves D2:D null if nothing is in that row of Col A.
Otherwise, whatever is in B2:B is concatenated with a space-colon-space and then a VLOOKUP of all rows within a FILTERed virtual array that contains only rows and Col-C data where Col C is not blank. Because TRUE is chosen as the final parameter, all rows will "look backward" to the last row where Col C did contain data and return that data as the final piece to be concatenated.
=if(isnumber(SEARCH("WORD",B2,1)),join(" : ",B2, indirect(ARRAYFORMULA(address(IFNA(match(2,1/($C$2:$C2<>"")))+1,COLUMN(C2))))),"")
past this formula in D1 cell and drag it ...
Example sheet: https://docs.google.com/spreadsheets/d/14ma-y3esh1S_EkzHpFBvLb0GzDZZiDsSVXFktH3Rr_E/edit?usp=sharing
In column B of ItemData sheet, I have achieved the result I want by copying the formula into every cell in the column, but I want to solve this using ArrayFormula instead.
In column C I have achieved the same result using ArrayFormula. However, for addition, column C is referring to cells in column B, while column B is referring to cells in column B. I.e. every cell in column B is adding 1 to the cell on the row above.
If I select the C3 formula text and paste it into the cell edit field for cell B3 (to not screw up cell references during copy - I know I could make them static references, but this is not my problem), the cell gets an error value of
#REF!
Error
Circular dependency detected. To resolve with iterative calculation, see File > Spreadsheet Settings.
Do note that the additions that need to be done are the same in both cases: Add 1 to the value of the cell on the previous row, so there is no circular reference involved. There is a starting value provided in B2, and cells in B3 and downwards should use the data from the B cell in the previous row.
Also, note that I did try File->Spreadsheet settings and enabling circular reference computation with max 25 items, but this only fills in the first two cells (B3 and B4).
How can I solve this problem? I would prefer having something like ArrayFormula, where the formula only exists in a single cell. But copy-pasting would be acceptable as long as any new rows, inserted in between or added at the bottom, would get the same formula added in column B.
Will matching items always be consecutive? It seems that way since you're comparing each Item cell to the cell above it right in your formula logic. That breaks an [unwritten?] rule of spreadsheet normalization; values' addresses themselves generally should not be treated as data.
IF you're committed to it though, have you considered explicitly using location as a data source? Example:
=ARRAYFORMULA(IFS(
NOT(LEN(A3:A40)),,
ROW(A3:A40)-3-MATCH(A3:A40,A$3:A$40,0)<=VLOOKUP(VLOOKUP(A3:A40,Items!$A$2:$D,2,false),DataPerColor!$A$2:$B,2,false),ROW(A3:A40)-3-MATCH(A3:A40,A$3:A$40,0),
true,
))
Just like your formulas, all that does in English is:
for each row,
if there's no Item, don't output any ItemData,
if the number that belongs in this cell¹ is less than or equal to the lookup, print it,
otherwise, don't output any ItemData
But then what is ¹ "the number that belongs in this cell" and how can we calculate it without using column B? I abuse locations of things to get it. Looking down your row B, each number that appears is just:
this row's number, minus
the row where items start [always 3], minus
the row number [in just the Item rows] of the first row containing this row's Item
Using the second-to-last ItemC as an example: the first ItemC is the 16th item listing, and the one we're looking up… the "second-to-last ItemC" is in row 21 of the sheet. 21-3-16 = 2 …the number you wanted.
If you can stomach that, it's a single formula and does work according to your specifications.
The answer in this question Highlighting Duplicate Rows in Google Sheets works perfectly to highlight the duplicate cells in a column. What I'm wanting to do is one step futher and highlight the rows that each of those duplicated cells are in.
So if I've got duplicated cells in column c that are highlighted, how do I also highlight the rows?
Thanks!
Here's the current formatting I have to highlight duplicates in Column C.
Current conditional formatting equation
Change Apply to range to A1:Z (change Z to last column you want to highlight). And change the Custom formula to =countif($C:C,$C1)>1. You need to use the absolute reference ($).
In Google Sheets I am using a filter function to pull in Names into column A and a Timestamp into column B. Every time a second occurrence of the name shows up into columns A & B of the list I want column C next to the prior occurrence to reference the new timestamp. In column D I will then calculate the difference from the names timestamp and the next occurrence of that same name.
Currently I am using the following formula:
=IFERROR(INDEX(B3:B,MATCH(A2,A3:A,0)))
If I drag this formula down it does what I need it to do, but due to how many rows are being added to the first two columns, rows are being added to the bottom of the sheet due to the filter and the formulas keep needing to be dragged down. The durations in column D are being calculated with the following formula, that automatically arrays the results and automatically expands with the filter results:
=IFERROR(ARRAYFORMULA(IF(C2:C="","",C2:C-B2:B)))
I would like my index match formula to do the same, but it seems I cannot use the index formula with an arrayformula.
I attempted to achieve this by using a vlookup combined with an offset for the range. The first row is giving me the result I want, but all the subsequent rows are not referencing the offset range, probably because the offset isn't changing with each new array result here is that attempt:
=IFERROR(ARRAYFORMULA(VLOOKUP(A2:A,OFFSET(A2:B,1,0),2,FALSE)))
Any ideas how this could be accomplished by placing a formula in one cell, or would this have to be accomplished with a script?
I have added an example spreadsheet of the current method HERE
Thanks in advance for any help.
Formula
Instead of
INDEX, MATCH and OFFSET
try the following formula
=ArrayFormula(IFERROR(VLOOKUP(
TRANSPOSE(VALUE(REGEXEXTRACT(QUERY(TRANSPOSE(
IF(FILTER(ROW(A2:A),LEN(A2:A))<TRANSPOSE(FILTER(ROW(A2:A),LEN(A2:A))),
IF(FILTER(A2:A,LEN(A2:A))=TRANSPOSE(FILTER(A2:A,LEN(A2:A))),
TRANSPOSE(FILTER(ROW(A2:A),LEN(A2:A))),
),)
),,2000000),"(\d+)"))),
FILTER({ROW(A2:A),B2:B},LEN(A2:A)),2,0)))
Formula description
This part creates a square matrix showing the row number of the value that matches if it's below of the current row:
IF(FILTER(ROW(A2:A),LEN(A2:A))<TRANSPOSE(FILTER(ROW(A2:A),LEN(A2:A))),
IF(FILTER(A2:A,LEN(A2:A))=TRANSPOSE(FILTER(A2:A,LEN(A2:A))),
TRANSPOSE(FILTER(ROW(A2:A),LEN(A2:A))),
),)
This part takes the smallest row that matches the current row (the next occurrence of the row value)
TRANSPOSE(VALUE(REGEXEXTRACT(QUERY(TRANSPOSE( ),,2000000),"(\d+)")))
This part returns the related value, if any, otherwise a blank:
IFERROR(VLOOKUP( ,FILTER({ROW(A2:A),B2:B},LEN(A2:A)),2,0)))
On my google sheet for its form, I have the answers in Row 2.
There are 109 columns in which I need to check if the descending rows of each column match the contents of Row 2 of that column. On top of that, I have to have conditional formatting for the cells that DO NOT match the contents of Row 2 in their respective column.
Is there a way that I don't have to add a formula to each any every column?
You can do this with conditional formatting - for the "apply to range" section (pretending your data starts in column A and ends in D, although in reality you will put whatever the last column is) enter in
A2:D
then for the rule, choose custom formula and enter in this exact formula:
=if(eq(indirect(address(row(),COLUMN(),4)),indirect(ADDRESS(2,column(),2)))=TRUE,FALSE,TRUE)
This will dynamically highlight all of the answers that do not match the value in row 2