Highlight duplicates within cells - google-sheets

Help needed finding duplicate words with single text cells in google sheets?
Data has trim function applied.
e.g =if(B10"",""trim(concatenate($I$3," ",trim(B10)," ",$J$3)))
Tested solution needs some modifications
Requirement number of duplicates in one cell in not important, it can be any word, in any position, it has to match exact word and duplicates should not be removed.
For example, in a cell: "Great Great Expectations" formula should detect “Great” repeated twice, and where no duplicate words are found no change should happen
Desired outcome would be to highlight entries with duplicate entries in column "I" or create message box listing duplicate entries.
Tested solutions include vlookup and array formula as standalone "dupe" "no dupe" comparison.
Only Solution that provide any success was
=IF(SUM(N(IFERROR(FIND(" "&MID(A2,ROW(OFFSET(A$1,,,LEN(A2))), MMULT(FIND(" ", {""," "} &A2&" ",ROW(OFFSET(A$1,,,LEN(A2)))),{1;-1}))&" ", " "&A2&" "), LEN(A2)+1)<ROW(OFFSET(A$1,,,LEN(A2)))))>0, "Dupes", "All good")
This return consistently correct result for standalone data, but I have not sure how to modify this to.
Run automatically
Highlight Cells with Duplicates
Open to any solutions
Thanks in advance
#Harun24Hr
Array Formula Returns "1" for blank lines
and Ref range error for any other cell.
Conditional Formatting is applying Colour formatting to entire range, with data no longervisible.
Any ideas
Thanks

Try below formula. When you will apply formula to conditional formatting then it will apply to all cells automatically for selected range.
=ArrayFormula(INDEX(SORT(COUNTIFS(TRANSPOSE(SPLIT(A1," ")),UNIQUE(TRANSPOSE(SPLIT(A1," ")))),1,FALSE),1)>1)
Conditional formatting settings.

Related

Compiling a list using INDEX but need to skip certain rows

I'm compiling a list based on the first answers recieved between row N and AF.
I'm using these two formulas:
=INDEX(N2:O2,MATCH(FALSE,ISBLANK(N2:O2),0))
and
=INDEX(R2:AF2,MATCH(FALSE,ISBLANK(R2:AF2),0))
Is there a way to combine them whilst not searching in rows P & Q?
These are generated from a Form response so can't just be switched around.
try:
=INDEX({N2:O2, R2:AF2}, MATCH(FALSE, ISBLANK({N2:O2, R2:AF2}), 0))
If Sheet1 is an intake sheet of form results, you should not add any data, formulas or even formatting to that sheet. It virtually always causes issues. A form intake sheet should be left exactly as it is. A new sheet can then be used to bring over the results of the form intake sheet as you want to see them.
However, since you didn't specify any of that, I will supply a formula written to work in the same sheet as your posted example and in-sheet examples.
Clear an entire column and place the following in the top cell of that column:
=ArrayFormula({"Attendee Name"; IF(E2:E="",,IFERROR(REGEXEXTRACT(TRIM(TRANSPOSE(QUERY(TRANSPOSE(FILTER(IF(N2:AK="",,N2:AK&"~"),N1:AK1=N1)),,COLUMNS(N1:AK1)))),"\s*([^~]+)"),"(none listed)"))})
This one formula will produce a header (the text of which you can change within the formula itself as you lie) and all valid results for all rows.
The inner IF will append a tilde (~) to any non-null entries in the range N2:AK.
FILTER will keep only those columns in this range where the header is the same as the header in N1 (i.e., "Attendee Name").
TRANSPOSE(QUERY(TRANSPOSE( ),,COLUMNS( ))) is colloquially called a "Query smash." It will form one cell from all horizontal results per row.
TRIM will cut any preliminary spaces and form a true string.
REGEXEXTRACT will pull the from the first non-space character up to but not including the first tilde (from those appended in the first step)—in other words, the first full valid entry from any column.
IFERROR will return a message if there is an error, with the likely error being that there were no valid entries for "Attendee name" in any column.
The outer IF will leave the cell blank if the no training event exists in E2:E.
{ } forms a virtual array that places the header over all other results.
ArrayFormula( ) signifies that multiple results will be processed at once.
Because this is an array formula that is being "asked" to process every row, you cannot manually type into any cell of this results column. If you do, you will "break the array"; everything except what you just typed will disappear, leaving only an error in the formula cell. If you need to add or change a name, you need to do that in the raw results range (e.g., manually type a name or a new name in Col N), which will then turn up in the formula output range.

Google Sheets: extracting numbers from multiple cells that contain text and numbers for one column of data?

I'm working in Google Sheets. I have a few hundred cells that contain text and numbers. The cells contain employee names and their ID#s. I want to extract the ID#s and compile them into one list. I have the formula below that will let me complete the task, but only for one cell, not for a range of cells (even if I select a range and add it to the formula):
=transpose(split(regexreplace(regexreplace(A1,"\s\d+\s"," "),"[^\d\.]"," ")," "))
For example, cell A1 would contain, "Tammy - 123456, Bob - 654987, Mike - 321456" and repeat similar until you get to something like cell DT75 "Marcus - 35768, Bruce - 95126, Lisa - 789123". Some cells in the sheet are blank. The above formula will give me the ID#s from A1 in their own cells:
123456
654987
321456
I'd like to get one column of all the ID#s in the sheet that I could then copy and paste into a completely different proprietary database. Am I coming at this the wrong way? Is a script a better angle?
Since you want your original range to be multi-column, you could try a slightly modified version of player0's formula, like this:
Use CONCATENATE to put all data in a single string.
REGEXREPLACE to remove everything but the numbers from your string.
SPLIT to divide your string into several cells, blank space being the separator.
FLATTEN put all resulting values into a single column.
=FLATTEN(SPLIT(REGEXREPLACE(CONCATENATE(A:DT), "[A-Za-z-,]+", )," "))
try:
=INDEX(FLATTEN(SPLIT(QUERY(REGEXREPLACE(A1:A, "[A-Za-z-,]+", ),,9^9), " ")))
for multi-column:
=INDEX(FLATTEN(SPLIT(FLATTEN(QUERY(REGEXREPLACE(A1:C, "[A-Za-z-,]+", ),,9^9)), " ")))

check for duplicate rows (not just a single cell) in google sheets

Hello I would like to check for duplicate rows, not just a cell, in google sheets, i would like to apply this formula in conditional formatting so it would highlight the cell
Here is a sample of what i want to catch
I would like to catch a duplicate row,group,or pair of cells in exact order. Can anybody help me with the formula?
I tried searching and there seems to be no article about it yet, I also tried using countif on both rows and multiply them, but that does not solve it being a pair.
Let's say you have the following data:
https://ibb.co/sFhjN34
First, range select A1:B1001.
Then, paste the following formula in the custom formula bar.
=AND(A1<>"",COUNTIF(ARRAYFORMULA($A:$A&$B:$B),index(ARRAYFORMULA($A:$A&$B:$B),ROW($A1),))>1)
Explaination:
ARRAYFORMULA($A:$A&$B:$B)
This is creating a virtual array which concat two columns A & B.
E.g. juice crackers -> juicecrackers
index(ARRAYFORMULA($A:$A&$B:$B),ROW($A1),)
Since conditional formating will loop through all rows given the starting range you specify earlier (A1:B1001), this part is trying to loop through ROW($A_) such that index(ARRAYFORMULA($A:$A&$B:$B),ROW($A_),) will return the combined word.
COUNTIF(ARRAYFORMULA($A:$A&$B:$B),index(ARRAYFORMULA($A:$A&$B:$B),ROW($A1),))>1)
Count every combined word that it specified in this array ARRAYFORMULA($A:$A&$B:$B)
If it countup more than 1, it means duplicated.
A1<>"" For those blank cells, we ignore it.
Combine the two conditions. AND(A1<>"",COUNTIF(ARRAYFORMULA($A:$A&$B:$B) ....)
It's not quite as perfect as you'd like, but I think this is a start:
=AND($A1=$A2,$B1=$B2)
This doesn't highlight the last row of any matches it finds, but it might be serviceable for what you want (ex. if Row 1 through Row 3 match, it will only highlight Row 1 and Row 2).
Just change the columns to match the two you're working with, and then if you want it to highlight the entire row, change the Apply to range to A1:Z or however many columns you have.

How to highlight PARTIAL matching duplicates across 1 column in Google Sheets using conditional formatting

As the title says, I'm trying to highlight partial duplicates for 1 column in Google Sheets using conditional formatting.
Here's what I have so far:
=if(C1<>"",Countif(C$1:C,left(C1,5)& "*") > 1)
This works, but the issue is the "left" makes it so the code only highlight cells that are duplicates from the start.
So for instance, the formula won't highlight "1exampletest" and "2exampletest" because the first 5 characters are not the same...which is something I want the formula to be able to highlight.
Does anyone know the right formula for detecting partial duplicates regardless of when the duplicate is occurring?
there are several ways how to do it which may or may not work for you (because you did not provide a geniue sample)
_______________________________________________________________
or with REGEXMATCH like: =REGEXMATCH(A1, "example")
_______________________________________________________________
=OR(IF(C1<>"",COUNTIF(C$1:C,LEFT(C1,5)&"*")>1),IF(C1<>"",COUNTIF(C$1:C,"*"&RIGHT(C1,5))>1))

Exclude empty cells when using SORT() in descending order

Use case
Sort formula against other sheet but exclude empty values after last item. Empty values get sorted at top, creating a whole bunch of blank space, and then data I care about.
=SORT('other sheet'!A1:C36,'other sheet'!D1:D36,FALSE)
A-C is the data I wish to show.
D is the column I wish to sort on.
Problem
The "36" must be manually updated each time I add/remove a row to 'other sheet'.
Possible solution would be:
Get the row number of the last non-empty cell in a column in Google Sheets as [last row].
=SORT('other sheet'!A1:C[last row],'other sheet'!D1:D[last row],FALSE)
What I tried
Lookup("",'other sheet'!A:A)
Result: #N/A
No examples in Help for finding empty cells
Get the last non-empty cell in a column in Google Sheets
Returns value not address. Could find that value in row but not as efficient. Also what if value is found in more than one place?
** Example Speadsheet **
https://docs.google.com/spreadsheets/d/1bqiVe3pBYDJFtrO4EysSKTDq17lzY5r2b8sPV-KnTdI/edit#gid=0
I cannot recreate this in a new spreadsheet. I believe this may be a bug.
If you want to find the last row, you can use the following formula.
=SORT(INDIRECT("'other sheet'!A1:C"&QUERY(TRANSPOSE(FILTER(ROW('other sheet'!A:A),'other sheet'!A:A="")),"select Col1")),INDIRECT("'other sheet'!D1:D"&QUERY(TRANSPOSE(FILTER(ROW('other sheet'!A:A),'other sheet'!A:A="")),"select Col1")),FALSE)
The code in bold is a formula to find the first blank cell in column A in 'other sheet'.
The code in italic return a reference range based on the bolded code.
I hope this help even though it seems to be a very long time since your question.
Edited: I just found out that query can limit rows.
=SORT(INDIRECT("'other sheet'!A1:C"&QUERY(FILTER(ROW('other sheet'!A:A),'other sheet'!A:A=""),"limit 1")),INDIRECT("'other sheet'!D1:D"&QUERY(FILTER(ROW('other sheet'!A:A),'other sheet'!A:A=""),"limit 1")),FALSE)
Edited: Sorry, I didn't read the question carefully. If you want to remove the first blank cell when sort in descending order, you just have to simply add a QUERY function at front, without query for anything.
=QUERY(SORT(INDIRECT("'other sheet'!A1:C"&QUERY(FILTER(ROW('other sheet'!A:A),'other sheet'!A:A=""),"limit 1")),INDIRECT("'other sheet'!D1:D"&QUERY(FILTER(ROW('other sheet'!A:A),'other sheet'!A:A=""),"limit 1")),FALSE),"")

Resources