I am trying to set up a Google Sheet arrayformula to find any duplicates without identifying the first instance of that value. I have a table with the values below:
A
1
2
3
4
5
Since number 2 is a duplicate value, I would like to identify this with a formula but I would not like to indentify the first value in this column. This formula gets me the results I am looking for: =IF(COUNTIF($A$2:$A2,A2)=1, "Unique", "Duplicate")
A
B
1
Unique
2
Unique
3
Unique
2
Duplicate
4
Unique
2 Duplicate
But when I try to convert this to an arrayformula so I don't have to manually drag the formula down when new rows are added I get a different result. This is the arrayformula I used: =ARRAYFORMULA(IF($A$2:$A="", "", IF(COUNTIF($A$2:$A,A2:A)=1, "Unique", "Duplicate")))
A B
1 Unique
2 Duplicate
3 Unique
2 Duplicate
4 Unique
2 Duplicate
The problem is that the first value is also identified as duplicate. What would be the best way to convert =IF(COUNTIF($A$2:$A2,A2)=1, "Unique", "Duplicate") into an arrayformula?
You want this:
=ArrayFormula(IF(A2:A="",,IF(COUNTIFS(A2:A,A2:A,ROW(A2:A),"<="&ROW(A2:A))=1,"Unique","Duplicate")))
The problem with your COUNTIF is that you essentially asked "Is this number unique against every other number in this column? Or is it duplicated anywhere else in this column?" That is why 2 says "Duplicate" in all instances: because each of them is duplicated "somewhere else" in the column.
What you really want to be asking is "Up to this row, has this number been duplicated yet so far?" And that requires COUNTIFS with a second condition that only checks considering ROW() numbers "up to" (i.e., "<=") the current row.
This is a very nice article to find duplicate entry. But I would like to add one missing point. That is, if you incorporate a TRIM function with the above functions, this will become more accurate and perfect. Because if we add spaces at the beginning or end in the duplicate cells, then the above functions won’t consider it as duplicate.
I did this to monitor my employees work sheets as well. i.e
ARRAYFORMULA(if(len(TRIM(C9:C)),(if((countif(TRIM(C9:C),TRIM(C9:C)))>1,”duplicate”,)),))
Related
The image explains what I would like to achieve probably the best:
The table I would like to create, at the moment I have only first three columns:
I would like the "New Volume" to have the same values whenever the keyword repeats. At the moment I have the three first columns but cannot figure out with what formula I can create the "New Volume" column.
I would appreciate the help.
This is a vlookup job.
=vlookup(D3;$D$3:$E$8;2;false)
It searches D column for identifier (also in column D) and returns value from second column. When vlookup is set as false, will always return first found value.
To avoid copying down the formula, you can nest it with arrayformula:
=ArrayFormula(ifna(vlookup(D3:D;$D$3:$E;2;false)))
My solution is here:
https://docs.google.com/spreadsheets/d/1RpSsb6DmUs6lcPmZ1R6uPW-a3iWs9XEnSpc3b3XiihI/edit?usp=sharing
I just wanted a simple way to number columns or rows in a Google Sheet, and most answers I've found offer many options that are far more complicated than I needed them to be.
Example: I want to number every column in the active sheet, starting with 1 for Column A and counting up by 1, regardless of the content of any other cells on the sheet and if I add columns to the sheet later, I want them to automatically update with the correct column numbers.
Another way is to use SEQUENCE.
So putting =SEQUENCE(99) in A1 would number the first 99 rows, from 1 to 99.
To number columns, just rotate that array, with TRANSPOSE.
So if A1 held =TRANSPOSE(SEQUENCE(26))
that would number columns A to Z with the numbers 1 to 26.
If you want to number both columns and rows,try:
in A1: =SEQUENCE(999), and
in B1: =TRANSPOSE(SEQUENCE(25,1,2))
I realise that this is numbering a specific number of rows, or columns, but I often find that very useful. You could modify this to number all columns or rows by adding some count to determine the total number of rows or columns, and using that in place of the first parameter for the SEQUENCE function.
The simplest way I've found to do this is by putting either of the following formulas in A1:
For numbering rows: =ArrayFormula(ROW(A:A))
And for columns: =ArrayFormula(COLUMN(1:1))
After putting the formula in A1, I'll usually hide the column or row the formula is in so I don't accidentally change or delete it.
If I want the counting to start at 1 on the 2nd, 3rd, or 4th row or column, then adding a -1,-2, or -3 respectively to the end of the formula gets that done.
For example: To number columns starting with 1 in Column C, the formula I put in A1 is =ArrayFormula(COLUMN(1:1)-2).
This may be way more basic than most people on this site are generally looking for, but for some reason it took me an unexpectedly long time to find it/ figure it out, so I thought maybe someone else would find it useful in the future.
Image of formula not working
I am trying to filter data by column reference
Not Work For Me =UNIQUE(A2:C6)
What i want it shoutd be like What actually i want
I want find UNIQUE Data Through Column Reference B is Mo No.
Solution
In this case a simple UNIQUE statement is not enough. You are looking for a function that takes in account only one column for your uniqueness check.
In this case SORTN is best suited for this job.
=SORTN(A1:C7,7,2,2,1)
Here is how it works:
n: The number of items to return. Must be greater than 0.
I have 7 rows so at most 7 results
display_ties_mode: A number representing the way to display ties.
In this case 2: Show at most the first n(7) rows after removing duplicate rows.
sort_column1: The index of the column in range or a range outside of range containing the values to sort by.
In this case is 2 as well. Since the uniqueness check is performed in the B Column.
is_ascending: TRUE or FALSE indicating whether to sort sort_column in ascending order.
This is up to you
My table contains 2 sheets with a different number of columns. I want to add a column that will display true or false (or any other 2 opposite values ) for each row depending on whether this row satisfies 2 criteria which are: sheet1!col1=sheet2!col1 and sheet1!col2=sheet2!col2.
You'll find an illustration below.
I've tried using
ARRAYFORMULA(VLOOKUP(A1&B1, {Sheet1!A1:A4&Sheet1!B1:B4,Sheet1!C1}, 3))
but I get an error message
vlookup evaluates to an out of bound range
So I wanted to try
QUERY({Sheet1!A1:B4,A1:B5}, "Select C where ")
but I couldn't figure out how to write the condition where (sheet1)col1=(sheet2)col1 & (sheet1)col2=(sheet2)col2 and I also don't know if I can work with tables of different dimensions. I finally tried
=MATCH(A1&B1,{Sheet1!A1:A&Sheet1!B1:B})
but it always returns 1.
Any idea please?
Sheet 1
Sheet 2
Your first formula is almost right. You are getting the error message because there is only one column in the curly brackets so you have to change it to
=ArrayFormula(vlookup(A1&B1,{Sheet2!A:A&Sheet2!B:B},1,false))
and add the 'false' to make sure it only does exact matches.
To make the query work you need the right syntax to access cells in the current sheet:
=query(Sheet2!A:B," select A,B where A='"&A1&"' and B='"&B1&"'")
To make the match work, you need to enter it as an array formula and add a zero to specify exact match:
=ArrayFormula(MATCH(A1&B1,{Sheet2!A:A&Sheet2!B:B},0))
However I would take flak from my colleagues if I didn't point out that there is an issue with the vlookup and match as shown above - toto&moto would match with not just toto&moto, but also with tot&omoto etc. The way round this is to add a separator character e.g.
=ArrayFormula(vlookup(A1&"|"&B1,{Sheet2!A:A&"|"&Sheet2!B:B},1,false))
=ArrayFormula(MATCH(A1&"|"&B1,{Sheet2!A:A&"|"&Sheet2!B:B},0))
These still need some tidying up if they are to report Yes and No, and also not to give false positive on blank rows - also the vlookup and match can be written as self-expanding array formulas - but that is the short answer to the question.
I have 5 columns of numbers that I want to sort per row into another set of columns. I figured I need to use small() (e.g. small(a2:e2,1) for f2; small(a2:e2,2) for g2 and so on). Is there away to iterate this for the next rows; if possible using only native google spreadsheet formulas?
Thanks in advance
I was able to make a temporary work around, but I had to use 3 cheat columns. It looks ok for now but I imagine it will be troublesome for really huge numbers.
Here's a sample sheet for reference: https://docs.google.com/spreadsheets/d/1MQTP2XkRsPRAnPQ5wLhkR8JoNVY6YOExVlOkkX8UeRs/edit#gid=0
The original data are in A3:E
The first cheat column (G3:G) simply creates a column of numbers from 1 to the largest number found in the source data. 1-9 is changed to 01-09 for easier searching. "#" is then added at the end-this will come handy later:
Cheat Column 1 =filter(if(row(A:A)=max(A:E)+1,ʺ#ʺ,text(row(A:A),ʺ00ʺ)),row(A:A)<=max(A:E)+1)
The second cheat column (H3:H) combines each row into a string separated by "-" with a "#" marker:
Cheat Column 2=filter(text(A3:A,ʺ00ʺ)&ʺ-ʺ&text(B3:B,ʺ00ʺ)&ʺ-ʺ&text(C3:C,ʺ00ʺ)&ʺ-ʺ&text(D3:D,ʺ00ʺ)&ʺ-ʺ&text(E3:E,ʺ00ʺ)&ʺ#ʺ,A3:A<>ʺʺ)
The last cheat column (I3:I) sorts each line (from cheat column 2) by finding each number from cheat column from 01 up to the max number, then the "#" char (this ensures that each line will still have the # end marker). "Find" will return the "position" of each number or an error if it's not found. By using "if", we can make "find" return the actual number or "" instead.
=filter(arrayformula(if(iferror(find(transpose(filter(G3:G,G3:G<>ʺʺ)),H3:H),ʺʺ), transpose(filter(G3:G,G3:G<>ʺʺ)),ʺʺ)),A3:A<>ʺʺ)
The formula above creates as many columns as there are numbers from cheat column 1. To prevent this, a "-" is added to each number then "Concatenate" is used to combine everything into one massive string with each set separated by "#". The string is then split using the "#" marker.
Cheat Column 3 =transpose(split(concatenate(filter(arrayformula(if(iferror(find(transpose(filter(G3:G,G3:G<>ʺʺ)),H3:H),ʺʺ),ʺ-ʺ&transpose(filter(G3:G,G3:G<>ʺʺ)),ʺʺ)),A3:A<>ʺʺ)),ʺ#ʺ))
Each number is then separated into each corresponding column by using mid().
Small 1 =filter(mid(I3:I,2,2)*1,A3:A<>ʺʺ)
Small 2 =filter(mid(I3:I,5,2)*1,A3:A<>ʺʺ)
Small 3 =filter(mid(I3:I,8,2)*1,A3:A<>ʺʺ)
Small 4 =filter(mid(I3:I,11,2)*1,A3:A<>ʺʺ)
Small 5 =filter(mid(I3:I,14,2)*1,A3:A<>ʺʺ)
Note that the formula above is only for numbers 1-99. For larger numbers, the Text() formulas should have more zeroes to correspond to the number of digits of the biggest number. The Mid() formulas should also be adjusted accordingly.
I would like to stress that I am very far from being a spreadsheet expert and that this solution is very "unoptimized". It requires several cheat columns; with the first one even having more rows than the original data. If anyone can help me get rid of the cheat columns (or at least the first one) I will be very grateful.
How about using SMALL like you mentioned in your question?
=small($A3:$E3,column()-columns($A3:$G3))
You will need to change the ranges accordingly. The last $G$3 is the cell just before the cell where the formula is placed.
Sample