Remove duplicate on second column and keep the first entry in googlesheet - google-sheets

I have a file that contains many columns and rows. I want to remove a duplicate entry in a certain column based on another column. I want to keep the first entry (if there is a duplicate) and erase the duplicates.
For example:
ID
Value
31560
0
31560
0
30530
62.77
30530
62.77
30540
100
ID
Value
31560
0
31560
30530
62.77
30530
30540
100
I found the following code, but it just erases the duplicates and it is not based on a certain column.
=IF(A2="","",IF(COUNTIF($A2:A15,A2)=1,A2,""))

use:
=ARRAY_CONSTRAIN(SORTN({A2:B, A2:A&B2:B}, 9^9, 2, 3, 0), 9^9, 2)
or:
=INDEX(IF(1=COUNTIFS(A2:A, A2:A, B2:B, B2:B, ROW(A2:A), "<="&ROW(A2:A)), B2:B, ))

Related

Google Sheets: Selecting values in a column that don't exist in another column taking into account number of occurances

I have two column in a Google sheet that have values in them. Column A has all possible values (inclusive duplicate values) and Column B has some of the values in A (also inclusive duplicates).
I wanted to find out which values in Column A do not appear in Column B, also taking into account number of occurrences of these values.
The MATCH function works well however I wanted to have three instances of HQR123 appearing in the Pending column, as there are 4 occurrences of this value in Column A vs only 1 occurrence in Column B. If another instance of HQR123 is entered in Column B then only two instances should appear in the Pending column.
Is this possible?
Thanks and regards,
Shalin.
You could try something like
=sort(
index(
filter(A2:A,
isna(
match(A2:A&countifs(A2:A, A2:A, row(A2:A), "<="&row(A2:A)),
B2:B&countifs(B2:B, B2:B, row(B2:B), "<="&row(B2:B))
, 0)
))))

How to add and highlight rows on input of particular column in Google Sheet

In the attached google sheet.
Google Sheet - https://docs.google.com/spreadsheets/d/1KxqaI-GYWur0Knt_bShI0GURucqU_cawu_sCoG_8Xlc/edit?usp=sharing
I need to execute the following two operations on input.
Add same number of rows against which ID if any user put the value as 1 in column K.
For Example, If I input 1 in column K for ID = ID_3 which has 5 rows, I want to append that five rows below where the last instance of ID_3 and in appended rows all the values should be the same as ID_3 except column C which will now be ID_3_Append, column K and O which should be blank for appended rows.
If someone input value as 1 in column M, We need to check for which ID it belongs and look for that id in column E, F, and G and highlight rows with red color if ID against which user has provided input as 1 is available in column E, F, and G.
If someone adds value as 1 in change type, I need to update the column B to I in the Test_1 sheet from Put_list but want to keep the Mark Status for common ID (if it has any Mark Status) between earlier Test_1andPut_list` unchanged. Also, we need to highlight the dependent rows accordingly.
Once we update column B to I we need to change the Name from 'Call' to 'Put'.
use:
=ARRAYFORMULA(QUERY({QUERY({Test_1!A:O;
IFERROR(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!K2:K}, 2, 0), 2, 0))=1,
{Test_1!A2:A, Test_1!B2:B, Test_1!C2:C&"_Append", Test_1!D2:J, Test_1!X2:X*0,
Test_1!L2:L, Test_1!X2:Y*0, Test_1!O2:O},
{"","","","","","","","","","","","","","",""}),
{"","","","","","","","","","","","","","",""})},
"where Col1 is not null and not Col3 matches '"&
TEXTJOIN("|", 1, "×", UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C,
SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C, )))&"' order by Col3", 1);
IF(LEN(TEXTJOIN("|", 1,
UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C, ))))>0,
QUERY({IF(Put_list!A2:A="",,IFERROR(Put_list!A2:A*1, "Put")), Put_list!A2:H,
IFNA(VLOOKUP(Put_list!B2:B&"♦"&COUNTIFS(Put_list!B2:B, Put_list!B2:B, ROW(Put_list!B2:B), "<="&ROW(Put_list!B2:B)),
{Test_1!C2:C&"♦"&COUNTIFS(Test_1!C2:C, Test_1!C2:C, ROW(Test_1!C2:C), "<="&ROW(Test_1!C2:C)), Test_1!J2:O}, {2,3,4,5,6,7}, 0))},
"where Col3 matches '"&TEXTJOIN("|", 1,
UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C, )),
UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C&".+", )))&"'"),
{"","","","","","","","","","","","","","",""})}, "where Col1 is not null order by Col3", 1))
transcript:
we start with array of {C, K} columns that we sort based on K column so if K column contains 1 then it will be moved up
this is convinient for VLOOKUP coz it will always look for 1st unique value eg. exactly wat we need if our array is sorted
so we vlookup C values to match our sorted array and return column K for all same IDs 2 stands for 2nd column from sorted array and 0 stands for "exact match"
if no match is found vlookup will output #N/A error so we wrap it into IFNA - then if no match is found vlookup will output empty rows
then we put this into IF statement... if our vlookup outputs 1, we output our columns (again in {array} form
if our vlookup outputs 0 then we output array of 15 empty cells in a row {"","","", .....} this is because we use arrays {} and all ranges in arrays needs to be of same size
our table has 15 columns so if some error would happen the we would get error in one single cell aganist our 15 columns
this way we avoid "array_literal error" having {15 columns; 15 cells in a row which is also 15 columns}
; semicolon puts these two arrays/ranges under each other while , comma will put then next to each other
so if vlookup results in 1 then we assemble our array with ranges... A, B columns are same then we append to C column the phrase "_Append" with &
D:J columns are same... then we force column of zeros by multipling random empty column (X) with 0 etc.
then again we use {"","","", ....} within IFERROR to deal with any possible error at this point
next we put all written above under our whole range Test_1!A:O and we wrap it into QUERY where we state to filter out all empty rows where Col1 is empty and 2nd condition of our query states
that Col3 of our array cant match our regex pattern which we assemble with TEXTJOIN formula
| stands for "or" in regex. 1 in textjoin means "all non empty cells". then we use × (unique symbol) in case textjoin would output empty cell on its own resulting in some error somewhere
again we perform same vlookup, same ifna and same IF but now we output only C column and we are interested only into UNIQUE values
then within the query we sort / order by Col3 our table and 1 at the end stands for "header rows"
This may answer the second part of your question.
Select the range of rows you want the red conditional formatting on, for example A2:A12 in your sample sheet.
Select Format - Conditional Formatting, verify the range, and apply a custom formula, as follows:
=OR(
IF($E2<>"NA",IFERROR(MATCH($E2,FILTER($C$1:$C$12,$M$1:$M$12=1),0)),0),
IF($F2<>"NA",IFERROR(MATCH($F2,FILTER($C$1:$C$12,$M$1:$M$12=1),0)),0),
IF($G2<>"NA",IFERROR(MATCH($G2,FILTER($C$1:$C$12,$M$1:$M$12=1),0)),0))
The FILTER pulls which IDs have a "1" in the update column, ColM.
MATCH looks to see if the ID in the Dependency column is in that filtered list.
The IFERROR sets the result to zero if no match.
And the initial IF, filters out the NA values in the Dependency column.
This logic is replicated for each Dependency column, so three IFs.
And the whole thing is wrapped in an OR function, so if any of the Dependency columns have a matching ID, the whole row is flagged for red.
I've applied this in a tab added to your sheet, Test_1-GK.
Let me know if this helps.
I'm afraid I didn't understand the first part of your question. I got confused with this bit:
I want to append that five rows below where the last instance of ID_3 and in appended rows all the values should be same as ID_3 except column C which will be now be ID_3_Append, column K and O which should be blank for appended rows.

Array Formula for running column minimum excluding above rows

So I have a spreadsheet with a very long series of numbers in a single column. I'd like to have a second column next to it that shows the lowest value contained in the first column, excluding any rows above.
Right now I've accomplished this by using a formula in each row of the second column [ ie: =MIN($I2:I) ], but I'd rather avoid having a formula in every row. Is there a way to accomplish this using a single Array Formula?
=ARRAYFORMULA(QUERY(TRANSPOSE(QUERY(TRANSPOSE(ARRAY_CONSTRAIN(SPLIT({"";
REPT("×99999", ROW(INDIRECT("A1:A"&COUNTA(A1:A)-1)))}&"×"&TEXTJOIN("×", 1,
INDEX(SORT({INDIRECT("A1:A"&COUNTA(A1:A)), ROW(INDIRECT("A1:A"&COUNTA(A1:A)))}, 2, 0)
,,1)), "×"), COUNTA(A1:A), COUNTA(A1:A))), "select "&TEXTJOIN(",", 1, IF(LEN(A1:A),
"min(Col"&ROW(A1:A)&")", ))&"")), "select Col2"))
spreadsheet demo

Is it possible to get a particular row cell value using a formula that combines both row and column searches?

I am setting up a shift roster. The top row is sequential dates, and the leftmost rows are staff names. A staff member can look along their row to see which shifts they are on for the coming weeks.
I want to automatically pull out the allocated shifts into another sheet to show a dynamic "Today's Staffing" which shows who is on duty in each role for the day e.g. for column TODAY'S DATE find which row contains MORNING SHIFT and return the FIRST COLUMN FOR THAT ROW which should contain the name.
I have access to both MS Excel and Google Sheets.
Is there a function/way that I can do this?
Example google sheet:
https://docs.google.com/spreadsheets/d/1VTYK39xuHT0-4s8O5398dnseXYsE0q54-os-rJNNVB8/edit?usp=sharing
=QUERY({INDIRECT(ADDRESS(2, MATCH(TODAY(), A1:1, 0), 4)&":"&
SUBSTITUTE(ADDRESS(1, MATCH(TODAY(), A1:1, 0), 4), 1, )), A2:A},
"where Col1 <>'OFF' and Col1 <>''")
if you want to run this under with just 3 people do:
=QUERY({INDIRECT(ADDRESS(2, MATCH(TODAY(), Sheet1!A1:1, 0), 4)&":"&
SUBSTITUTE(ADDRESS(4, MATCH(TODAY(), Sheet1!A1:1, 0), 4), 1, )), Sheet1!A2:A4},
"where Col1 <>'OFF' and Col1 <>'' order by Col1 desc")

Is there a reason why my ArrayFormula is not working in the other cells of my column?

In a Google sheet with form responses, I made an additional column where I want to look up from each submission if the value left of my new column already occurs in a range on another sheet.
So this is going to be a Vlookup formula finally.
Unfortunately, I didn't make it to the Vlookup part yet because the ArrayFormula part is not working.
I started off by looking to the cell value at the left with this formula, which worked, but the ArrayFormula part of it DOESN'T work.
=ArrayFormula(indirect(ADDRESS(ROW(), COLUMN()-1)))
I know that some functions don't work very well with ArrayFormula,
But I don't see any reason here this should not work because it's only looking to its row and its column.
I hope the image shows the problem well enough
If you just want to repeat the values from the previous column (let's say column A, starting in row 2), you can try in column B (also in row 2)
=Arrayformula(if(len(A2:A), A2:A,))
Change range to suit. See if that helps?
UPDATE: To repeat the previous column (anywhere you input the formula) try (in row 2)
=offset(A1, 1, column()-2, rows(A1:A))
To 'limit' the output you can use any number instead of rows(A1:A) or replace it with COUNTA(A1:A)...
you can do something like this, which will search for specific header across entire sheet and then return values of that column:
=QUERY({INDIRECT("Sheet1!"&
ADDRESS(1, MATCH("job ID", Sheet1!1:1, 0), 4)&":"&
ADDRESS(1000000, MATCH("job ID", Sheet1!1:1, 0), 4))},
"select * where not Col1 matches 'job ID' and Col1 is not NULL", 0)
without sheet name:
=QUERY({INDIRECT(
ADDRESS(1, MATCH("job ID", 1:1, 0), 4)&":"&
ADDRESS(1000000, MATCH("job ID", 1:1, 0), 4))},
"select * where not Col1 matches 'job ID' and Col1 is not NULL", 0)

Resources