Remove rows based on duplicates in a single column in Google Sheets - google-sheets

I have spreadsheet similar to this:
I'd like to remove all duplicates of row based on the first column data.
So from this screenshot row, 1 and 2 would be kept, and row 2 would be removed. Any help would be greatly appreciated.
P.S. In my case I have columns from A to AU and rows from 2 to 9500.
Thank you.

Maya's answer and the solution AJPerez linked both work.
You can also use Filter View, which doesn't require deleting rows or creating new rows/sheets.
First create a helper column, say to the left of all your data. If your data starts on row 1, then create a blank row above all your data; if not, you are fine. Afterwards, on the first row where you have data, say row 2, write in the formula
=iserror(match(B2,B$1:B1,0))
Replace "2" with the row number of the first row of your data and "1" with that number minus 1. Also populate the rest of the column with the formula. (Unfortunately, arrayformula doesn't work here.) The formula outputs TRUE when the entry in B# has not occurred in cells above.
Note that this assumes your data now starts with column B and column B is where you want the filter to base. If that's not the case, just edit the column index too accordingly.
Select this new helper column. Go to Data -> Filter Views... -> Create a new Filter. Select filter by value and check TRUE only.
Caveat: this filter can only work if there are actually rows with TRUE value. This will always be the case as some entries are always unique.
If you want to avoid the caveat in your future applications, you can use filter by conditions with custom formula. The formula b2 should work.
But to begin with, even without the helper column, the above formula should work. Why doesn't it? That would be a good question for Google Support should it exist.

You can also use a google apps script to do this. To open the script editor from google sheets:
Choose the menu Tools > Script Editor.
Copy and paste the following script:
function removeDuplicates() {
var sheet = SpreadsheetApp.getActiveSheet();
var data = sheet.getDataRange().getValues();
var newData = [];
var ids = [];
for (var i in data) {
var row = data[i];
var duplicate = false;
if (ids.indexOf(row[0]) > -1) {
duplicate = true;
} else {
duplicate = false;
ids.push(row[0]);
}
if (!duplicate) {
newData.push(row);
}
}
sheet.clearContents();
sheet.getRange(1, 1, newData.length, newData[0].length).setValues(newData);
}
This assumes that you want to dedupe based on the contents of the first row of you sheet. If not you can adjust the row references from the 0 index to any other index you wish.

I would...
Create a new sheet
For column A, do =Unique(Sheet1!A:A)
Simply use VLOOKUP to populate the other columns. This will deliver the first value associated with the duplicates.

Related

Google Sheets - Auto import data from sheet 1 and not have it delete from the new sheet 2

I used IMPORTRANGE to connect my two google sheets. Sheet1 has information on things we need to order, Sheet2 is used as a storage for everything once it has been ordered. However, I want the data in Sheet1 to be deleted as it is ordered but then it deletes from Sheet2.
How can I get information from Sheet1 to auto copy to Sheet2 based on the Order#, and not be deleted when it is removed from Sheet1?
You need a script if you want to achieve that by deleting the rows in Sheet1.
If you want to use only Sheets standard formulas, there is another option that might satisfy the use case, but without actually deleting the rows from Sheet1.
What you need to do is add a new yes/no or checkbox column (named "deleted" or similar) to Sheet1. When you want to "delete" a row, you will just check the checkbox in that column.
On Sheet2 you will wrap the IMPORTRANGE into QUERY, something like
=QUERY(IMPORTRANGE(...), "select * where Col1=TRUE")
in case you first column in Sheet1 is a checkbox that is selected if a row is marked as "deleted".
If a script is an option, and you want the row actually deleted in Sheet1, and still can add a dedicated column for checkbox (preferably the first column), then this will solve your issue.
See Script and Output for reference.
Script
function onEdit(e) {
var range = e.range;
var value = e.value;
var sheet = range.getSheet();
// if checkbox checked is in Sheet1!A:A
if (sheet.getName() == 'Sheet1' && value == 'TRUE' && range.getColumn() == 1) {
// fetch what row number is ticked
row = range.getRow();
// append the data on the checked row on the last row of Sheet2
sheet.getParent().getSheetByName('Sheet2').appendRow(sheet.getRange(row, 2, 1, sheet.getLastColumn()).getValues().flat());
// delete row in Sheet1 afterwards
sheet.deleteRow(row);
}
}
Script above will transfer then delete the row upon checking the row's checkbox
Output:
Note:
Creating the script without adding a checkbox on the first column will have to be a bit harder and a hassle since onEdit doesn't hold the oldValue when deleting a row.
OnChange can detect deletion of row but still doesn't have the deleted values saved.

How to get list of row that has value from a range of data based on column header as the criteria?

I have a range of data populated from a google form in sheet "Data".
I need to make a report (in sheet "Report") that shows the students attendance, based on name (from drop down list). The data that is blank is excluded in that report, as seen in column i to column k
desired result
I have shared the sheet here
I have tried to use filter
=filter(Data!B1:I20,Data!E1:E20<>"")
but still I cant find the way to change the "condition" in this formula to refer to the sheet Report A1.
what is the correct formula to achieve this?
Thanks you for your help.
Using Apps Script you will have to use the function getSheetByName(name).
Example:
// The code below logs the index of a sheet named "Expenses"
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Expenses");
if (sheet != null) {
Logger.log(sheet.getIndex());
}
Then to find from column E the rows that have value you would have to use the getRange(a1Notation) method:
// Get a range A1:D4 on sheet titled "Invoices"
var ss = SpreadsheetApp.getActiveSpreadsheet();
var range = ss.getRange("Invoices!A1:D4");
// Get cell A1 on the first sheet
var sheet = ss.getSheets()[0];
var cell = sheet.getRange("A1");
Then you would have to iterate over that rows and get the rows where the E column has something.
Once you have that you can get the Report sheet and set the values there with setValues(values)
For more information on how the function you may need to use you have the reference of SpreadsheetApp and the overview on Extending Google Sheets

Set column value via script in google sheets

I am trying to index my google sheet, based on an if condition. So, if the 3 column data is not empty, I want the first column of the sheet to get indexed. In the code below, the if statement is working fine, but I am unable to set the value of index in the first column.
var sheetFrom = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("production");
var data = sheetFrom.getDataRange().getValues();
var count = 0;
var farmerCount = 0;
for(n=3;n<data.length;++n){ // iterate in the array, row by row
if (data[n][2]!=""){
farmerCount++;
data[n][0].setValue(farmerCount);
//sheetFrom.getRange(n, 0).setValue(farmerCount);
}
Thanks in advance.
Range indexing in Google Apps Script is 1 based so you want to use
sheetFrom.getRange(n, 1).setValue(farmerCount);
Possibly n+1 depending on how many header rows you have.

Creating a Macro in google spreadsheet to search and then write text

What I am trying to accomplish is I would like to search for a term in one cell, if that cell has the term write text to another cell. My specific example would be I would like to search for the term 'DSF' in column 4. If I find 'DSF' it would then write 'w' in column 5 & write '1.2' in column 3. This is searched per row.
I do understand the the .setvalue will write the needed text, but I do not understand how to create a search function. Some help would be greatly appreciated.
EDIT
Here is the code I am working with at the moment. I am modifying it from something I found.
function Recalls()
{
var sh = SpreadsheetApp.getActiveSheet();
var data = sh.getDataRange().getValues(); // read all data in the sheet
for(n=0;n<data.length;++n){ // iterate row by row and examine data in column D
if(data[n][3].toString().match('dsf')=='dsf'){ data[n][4] = 'w'}{ data[n][2] = '1.2'};// if column D contains 'dsf' then set value in index [4](E)[2](C)
}
//Logger.log(data)
//sh.getRange(1,1,data.length,data[3].length).setValues(data); // write back to the sheet
}
With the Logger.log(data) not using the // It works properly but it overwrites the sheet, which will not work since I have formulas placed in a lot of the cells. Also, Maybe I did not realize this but Is there a way to do a live update, as in once I enter text into a cell it will research the sheet? Otherwise having to 'run' the macro with not save me much time in the long run.
Try this. It runs when the sheet is edited. It only captures columns C,D,&E into the array and only writes back those columns. That should solve overwriting your formulas. It looks for 'DSF' or 'dsf' in column D (or contains dsf with other text in the same cell either case). Give it a try and let me know if I didn't understand your issue.
function onEdit(){
var sh = SpreadsheetApp.getActiveSheet();
var lr = sh.getLastRow()// get the last row number with data
var data = sh.getRange(2,3,lr,3).getValues(); // get only columns C.D,& E. Starting at row 2 thur the last row
//var data = sh.getDataRange().getValues();// read all data in the sheet
for(n=0;n<data.length-1;++n){ // iterate row by row and examine data in column D
// if(data[n][0].toString().match('dsf')=='dsf'){
if(data[n][1].match(/dfs/i)){ //changed to find either upper or lower case dfs or with other text in string.
data[n][2] = 'w';
data[n][0] = '1.2'};
}
sh.getRange(2,3,data.length,data[3].length).setValues(data); // write back to the sheet only Col C,D,& E
}

arrayformula that can "skip" rows

I need to introduce functionality into a google spreadsheet that will allow the user to edit the result of an array formula. The reason for the requirement is that an ARRAYFORMULA sets a default value for a group of cells, but the user sometimes needs to overwite these defaults. I'd like to know if this is even remotely possible.
example:
Row(#)|Array_1 |Array_2
------------------------------------
1 |a |=arrayformula(Array_1)
2 |b |""
3 |c |""
4 |d |""
So all rows in Array_2 are populated by an array formula. However the user wants to go directly to the second cell in Array_2 and change its value. Of course, by design ARRAYFORMULA will break. Is there some way to modify ARRAYFORMULA, so that it will simply skip over the cell that the user has edited and continue on its way as if nothing has happeded?
I realize this is an old problem but I was searching for this today and made a script that works for me.
This script puts a formula in an adjacent cell when a cell is edited in the second column. This way you can just overwrite the formula if you need to input something manually and you don't need to have the formulas go into all of the rows beforehand. I had people accidentally edit the formula and mess it up most of the time when they were pre-filled, so this works better for me.
function onEdit() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheetList = ["Sheet1","Sheet2","Sheet3"]; // list of sheets to run script on
for (i = 0; i < sheetList.length; i++) {
var sheetName = ss.getSheetByName(sheetList[i]);
// only runs if sheet from sheetList is found
if (sheetName != null) {
var aCell = sheetName.getActiveCell();
var col = aCell.getColumn();
var adjacentCell = aCell.offset(0, -1);
var formula = 'INPUT FORMULA HERE'; // put the formula you want in the adjacentCell here. Don't use it in an arrayformula
// only runs if active cell is in column 2, if the adjacentCell is empty, and if the active cell is not empty(otherwise it runs if you delete something in column 2)
if(col==2 && adjacentCell.getValue()=="" && aCell.getValue()!="") {
adjacentCell1.setValue(formula);
}
}
}
}
Will changing the value not throw out the output of the remaining formulas?
If not, you could set up 2 new tabs: one which will receive the user over-ride values, and another "reflection" tab which you populate with
IF(tabOverride!Rx:Cy, tabOverride!Rx:Cy, tabArray!Rx:Cy)
basically the new tabs are cloned layouts of your array tab, creating an override input layer, plus a presentation layer that uses the IF('override value exists', 'then show override', 'else show array out put') logic to return the desired values.
hope that makes sense!

Resources