How to compare two column in a spreadsheet - google-sheets

I have 30 columns and 1000 rows, I would like to compare column1 with another column. IF the value dont match then I would like to colour it red. Below is a small dataset in my spreadsheet:
A B C D E F ...
1 name sName email
2
3
.
n
Because I have a large dataset and I want to storing my columns in a array, the first row is heading. This is what I have done, however when testing I get empty result, can someone correct me what I am doing wrong?
var index = [];
var sheet = SpreadsheetApp.getActiveSheet();
function col(){
var data = sheet.getDataRange().getValues();
for (var i = 1; i <= data.length; i++) {
te = index[i] = data[1];
Logger.log(columnIndex[i])
if (data[3] != data[7]){
// column_id.setFontColor('red'); <--- I can set the background like this
}
}
}
From the code you can see I am scanning whole spreadsheet data[1] get the heading and in if loop (data[3] != data[7]) compare two columns. I do have to work on my colour variable but that can be done once I get the data that I need.

Try to check this tutorial if it can help you with your problem. This tutorial use a Google AppsScript to compare the two columns. If differences are found, the script should point these out. If no differences are found at all, the script should put out the text "[id]". Just customize this code for your own function.
Here is the code used to achieve this kind of comparison
function stringComparison(s1, s2) {
// lets test both variables are the same object type if not throw an error
if (Object.prototype.toString.call(s1) !== Object.prototype.toString.call(s2)){
throw("Both values need to be an array of cells or individual cells")
}
// if we are looking at two arrays of cells make sure the sizes match and only one column wide
if( Object.prototype.toString.call(s1) === '[object Array]' ) {
if (s1.length != s2.length || s1[0].length > 1 || s2[0].length > 1){
throw("Arrays of cells need to be same size and 1 column wide");
}
// since we are working with an array intialise the return
var out = [];
for (r in s1){ // loop over the rows and find differences using diff sub function
out.push([diff(s1[r][0], s2[r][0])]);
}
return out; // return response
} else { // we are working with two cells so return diff
return diff(s1, s2)
}
}
function diff (s1, s2){
var out = "[ ";
var notid = false;
// loop to match each character
for (var n = 0; n < s1.length; n++){
if (s1.charAt(n) == s2.charAt(n)){
out += "–";
} else {
out += s2.charAt(n);
notid = true;
}
out += " ";
}
out += " ]"
return (notid) ? out : "[ id. ]"; // if notid(entical) return output or [id.]
}
For more information, just check the tutorial link above and this SO question on how to compare two Spreadsheets.

Related

Highlight near duplicate in conditional formating to highlight values with one character difference

I'm currently using this formula to highlight duplicates in my spreadsheet.
=ARRAYFORMULA(COUNTIF(A$2:$A2,$A2)>1)
Quite simple, it allows me to skip the first occurrence and only highlight 2nd, 3rd, ... occurrences.
I would like the formula to go a bit further and highlight near duplicates as well.
Meaning if there is only one character difference between 2 cells, then it should be considered as a duplicate.
For instance: "Marketing", "Marketng", "Marketingg" and "Market ing" would all be considered the same.
I've made a sample sheet in case my requirement is not straightforward to understand.
Thanks in advance.
Answer
Unfortunately, it is not possible to do this only through Formulas. Apps Scripts are need as well. The process for achieving your desired results is described below.
In Google Sheets, go to Extensions > Apps Script, paste the following code1 and save.
function TypoFinder(range, word) { // created by https://stackoverflow.com/users/19361936
if (!Array.isArray(range) || word == "") {
return false;
}
distances = range.map(row => row.map(cell => Levenshtein(cell, word))) // Iterate over range and check Levenshtein distance.
var accumulator = 0;
for (var i = 0; i < distances.length; i++) {
if (distances[i] < 2) {
accumulator++
} // Keep track of how many times there's a Levenshtein distance of 0 or 1.
}
return accumulator > 1;
}
function Levenshtein(a, b) { // created by https://stackoverflow.com/users/4269081
if (a.length == 0) return b.length;
if (b.length == 0) return a.length;
// swap to save some memory O(min(a,b)) instead of O(a)
if (a.length > b.length) {
var tmp = a;
a = b;
b = tmp;
}
var row = [];
// init the row
for (var i = 0; i <= a.length; i++) {
row[i] = i;
}
// fill in the rest
for (var i = 0; i < b.length; i++) {
var prev = i;
for (var j = 0; j < a.length; j++) {
var val;
if (b.charAt(i) == a.charAt(j)) {
val = row[j]; // match
} else {
val = Math.min(row[j] + 1, // substitution
prev + 1, // insertion
row[j + 1] + 1); // deletion
}
row[j] = prev;
prev = val;
}
row[a.length] = prev;
}
return row[a.length];
}
In cell B1, enter =TypoFinder($A$2:$A2,$A2). Autofill that formula down the column by draggin.
Create a conditional formatting rule for column A. Using Format Rules > Custom Formula, enter =B2:B.
At this point, you might wish to hide column B. To do so, right click on the column and press Hide Column.
The above explanation assumes the column you wish to highlight is Column A and the helper column is column B. Adjust appropriately.
Note that I have assumed you do not wish to highlight repeated blank columns as duplicate. If I am incorrect, remove || word == "" from line 2 of the provided snippet.
Explanation
The concept you have described is called Levenshtein Distance, which is a measure of how close together two strings are. There is no built-in way for Google Sheets to process this, so the Levenshtein() portion of the snippet above implements a custom function to do so instead. Then the TypoFinder() function is built on top of it, providing a method for evaluating a range of data against a specified "correct" word (looking for typos anywhere in the range).
Next, a helper column is used because Sheets has difficulties parsing custom formulas as part of a conditional formatting rule. Finally, the rule itself is implemented to check the helper column's determination of whether the row should be highlighted or not. Altogether, this highlights near-duplicate results in a specified column.
1 Adapted from duality's answer to a related question.

Google sheet get fasten array for loop

Is it possible to fasten the hide of many rows when range.length > 300 rows ?
I also can't succeed moving the focus to the top of the sheet once the rows are hidden, I can only get the focus on another sheet.
Here is my code (french parameters), I'm not sure I need to show my spreadsheet. Thank you very much.
var LastRow = sheet.getLastRow()-1;
var ToHide = [];
for (var i=1 ; i < LastRow +1 ; i++){
if ( sheet.getRange(i,1).isChecked() == null){ ToHide.push(i); }
}
for (var j=0 ; j<ToHide.length ; j++){ MyActiveSheet.hideRows(ToHide[j+1],1); }
ToHide.forEach(function (d){ FeuilleActive.hideRows(d); }); // also tried .hideRows(d,1)
SpreadsheetApp.getActive().getSheets()[5].getRange(1,1).activate(); //
To lessen the number of loops, you can ​group it by determining the series of consecutive numbers in your Array. Which you can use to determine the index and number of rows to hide. The number of .hideRows() execution will be determined by the number of series in your array. Thus, lesser runtime.
Example Code:
function myFunction() {
var a = [1,2,3,4,6,7,8,9,11,13,14];
const result = a.reduce((r, n) => {
const lastSubArray = r[r.length - 1];
if(!lastSubArray || lastSubArray[lastSubArray.length - 1] !== n - 1) {
r.push([]);
}
r[r.length - 1].push(n);
return r;
}, []);
//result output: [[1.0, 2.0, 3.0, 4.0], [6.0, 7.0, 8.0, 9.0], [11.0], [13.0, 14.0]]
result.forEach(e => {
var index = e[0];
var numRows = e.length;
Logger.log("Index: "+ index);
Logger.log("numRows: "+numRows);
// MyActiveSheet.hideRows(index,numRows);
})
}
Output:
References:
Ori Drori answer on how to group series of consecutive numbers in an Array.
hideRows(rowIndex, numRows)

Processing each Row with pivot data from other cell

==================================
UPDATE 11 December 2019
My Question is more about Macro Script
The GOAL (in illustration)
to change below raw sheet:
to more readable format:
Basically what i'm doing is split the campaign name with the separator and parse it.
I don't have the problem if the function on only process single cell,for example:
on "Report" Sheet the CELL B2 , is taking data from "Data" B2 ONLY
i got problem when the return data require conditional operator that involve specific condition. So while processing cell B2, it require content from E2, D2, etc
=====================================
i'm taking data from Google Ads/Analytics API to Google Sheet on specific worksheet (i call it 'Raw Data').
Now i'm using pattern for the campaign, so i can easily split/break with separator in order for me to get specific data.
For Example:
With this, by using underscore as separator, i can split campaign name, into various data:
Campaign Objective: Sales
Campaign Title: TBMB
Network: SEM
Branch: All
Targeting: Keywords
..etc
Then i create new sheet called Called CReport which consist the same data from Raw Data sheet, but in much better visualization for marketing people.
Now, after searching on Google, i found the solution for self reference cell.
The script goes like this:
function getSegment(data,index){
temp=data.split("_");
return temp[index-1];
}
function dataParse(input,dataSegment){
return Array.isArray(input) ? input.map(function(e){
return e.map(function(f){
if(f!=""){
return getSegment(f,dataSegment);
}
}
)}
) : "false usage";
}
​So if i want to have a column with Network Name, i can place this formula on row 2 (because row 1 is for table header) something like this:
=ArrayFormula(dataParse('RAW DATA'!B2:B;2))
Now my question:
This works for self-reference cell, means if the data taken from B2 in RAW DATA sheet, it will be the only data referenced to cell in Campaign Report sheet.
If the pointer is in B2 on CReport Sheet require data not only from B2 in RAW DATA but also D2 Cell.
What script i need to add in my function ?
i'm expecting the chunk of code will something like this
function dataParse(input,dataSegment){
return Array.isArray(input) ? input.map(function(e){
return e.map(function(f){
if(f!=""){
segmentData=getSegment(f,dataSegment);
if(segmentData=="google"){
returnData=get reference from column D //<---
}else{
returnData=get reference from column E //<---
}
return returnData
}
}
)}
) : "false usage";
}
Hope its clear enough.
Thanks in Advance !
I modified your function in this way:
// range (String): It will be used to get the info in a range
function dataParse(input,dataSegment, range){
var val = "";
return Array.isArray(input) ? input.map(function(e, index){
return e.map(function(f){
if(f!=""){
// If col D has value google then take info from col B
if(f === "google") val = getDesiredRangeValue("B", range, index);
// else take info from col E
else val = getDesiredRangeValue("E", range, index);
// Take segment as needed
return getSegment(val,dataSegment);
}
}
)}
) : "false usage";
}
In order to make it work, I inserted an extra argument to the function. Now you will need to pass as an string the range in A1 notation in your ArrayFormula, this is because the input argument only gives you the values in the cells, and with that extra argument it will be possible to obtain extra info. To make it work fine, always use the same range as the next example shows:
=ArrayFormula(dataParse('RAW DATA'!D2:D5, 2,"D2:D5"))
or
=ArrayFormula(dataParse('RAW DATA'!D2:D, 2,"D2:D"))
Notice I also added a new function called getDesiredRangeValue, which will take the values from the column you need, depending if one of the cells from Col D has the value google. This is how the function looks:
/*
// A1 (String): The col from where you will want the info
// range (String): It will be used to get the info in a range
// index (Integer): It gives the index number from the main array gotten in the input arg
*/
function getDesiredRangeValue(A1, range, index){
var rowNumbers = range.match(/\d+/g);
// It checks if the range will has and end or it will prolong without specifying and end row
if(rowNumbers.length > 1){
var rangeCol = ss.getRange(A1 + rowNumbers[0] + ":" + A1 + rowNumbers[1]).getValues();
} else {
var rangeCol = ss.getRange(A1 + rowNumbers[0] + ":" + A1).getValues();
}
// It returns the whole value from each cell in the specified col
return rangeCol[index][0];
}
Code
Now your whole code will look like this:
// Global var
var ss = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("RAW DATA");
function getSegment(data,index){
temp=data.split("_");
return temp[index-1];
}
/*
// A1 (String): The col from where you will want the info
// range (String): It will be used to get the info in a range
// index (Integer): It gives the index number from the main array gotten in the input arg
*/
function getDesiredRangeValue(A1, range, index){
var rowNumbers = range.match(/\d+/g);
// It checks if the range will has and end or it will prolong without specifying and end row
if(rowNumbers.length > 1){
var rangeCol = ss.getRange(A1 + rowNumbers[0] + ":" + A1 + rowNumbers[1]).getValues();
} else {
var rangeCol = ss.getRange(A1 + rowNumbers[0] + ":" + A1).getValues();
}
// It returns the whole value from each cell in the specified col
return rangeCol[index][0];
}
// range (String): It will be used to get the info in a range
function dataParse(input,dataSegment, range){
var val = "";
return Array.isArray(input) ? input.map(function(e, index){
return e.map(function(f){
if(f!=""){
// If col D has value google then take info from col B
if(f === "google") val = getDesiredRangeValue("B", range, index);
// else take info from col E
else val = getDesiredRangeValue("E", range, index);
// Take segment as needed
return getSegment(val,dataSegment);
}
}
)}
) : "false usage";
}
Docs
These are the docs I used to help you:
Class Sheet
Custom Functions

Google Sheets - Clone row but with only data from one cell

I'm looking to clone a row 3x, but only keeping data from one column.
So essentially I have the following [Name / Time / Booking], and each row is populated with all 3 properties, I'm trying to create 3 blank rows underneath each current row which is populated with only the persons name.
Can't work how to do it in scripting and can't find a plugin to do this. My data set is over 10,000 big so doing it manually isn't an option.
What I have:
What I want:
UPDATED code:
function duplicateRows() {
var sh, v, arr, c, b;
sh = SpreadsheetApp.getActive()
.getSheetByName('Blad1')
v = sh.getRange(1, 1, sh.getLastRow(), 40)
.getValues();
arr = [v[0]];
v.splice(1)
.forEach(function (r, i) {
arr.push(r)
c = 0
while (c < 3) {
dup = makeEmptyArrayXEl(40)
dup[0] = r[0];
arr.push(dup)
c += 1;
}
})
sh.getRange(1, 1, arr.length, arr[0].length)
.setValues(arr);
}
function makeEmptyArrayXEl(num) {
var arr = [];
for (var i = 0; i < num; i++) {
arr.push("")
}
return arr;
}
Would this work for you? It requires a free column to the left of Booking in the original data set. The formula below is a new sheet.
=ArrayFormula(sort({A2:A4,B2:B4,C2:C4;A2:A4,D2:D4,D2:D4;A2:A4,D2:D4,D2:D4;A2:A4,D2:D4,D2:D4},1,FALSE))

Search Google Sheet column for matching text and print matches

I have a table with Long Words like 'Condemnation' and 'Income' in column A, and Shorter Words such as 'Con' and 'Come' in column B.
I'd like to create a cell to the right which will search through the 'LONG WORD' column if it contains the text of the 'SHORTER WORD' column and print them as a pair.
I only need it to return the first instance it comes across as it goes down.
I have looked at various MATCH and LOOKUP commands, but none seem quite to be able to do the 'return one matching word from a whole column' bit.
Thanks
Tardy
I've thrown together a script based solution for you. Other solutions that require a formula on every line where you might have partials will end up bogging down the sheet by quite a bit for large data sets. This should generate a range of matches after a couple seconds for data several tens of thousands of rows long.
Note: Since you opted to not provide a sample dataset, I had to assume how it's laid out. However, this will work regardless of where your columns are, as long as they are titled as Full Words, Partials, and Matches.
Link to spreadsheet (Must be signed into a google account to use the button): Google Sheet
Just click the Get Matches button to have it generate the matches.
The source is a bit more complex/dynamic than it needs to be, but I had a bunch of functions already laying around that I just reused.
Source:
//Retrieves all the necessary word matches
function GetWordMatches() {
var spreadsheet = SpreadsheetApp.openById('1s0S2iJ7L0wEXgVsKrpuK-aLysaxfHYRDQgp3ShPR8Ns').getSheetByName('Matches');
var dataRange = spreadsheet.getDataRange();
var valuesRange = dataRange.getValues();
var columns = GetColumns(valuesRange, dataRange.getNumColumns(), 0);
var fullWordsData = GetColumnAsArray(valuesRange, columns.columns['Full Words'].index, true, 1);
var partialsArray = GetColumnAsArray(valuesRange, columns.columns['Partials'].index, true, 1);
var partialsData = GeneratePartialsRegexArray(partialsArray);
var matches = GenerateMatches(fullWordsData, partialsData);
WriteMatchesToSheet(spreadsheet, columns.columns['Matches'].index, matches, partialsArray);
}
//Writes the matches to the sheet
function WriteMatchesToSheet(spreadsheet, matchesColumnIndex, matches, partialsArray){
var sortedMatches = SortByKeys(matches, partialsArray);
var dataRange = spreadsheet.getRange(2, matchesColumnIndex+1, sortedMatches.length);
dataRange.setValues(sortedMatches);
}
//Generates an array of matches for the full words and partials
function GenerateMatches(fullwordsData, partialsData){
var output = [];
var totalLoops = 0;
for(var i = 0; i < fullwordsData.length; i++){
totalLoops++;
for(var ii = 0; ii < partialsData.length; ii++){
totalLoops++;
var result = fullwordsData[i].match(partialsData[ii].regex)
if(result){
output.push([fullwordsData[i], partialsData[ii].value]);
partialsData.splice(ii, 1);
break;
}
}
}
if(partialsData.length > 0){
var missedData = GenerateMissedPartialsArray(partialsData);
output = output.concat(missedData);
}
return output;
}
//Generates a missed partials array based on the partials that found no match.
function GenerateMissedPartialsArray(partialsData){
var output = [];
for(var i = 0; i < partialsData.length; i++){
output.push(['No Match', partialsData[i].value])
}
return output;
}
//Generates the regex array for the partials
function GeneratePartialsRegexArray(partialsArray){
var output = [];
for(var i = 0; i < partialsArray.length; i++){
output.push({regex: new RegExp(partialsArray[i], 'i'), value: partialsArray[i]});
}
return output;
}
//http://stackoverflow.com/a/13305008/3547347
function SortByKeys(itemsArray, sortingArray){
var itemsMap = CreateItemsMap(itemsArray), result = [];
for (var i = 0; i < sortingArray.length; ++i) {
var key = sortingArray[i];
result.push([itemsMap[key].shift()]);
}
return result;
}
//http://stackoverflow.com/a/13305008/3547347
function CreateItemsMap(itemsArray) {
var itemsMap = {};
for (var i = 0, item; (item = itemsArray[i]); ++i) {
(itemsMap[item[1]] || (itemsMap[item[1]] = [])).push(item[0]);
}
return itemsMap;
}
//Gets a column of data as an array
function GetColumnAsArray(valuesRange, columnIndex, ignoreBlank, startRowIndex){
var output = [];
for(var i = startRowIndex; i < valuesRange.length; i++){
if(ignoreBlank){
if(valuesRange[i][columnIndex] !== ''){
output.push(valuesRange[i][columnIndex]);
}
continue;
}
output.push(valuesRange[i][columnIndex]);
}
return output;
}
//Gets a columns object for the sheet for easy indexing
function GetColumns(valuesRange, columnCount, rowIndex)
{
var columns = {
columns: {},
length: 0
}
Logger.log("Populating columns...");
for(var i = 0; i < columnCount; i++)
{
if(valuesRange[0][i] !== ''){
columns.columns[valuesRange[0][i]] = {index: i ,value: valuesRange[0][i]};
columns.length++;
}
}
return columns;
}
A note on some decisions: I opted to not use map, or other more concise array functions for the sake of performance.
This works too:
=QUERY(FILTER($D$1:$D$3,REGEXMATCH(A1,"(?i)"&$D$1:$D$3)),"limit 1")
we use REGEXMATCH and (?i) makes the search case-insensitive. limit 1 in query gives only first occurrence.
OK, I think I've found an answer. I'll post it here in case it's of use to anyone else.
To give credit where's credit's due, I found it here
This does what I was looking for:
=INDEX($D$1:$D$3,MATCH(1,COUNTIF(A1,"*"&$D$1:$D$3&"*"),0))
It does slow EVERYTHING down a lot because everything is cross-referencing like mad (I had 3000 lines on my spreadsheet), but if there's a list of words in D1-3 it will see if cell A1 contains one of those words and print the word it matches with.
Thanks to everyone who offered solutions, particularly #douglasg14b - if there is one that is less taxing in terms of memory, that would be great, but this does the trick in a slow kind of way!
Thanks
Tardy
MATCH and LOOKUP doesn't work for partial matches.
One alternative is to use SEARCH or FIND together with other functions in an array formula.
Example:
Column A contains a list of long strings
Cell B1 contain a short string
Cell C1 contain a formula that returns the first long string in column a that contains the short string in B1
=ArrayFormula(INDEX(A1:A,SORT(IF(search(B1,A1:A),ROW(A1:A),),1,TRUE)))
Data
+---+--------------+-------+-------------+
| | A | B | C |
+---+--------------+-------+-------------+
| 1 | Orange juice | apple | Apple cider |
| 2 | Apple cider | | |
| 3 | Apple pay | | |
+---+--------------+-------+-------------+

Resources