How to compare if the last trail of two URLs match? - google-sheets

I have tried to create a spreadsheet which lets me know if the last trailing step of two URLs are matching. (In case you are wondering: It is for redirect mapping because when I have old URLs and match them to a new structure the last trail stays the same more often than not.)
The said spreadsheet is working more or less and can be found here: https://docs.google.com/spreadsheets/d/1m3E5NQYSUGe4Kxn4BrpcgWj4WhotKG88JCx93ER3U1c/edit?usp=sharing
I did the following:
Pull old and new URLs in two separate sheets (oldcrumb/newcrumb)
Split the URLs in these sheets into separate cells splitting at every "/"
Compare the last filled Cell in a Row between the two sheets
Unfortunately my solution is somewhat clunky. For about 600 initial rows I need another 1200 rows because I use 2 different sheets and also use split formula to get hands on the last trail.
Also I have just implemented comparison for a certain range (Rows I to H in the split row sheets) of trailing position with several IF-conditions. If URLs are supershort or superlong, nothing will be displayed in the "Last trail matching?" row.
Also at some point I get a warning because I would reach the maximum amount of usable cells in Google Sheets (around 200000 or so?).
So is there any more elaborate/elegant way to do what I did in a super awkward and heavy loaded approach?

The following formula does it:
=regexreplace(A2, ".*/", "")=regexreplace(C2, ".*/", "")
Here, regexreplace removes everything up to (and including) the last slash, because .* means any number of any characters. So the equality is tested between the tails after the last slash.
Also works as an array formula:
=arrayformula(regexreplace(A2:A11, ".*/", "")=regexreplace(C2:C11, ".*/", ""))
or an array formula that allows for blank row in the input range:
=arrayformula(if(isblank(A2:A), "", regexreplace(A2:A, ".*/", "")=regexreplace(C2:C, ".*/", "")))

I would go with a compare method using lastIndexOf() and slice()
Something like:
/**
* Added for use as a custom function
* #customfunction
*/
function compareurl (oldurl, newurl) {
var a = oldurl.slice(oldurl.lastIndexOf("/") + 1);
var b = newurl.slice(newurl.lastIndexOf("/") + 1);
return (a == b) ? true : false;
}
It finds the last / and creates a new string from the last token in the url to compare
edit Just noticed you may not be using apps scripts. Would still go with a Custom Function and the above method. Then use =compareurl()

Related

Google Sheet: How to use arrayformula to copy data from one sheet to another?

In a Google spreadsheet, I want to sync A2:G500 in sheet1 to sheet2, I've been aware of the following two methods:
use IMPORTRANGE: put the following formula in A1 of sheet2:
=IMPORTRANGE("spreadsheet_url",sheet1!A2:G500)
It works but it feels like I am overdoing it, besides there seem to be a performance issue
In A2 of sheet2, put formula =sheet1!A2, then drag the formula to G500 in sheet2. This one is intuitive and simple to do. However, it doesn't work if sheet1 is a form response sheet - when new response is added, sheet2 won't automatically get it.
For learning purpose, I'm wondering if there is a way to do this using Arrayformula. Besides, I want to find a way to make this sync more care-free, meaning if there are indefinite rows of data I won't have to go back to this sheet every now and then and change the formula or manually drag the formula. Is this possible? And is Arrayformula the right way to go for this purpose?
I would recommend an { array expression }, like this:
={ Sheet1!A2:G }
This is more or less the same as
=arrayformula(Sheet1!A2:G)
...but I prefer the {} syntax because it allows you to specify non-adjacent columns. For example, you can skip columns D and F like this:
={ Sheet1!A2:C, Sheet1!E2:E, Sheet1!G2:G }
In spreadsheets where the locale uses the comma as decimal mark instead of the period, use a backslash \ instead of comma as horizontal separator.
To skip rows, use the semicolon ; as vertical separator. For example, you can skip rows 2:9 like this:
={ Sheet1!A1:G1; Sheet1!A10:G }
The open-ended range reference A10:G means "columns A to G starting in row 2 and extending all the way to the bottom of the sheet."
You can also leave out the row number to get an open-ended range reference like A:G which means "columns A to G from the very top to the bottom of the sheet." This reference will behave the same as A1:G in almost all situations. I have made it a habit to always include the start row in the reference because that way the formula will automatically adjust in the event a row is inserted above row 1.
When the source sheet is a form responses sheet, another tactic is needed. Form responses are always inserted in newly created rows that cannot be referenced directly in advance.
To avoid the range reference from adjusting when you dynamically copy form responses to another sheet, start the copy from row 1, like this:
={ 'Form Responses 1'!A1:A }
Alternatively, use an array formula, like this:
=arrayformula( 
  if( 
    row('Form Responses 1'!A1:A) = 1,
"Enter column header here", 
    'Form Responses 1'!A1:A
  ) 
)
An even better way to deal with form responses is to aggregate the data directly to whatever reports you need with the query() function.
It's either:
ArrayFormula(Sheet1!A2:G500) for the 499 lines, or
ArrayFormula(Sheet!A2:G) if you wanto sync everything from line 2 down
=ARRAYFORMULA(Sheet1!A:G)
Does this not work?
try in row 1:
={""; INDEX(sheet1!A2:A)}
this will solve your form issues when you use it in 1st row. if you already have something in your row 1 you can add it into double quotes like this:
={"header"; INDEX(sheet1!A2:A)}
in case of multiple columns its like this:
={"","","","","","",""; INDEX(sheet1!A2:G)}

Can change shape of range with ARRAYFORMULA() in Google Sheets?

My intention is to convert a single line of data into rows consist of a specific number of columns in Google Sheets.
For example, starting with the raw data:
A
B
C
D
E
F
1
id1
attr1-1
attr2-1
id2
attr2-1
attr2-2
And the expected result is:
(by dividing columns by three)
A
B
C
1
id1
attr1-1
attr1-2
2
id2
attr2-1
attr2-2
I already know that it's possible a bit manually, like:
=ARRAYFORMULA({A1:C1;D1:F1})
But I have to start over with it every time the target range is moved OR the subset size needs to be changed (in the case above it was three)!
So I guess there will be a much more graceful way (i.e. formula does not require manual update) to do the same thing and suspect ARRAYFORMULA() is the key.
Any help will be appreciated!
I added a new sheet ("Erik Help") where I reduced your manually entered parameters from two to one (leaving only # of columns to be entered in A2).
The formula that reshapes the grid:
=ArrayFormula(IFERROR(VLOOKUP(SEQUENCE(ROUNDUP(COUNTA(7:7)/A2),A2),{SEQUENCE(COUNTA(7:7),1),FLATTEN(FILTER(7:7,7:7<>""))},2,FALSE)))
SEQUENCE is used to shape the grid according to whatever is entered in A2. Rows would be the count of items in Row 7 divided by the number in A2 (rounded to the nearest whole number); and the columns would just be whatever number is entered in A2.
Example: If there are 11 items in Row 7 and you want 4 columns, ROUNDUP(11/4)=3 rows to the SEQUENCE and your requested 4 columns.
Then, each of those numbers in the grid is VLOOKUP'ed in a virtual array consisting of a vertical SEQUENCE of ordered numbers matching the number of data pieces in Row 7 (in Column 1) and a FLATTENed (vertical) version of the Row-7 data pieces themselves (in Column 2). Matches are filled into the original SEQUENCE grid, while non-matches are left blank by IFERROR
Though it's a bit messy, managed to get it done thanks to SEQUENCE() function anyway.
It constructs a grid by accepting number of rows/columns input, and that was exactly I was looking for.
For reference set up a sheet with the sample data here:
https://docs.google.com/spreadsheets/d/1p972tYlsPvC6nM39qLNjYRZZWGZYsUnGaA7kXyfJ8F4/edit#gid=0
Use a custom formula
Although you already solved this. If you are doing this kind of thing a lot, it could be beneficial to look into Apps Script and custom formulas.
In this case you could use something like:
function transposeSingleRow(range, size) {
// initialize new range
let newRange = []
// initialize counter to keep track
let count = 0;
// start while loop to go through row (range[0])
while (count < range[0].length){
// add a slice of the original range to the new range
newRange.push(
range[0].slice(count, count + size)
);
// increment counter
count += size;
}
return newRange;
}
Which works like this:
The nice thing about the formula here is that you select the range, and then you put in a number to represent its throw, or how many elements make up a complete row. So if instead of 3 attributes you had 4, instead of calling:
=transposeSingleRow(A7:L7, 3)
you could do:
=transposeSingleRow(A7:L7, 4)
Additionally, if you want this conversion to be permanent and not dependent on formula recalculation. Making it in run fully in Apps Script without using formulas would be neccesary.
Reference
Apps Script
Custom Functions

Import multiple ranges to a single sheet

I have two different sheets, with two of the same ranges (age). I want to combine these two separate ranges into one on a different sheet. Current formula / function I am using:
={(importrange("https...", "Sheet1!A2:A100"));(importrange(""https...", "Sheet2!A2:A100"))}"))
What am I doing wrong?
I was able to bring in one range at a time with this formula / function:
=IMPORTRANGE("https...", "Sheet1!A2:A100")
=IMPORTRANGE("https...", "Sheet2!A2:A100")
but I need them to be in one column together (the order does not matter, I just need the values to be pulled across).
Try two IMPORTRANGE functions within one formula separated by a semi-colon and wrapped in braces (e.g. { and } that you type yourself)
={IMPORTRANGE("https://docs.google.com/spreadsheets/d/1mYWnO8vzyb5o4jzp-Ti-369nSyQoCfg-WzqaaTb94tE", "Sheet1!A2:A10");IMPORTRANGE("https://docs.google.com/spreadsheets/d/1mYWnO8vzyb5o4jzp-Ti-369nSyQoCfg-WzqaaTb94tE", "Sheet2!A2:A")}
If you do not have a set number of rows in the source sheet1 (e.g. A2:A100), then the retrieved data from sheet2 will start on the 101st row with blanks above it. To get around this, concatenate a dynamic 'last populated' row number onto the range string.
={IMPORTRANGE("https://docs.google.com/spreadsheets/d/1mYWnO8vzyb5o4jzp-Ti-369nSyQoCfg-WzqaaTb94tE", "Sheet1!A2:A"&match(1E+99, IMPORTRANGE("https://docs.google.com/spreadsheets/d/1mYWnO8vzyb5o4jzp-Ti-369nSyQoCfg-WzqaaTb94tE", "Sheet1!A:A")));IMPORTRANGE("https://docs.google.com/spreadsheets/d/1mYWnO8vzyb5o4jzp-Ti-369nSyQoCfg-WzqaaTb94tE", "Sheet2!A2:A")}
source link
destination link
What am I doing wrong?
You have a couple of double inverted commas too many and unmatched parentheses (also some unnecessary spaces and parentheses). Following should work, with granting authorisation if required.:
={importrange(" k e y 1 ","Sheet1!A2:A100");importrange(" k e y 2 ","Sheet2!A2:A100")}
It might help to compare 'yours' and 'mine' in a word processor and fixed width font.

How to assign a unique ID to a google form input?

Google Forms - I have set up a google form and I want to assign a unique id each of the completed incoming form inputs. My intention is to use the unique ID as an input for another google form I have created which I will use to link the two completed forms. Is there another easier way to do this?
I'm not a programmer but I have programming resources available to me if needed.
I was also banging my head at this and finally found a solution.
I compose a 6-digit number that gets generated automatically for every row and is composed of:
3 digits of the row number - that gives the uniqueness (you can use more if you expect more than 998 responses), concatenated with
3 digits of the timestamp converted to a number - that prevents guessing the number
Follow these instructions:
Create an additional column in the spreadsheet linked to your form, let's call it: "unique ID"
Row number 1 should be populated with column titles automatically
In row number 2, under column "Unique ID", add the following formula:
=arrayformula( if( len(A2:A), "" & text(row(A2:A) - row(A2) + 2, "000") & RIGHT(VALUE(A2:A), 3), iferror(1/0) ) )
Note: An array formula applies automatically to the entire column.
Make sure you never delete that row, even if you clear up all the results from the form
Once a new submission is populated, its "Unique ID" will appear automatically
Formula explanation:
Column A should normally hold the timestamp. If the timestamp is not empty, then this gives the row number: row(A2:A) - row(A2) + 2
Using text I trim it to a 3-digit number.
Then I concatenate it with the timestamp converted to a number using VALUE and trim it to the three right-most digits using RIGHT
Voila! A number that is both unique and hard-to-guess (as the submitter has no access to the timestamp).
If you would like more confidence, obviously you could use more digits for each of the parts.
You can apply unique ID numbers using an arrayformula next to the form data. In row 1 of the first rightmost empty column you can use something like
=arrayformula(if(row(A1:A)=1,"UNIQUE ID",if(len(A1:A)>0,98+row(A1:A),iferror(1/0))).
A few comments regarding the explanation provided by #Ying, which I will try to expand, as it is very good.
> Column A should normally hold the timestamp.
In my case, it is date+time stamp.
> 4. Make sure you never delete that row,
even if you clear up all the results from the form
That issue can easily be avoided by placing the formula in the header like this
={"calculated_id";arrayformula( if( len(C2:C); "" & text(row(C2:C) - row(C2) + 2; "000") & RIGHT(VALUE(C2:C); 3); iferror(1/0) ) )}
This formula provides an string for one cell, and a formula for the next one, which happens to be an array formula which will cover all the cells below.
Note: Depending on your language settings you may need to use ";" or "," as separator among parameters.
> 5. Once a new submission is populated,
its "Unique ID" will appear automatically
Issue
And here is the issue I see with this solution.
If the Google Form allows responders to Edit their responses, the date+time stamp will change and so the calculated_id.
A workaround is to have 2 columns, one is the calculated_id and the other will be static_id.
static_id will take whatever is on calculated_id only if itself has no data, otherwise it will stay as it is.
Doing that we will have an ID that will not change no matter how many updates the response experience.
The sort formula for static_id is
=IF(AND(IFERROR(K2)<>0;K2<>"");K2;L2)
The large one is
={"static_id";ArrayFormula(IF(AND(IFERROR(M2:M)<>0;M2:M<>"");M2:M;L2:L))
}
M or K -> static_id
L -> calculated_id
Remember to put this last one on the header of the column. I tend to change the color to purple when it has a formula behind, so I don't mess with it by mistake.
Extra info.
The numeric value from the date/time stamp differs when it comes from both or just one. Here are some examples.
Note that the number of digits on the fractional part differ quite a lot depending on the case.

Get the last non-empty cell in a column in Google Sheets

I use the following function
=DAYS360(A2, A35)
to calculate the difference between two dates in my column. However, the column is ever expanding and I currently have to manually change 'A35' as I update my spreadsheet.
Is there a way (in Google Sheets) to find the last non-empty cell in this column and then dynamically set that parameter in the above function?
There may be a more eloquent way, but this is the way I came up with:
The function to find the last populated cell in a column is:
=INDEX( FILTER( A:A ; NOT( ISBLANK( A:A ) ) ) ; ROWS( FILTER( A:A ; NOT( ISBLANK( A:A ) ) ) ) )
So if you combine it with your current function it would look like this:
=DAYS360(A2,INDEX( FILTER( A:A ; NOT( ISBLANK( A:A ) ) ) ; ROWS( FILTER( A:A ; NOT( ISBLANK( A:A ) ) ) ) ))
To find the last non-empty cell you can use INDEX and MATCH functions like this:
=DAYS360(A2; INDEX(A:A; MATCH(99^99;A:A; 1)))
I think this is a little bit faster and easier.
If A2:A contains dates contiguously then INDEX(A2:A,COUNT(A2:A)) will return the last date. The final formula is
=DAYS360(A2,INDEX(A2:A,COUNT(A2:A)))
Although the question is already answered, there is an eloquent way to do it.
Use just the column name to denote last non-empty row of that column.
For example:
If your data is in A1:A100 and you want to be able to add some more data to column A, say it can be A1:A105 or even A1:A1234 later, you can use this range:
A1:A
So to get last non-empty value in a range, we will use 2 functions:
COUNTA
INDEX
The answer is =INDEX(B3:B,COUNTA(B3:B)).
Here is the explanation:
COUNTA(range): Returns number of values in a range, we can use this to get the count of rows.
INDEX(range, row, col): Returns the content of a cell, specified by row and column offset. If the column is omitted then the whole row is returned.
Examples:
INDEX(A1:C5,1,1) = A1
INDEX(A1:C5,1) = A1,B1,C1 # Whole row since the column is not specified
INDEX(A1:C5,1,2) = B1
INDEX(A1:C5,1,3) = C1
INDEX(A1:C5,2,1) = A2
INDEX(A1:C5,2,2) = B2
INDEX(A1:C5,2,3) = C2
INDEX(A1:C5,3,1) = A3
INDEX(A1:C5,3,2) = B3
INDEX(A1:C5,3,3) = C3
For the picture above, our range will be B3:B. So we will count how many values are there in range B3:B by COUNTA(B3:B) first. In the left side, it will produce 8 since there are 8 values while it will produce 9 in the right side. We also know that the last value is in the 1st column of the range B3:B so the col parameter of INDEX must be 1 and the row parameter should be COUNTA(B3:B).
PS: please upvote #bloodymurderlive's answer since he wrote it first, I'm just explaining it here.
My favorite is:
=INDEX(A2:A,COUNTA(A2:A),1)
So, for the OP's need:
=DAYS360(A2,INDEX(A2:A,COUNTA(A2:A),1))
If the column expanded only by contiguously added dates
as in my case - I used just MAX function to get last date.
The final formula will be:
=DAYS360(A2; MAX(A2:A))
Here's another one:
=indirect("A"&max(arrayformula(if(A:A<>"",row(A:A),""))))
With the final equation being this:
=DAYS360(A2,indirect("A"&max(arrayformula(if(A:A<>"",row(A:A),"")))))
The other equations on here work, but I like this one because it makes getting the row number easy, which I find I need to do more often. Just the row number would be like this:
=max(arrayformula(if(A:A<>"",row(A:A),"")))
I originally tried to find just this to solve a spreadsheet issue, but couldn't find anything useful that just gave the row number of the last entry, so hopefully this is helpful for someone.
Also, this has the added advantage that it works for any type of data in any order, and you can have blank rows in between rows with content, and it doesn't count cells with formulas that evaluate to "". It can also handle repeated values. All in all it's very similar to the equation that uses max((G:G<>"")*row(G:G)) on here, but makes pulling out the row number a little easier if that's what you're after.
Alternatively, if you want to put a script on your sheet you can make it easy on yourself if you plan on doing this a lot. Here's that scirpt:
function lastRow(sheet,column) {
var ss = SpreadsheetApp.getActiveSpreadsheet();
if (column == null) {
if (sheet != null) {
var sheet = ss.getSheetByName(sheet);
} else {
var sheet = ss.getActiveSheet();
}
return sheet.getLastRow();
} else {
var sheet = ss.getSheetByName(sheet);
var lastRow = sheet.getLastRow();
var array = sheet.getRange(column + 1 + ':' + column + lastRow).getValues();
for (i=0;i<array.length;i++) {
if (array[i] != '') {
var final = i + 1;
}
}
if (final != null) {
return final;
} else {
return 0;
}
}
}
Here you can just type in the following if you want the last row on the same of the sheet that you're currently editing:
=LASTROW()
or if you want the last row of a particular column from that sheet, or of a particular column from another sheet you can do the following:
=LASTROW("Sheet1","A")
And for the last row of a particular sheet in general:
=LASTROW("Sheet1")
Then to get the actual data you can either use indirect:
=INDIRECT("A"&LASTROW())
or you can modify the above script at the last two return lines (the last two since you would have to put both the sheet and the column to get the actual value from an actual column), and replace the variable with the following:
return sheet.getRange(column + final).getValue();
and
return sheet.getRange(column + lastRow).getValue();
One benefit of this script is that you can choose if you want to include equations that evaluate to "". If no arguments are added equations evaluating to "" will be counted, but if you specify a sheet and column they will now be counted. Also, there's a lot of flexibility if you're willing to use variations of the script.
Probably overkill, but all possible.
This works for me. Get last value of the column A in Google sheet:
=index(A:A,max(row(A:A)*(A:A<>"")))
(It also skips blank rows in between if any)
This seems like the simplest solution that I've found to retrieve the last value in an ever-expanding column:
=INDEX(A:A,COUNTA(A:A),1)
For strictly finding the last non-empty cell in a column, this should work...
=LOOKUP(2^99, A2:A)
What about this formula for getting the last value:
=index(G:G;max((G:G<>"")*row(G:G)))
And this would be a final formula for your original task:
=DAYS360(G10;index(G:G;max((G:G<>"")*row(G:G))))
Suppose that your initial date is in G10.
I went a different route. Since I know I'll be adding something into a row/column one by one, I find out the last row by first counting the fields that have data. I'll demonstrate this with a column:
=COUNT(A5:A34)
So, let's say that returned 21. A5 is 4 rows down, so I need to get the 21st position from the 4th row down. I can do this using inderect, like so:
=INDIRECT("A"&COUNT(A5:A34)+4)
It's finding the amount of rows with data, and returning me a number I'm using as an index modifier.
for a row:
=ARRAYFORMULA(INDIRECT("A"&MAX(IF(A:A<>"", ROW(A:A), ))))
for a column:
=ARRAYFORMULA(INDIRECT(ADDRESS(1, MAX(IF(1:1<>"", COLUMN(1:1), )), 4)))
This will give the contents of the last cell:
=indirect("A"&max(ARRAYFORMULA(row(a:a)*--(a:a<>""))))
This will give the address of the last cell:
="A"&max(ARRAYFORMULA(row(a:a)*--(a:a<>"")))
This will give the row of the last cell:
=max(ARRAYFORMULA(row(a:a)*--(a:a<>"")))
Maybe you'd prefer a script. This script is way shorter than the huge one posted above by someone else:
Go to script editor and save this script:
function getLastRow(range){
while(range.length>0 && range[range.length-1][0]=='') range.pop();
return range.length;
}
One this is done you just need to enter this in a cell:
=getLastRow(A:A)
Calculate the difference between latest date in column A with the date in cell A2.
=MAX(A2:A)-A2
To find last nonempty row number (allowing blanks between them) I used below to search column A.
=ArrayFormula(IFNA(match(2,1/(A:A<>""))))
The way an amateur does it is "=CONCATENATE("A",COUNTUNIQUE(A1:A9999))", where A1 is the first cell in the column, and A9999 is farther down that column than I ever expect to have any entries. This resultant A# can be used with the INDIRECT function as needed.
Ben Collins is a Google sheets guru, he has many tips on his site for free and also offers courses. He has a free article on dynamic range names and I have used this as the basis for many of my projects.
https://www.benlcollins.com/formula-examples/dynamic-named-ranges/
Disclaimer, I have nothing to gain by referring Ben's site.
Here is a screenshot of one of my projects using dynamic ranges:
Cell D3 has this formula which was shown above except this is as an array formula:
=ArrayFormula(MAX(IF(L2s!A2:A1009<>"",ROW(2:1011))))
Cell D4 has this formula:
="L2s!A2:E"&D3
This may work:
=DAYS360(A2,INDEX(A2:A,COUNTA(A2:A)))
To pick the last in a column of arbitrary, non-empty values ignoring the header cell (A1):
=INDEX(A2:A,COUNT(A2:A))
With the introduction of LAMBDA and REDUCE functions we can now compute the row number in a single pass through the cells (Several of the solutions above filter the range twice.) and without relying on magic text or numeric values.
=lambda(rng,
REDUCE(0, rng, lambda(maxrow, cell, if(isblank(cell),maxrow,row(cell)) ) )
)(A:A)
which can be nicely packaged into a Named Function for usage like
=LAST_ROWNUM(A:A)
It works on columns with interspersed blanks, and multi-column ranges (because REDUCE iterates over the range in row-first), and partial columns (like A20:A), still returning the actual row number (not the offset within the range).
This can then be combined with Index to return the value
=DAYS360(A2, Index(A1, LAST_ROWNUM(A:A)))
(In truth, though, I suspect that the OPs date values are monotonic (even if with blanks in between), and that he could get away with
=DAYS360(A2, MAX(A2:A))
This solution is identified above as relying on the dates being "contiguous" - whether that means "no blanks" or "no missing dates" I'm not certain - but either stipulation is not necessary.)

Resources