I have a Google sheet with hundreds of images URLs from Dropbox and I want to write a regex that will find all the prefixes part of the URLs without the file name + extension. I'll then replace the search result with another url.
What I have:
https://www.dropbox.com/s/dtauvpuy3a5qyu4/A1001.jpg
https://www.dropbox.com/s/dtauvpuy3a5qyu4/A1001.jpg
https://www.dropbox.com/s/dtauvpuy3a5qyu4/A1001.jpg
https://www.dropbox.com/s/d0xedx0j72v5uub/A1002-1.jpg
https://www.dropbox.com/s/d0xedx0j72v5uub/A1002-1.jpg
What I expect:
https://anotherurl.com/A1001.jpg
https://anotherurl.com/A1001.jpg
https://anotherurl.com/A1001.jpg
https://anotherurl.com/A1002-1.jpg
https://anotherurl.com/A1002-1.jpg
Please, how can I do that?
Suppose your original URLs are in the range A2:A of some sheet. Clear some other column entirely (e.g., B:B), and place the following formula in B2:
=ArrayFormula(IF(A2:A="",,REGEXREPLACE(A2:A,"(.+//).+(/[^/]+)$","$1"&"another.url"&"$2")))
Where you see "another.url", you can either manually enter some other URL text between the quotation marks, or you can enter the new URL text is some cell (say, B1) and use the cell reference instead:
=ArrayFormula(IF(A2:A="",,REGEXREPLACE(A2:A,"(.+//).+(/[^/]+)$","$1"&B1&"$2")))
Related
I need to get the text from URLs.
The formula INDEX(SPLIT(B3,"/",),"5")) works if the URL is formatted as a direct link.
However, if the URL is a redirect link it extracts "picassoRedirect.html" I need to use the formula RIGHT(INDEX(SPLIT(B3,"%",),"23"),10) if the URL is a redirect URL to get the text I want.
Is there a way I can combine both formulas and get the text in either case?
Use regexextract(), like this:
=arrayformula( { "ASIN"; iferror( regexextract(B3:B, "dp(?:%2F|/)(\w+)") ) } )
This array formula will fill a whole column in one go. Put the formula in cell C2.
I would like to take a URL and extract characters to multiple cells in Google Sheets.
The format of these URLs are:
https://www.sideshow.com/collectibles/{PRODUCT_NAME}-{PRODUCT_ID}
This is an example of the URL:
https://www.sideshow.com/collectibles/marvel-scarlet-witch-sideshow-collectibles-300485
I'd like to extract the following to 2 different cells:
Product ID: 300485 written to cell A1 as 300485
This 6 digit number will change on every URL but it will always be the last 6 characters of the URL
Product Name: marvel-scarlet-witch written to cell B1 as Marvel Scarlet Witch (no dashes, proper case format)
The Product Name, marvel-scarlet-witch, will change with every URL but its position in the URL is constant.
I'm not sure if all this can be done in a single action or if two separate ones would be required.
Alternatively you can also try (assuming string in D1)
=split(proper(substitute(regexreplace(D1, ".*\/(.*?)-side.*?([^-]\d+)$", "$2✓$1"), "-", " ")), "✓")
Change range to suit.
See this link for a brief explanation about the regular expression.
Assuming that the string is in cell C1.
To extract Product ID from the given URL format, use the formula on A1 cell:
=PROPER(SUBSTITUTE(REGEXEXTRACT(C1,".*/(.*)-sideshow-collectibles"), "-", " "))
To extract Product Name from the given URL format, use the formula on B1 cell:
=RIGHT(C1,6)
So I have a document with 30k+ emails. The probleme is, during the export random characters appeared after the emails, something like name#email.com2019-10-10T0545152019-10-10T054515f or name#email.com00000000000700392019-11-28T070033f
My question is, how do i remove everything after ".com" or ".fr" in all the cells ?
You could try using REGEXREPLACE.
=REGEXREPLACE(A1,"\.com.*|\.fr.*", "")
Try
=REGEXEXTRACT(A1,".+\.com|.+\.fr")
Working from what other people added, you can get all emails from the column A and use regular expressions to get the values. Using ARRAYFORMULA you can do it in a single formula:
=ARRAYFORMULA(IF(A:A<>""; REGEXEXTRACT(A:A; ".+\.(?:com|fr)"); ""))
Rundown
ARRAYFORMULA allows to execute the formula to the entire column
REGEXEXTRACT extracts part of the string using regular expressions
IF conditional. In this case it's used to no execute when the cell is empty, preventing an error.
References
ARRAYFORMULA (Docs Editor Help)
REGEXEXTRACT (Docs Editor Help)
IF (Docs Editor Help)
Supposing your raw-data email list were in A2:A, try this in, Row 2 of an otherwise empty column (e.g., B2):
=ArrayFormula(IF(A2:A="",,REGEXEXTRACT(A2:A,"^.+\.\D+")))
In plain English, this means "Extract everything up to the last dot found that is followed by some number of non-digits."
This should pull up to any suffix (e.g., .com, .co, .biz, .org, .ma.gov, etc.).
I copy a list of company name from a website, each of them has its hyperlink.
But now I would like to paste the names on column B of a google spreadsheet and the link list on column C.
the sample spreadsheet shows here
Column B shows name and column C shows its link like http://.....
The =HYPERLINK function syntax are as followed
HYPERLINK(url, [link_label])
Is there any way I can make the [link_label] become link url itself?
Or is there any other way to list all the hyperlink of a sheet on a column?
The square brackets in a Google function indicate that the parameter is optional. HYPERLINK defaults to display the link itself if the [link_label] is omitted.
In other words, =HYPERLINK("www.example.com") will display as www.example.com.
See Google's documentation.
How can I extract a specific word or words from a URL to display in another column on Google Spreadsheets? The URL is https://seatgeek.com/bands/katy-perry?p=3 and I have to extract "katy perry" from this URL. I also have to create a second formula that will display the same URL with a date from another column on the spreadsheet.
Look up regular expressions for VBA. This way you can perform pattern matching with a lot of flexibility.
Here:
http://www.macrostash.com/2011/10/08/simple-regular-expression-tutorial-for-excel-vba/
or better yet, here:
How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
How's this - change A3 as needed to match the Cell with the URL:
=SUBSTITUTE(MID(A3,SEARCH(";",SUBSTITUTE(A3,"/",";",4))+1,FIND("?",SUBSTITUTE(A3,"/",";",4))-SEARCH(";",SUBSTITUTE(A3,"/",";",4))-1),"-"," ")
What this is doing is switching out the '/' right before 'katy-perry' with a unique (to that cell) mark, the semi-colon. Then, using MID(), extract the info between the substituted ';' and the '?'.
Edit: This should work with any name length (i.e. 'katy-perry','katyyyyyy-peeerrryyy'). Note that it assumes that you will ALWAYS have a URL with four '/' before the artist's name.
The single sample URL you provided leaves one wondering if the configuration is going to be standard across many other URLs you may have listed. If this is typical of the way other URLs are constructed, you can identify the question mark and the last forward slash to parse out the katy-perry. Here is is in steps then altogether.
The following instructions assume that https://seatgeek.com/bands/katy-perry?p=3 is in A1.
Append a question mark to the end just in case there isn't one in the URL and use the first question mark found to strip off anything right of that.
=LEFT(A1, FIND("?", A1&"?")-1)
Replace all forward slashes with 99 spaces.
=SUBSTITUTE(LEFT(A1, FIND("?", A1&"?")-1), "/", REPT(" ", 99))
Peel off the right-most 99 characters and trim off extra spaces.
=TRIM(RIGHT(SUBSTITUTE(LEFT(A1, FIND("?", A1&"?")-1), "/", REPT(" ", 99)), 99))
The result should katy-perry. This formula is Google-Spreadsheet friendly.