Extracting Info from URL using Google Sheets - google-sheets

I would like to take a URL and extract characters to multiple cells in Google Sheets.
The format of these URLs are:
https://www.sideshow.com/collectibles/{PRODUCT_NAME}-{PRODUCT_ID}
This is an example of the URL:
https://www.sideshow.com/collectibles/marvel-scarlet-witch-sideshow-collectibles-300485
I'd like to extract the following to 2 different cells:
Product ID: 300485 written to cell A1 as 300485
This 6 digit number will change on every URL but it will always be the last 6 characters of the URL
Product Name: marvel-scarlet-witch written to cell B1 as Marvel Scarlet Witch (no dashes, proper case format)
The Product Name, marvel-scarlet-witch, will change with every URL but its position in the URL is constant.
I'm not sure if all this can be done in a single action or if two separate ones would be required.

Alternatively you can also try (assuming string in D1)
=split(proper(substitute(regexreplace(D1, ".*\/(.*?)-side.*?([^-]\d+)$", "$2✓$1"), "-", " ")), "✓")
Change range to suit.
See this link for a brief explanation about the regular expression.

Assuming that the string is in cell C1.
To extract Product ID from the given URL format, use the formula on A1 cell:
=PROPER(SUBSTITUTE(REGEXEXTRACT(C1,".*/(.*)-sideshow-collectibles"), "-", " "))
To extract Product Name from the given URL format, use the formula on B1 cell:
=RIGHT(C1,6)

Related

Combining a Cell Reference and Wildcard as Criterion in Google Sheets CountIf Formula

I'm struggling with writing the proper syntax for this formula in Google Sheets. In one sheet called Game Log, in the H column I have data that can be a range of names (1 - 10 names per row). I'm trying to construct a COUNTIF statement that would search for the name in all the rows for that column. There can be several other names in the same column so I need to use the wildcard * to find any occurrence of the name in each row. So for example, the current code below would count all occurrence of Adam in the rows.
=COUNTIF('Game Log'!H3:H102, "*Adam*")
What I would like to do is replace the hard codes "Adam" with a cell reference (in this case B2). Is it possible to combine that cell reference with the wild card? The know the code below doesn't work (as it would return text counting occurrences of B2), but is something like this possible?
=COUNTIF('Game Log'!H3:H102, "*B2*")
Have you tried something like this?
=COUNTIF('Game Log'!H3:H102, "*" & B2 & "*")
That ought to look for any string value, followed by the cell value, followed again by any string value. It's essentially just performing separate checks, in sequence, which allows you to search for different value types (in this case string wildcard + cell value + string wildcard).

Regex to find URL without the file name and extension

I have a Google sheet with hundreds of images URLs from Dropbox and I want to write a regex that will find all the prefixes part of the URLs without the file name + extension. I'll then replace the search result with another url.
What I have:
https://www.dropbox.com/s/dtauvpuy3a5qyu4/A1001.jpg
https://www.dropbox.com/s/dtauvpuy3a5qyu4/A1001.jpg
https://www.dropbox.com/s/dtauvpuy3a5qyu4/A1001.jpg
https://www.dropbox.com/s/d0xedx0j72v5uub/A1002-1.jpg
https://www.dropbox.com/s/d0xedx0j72v5uub/A1002-1.jpg
What I expect:
https://anotherurl.com/A1001.jpg
https://anotherurl.com/A1001.jpg
https://anotherurl.com/A1001.jpg
https://anotherurl.com/A1002-1.jpg
https://anotherurl.com/A1002-1.jpg
Please, how can I do that?
Suppose your original URLs are in the range A2:A of some sheet. Clear some other column entirely (e.g., B:B), and place the following formula in B2:
=ArrayFormula(IF(A2:A="",,REGEXREPLACE(A2:A,"(.+//).+(/[^/]+)$","$1"&"another.url"&"$2")))
Where you see "another.url", you can either manually enter some other URL text between the quotation marks, or you can enter the new URL text is some cell (say, B1) and use the cell reference instead:
=ArrayFormula(IF(A2:A="",,REGEXREPLACE(A2:A,"(.+//).+(/[^/]+)$","$1"&B1&"$2")))

Google Sheets' dynamic data range in query function

is it possible to have a dynamic data range when using query function in Google sheet?
What I would like to do is, using a dropbox, change the data range used in query function.
For example, I have 4 tables in 4 different sheets. On my main sheet, I want using my dropbox, perform a query on my selected table.
Is it necessary to do it with a script?
You can make a dynamic query without using a script.
The query string can contain a reference to other cells.
Example in Sheets.
This example has a pulldown for the data set in B2, a pulldown for the value set in B4. The data ranges include one from another sheet. I am using named ranges to simplify the lookup process. Each data set n is named DataN.
You can separate out the query string from the cell with the actual query function call. The trick is to build up a query string using INDIRECT, COLUMN, and VALUE. I placed this in cell A10:
="select " & mid("ABCDEFGH",COLUMN(INDIRECT(B2)),1) & " where " & mid("ABCDEFGH",VALUE(COLUMN(INDIRECT(B2)))+1,1) & "=" & """" & B4 & """"
The four quotes let us place a literal quote in the query string. The '&' character does string concatenation.
The use of MID as a way of translating the COLUMN function to a letter I got from here.
Then your cell with the query uses the values of the data set pulldown (B2) and the value of the query string (A10) this like:
=QUERY(INDIRECT(B2),A10,1)

How can I extract the exact part of the text on the cell of google sheet when the text can change?

In a Google Sheets spreadsheet, I have the cell A1 with value "people 12-14 ABC". I want to extract the exact match "ABC" into another cell. The contents of cell A1 can change, e.g. to "woman 60+ ABCD". For this input, I would want to extract "ABCD". If A1 was instead "woman 12-20 CAE", I would want "CAE".
There are 5 possible strings that the last part may be: (ABC, ABCD, AB, CAE, C), while the first portions are very numerous (~400 possibilities).
How can I determine which of the 5 strings is in A1?
If the first part "only" has lower case or numbers and the last part "only" UPPER case,
=REGEXREPLACE(D3;"[^A-E]";)
Anchor: Space
=REGEXEXTRACT(A31;"\s([A-E]+)$")
If you can guarantee well-formatted input, this is simply a matter of splitting the contents of A1 into its component parts (e.g. "gender_filter", "age range", and "my 5 categories"), and selecting the appropriate index of the resultant array of strings.
To convert a cell's contents into an array of that content, the SPLIT() function can be used.
B1 = SPLIT(A1, " ")
would put entries into B1, C1, and D1, where D1 has the value you want - provided your gender filter and age ranges.
Since you probably don't want to have those excess junk values, you want to contain the result of split entirely in B1. To do this, we need to pass the array generated by SPLIT to a function that can take a range or array input. As a bonus, we want to sub-select a part of this range (specifically, the last one). For this, we can use the INDEX() function
B1 = INDEX(SPLIT(A1, " "), 1, COUNTA(SPLIT(A1, " ")))
This tells the INDEX function to access the first row and the last column of the range produced by SPLIT, which for the inputs you have provided, is "ABC", "ABCD", and "CAE".

Google Sheets: "Bob Smith" --> "bsmith" formula?

I'm trying to pull data from another Google Sheet to feed another Google Sheet. I need to pull from a full name field, which will have something like, "Bob Smith" and then I need to have it rewrite into the new Google Sheet as "bsmith".
Basically, "Get first letter of the first string, then concatenate the entire second string, and then make all lowercase."
So far I've gotten =LEFT(A28,1) working to grab the first letter of a string, but then not sure how to grab the second word and then concatenate.
To get the 2nd word you need to FIND() the first space then read from that position + 1 to the end of the string using MID(). & is used for concatenation.
=lower(left(A28,1) & mid(A28, find(" ", A28) + 1, len(A28)))
Try this for a Google sheet specific solution:
=LOWER(REGEXREPLACE(A2,"^(\w).*?(\w+$)","$1$2"))
It uses REGEX, a much more sophisticated engine and easily adaptable to variations than LEFT and/or MID.
Shorter:
=lower(left(A28)&index(split(A28," "),2))
(Assumes only ever two words.)

Resources