Get root domain from list of subdomains - google-sheets

I am using google sheets and I am having a list of subdomains:
app.example.com
appserver.example.com
bigstone.example.com
cpanel.example.com
cpanel.example3.com
cpanel.example4.com
cpanel.example2.com
cpanel.example2.com
I would like to get:
example.com
example2.com
example3.com
example4.com
Find below my example sheet:
Google Sheet Example
I tried =left(F2, find(".", F2,2)), however I only get .app etc.
Any suggestions what I am doing wrong?

Maybe this will work for you in cell E2:
=ARRAYFORMULA(UNIQUE(IFERROR(MID(A2:A,FIND(".",A2:A)+1,100))))

You were on the right path, but it's better to search from the other side:
=RIGHT(A1, len(A1) - find(".", A1))
It basically searches from the right between the length of the cell and the first . it finds

Related

Unable import text using importxml and xpath inside div

i'm Using Google Sheets with IMPORTXML to scrape a download count information from a japanese website via XPath in google sheet. I want to save the number/text inside this red box
here's the link
https://www.photo-ac.com/main/detail/4465781?title=%E3%82%A2%E3%82%B2%E3%83%8F%E8%9D%B6%E3%81%A8%E3%83%92%E3%83%A3%E3%82%AF%E3%83%8B%E3%83%81%E3%82%BD%E3%82%A6
here's my function
=IMPORTXML("https://www.photo-ac.com/main/detail/4465781?title=アゲハ蝶とヒャクニチソウ", "/html/body/div[17]/div/div/div/div[2]/div[7]/div[1]/div[1]/div/div[3]/div[2]/div[1]//text()")
the function doesn't work? why?
thank you
When I tested your formula, I confirmed that an error of Could not fetch url: occurred. But, fortunately, when Google Apps Script is used, I confirmed that the URL can be requested using UrlFetchApp. So, in this answer, I would like to propose to use Google Apps Script. The sample script is as follows.
Sample script:
Please copy and paste the following script to the script editor of Google Spreadsheet, and save it, and put a formula of =SAMPLE("URL") to a cell. If the function name is not found, please reopen the Google Spreadsheet and test it again. This script is used as the custom function.
function SAMPLE(url) {
const value = UrlFetchApp.fetch(url).getContentText().match(/ダウンロード:.+/);
if (!value) throw new Error("Value was not retrieved.");
return value;
}
Result:
When above script is used, the following result is obtained.
Note:
This sample script is for the current HTML of the URL of https://www.photo-ac.com/main/detail/4465781?title=アゲハ蝶とヒャクニチソウ. And, when the structure of HTML of the URL is changed, above script might not be able to be used. Please be careful this.
References:
Custom Functions in Google Sheets
fetch(url)

importxml of url with Hebrew returns in encoding other than UTF-8 that chrome doesn't recognize

For example, in the dummy spreadsheet (tab 'desired outcome'), under "Link 1" you will see this URL:
http://www.promotion-il.co.il/service/%5DE%5E4%5D9%5E5-%5E8%5D9%5D7-%5D7%5E9%5DE%5DC%5D9-%5DC%5E2%5E1%5E7%5D9%5DD/
However, the actual URL in UTF-8 is:
http://www.promotion-il.co.il/service/%D7%9E%D7%A4%D7%99%D7%A5-%D7%A8%D7%99%D7%97-%D7%97%D7%A9%D7%9E%D7%9C%D7%99-%D7%9C%D7%A2%D7%A1%D7%A7%D7%99%D7%9D/
The actual URL string that contains Hebrew is:
http://www.promotion-il.co.il/service/מפיץ-ריח-חשמלי-לעסקים/
I will also add that the same URL has returned with a proper UTF-8 encoding for other blog posts. (See second example in the same tab).
Why is it happening?
How can it be fixed?
Thanks in advance!
This is the solution I came up with eventually:
I saw that for the imported urls - in order to fix a broken url 2 repalcements were needed:
5D --> D7%9
5E --> D7%A
I used this formula in a separate column to achieve it:
==ARRAYFORMULA(SUBSTITUTE(SUBSTITUTE((<COLUMN WITH IMPORTED URLS HERE>),"5D","D7%9"),"5E","D7%A"))

Using google query to download parts of a published sheet

This works:
curl 'https://docs.google.com/spreadsheets/d/e/2PACX-1vS3iBtVf4i_won5zAN9NGPqhcd6CcTb-4QHxpisSjCmlgV95B6mFmZvtMaC9GPvD7m8kD-6XLkVAhfc/pub?gid=911257845&single=true&output=csv'
however I want to only pick up rows where count > 300.
The query before encoding would be
select * where F > 300
After encoding
select%20*%20where%20F%3E300
So the url becomes
https://docs.google.com/spreadsheets/d/e/2PACX-1vS3iBtVf4i_won5zAN9NGPqhcd6CcTb-4QHxpisSjCmlgV95B6mFmZvtMaC9GPvD7m8kD-6XLkVAhfc/pub?gid=911257845&output=csv&tq=select%20*%20where%20F%3E300
The line above works retrieves a file, but it returns the whole file, and doesn't filter.
Note that a published web sheet has the form
https://docs.google.com/spreadsheets/d/e/KEY/pub?gid=GID
https://docs.google.com/spreadsheets/d/e/2PACX-1vS3iBtVf4i_won5zAN9NGPqhcd6CcTb-4QHxpisSjCmlgV95B6mFmZvtMaC9GPvD7m8kD-6XLkVAhfc/pub?gid=911257845
This works. Adding &output=csv to it (no space before the &) works, and it downloads as a csv file. This opens in excel and shows the data in the table.
I tried this:
https://docs.google.com/spreadsheets/d/e/2PACX-1vS3iBtVf4i_won5zAN9NGPqhcd6CcTb-4QHxpisSjCmlgV95B6mFmZvtMaC9GPvD7m8kD-6XLkVAhfc/pub?gid=911257845&output=csv&tq=select%20*%20where%20F%3E%20300
and
https://docs.google.com/spreadsheets/d/e/2PACX-1vS3iBtVf4i_won5zAN9NGPqhcd6CcTb-4QHxpisSjCmlgV95B6mFmZvtMaC9GPvD7m8kD-6XLkVAhfc/gviz/tq?gid=911257845&output=csv&tq=select%20*%20where%20F%3E300
and get errors -- resource not available.
The page above should be public for people who want to try.
This may be an issue between publishing a sheet, and sharing a whole spread sheet to anyone who has the link.
I've created a new page that uses importrange() that slurps up the page from the main sheet, and that one is public.
https://docs.google.com/spreadsheets/d/1-lqLuYJyHAKix-T8NR8wV8ZUUbVOJrZTysccid2-ycs/edit?usp=sharing
How about this modification?
Modification points :
When it uses query, please use like https://docs.google.com/spreadsheets/d/### file ID ###/gviz/tq?gid=###&tq=### query ###.
When select%20*%20where%20%F%3E300 is decoded, it is select * where %F>300.
select * where F > 300 is select%20%2a%20where%20F%20%3e%20300.
In order to output CSV, please use tqx=out:csv.
Please share the Spreadsheet.
On Google Drive
On the Spreadsheet file
right-click -> Share -> Advanced -> Click "change" at "Private - Only you can access"
Check "On Anyone with the link"
Click "Save"
At "Link to share", copy URL.
Retrieve file ID from https://docs.google.com/spreadsheets/d/### file ID ###/edit?usp=sharing
Modified curl command :
curl 'https://docs.google.com/spreadsheets/d/### file ID ###/gviz/tq?gid=911257845&tq=select%20%2a%20where%20F%20%3e%20300&tqx=out:csv'
Reference :
Query Language Reference
If I misunderstand your question, I'm sorry.
Edit :
The following 2 URLs are the comparison between your URL and my answer. The URL of my answer was matched to your URL.
1. Your URL
https://docs.google.com/spreadsheets/d/e/2PACX-1vS3iBtVf4i_won5zAN9NGPqhcd6CcTb-4QHxpisSjCmlgV95B6mFmZvtMaC9GPvD7m8kD-6XLkVAhfc/gviz/tq?gid=911257845&output=csv&tq=select%20*%20where%20F%3E300
When above URL is separated,
https://docs.google.com/spreadsheets/d/e/
e/ is not required.
2PACX-1vS3iBtVf4i_won5zAN9NGPqhcd6CcTb-4QHxpisSjCmlgV95B6mFmZvtMaC9GPvD7m8kD-6XLkVAhfc
This is not the file ID of spreadsheet.
/gviz/tq
gid=911257845
output=csv
tq=select%20*%20where%20F%3E300
2. In my answer matched to your URL
https://docs.google.com/spreadsheets/d/### file ID ###/gviz/tq?gid=###&tqx=out:csv&tq=### query ###
When above URL is separated,
https://docs.google.com/spreadsheets/d/
### file ID ###
You can see the detail of the file ID of spreadsheet at here.
/gviz/tq
gid=###
You can use gid=911257845.
tqx=out:csv
This has to be used instead of output=csv.
tq=### query ###
You can use tq=select%20*%20where%20F%3E300.
Note :
Each number corresponds.
And please share the Spreadsheet as follows. This is difference from "Publish to the web" on Spreadsheet.
On Google Drive
On the Spreadsheet file
right-click -> Share -> Advanced -> Click "change" at "Private - Only you can access"
Check "On Anyone with the link"
Click "Save"
At "Link to share", copy URL.
Retrieve file ID from ``https://docs.google.com/spreadsheets/d/###

Google Sheets! Issue with Image() Function

Let me try to explain my issue.
I've an image in my Drive URL = "https://drive.google.com/open?id=0B7Qr0kRr6yOOLXB1OE5lUUxBRjQ"
I converted it to "https://drive.google.com/uc?export=download&id=0B7Qr0kRr6yOOLXB1OE5lUUxBRjQ"
Replaced "open?id" with "uc?export=download&id"
I've modified URL of step 2 above in Sheet1!A1
At Sheet1!A5 When I use =Image(A1,1) it doesn't works.
but when I use =Image("https://drive.google.com/uc?export=download&id=0B7Qr0kRr6yOOLXB1OE5lUUxBRjQ",2) it works. This shows up image.
Q: What's the reason that when i use direct URL with Image Function it works but when i give cell reference of containing URL it won't.
Please help to find the issue! Thanks in advance.
use =Image(&A1,1)
this means "&" puts the value of A1 in the image formula and displays the image.

Current site URL in Liferay 6.2

I cannot figure out how to get URL of current site in Liferay. For example if i have created four sites - site1, site2, site3, site4. URL of this sites will be:
http://localhost:8080/web/site1/
http://localhost:8080/web/site2/
http://localhost:8080/web/site3/
http://localhost:8080/web/site4/
But how can i get this URLs from velocity (in theme)? I tried few options:
$themeDisplay.getPathFriendlyURLPublic() - /web
$themeDisplay.getPortalURL() - http://localhost:8080
$themeDisplay.getURLHome() - http://localhost:8080/web/guest
$themeDisplay.getURLCurrent() - /web/site1/home
I need to get just http://localhost:8080/web/actualsite/.
All right, after few hours of trying I find solution:
To get current site url, you need to use:
$layout.getGroup().friendlyURL in velocity.
This expression returns '/site-name' format.
Try this in your theme vm. This should give you current complete url.
$portalUtil.getCurrentCompleteURL($request)
Output : http://localhost:8080/web/site4/home

Resources