Error trying to get page title in google sheets - google-sheets

Can you help me with the correct XPath to select the text of the title "Quotes to Scrape" contained in the <a> tag inside an <h1> tag on the following webpage?
I need to use this XPath on the IMPORTXML function in Google Sheets, but I'm not sure if the XPath is correct.
=IMPORTXML("https://quotes.toscrape.com/","//div[#class='col-md-8']/h1/a")
I have an error in Google Sheets.
I am expecting to get the text inside /h1/a.

The answer was to use a semicolon instead of comma:
=IMPORTXML("http://quotes.toscrape.com/";"//div[#class='col-md-8']/h1/a")

I'm writing this answer as a community wiki since the solution was provided by #margusl in the comments section, for visibility to the community.
The issue was related to a typo with the formula. The formula with the issue was
=IMPORTHTML("https://quotes.toscrape.com/","//div[#class='col-md-8']/h1/a")
However, IMPORTHTML doesn't use XPath, it uses a query like "List" or "Table" as mentioned here.
So to fix the issue, you need to fix the typo and use:
=IMPORTXML("http://quotes.toscrape.com/","//div[#class='col-md-8']/h1/a")
Or you can also use:
=IMPORTXML("http://quotes.toscrape.com/","/html/body/div/div[1]/div[1]/h1/a")

Related

Google Spreadsheets ImportXML / XPath - I get "data:image/svg+xml..."

I am trying to get the url of the images from web link.
For this I use the IMPORTXML and XPATH function in google Spreadsheet.
This is the code:
=IMPORTXML(B1;"//*[#id='gallery-1']/figure/div/a//img/#src")
So far everything is correct, get the urls of each image, however, I am also getting the following text in turn for each URL obtained.
data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%20500%20400'%3E%3C/svg%3E
I don't know how to prevent this text from appearing. I have been looking at the source code of the page link and this text does not appear anywhere. Therefore I consider that it is something related to spreadsheet.
I have checked the xpath again and again but I also find that there is an error ...
Here is a screenshot of what happened for your reference.
https://drive.google.com/file/d/1Tv_zk0JUI9EjCTU0kKfzRuQr936-88I7/view?usp=sharing
Any help is greatly appreciated. Thanks
try:
=QUERY(IMPORTXML(B1, "//*[#id='gallery-1']/figure/div/a//img/#src"),
"where not Col1 starts with 'data'")

Attempting to import from a XPath, seems to always yield blank information

Currently in my google doc, i'm working on a database for my card worth, and it seems like it doesn't want to grab the information no matter what xpath i want to attempt.
Website i'm trying to take information available here. *This is the hyperlink i'm feeding
In the top right corner i'm attempting to grab the worth box information, here is current xpaths i've attempted
"//a[#id='worthBox']/h4"
"/html/body/div[4]/div[1]/div[2]/form/div[1]/div[2]/div/a/h4"
"/h4"
"/h4[0-20]"
"//a[#id='worthBox'][1]/h4"
"//div[#id='estimate-box']/a/h4"
"//div[#id='estimate-box']/a[1]/h4"
Can someone explain to me why it doesn't seem to wanna fetch, is it even possible?
Thank you so much for your time and help!
In the URL, the value is put using the Javascript. But IMPORTXML cannot retrieve the result after Javascript was run. IMPORTXML retrieves the HTML without running Javascript. I think that your xpath is the result after Javascript was run. By this, they cannot be used. But it seems that the value you expect can be retrieved other xpath.
Modified xpath:
//input[#id='medianHiddenField']/#value
Sample formula:
=IMPORTXML(A1,"//input[#id='medianHiddenField']/#value")
In this case, the URL of https://mavin.io/search?q=Lugia%20NM%209%2F111%20-PSA&bt=sold# put in the cell "A1".
Result:
Reference:
IMPORTXML

Getting “#N/A” error when using importhtml formula

I was trying to import a table from hket website to run some analysis of my own.
When I used: =importxml("http://www1.hket.com/finance/chart/industry-index.do","//*[#id='eti-finance-chart-table']") which represents the link to the site, I am getting the "N/A" error.
The importxml works fine with gurufocus site.
Can you help me out? I haven't been able to figure out what the issue could be.
from what I understand, hket doesn't use HTML or XML format for their table. If that is the case, is there a script I can use in Google Sheets that will let me extract data from hket?
you can see the culprit if you run this formula:
=IMPORTXML("http://www1.hket.com/finance/chart/industry-index.do", "//*")

Google Spreadsheet getting text with importxml

I've tried this and other versions to no avail? Can anyone help please?
=IMPORTXML("http://performance.morningstar.com/fund/ratings-risk.action?t=MWTRX", "//*[#id='div_ratings_risk']/table/tbody/tr[4]/td[3]/text()")
As explained in the comments to your original question, initially the div Element with the id #div_ratings_risk is initially empty and does not consist of a table.
So Google spreadsheets is not able to parse content that is not there and yet needs to be loaded first.
The content (table) you try to fetch data from into your google spreadsheet is dynamically loaded using jQuery from another URL. You can get that URL using e.g. the chrome developer tools and filter for XHR request.
If you parse the content directly from that HTML it will work. So you would need to change your formula to that URL and adapt your XPath like so:
=IMPORTXML("http://performance.morningstar.com/ratrisk/RatingRisk/fund/rating-risk.action?&t=XNAS:MWTRX&region=usa&culture=en-US&cur=&ops=clear&s=0P00001G5L&ep=true&comparisonRemove=null&benchmarkSecId=&benchmarktype=", "//table/tbody/tr[4]/td[3]/text()")

How to formulate a hyperlink in OpenDocument/ OpenOffice Calc

I am using Open Office Calc and trying to create a link to another worksheet but using the HYPERLINK formula function.
For some reason the following won't work and I cannot find the solution anywhere on the web.
=HYPERLINK(#New.A1,'new') and I've also tried =HYPERLINK(#New.A1;'new')
What is the correct way to do this?
Use double quotes.
=HYPERLINK("#New.A1"; "new")
There is an example at https://help.libreoffice.org/Calc/Spreadsheet_Functions#HYPERLINK.

Resources