How to get Website Value into Spreadsheet? - google-sheets

From website https://goldenmark.com/pl/mysaver/ceny-zlota/ I want to get bolded value in Aktualny kurs: 6940,28 PLN/uncję into my spreadsheet. How to obtain this?
Thank you

You could try xpath and the importxml function:
https://support.google.com/docs/answer/3093342?hl=en

the element you want to import is controlled by JavaScript.
google sheets is not able to import JS content.
you can always test this by disabling JS for a given site and what's left can be scraped

Related

Google Spreadsheets ImportXML / XPath - I get "data:image/svg+xml..."

I am trying to get the url of the images from web link.
For this I use the IMPORTXML and XPATH function in google Spreadsheet.
This is the code:
=IMPORTXML(B1;"//*[#id='gallery-1']/figure/div/a//img/#src")
So far everything is correct, get the urls of each image, however, I am also getting the following text in turn for each URL obtained.
data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%20500%20400'%3E%3C/svg%3E
I don't know how to prevent this text from appearing. I have been looking at the source code of the page link and this text does not appear anywhere. Therefore I consider that it is something related to spreadsheet.
I have checked the xpath again and again but I also find that there is an error ...
Here is a screenshot of what happened for your reference.
https://drive.google.com/file/d/1Tv_zk0JUI9EjCTU0kKfzRuQr936-88I7/view?usp=sharing
Any help is greatly appreciated. Thanks
try:
=QUERY(IMPORTXML(B1, "//*[#id='gallery-1']/figure/div/a//img/#src"),
"where not Col1 starts with 'data'")

Attempting to import from a XPath, seems to always yield blank information

Currently in my google doc, i'm working on a database for my card worth, and it seems like it doesn't want to grab the information no matter what xpath i want to attempt.
Website i'm trying to take information available here. *This is the hyperlink i'm feeding
In the top right corner i'm attempting to grab the worth box information, here is current xpaths i've attempted
"//a[#id='worthBox']/h4"
"/html/body/div[4]/div[1]/div[2]/form/div[1]/div[2]/div/a/h4"
"/h4"
"/h4[0-20]"
"//a[#id='worthBox'][1]/h4"
"//div[#id='estimate-box']/a/h4"
"//div[#id='estimate-box']/a[1]/h4"
Can someone explain to me why it doesn't seem to wanna fetch, is it even possible?
Thank you so much for your time and help!
In the URL, the value is put using the Javascript. But IMPORTXML cannot retrieve the result after Javascript was run. IMPORTXML retrieves the HTML without running Javascript. I think that your xpath is the result after Javascript was run. By this, they cannot be used. But it seems that the value you expect can be retrieved other xpath.
Modified xpath:
//input[#id='medianHiddenField']/#value
Sample formula:
=IMPORTXML(A1,"//input[#id='medianHiddenField']/#value")
In this case, the URL of https://mavin.io/search?q=Lugia%20NM%209%2F111%20-PSA&bt=sold# put in the cell "A1".
Result:
Reference:
IMPORTXML

Google ImportXML from QGIS metadata file

I am trying to capture elements of an qmd file (that is xml markup) using Google Sheets importxml. Based on How to use importXML function with a file from Google Drive? I think I've got the file importing correctly but can't seem to capture any of the tags.
Here's what I am trying -
=importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","\\identifier")
Here's what the qmd/xml file looks like
<!DOCTYPE qgis PUBLIC 'http://mrcc.com/qgis.dtd' 'SYSTEM'>
<qgis version="3.9.0-Master">
<identifier>Z:/My Drive/Mangoesmapping/Spatial Projects/2019/DSC/132_Ongoing_Asset_Updates/Working/Sewerage_Updates/Sewerage_Manholes_InspectionShafts.TAB</identifier>
<parentidentifier>Sewerage Manhole Infrastructure</parentidentifier>
<language>AUS</language>
<type>dataset</type>
<title>Sewerage Manholes within Douglas Shire Council</title>
<abstract>Sewerage Manholes within Douglas Shire Council. Most data has been updated based on field work, review of existing AsCon files and discussion with council staff responsible for the assets in 2018/2019. In Port Douglas most of the infrastructure has been surveyed in. </abstract>
<keywords vocabulary="gmd:topicCategory">
<keyword>Infrastructure</keyword>
<keyword>Sewerage</keyword>
If I use
=importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","*")
I get
But I really would like to just get the elements I want by placing the importxml for each tag in the cell I need it in.
You want to retrieve ### of <identifier>###</identifier> from https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download
I could understand like above. If my understanding is correct, how about this answer?
Issue:
In your question, the formula of =importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","\\identifier") uses \\identifier as the xpath. From your data you want to retrieve the values, it seems that you are trying to retrieve ### of <identifier>###</identifier>.
In this case, in order to Selects nodes in the document from the current node that match the selection no matter where they are, // is required to be used instead of \\. This can be seen at the document of here.
Modified formula:
So =importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","\\identifier") can be modified as follows.
=importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","//identifier")
As other xpath, from your data in your question, you can also use the xpath of /qgis/identifier instead of //identifier. So you can also use the following formula.
=importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","/qgis/identifier")
References:
IMPORTXML
XPath Tutorial

Getting “#N/A” error when using importhtml formula

I was trying to import a table from hket website to run some analysis of my own.
When I used: =importxml("http://www1.hket.com/finance/chart/industry-index.do","//*[#id='eti-finance-chart-table']") which represents the link to the site, I am getting the "N/A" error.
The importxml works fine with gurufocus site.
Can you help me out? I haven't been able to figure out what the issue could be.
from what I understand, hket doesn't use HTML or XML format for their table. If that is the case, is there a script I can use in Google Sheets that will let me extract data from hket?
you can see the culprit if you run this formula:
=IMPORTXML("http://www1.hket.com/finance/chart/industry-index.do", "//*")

Google Spreadsheet getting text with importxml

I've tried this and other versions to no avail? Can anyone help please?
=IMPORTXML("http://performance.morningstar.com/fund/ratings-risk.action?t=MWTRX", "//*[#id='div_ratings_risk']/table/tbody/tr[4]/td[3]/text()")
As explained in the comments to your original question, initially the div Element with the id #div_ratings_risk is initially empty and does not consist of a table.
So Google spreadsheets is not able to parse content that is not there and yet needs to be loaded first.
The content (table) you try to fetch data from into your google spreadsheet is dynamically loaded using jQuery from another URL. You can get that URL using e.g. the chrome developer tools and filter for XHR request.
If you parse the content directly from that HTML it will work. So you would need to change your formula to that URL and adapt your XPath like so:
=IMPORTXML("http://performance.morningstar.com/ratrisk/RatingRisk/fund/rating-risk.action?&t=XNAS:MWTRX&region=usa&culture=en-US&cur=&ops=clear&s=0P00001G5L&ep=true&comparisonRemove=null&benchmarkSecId=&benchmarktype=", "//table/tbody/tr[4]/td[3]/text()")

Resources