Google Spreadsheet getting text with importxml

Google Spreadsheet getting text with importxml - google-sheets

I've tried this and other versions to no avail? Can anyone help please?
=IMPORTXML("http://performance.morningstar.com/fund/ratings-risk.action?t=MWTRX", "//*[#id='div_ratings_risk']/table/tbody/tr[4]/td[3]/text()")

As explained in the comments to your original question, initially the div Element with the id #div_ratings_risk is initially empty and does not consist of a table.
So Google spreadsheets is not able to parse content that is not there and yet needs to be loaded first.
The content (table) you try to fetch data from into your google spreadsheet is dynamically loaded using jQuery from another URL. You can get that URL using e.g. the chrome developer tools and filter for XHR request.
If you parse the content directly from that HTML it will work. So you would need to change your formula to that URL and adapt your XPath like so:
=IMPORTXML("http://performance.morningstar.com/ratrisk/RatingRisk/fund/rating-risk.action?&t=XNAS:MWTRX&region=usa&culture=en-US&cur=&ops=clear&s=0P00001G5L&ep=true&comparisonRemove=null&benchmarkSecId=&benchmarktype=", "//table/tbody/tr[4]/td[3]/text()")

Related

Attempting to import from a XPath, seems to always yield blank information

Currently in my google doc, i'm working on a database for my card worth, and it seems like it doesn't want to grab the information no matter what xpath i want to attempt.
Website i'm trying to take information available here. *This is the hyperlink i'm feeding
In the top right corner i'm attempting to grab the worth box information, here is current xpaths i've attempted
"//a[#id='worthBox']/h4"
"/html/body/div[4]/div[1]/div[2]/form/div[1]/div[2]/div/a/h4"
"/h4"
"/h4[0-20]"
"//a[#id='worthBox'][1]/h4"
"//div[#id='estimate-box']/a/h4"
"//div[#id='estimate-box']/a[1]/h4"
Can someone explain to me why it doesn't seem to wanna fetch, is it even possible?
Thank you so much for your time and help!

In the URL, the value is put using the Javascript. But IMPORTXML cannot retrieve the result after Javascript was run. IMPORTXML retrieves the HTML without running Javascript. I think that your xpath is the result after Javascript was run. By this, they cannot be used. But it seems that the value you expect can be retrieved other xpath.
Modified xpath:
//input[#id='medianHiddenField']/#value
Sample formula:
=IMPORTXML(A1,"//input[#id='medianHiddenField']/#value")
In this case, the URL of https://mavin.io/search?q=Lugia%20NM%209%2F111%20-PSA&bt=sold# put in the cell "A1".
Result:
Reference:
IMPORTXML

Google ImportXML from QGIS metadata file

I am trying to capture elements of an qmd file (that is xml markup) using Google Sheets importxml. Based on How to use importXML function with a file from Google Drive? I think I've got the file importing correctly but can't seem to capture any of the tags.
Here's what I am trying -
=importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","\\identifier")
Here's what the qmd/xml file looks like
<!DOCTYPE qgis PUBLIC 'http://mrcc.com/qgis.dtd' 'SYSTEM'>
<qgis version="3.9.0-Master">
<identifier>Z:/My Drive/Mangoesmapping/Spatial Projects/2019/DSC/132_Ongoing_Asset_Updates/Working/Sewerage_Updates/Sewerage_Manholes_InspectionShafts.TAB</identifier>
<parentidentifier>Sewerage Manhole Infrastructure</parentidentifier>
<language>AUS</language>
<type>dataset</type>
<title>Sewerage Manholes within Douglas Shire Council</title>
<abstract>Sewerage Manholes within Douglas Shire Council. Most data has been updated based on field work, review of existing AsCon files and discussion with council staff responsible for the assets in 2018/2019. In Port Douglas most of the infrastructure has been surveyed in. </abstract>
<keywords vocabulary="gmd:topicCategory">
<keyword>Infrastructure</keyword>
<keyword>Sewerage</keyword>
If I use
=importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","*")
I get
But I really would like to just get the elements I want by placing the importxml for each tag in the cell I need it in.

You want to retrieve ### of <identifier>###</identifier> from https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download
I could understand like above. If my understanding is correct, how about this answer?
Issue:
In your question, the formula of =importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","\\identifier") uses \\identifier as the xpath. From your data you want to retrieve the values, it seems that you are trying to retrieve ### of <identifier>###</identifier>.
In this case, in order to Selects nodes in the document from the current node that match the selection no matter where they are, // is required to be used instead of \\. This can be seen at the document of here.
Modified formula:
So =importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","\\identifier") can be modified as follows.
=importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","//identifier")
As other xpath, from your data in your question, you can also use the xpath of /qgis/identifier instead of //identifier. So you can also use the following formula.
=importXML("https://drive.google.com/uc?id=1AI2C8hQnSOuuoyJXizYBszGmpMXW8xxT&export=download","/qgis/identifier")
References:
IMPORTXML
XPath Tutorial

Google Sheet find cell by URL parameter

I have a database of elements, each element has its own QR Code. After reading the code I would like to be able to open the worksheet on a specific tab and jump to the appropriate cell (according to the element name). Calling a worksheet through a URL with the #gid parameter allows you to open a tab.... the "range" parameter allows you to jump to a specific cell.... and what if I want to search for an item by name? Something like: https://docs.google.com/spreadsheets/d/1fER4x1p.../edit#gid=82420100&search=element_name.... is it possible?

Google has not introduced this yet
But you can look into Google Script (Googles SpreadSheets macros like) to achieve this.
Also a simpler approach will be to just filter the data, but this will change your requirement obviously. For example you can create a Filter with the name you are looking for and then you will get the URL.
This is the URL to a Sample of this, it should open the
Spreadsheet and filter the data when loaded. This is the Icon to
look for to create the filters
here is some documentation for you to get started on Google App Script, but I don't have a direct link to let you know how to catch the parameters for it to process them. What I can tell you is that this is a much more complicated approach than just a URL because it involves programmatic processing on the Spreadsheet side.

finding an accurate Xpath for Google Sheets ImportXML [duplicate]

This question already has answers here:
Scraping data to Google Sheets from a website that uses JavaScript
(2 answers)
Closed 4 months ago.
So, I'm trying to use the ImportXML function in Google sheets to scrape some data from a website (https://www.cargurus.com/Cars/m-Bob-Johnson-Certified-Collection-sp402449), and I'm having trouble finding a path that works. This is the section I'm looking to pull.
I've tried using Chromes Inspect Element and using Copy X-path, Which gives me
//*[#id="ratingFilter_ContainerId"]/div
and returns #NA
I've used a Chrome plug-in called Scraper, which gives me //div[13]/div/div[2]/div[2]/div/label and returns #NA
I've even tried going through the code and making as direct a path as I could from scratch and came up with //body/div[1]/div[1]/main/div[1]/div[1]/div[11]/div[1]/div[1]/div[2]/div[2]/div[1]/div[1]/div[3]/div[1]/div[4]/div[2]/div[13]/div[1]/div[2]/div[2]/div
which also return #NA
So any tips for finding an accurate XPath would really be appreciated.

The expression
//*[#id="ratingFilter_ContainerId"]
executed on a fetched document selects a div element two levels above the one you show.
When extended by another step subexpression:
//*[#id="ratingFilter_ContainerId"]/div
it selects the div which contains the 'Deal Ratings' caption with the '(clear)' link at the right side, and the options list you need.
What you are interested in is rather
$fetched-document/descendant::div[#id="ratingFilter_OptionListContainer"]
EDIT
BTW, are you sure you fetch the page properly? When I load it into my browser, the page seems to load some additional data, which is noted with a 'Loading listings...' splash. Maybe you're trying to execute your query on an incomplete page...?

Google XPATH importxml can find "show" but not "showcount" or "count" [duplicate]

This question already has answers here:
Scraping data to Google Sheets from a website that uses JavaScript
(2 answers)
Closed last month.
Using this webpage as an example http://forums.macrumors.com/showthread.php?t=1688317
On a google spreadsheet, the following DO NOT work with importxml():
//a[contains(#href,"showpost")]/#href
//a[contains(#href,"showcount")]/#href
//*[#id="postcount18545482"]
The last one (//*[#id="postcount18545482"]) was copied directly from Chrome's element viewer.
The following DO work but exclude any results with the word "showcount", "postcount", or "showpost":
//div[contains(#id,"post_message")]/#id
//a[contains(#href,"show")]/#href
//a[contains(#href,"post")]/#href
Is there something special about the word "count" when working with importxml() or XPATH? How can I get the missing entries?

ImportXML function in Google Docs spreadsheet can not process data that is created in a two-step process. For example, when an authentication token must be retrieved first before making the url request, or when the URL tells the server to dynamically create an xml output after which the user is redirected to the output, even when the URL stays the same. You might want to look into Google Apps Scripts (http://code.google.com/googleapps/appsscript/index.html) to handle this case.
Taken from here
In your particular case the anchor parameters get set in the vbulletin_post_loader.js script called after the page container is loaded.
...
pc_obj=fetch_object("postcount"+this.postid);
openWindow("showpost.php?"+(SESSIONURL?"s="+SESSIONURL:"")
+(pc_obj!=null?"&postcount="+PHP.urlencode(pc_obj.name):"")+"&p="+A)
...
In other words, when importXML() scans the page, the nodes containing 'showpost' or 'postcount' in href are not yet on the page:
Looks like importXML() works with static pages only and not able to handle dynamically loaded content.
Try to find another way of obtaining the number of post in a thread.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Google Spreadsheet getting text with importxml - google-sheets

I've tried this and other versions to no avail? Can anyone help please? =IMPORTXML("http://performance.morningstar.com/fund/ratings-risk.action?t=MWTRX", "//*[#id='div_ratings_risk']/table/tbody/tr[4]/td[3]/text()")

Related

Attempting to import from a XPath, seems to always yield blank information

Google ImportXML from QGIS metadata file

Google Sheet find cell by URL parameter

finding an accurate Xpath for Google Sheets ImportXML [duplicate]

Google XPATH importxml can find "show" but not "showcount" or "count" [duplicate]

Categories

Resources