I am trying to fetch this website (https://www.covidhotspots.in/?city=Mumbai&source=share) data using Importxml in Google Sheets but it gives me no data.
I am trying to apply below formula but it is giving me #NA
=IMPORTXML("https://www.covidhotspots.in/?city=Mumbai&source=share","//li/text()")
I want to fetch geocodes as mentioned in the below images
Issue
IMPORTXML can only read the HTML source of a website. Therefore, those elements and components of a website added dynamically will not be able to be retrieved by the IMPORTXML.
If in your browser you take a look and view the source of the website, specifically focusing on the parent ul element that contains the li that you want to retrieve, you will find that the ul element is empty. This is because the children are inserted dynamically throughout a Javascript script (and that is why when you call this IMPORTXML accordinly to that ul nothing is returned, because it is empty in the source HTML).
Possible workaround
Sometimes, in the Javascript files of the website, you can find out the URL of the source of data being inserted dynamically but that is a tedious task to achieve.
I hope this has helped you. Let me know if you need anything else or if you did not understood something. :)
Related
I've tried to get table's data from https://www.set.or.th/th/market/index/set/agro/agri
to Google Sheets
=IMPORTHTML("https://www.set.or.th/th/market/index/set/agro/agri","table",1)
Changing list to table and still unable to get the data.
My expected output in Sheets is
EE bunch of numbers
GFPT bunch of numbers
LEE bunch of numbers
. bunch of numbers
. bunch of numbers
VPO bunch of numbers
I'm writing this answer as a community wiki, since the issue was resolved from the comments section, in order to provide a proper response to the question.
The content you're trying to extract, loads with JavaScript and IMPORT functions can’t extract content that loads with JavaScript. You can check this article.
If you click on the ‘Lock’ icon beside the browser’s address bar, select ‘Site settings’ and set JavaScript to ‘Block’. Reload the page and as you can see in the screenshot below, the site needs JavaScript enabled to load certain content.
I am trying to retrive the price from a custom search Page in google sheets using importxml() but i have an empty result error
=IMPORTXML("https://starcitygames.com/search/?hawksearchable=card_nametext%3A+%22Shaman%20of%20the%20Great%20Hunt%22+AND+search_includetext%3A+%22default%22&language=English&filter_set=Fate%20Reforged&finish=Non-foil","/html/body/div/div[1]/main/main/div[7]/div/div[2]/div[2]/div/div[1]/div[1]/div[2]")
Formula
This is an example
It looks like the results in that HTML page are generated in your browser using JavaScript. Google Sheets does not run JavaScript so it never sees the HTML that you see in the browser. See https://www.reddit.com/r/googlesheets/wiki/import-html-xml/ for more information and a possible way forward.
After multiple test and research I don't have success in importing the data of this table (div) into a Google slide.
None of the formula I tested actually work included this simple test to extract the first column/line "Name":
=importxml("https://ecosystem.lafrenchtech.com/lists/18872/list?showGrid=false", "//span[#class='table-column-text']")
:(
Anyone could help me ?
Thx by advance.
Answer:
I've tested your function on a test sheet and it returns an empty content.
According to an answer at Google Sheets importXML Returns Empty Value , IMPORTXML can not retrieve data which is being populated by a script and it is a limitation. Unfortunately, I have checked that when Javascript is disabled for the ecosystem.lafrenchtech.com site in Chrome browser, the table never loads. Thus, this confirms that the table is being populated by a script and this is the reason why it returns an empty content.
A possible alternative solution is to check if the ecosystem.lafrenchtech.com offers an API, where you can directly get the data that they show from their table using an API key (if it is available). However, this will require you to use Apps Script to parse the data from their API and then post it on your spreadsheet, which would be quite a tedious for a quite simple process.
Note:
On your post, google-slides was the set tag.
I am interested in importing a table from a website. The does not load all rows at first; it expands as the user scrolls, eventually reaching the end.
I'm using a GoodReads account as an example.
I want to import all rows; however, as the url doesn't change, I believe I will need to use the IMPORTXML function rather than IMPORTHTML. However, I have not been able to identify an XPath that works.
IMPORTHTML displays the rows that initially populate when the page runs (url, "table", 2)
IMPORTXML currently displays text from rows that initially populate when the page runs in single cell
the following link has both options in individual sheets
https://docs.google.com/spreadsheets/d/14jHGRyHKf866jrZiIfX2-GX6hZ2SE6vi_DezckpXaRs/edit?usp=sharing
Upon checking the website you are trying to get data from, it the auto-expand of data while scrolling is controlled by JavaScript. Thus, it cannot be fetched using import() functions from google. This is why you can only fetch the initially available data.
To check whether a website is being controlled by JavaScript, click on the lock button beside the url->go to Site Settings->set JavaScript to 'block'->refresh the website and see the difference.
In your case, after blocking JavaScript, the website turned into this:
This question already has answers here:
Scraping data to Google Sheets from a website that uses JavaScript
(2 answers)
Closed last month.
So I am trying to do a spreadsheet using Google Sheets and the importxml/html function. However, I am not seeing a solution for the URL since it has tabs on a persistent URL: https://www.morningstar.co.uk/uk/etf/snapshot/snapshot.aspx?id=0P0001CY2T&tab=3&InvestmentType=FE
My goal is to extract the tables of value & growth but not seeing a way to work around that. Only making it work on the main page of the URL: https://www.morningstar.co.uk/uk/etf/snapshot/snapshot.aspx?id=0P0001CY2T which is data I don't intend to use.
I did try to importhtml with table selection, however not displaying any data when the first URL is used. Also did try importxml with both full Xpath and Xpath for the items I'm interested in and not working either...
Options used:
=importhtml("https://www.morningstar.co.uk/uk/etf/snapshot/snapshot.aspx?id=0P0001CY2T&tab=3&InvestmentType=FE";"table";"2")
=importxml("https://www.morningstar.co.uk/uk/etf/snapshot/snapshot.aspx?id=0P0001CY2T&tab=3&InvestmentType=FE";"//#html/body/div/sal-components-pillar-cards-process/div/div[2]/div/div[3]/div[2]/div/sal-components-mip-style-measures/div/div[3]/div/div[1]/sal-components-mip-measures/div/div[2]/div/div[2]/div/div/div/table/tbody/tr[1]/td[2]")
Any ideas?
It seems that the table you are trying to fetch is controlled by Javascript which is out of hand when using IMPORTs in Google Sheets. Thus, the table can't be scraped.
You can check whether a website/table in a website is javascript controlled by doing this. Go click on the lock button on the left side of the address bar and click site settings, look for Javascript then block it. If you try and reload the website, You should notice a difference before blocking Javascript.
In this case, if you try it on your end, you will notice that after blocking Javascript on the website, you won't be able to see the tables anymore.
IMPORT functions of google sheets are not able to handle JavaScript elements. if you disable JS you are left with (and only this can be imported):