How to retrieve data to Google Sheets - google-sheets

I've tried to get table's data from https://www.set.or.th/th/market/index/set/agro/agri
to Google Sheets
=IMPORTHTML("https://www.set.or.th/th/market/index/set/agro/agri","table",1)
Changing list to table and still unable to get the data.
My expected output in Sheets is
EE bunch of numbers
GFPT bunch of numbers
LEE bunch of numbers
. bunch of numbers
. bunch of numbers
VPO bunch of numbers

I'm writing this answer as a community wiki, since the issue was resolved from the comments section, in order to provide a proper response to the question.
The content you're trying to extract, loads with JavaScript and IMPORT functions can’t extract content that loads with JavaScript. You can check this article.
If you click on the ‘Lock’ icon beside the browser’s address bar, select ‘Site settings’ and set JavaScript to ‘Block’. Reload the page and as you can see in the screenshot below, the site needs JavaScript enabled to load certain content.

Related

ImportXML / ImportHTML workaround with URL Tabs on Google Sheets [duplicate]

This question already has answers here:
Scraping data to Google Sheets from a website that uses JavaScript
(2 answers)
Closed last month.
So I am trying to do a spreadsheet using Google Sheets and the importxml/html function. However, I am not seeing a solution for the URL since it has tabs on a persistent URL: https://www.morningstar.co.uk/uk/etf/snapshot/snapshot.aspx?id=0P0001CY2T&tab=3&InvestmentType=FE
My goal is to extract the tables of value & growth but not seeing a way to work around that. Only making it work on the main page of the URL: https://www.morningstar.co.uk/uk/etf/snapshot/snapshot.aspx?id=0P0001CY2T which is data I don't intend to use.
I did try to importhtml with table selection, however not displaying any data when the first URL is used. Also did try importxml with both full Xpath and Xpath for the items I'm interested in and not working either...
Options used:
=importhtml("https://www.morningstar.co.uk/uk/etf/snapshot/snapshot.aspx?id=0P0001CY2T&tab=3&InvestmentType=FE";"table";"2")
=importxml("https://www.morningstar.co.uk/uk/etf/snapshot/snapshot.aspx?id=0P0001CY2T&tab=3&InvestmentType=FE";"//#html/body/div/sal-components-pillar-cards-process/div/div[2]/div/div[3]/div[2]/div/sal-components-mip-style-measures/div/div[3]/div/div[1]/sal-components-mip-measures/div/div[2]/div/div[2]/div/div/div/table/tbody/tr[1]/td[2]")
Any ideas?
It seems that the table you are trying to fetch is controlled by Javascript which is out of hand when using IMPORTs in Google Sheets. Thus, the table can't be scraped.
You can check whether a website/table in a website is javascript controlled by doing this. Go click on the lock button on the left side of the address bar and click site settings, look for Javascript then block it. If you try and reload the website, You should notice a difference before blocking Javascript.
In this case, if you try it on your end, you will notice that after blocking Javascript on the website, you won't be able to see the tables anymore.
IMPORT functions of google sheets are not able to handle JavaScript elements. if you disable JS you are left with (and only this can be imported):

importxml function in Googlesheet

First of all, I'm completely incompetent and my hours-long attempts at trying to make this work have been fruitless. So, please, there's someone that can help me.
I have
table id="..........." tablesorter class="........"
They are in the same line of code ad I'm able to scrape until the first element. For me it's important to scrape by the second one. I'm tryng different way but nothing
investing
In the image, in the part highlighted on the left where there is the drop-down menu, it's possible to select the different American markets (Nasdaq, DowJones,
S&P500 etc.). When I select a market other than DowJones, the URL of the page always remains the same, while the part that I highlighted on the right changes (tablesorter class = "............").
In my sheet, I've done this but it can't allow me to scrape different market (only the default table thay you see when open the webpage)
spreadsheet
Your main problem is that IMPORTXML can only retrieve information from static content in websites. Therefore, any content inserted dynamically can't be retrieved by this function.
In your case, you can check what content is not static by heading over to the website https://it.investing.com/equities/americas and then disabling JavaScript on it. To do so if you are using Chrome please follow this guide.
As Javascript will add dynamic content to the site, when you disable it you will observe that the information subject to change with the dropdown doesn't actually change which means that it was dynamically inserted and therefore can't be accessed by IMPORTXML. I have attached an image below showing this.
As a workaround to this you will need to use other web scraping techniques.

How to extract data of previous week based on drop down from the Yahoo Fantasy Football

Hy, I am using Yahoo Fantasy Football and I have design Google Sheet to get the score data which is the working fine. The link to the sheet is as under.
Google Sheet Link
I have changed the permission to Editor. I have made a drop-down which holds the information of Week numbers. Basically, my idea is that by choosing the week number I want to populate the data from yahoo fantasy football. For importing data, i am using this command.
=importhtml("https://football.fantasysports.yahoo.com/f1/683375","table",1)
and this command is working well.
I tried it by using the same command but it does not works for the week numbers. The source of the page is as under.
so according to the given picture, here is the week number, I want to implement the same in the google sheet by using the dropdown. I have implemented the drop down but it does not work. Is there a way to interlink both using scripts or command so from google sheet when I chose week from the drop-down the concerning data should be populated? please take a look at the Google Sheet given above. I am also getting this error, while it was working fine before, how can it be resolved also.
Thanks
IMPORTHTML cannot retrieve elements dynamically inserted by a script. In your case the content on Week matchups is inserted dynamically and therefore will not be retrieved (it will return empty). Moreover, IMPORTHTML olny gets data from tables or lists and if you inspect what it seemed to be a table in Week Matchup is just actually a series of divs. If the content would not be inserted dynamically, to get the data from these divs you would need to use IMPORTXML.
If you still want to retrieve this information I am afraid that you will need to look for other web scraping techniques.

Not able to fetch website data using Importxml in Google Sheets

I am trying to fetch this website (https://www.covidhotspots.in/?city=Mumbai&source=share) data using Importxml in Google Sheets but it gives me no data.
I am trying to apply below formula but it is giving me #NA
=IMPORTXML("https://www.covidhotspots.in/?city=Mumbai&source=share","//li/text()")
I want to fetch geocodes as mentioned in the below images
Issue
IMPORTXML can only read the HTML source of a website. Therefore, those elements and components of a website added dynamically will not be able to be retrieved by the IMPORTXML.
If in your browser you take a look and view the source of the website, specifically focusing on the parent ul element that contains the li that you want to retrieve, you will find that the ul element is empty. This is because the children are inserted dynamically throughout a Javascript script (and that is why when you call this IMPORTXML accordinly to that ul nothing is returned, because it is empty in the source HTML).
Possible workaround
Sometimes, in the Javascript files of the website, you can find out the URL of the source of data being inserted dynamically but that is a tedious task to achieve.
I hope this has helped you. Let me know if you need anything else or if you did not understood something. :)

Google Sheets embed into website with formatted table

I usually record students marks in a Google Sheet. However Google's embed provides a "mirror" of the Sheets and looks exactly like the sheet. This means I have to resize the cells so as to show the complete names and perform formatting. Is there a better way of displaying this information without resizing the cells. I need a method which automatically displays a the data without any configuration. If possible without the use of Google Apps Script.Here is the sample data. The data will be sent to parents and will also be printed.
https://docs.google.com/spreadsheets/d/1nKcShloX5R4OvhuEtRogCHG18V1YTR5v9Hb19Jobm88/pubhtml
You can use Google Visualizations API. This automatically resizes the cells and produces a neat and minimal looking table.
This is the usage:
https://docs.google.com/spreadsheets/d/<sheet-id>/gviz/tq?tqx=out:html&tq&gid=2
So in your case:
https://docs.google.com/spreadsheets/d/1nKcShloX5R4OvhuEtRogCHG18V1YTR5v9Hb19Jobm88/gviz/tq?tqx=out:html&tq&gid=2
The difference is very clear.

Resources