How to extract website data into Google Sheets - google-sheets

I wanna extract the rating of a hotel room from Expedia in Google sheet. Unfortunatley, my code doesn't work. Can you take a look?
=IMPORTXML(https://www.expedia.co.in/Manila-Hotels-ZEN-Rooms-Pioneer-Street.h17368867.Hotel-Information,"//span[#class='rating-number']")

importXml() can import simple HTML, but it does not work with the most of modern sites. In this case too, importXml() was not able to retrieve div, span, etc. Nevertheless, you can get the whole html and then parse it.
Just change your XPath query to //html. And seen from the source, you know where and how to find the rating. So just get it by =mid(A2, find("out of 5.0", A2)-4, 3).
Sample sheet is here. I tested it with a few other hotels and it worked fine.

Related

Assitance with ImportXML in google sheets

I'm putting together a spreadsheet so I can keep track of items my store has for preorder and making it so it updates when stock levels change. I am trying to get it so my sheet tells me whether the webpage button says pre-order or add to basket. Sometimes this works and then other times it doesn't.
Here is the formula I have:
=IMPORTXML("https://www.smythstoys.com/uk/en-gb/video-games-and-tablets/gaming-merchandise/harry-potter-lumos-logo-light/p/209816","//html/body/div[7]/section/div/div/div[2]/div[1]/div[5]/div/div/div/div[2]/form/button")
This particular one seems to return an N/A. Outside of trying to teach myself, I don't know Importxml well or the details for HTML as to why this might or might not work.
I was also wondering if it were possible to only retrieve certain bits of text but not all of it. So, I have this:
=IMPORTXML("https://www.smythstoys.com/uk/en-gb/video-games-and-tablets/gaming-merchandise/harry-potter-lumos-logo-light/p/209816","//html/body/div[7]/section/div/div/div[2]/div[1]/div[5]/div/div/div/div[2]/form/div[1]/span[14]/table/tbody/tr/td[2]")
which returns "Out of Stock. Expected Stock August 2022". Is it possible to only retrieve the "August 2022" part of that text?
Thank you so much to anyone who can help.
To return the button
=index(importxml(url,"//button[#id='addToCartButton']"),1,1)
To get only the availability date
=regexextract(IMPORTXML(_______________),".*: (.*)")

ImportXML Function on Google Sheets

I'm having a tough time pulling info in on Google Sheets using the ImportXML function. I want to pull in the price of a crypto coin so that I have a real-time feed. The link that I'm hoping to pull from is:
https://www.dextools.io/app/uniswap/pair-explorer/0x40f0e70a7d565985b967bcdb0ba5801994fc2e80
I've tried out a lot of different formulas and keep getting an #N/A or an error. Some of the ones I've tried:
Copy XPATH fully:
=IMPORTXML("https://www.dextools.io/app/uniswap/pair-explorer/0x40f0e70a7d565985b967bcdb0ba5801994fc2e80","/html/body/app-root/div[3]/div/main/app-uniswap/div/app-pairexplorer/app-layout/div/div/div[2]/div[2]/ul/li[2]/span")
Shortened XPATH (also tried deleting the second backslash before 'li' but that didn't work):
=IMPORTXML("https://www.dextools.io/app/uniswap/pair-explorer/0x40f0e70a7d565985b967bcdb0ba5801994fc2e80","//li[2]/span")
Include class:
=IMPORTXML("https://www.dextools.io/app/uniswap/pair-explorer/0x40f0e70a7d565985b967bcdb0ba5801994fc2e80","//li[2]/span[#class='ng-tns-c93-2 ng-star-inserted']")
Does anyone have thoughts? Thanks!
upon disabling JavaScript the site is empty = can't be scraped by Google Sheets by any import formula.
To avoid the problem above, consider using a proper API service that gives you easy access to the data.
For instance you could get Zero price in USD using
=IMPORTDATA("https://cryptoprices.cc/ZERO/")
If you need it in comparison to ETH you could try doing it by hand
=IMPORTDATA("https://cryptoprices.cc/ZERO/")/=IMPORTDATA("https://cryptoprices.cc/ETH/")
Or use a more advanced API such as CoinGecko's
https://www.coingecko.com/en/api

Google Sheets ImportXML - Extract Class Information

First post on Stack Overflow! I have minimal IT/Dev background, and I was just trying to learn how to data scrape using the Import XML function in Google Sheets to get a little experience with the function and I've ran into a speed bump, hoping you can help!
I've been successful in my attempts to pull the data I would like so far, but there is a tiny amount of information I would also like to extract, but can't really figure it out thus far. I can see the information in Google DevTools. (Screenshot attached)
The data is stored in the Class definition line and it defines the time Last Seen, accessible one of two ways.
URL : https://us.tamrieltradecentre.com/pc/Trade/SearchResult?ItemID=11807&SortBy=Price&Order=asc
Desired function from ImportXML would be to pull the text, or the URL Extension - With the information of one of those two pieces of information
Thanks for your help!
*EDIT Added Google Sheets Screenshot
Devtool Screenshot
Google Sheets Screenshot
Red Circles for Values I Would Like To Import
You want to retrieve the values of "Last seen" like "1 Hour ago" from the URL using IMPORTXML.
When I checked the site of the URL, it was found that at the URL you want to use, the values like "1 Hour ago" are put using Javascript. In this case, unfortunately, that cannot be retrieved using IMPORTXML. Because IMPORTXML cannot evaluate Javascript.

Cell/column/row divider between formula/import result

This is a Google Sheets case, but if you good enough with MS Excel and
know the solution, don't be shy and share it with the community. Your
experience could be relevant to all of us.
Hello over there! I still didn't find any relevant solution in google, so I post my problem here.
I have a sample sheet and I import data via formula to cell via:
=UNIQUE(INDIRECT(C$2&"!"&$O3)
which means import all unique values via formula from Sheet!Range:range
It works fine, and I receive necessary data like:
But I want to see it like this:
with X (1,2,3,etc) row/cell/column spaces between them. I guess that =SPLIT formula should help me with that, but when I use , as a separator, I receive only first value, not the whole array of results that I needed it to.
So is there any way to achieve the result that I show at the picture above via formula or Google Script?
SAMPLE TEST SHEET with EDIT permission here:
https://docs.google.com/spreadsheets/d/1eYeFI8nL39kLNDkcyyjeqV8tc9LwG7P7cn2NzUIB7Wo
=ARRAYFORMULA(SUBSTITUTE(TRANSPOSE(SPLIT(QUERY(
"♂"&UNIQUE(INDIRECT(F$2&"!"&$G2))&"♂♀",,999^99), "♂")), "♀", ""))

Problems with Google Spreadsheets ImportXML function

I'm having some problems with ImportXML in my Google Spreadsheet. I currently have two sheets, each with their own ImportXML, retireving (basically) the same data - the server providing the data has updated their feed service to require the use of a user-specific "key" in the URL to track who is retrieving what. Prior to this change, my ImportXML worked just fine. They are about to turn off the non-key feeds, and my spreadsheets are about to break.
In the first (working) sheet, this is the feed.
I can import the data sucessfully by using the following syntax in cell A1:
=importXML(ʺhttp://atilla.hinttech.nl/fseconomy/xml?id=18649&key=M3LRG43T&query=GroupLogByMonth&month=10ʺ,ʺ//GroupLogByMonthʺ)
In the new (non-working) sheet, the URL to the feed (including my user-specific "keys") is here.
I am unable to create a working importXML on this sheet. None of my attempted Xpath queries worked, except "*"; but that resulted in all elements being lumped into a single cell.
I have shared my spreadsheet file (link is in the comments below - I am unable to post more than 2 links) with each of these sheets so that the above examples can be seen and played with. Any advise on the non-working sheet would be wonderful.
In the new XML feed there is no tag "GroupLogByMonth". This might explain why your Xpath query won't return anything when you look for that.
Did the format of the XML change too, next to the new URL?

Resources