Getting Product category in Amazon - google-sheets

I’ve tried using IMPORTXML on Google sheets to get Amazon product category but it’s always returning N/A.
Is there a way for me to get it?
Here’s the Amazon link https://www.amazon.co.uk/dp/B07H8Q5S1T

Try
=importxml(url,"//div[#id='nav-subnav']/#data-category")
or
=index(importxml(url,"//div[#id='nav-subnav']"),1)

Related

importHTML (Google Sheets) formula stopped working

I used the following formula for almost a year now and suddenly it stopped working and is not importing the table.
=IMPORTHTML("https://tradingeconomics.com/matrix";"table";1)
It gives me a "Could not fetch url: https://tradingeconomics.com/matrix error. I also tried the importXML function - same problem.
I tried https://www.octoparse.com just to see if it was able to scrape the data. And it is able to scrape and parse out the data and export it to various formats (you need to install a program for it), although it doesn't solve the problem of automatically importing into Sheet via formula. 😕
Any ideas about what the problem could be and how I need to adapt the formula?
Note: I can't code, unfortunately.
There have being several posts here and other places about the same error message realted to IMPORTHTML. Here are some previous questions bout the same error message that were fixed without making any change:
Google spreadsheet importHTML Could not fetch URL
"Could not fetch URL" using IMPORTHTML
Sometimes the problem is caused by something on the Google side and there isn't any change that can be done on the formula to fix it, the only thing to do is to report the problem to Google from the help menu and wait. At this time the option is shown to me as "Help Sheets improve" but this might change without any notice as it has being done several times.
You might also report it through the official Google Editors Help forum.
Related
How to know if Google Sheets IMPORTDATA, IMPORTFEED, IMPORTHTML or IMPORTXML functions are able to get data from a resource hosted on a website?
try cached version:
https://webcache.googleusercontent.com/search?q=cache:ZNJKOXQm2t4J:https://tradingeconomics.com/matrix+&cd=2&hl=en&ct=clnk
=IMPORTHTML("https://webcache.googleusercontent.com/search?q=cache:https://tradingeconomics.com/matrix", "table", 1)

Can text be scraped from Grammarly to google spreadsheet using IMPORTXML function?

I am trying to get texts from the Grammarly application imported into a Google spreadsheet using the IMPORTXML function. To do so, I follow the required syntax IMPORTXML(URL, xpath_query), but it keeps showing an error that the "imported content is empty".
However, the same steps work fine to import data from other websites, and I am confused what might be the matter with Grammarly. Is it because it does allow data scraping at all, maybe?
Thanks for your help. 1 2 3
not possible because this is behind the login gate. google sheets cant read such data

IMPORTHTML on Google Sheets returning a #N/A error, but only in one document

I have a Google Sheets document where I track the prices of several stocks. I made this a couple of months ago, and have been experiencing this issue for the past couple of weeks:
This formula returns "#N/A", the error description is: "Could not fetch url: https://finviz..."
=substitute(INDEX(IMPORTHTML("https://finviz.com/quote.ashx?t=VOO","table",11),8,2),"*","")
However, if I create a new Google Sheets document and use this exact formula, it works. Does anyone know what could be the problem?
I am having the same issue. Something must have been changed at finviz / google :(
There are also some discussions in the google support groups.
One possible solution could be to put all the symbols you're interested in into one URL, e.g. https://finviz.com/screener.ashx?v=161&t=FB,AAPL,GOOG,TSLA&ta=0&p=w
and then parse the resulting table.
Unfortunately I am not very good at the parsing part and have to do it by try and error.
But for example
=importxml("https://finviz.com/screener.ashx?v=161&t=FB,AAPL,GOOG,TSLA&ta=0&p=w";"//*[#id='screener-content']/table/tbody/tr[4]/td/table")
is at least showing some results in google docs. So this might be something to work with.
It will work again by removing 'SUBSTITUTE' and switching to table 8.
A2 = stock ticker
=ÍNDICE(IMPORTHTML("https://finviz.com/quote.ashx?t="&A2;"table";8);7;2)

Get Google Reviews with IMPORTXML function in G Sheets

I'm trying to import to a Google Sheet the number of reviews and average rating of a certain venue on Google Maps.
Taking as an example this page:
https://www.google.com/maps?cid=8807257593070771217
From Chrome's inspector, the XPath for the average should be:
//*[#id='pane']/div/div[1]/div/div/div[1]/div[3]/div[2]/div/div[1]/span[1]/span/span
However it always returns empty.
Any idea why?
PS - This URL redirects to another, but that shouldn't be the problem as the same thing happens with Facebook and it returns the correct values.
Thanks in advance for any help
Per the comments here, you can't. If you want to scrape Google Maps, use Google's officially supported way to do that: their APIs.
You're probably interested in the Place Details, in particular.
If you have access to a businesses Google My Business page you can leverage the API to pull in reviews that way: https://developers.google.com/my-business/content/review-data
Otherwise, https://serpapi.com can scrape Google reviews for you.

Has the Google Sheets Published URL Suddenly Changed to a Different Format?

Normally, when I use the Google Sheets API, I get a very predictable URL structure from the "Publish Sheet" menu option, that I use to extract the Spreadsheet ID with a regular expression and use it for other tasks on the Google Sheets API.
This has worked for years and is the way that Google's documentation recommends getting the Spreadsheet ID - from the URL.
e.g.
https://docs.google.com/spreadsheets/d/{MYSPREADSHEETID}/pubhtml
However, as of today, when publishing a spreadsheet, I now get a URL like this:
https://docs.google.com/spreadsheets/d/e/2PACX{BUNCH OF RANDOM CHARACTERS}/pubhtml
This breaks my code as the bunch of random characters that appears with 2PAC is not the spreadsheet ID and does not work with the API.
Does anyone know if this is an unannounced change to Google's URL structure or some kind of bug?
I have no idea when or why Google has decided to change their URL structure. The Google Sheets API Documentation states to pull the spreadsheet ID from the editing URL. Google Sheets API Documentation It seems unlikely to me that this is a bug of some kind, since this has been going on for a while, and to me, seems permanent.
The solution to this problem would be to pull the spreadsheet ID from the editing (or the sharing URL) URL itself instead of using the URL of the published sheet.
I hope Google fixes this issue as this affects consistency across their URLs but for now, the only way to retrieve the spreadsheet ID is to get it from the editing or sharing URLs.
Hope this helps! :)

Resources