Google Sheets importXML Returns Empty Value - google-sheets

Im trying to scrape this website (https://kamadan.gwtoolbox.com/) with google sheets for material costs for a game that I play. There are two tables; "Common Materials" and "Rare Materials" in a drop down in the top right corner. I am trying to pull the values for both as the prices update. I copied the full Xpath and used the function below in an empty cell on a sheet.
=importxml("https://kamadan.gwtoolbox.com/","/html/body/div[2]/div[1]/div/div[2]/table/tbody")
This returns a #N/A error saying it is returning an empty value.
I also tried it with the regular xpath...
=importxml("https://kamadan.gwtoolbox.com/","//*[#id='trader-overlay-items']")
Which just returns a blank cell. I have also tried both methods using the inspect function through chrome on the ancestors and children they return either of the two errors above.
Sorry if this is a really easy one. I am not familiar at all with Xpaths or html. I mostly dabble in VBA in excel.

Answer:
IMPORTXML can not retrieve data which is populated by a script, and so using this formula to retrieve data from this table is not possible to do.
More Information:
As you've already mentioned, you can attempt to get the data directly from the table using:
=IMPORTXML("https://kamadan.gwtoolbox.com/","//table[#id='trader-overlay-items']")
Which just gets a blank cell.
I went a step further and tried to reverse-engineer this by calling IMPORTXML on the HTML elements on the page in steps:
=IMPORTXML("https://kamadan.gwtoolbox.com/","html")
=IMPORTXML("https://kamadan.gwtoolbox.com/","html/body")
=IMPORTXML("https://kamadan.gwtoolbox.com/","html/body/div[1]")
=IMPORTXML("https://kamadan.gwtoolbox.com/","html/body/div[1]/div[0]")
...
html/body/div[1]/div[0] is the first path which gives no imported content, and we can see from importing html/body that the full body does not contain the imformation and only a template of it - in cell B1 we have references to 'Common materials' and 'Rare materials':
And in D1 we start to see JavaScript and JSON objects which are not called by IMPORTXML and so the results of which can not be retrieved:
As you can see if you disable JavaScript on the site, almost nothing is actually rendered and so can't be obtained using IMPORTXML:
References:
IMPORTXML - Docs Editors Help

Related

X-Path to a library search engine

I am writing a short scrapper using Google Spreadsheets using Xpatch and IMPORTXML
on that page, I am trying to get in B3 and following all the titles of articles (class 'library-document-summary') and in C3 and follow all the URLS of said articles
however, I am getting nowhere as the returns of my XPATH are always empty. Could someone with knowledge in this area help?
B2= https://resources.norrag.org/categories/591,595
=IMPORTXML(B2,"//div//a[#class='library-document-summary']/text()")
I don't think the IMPORTXML function supports XPaths that select text nodes. But I think if your XPath selects the a elements themselves, then their text content will be imported. e.g.
//div[#id='article_search_results']//a
... and for the links:
//div[#id='article_search_results']//a/#href

IMPORTXML on google sheets error- Imported content is empty

I want to get the price from mercari, a japanese online shop.
For example, in this link, I like to get 1,488.
https://jp.mercari.com/item/m78226870756
when I copy the xpath of
<span class="number">
I get
//*[#id="item-info"]/section[1]/section[1]/div[1]/mer-price//span[2]
Now, using google sheet importxml
=IMPORTXML("https://jp.mercari.com/item/m78226870756","//*[#id='item-info']/section[1]/section[1]/div[1]/mer-price//span[2]")
I receive a
#N/A Imported content is empty.
I would really like to know how to get the price.
I am not familiar with this at all.
Any other way other than google sheet is also welcome.
you are getting #N/A error due to importxml (or any other import) formula does not support the scrapping of JavaScript elements. you can test this always by disabling JS for a given site and what's left can be usually imported into google sheets

How to use ImportXML to extract text within <span> with multiple classes?

I'm using Google spreadsheet's ImportXML function, trying to fetch member counts from discordapp.com's invite link so that I can keep track on multiple servers' size and growth. The desired text is inside a span inside other divs. From what I've read, I'd think my code would work, but the error says content is empty. See details below:
My attempted code:
=ImportXML("https://discordapp.com/invite/steam","//span[#class='pillMessage-1btqlx medium-zmzTW- size16-14cGz5 height20-mO2eIN']")
Expected: Cell filled with current count, "24,013 Members".
Preferably: Cell filled with value 24013.
Actually: Cell: #N/A & Hovering: Error Imported content is empty.
How can I fix it to fetch the server's member count?
How about this answer?
It seems that at the site, the value like 24,013 is shown by the script. So the value cannot be directly retrieved by IMPORTXML(). But when I saw the HTML, it was found that the value is included in the metadata of HTML. In this answer, as a workaround, the value is retrieved from the metadata. Please think of this as just one of several answers.
Modified formula:
=VALUE(REGEXEXTRACT(IMPORTXML(A1,"//meta[3]/#content"),"hang out with ([0-9,]+) "))
The url of https://discordapp.com/invite/steam is put to the cell "A1".
Content of metadata is retrieved using IMPORTXML().
In this case, I used //meta[3]/#content as the xpath.
The value is retrieved using REGEXEXTRACT().
The value is converted to the number using VALUE().
Result:
When I tried above formula, 24018 was retrieved.
References:
IMPORTXML
REGEXEXTRACT
VALUE
If I misunderstood your question and this was not the result you want, I apologize.

Using CONCATENATE with Google forms and sheets

I have a survey going out with Google Forms, but to analyse the results, I would need to concatenate some cells. However, due to the nature of Google Forms, whenever a new response is recorded, a new row is added. I've read around, looking at different forums and tutorials, but can't seem to find anything that works.
Some of the places I've looked are:
concatenate column values for each row that gets added after google form submission
https://productforums.google.com/forum/#!topic/docs/0Os52U-0i1k
So what I would need help with is if it's possible to concatenate results from a Google Form without having to manually copy the formula in the cells whenever there are new responses. I've tried ArrayFormula, but I can't seem to get it to work. Any help would be much appreciated!
ArrayFormula(A2:A & B2:B) should do the trick.
Note that the formula will persist even if you put it directly at the end of the form and then add a new field.
It will just be shifted to the right, so you don't need to worry about taking care of that when you modify your form.
The CONCATENATE function is a Google spreadsheet function that combines two or more text strings into a single string. It appears in the dropdown menu for functions above cell A1, and when you select it, it places an =CONCATENATE()= formula in the selected cell.
Note that you may need to replace spaces with "&" if your text has spaces.
In order to perform this operation on Google Forms though, you will need to set up Form Embeds by making sure you have the input type of "google form embed." When embedded forms are enabled, there is no need for individual cells within a google sheet workbook with custom formulas next to each question result button as they're all being calculated.
You can find more info on CONCATENATE by referring to this.

Problems with Google Spreadsheets ImportXML function

I'm having some problems with ImportXML in my Google Spreadsheet. I currently have two sheets, each with their own ImportXML, retireving (basically) the same data - the server providing the data has updated their feed service to require the use of a user-specific "key" in the URL to track who is retrieving what. Prior to this change, my ImportXML worked just fine. They are about to turn off the non-key feeds, and my spreadsheets are about to break.
In the first (working) sheet, this is the feed.
I can import the data sucessfully by using the following syntax in cell A1:
=importXML(ʺhttp://atilla.hinttech.nl/fseconomy/xml?id=18649&key=M3LRG43T&query=GroupLogByMonth&month=10ʺ,ʺ//GroupLogByMonthʺ)
In the new (non-working) sheet, the URL to the feed (including my user-specific "keys") is here.
I am unable to create a working importXML on this sheet. None of my attempted Xpath queries worked, except "*"; but that resulted in all elements being lumped into a single cell.
I have shared my spreadsheet file (link is in the comments below - I am unable to post more than 2 links) with each of these sheets so that the above examples can be seen and played with. Any advise on the non-working sheet would be wonderful.
In the new XML feed there is no tag "GroupLogByMonth". This might explain why your Xpath query won't return anything when you look for that.
Did the format of the XML change too, next to the new URL?

Resources