I use a googlesheet and IMPORTHTML to get the content of this page :
https://meteor.dsac.fr/documentation.php
It works fine, except the last column : the return value is "Consulter", however I would like to get the link instead (eg: https://meteor.dsac.aviation-civile.gouv.fr/meteor-externe/#communication/18280)
Is there a way to do that with that function?
Thanks
In the current stage, IMPORTHTML cannot retrieve the attribute of the tag. So, for example, how about the following sample formula?
Sample formula:
=QUERY({IMPORTHTML("https://meteor.dsac.fr/documentation.php","table",1),{"";IMPORTXML("https://meteor.dsac.fr/documentation.php","//td[4]/a/#href")}},"SELECT Col1,Col2,Col3,Col5")
In this sample formula, in order to retrieve the URLs instead of the value of "Consulter", IMPORTXML is used.
Testing:
When this formula is used, the following result is obtained.
References:
IMPORTHTML
IMPORTXML
Related
Using IMPORTXML in google sheets. I want to extract part of the result into one cell.
=IMPORTXML(B1,"//div[#class='orca-rating SwtJyda color-yellow tbody-6']/span")
I got the result spread over several columns. B8:F8
The inspect element is like this. I only want the value "2". It is in cell B8.
I think this can be done using substring-after. But I could not get the correct result.
In your situation, how about the following samples?
=REGEXREPLACE(JOIN("",IMPORTXML(B1,"//div[#class='orca-rating SwtJyda color-yellow tbody-6']/span")),"[^0-9]","")
=REGEXEXTRACT(JOIN("",IMPORTXML(B1,"//div[#class='orca-rating SwtJyda color-yellow tbody-6']/span")),"\((.*)\)")
References:
REGEXREPLACE
REGEXEXTRACT
I use this formula. That works too.
=INDEX( IMPORTXML(B1,"//div[#class='orca-rating SwtJyda color-yellow tbody-6']/span"),3)
But tanaike's formula is very good.
I'm trying add easy updating prices into a google sheet.
I need the market price from
//*[#id="app"]/div/section[2]/section/div[1]/section[3]/div/section[1]/ul/li[1]/span[2]
https://www.tcgplayer.com/product/242811/pokemon-celebrations-celebrations-elite-trainer-box?Language=English
I need it to display just the one number from the XPath to a cell, and I can't seem to figure out where I am going wrong. I've been using the IMPORTXML function and it won't return a value.
=IMPORTXML(A2,"//*[#id='app']/div/section[2]/section/div[1]/section[3]/div/section[1]/ul/li[1]/span[2]")
where A2 is the URL.
In your situation, it seems that the value of the market price cannot be directly retrieved from the URL of https://www.tcgplayer.com/product/242811/pokemon-celebrations-celebrations-elite-trainer-box?Language=English. But, fortunately, it seems that that value can be directly retrieved from the endpoint of API. So, how about the following sample formula?
Sample formula:
=REGEXEXTRACT(JOIN(",",IMPORTDATA(A1)),"marketPrice:(.+?),")*1
or
=REGEXEXTRACT(QUERY(TRANSPOSE(IMPORTDATA(A1)),"WHERE Col1 matches 'marketPrice.+'"),"marketPrice:(.+)")*1
The cell "A1" has the URL of https://mpapi.tcgplayer.com/v2/product/242811/details.
In the case of https://www.tcgplayer.com/product/242811/pokemon-celebrations-celebrations-elite-trainer-box?Language=English, please use 242811 from the URL to the endpoint of API like https://mpapi.tcgplayer.com/v2/product/242811/details.
Result:
Note:
The value from the URL is JSON data. In this case, the following custom function can be also used. In this case, please copy and paste the following script to the script editor of Spreadsheet and save the script. And please put a custom function of =SAMPLE("url") to a cell.
const SAMPLE = url => JSON.parse(UrlFetchApp.fetch(url).getContentText()).marketPrice;
References:
IMPORTDATA
REGEXEXTRACT
Custom Functions in Google Sheets
it's not possible to scrape JS content into google sheets:
I'm using the following query in my sheet to import total Spotify streams for artists. Example:
=IMPORTXML("https://chartmasters.org/spotify-streaming-numbers-tool/?artist_name=&artist_id=1uh2pZRWuOebEoQgFVKK7l&displayView=Disco","//tr[#class='careerTotals'][2]")
However it's returning one extra value I don't want ("EAS"). I would like to just have the artist name in A and the total streams in B. Any ideas? Thanks.
How about these modifications?
Modified formula:
=TRANSPOSE(IMPORTXML(A1,"//tr[#class='careerTotals'][2]/td[position()<3]"))
or
=QUERY(IMPORTXML(A1,"//tr[#class='careerTotals'][2]"),"SELECT Col1,Col2")
The URL of https://chartmasters.org/spotify-streaming-numbers-tool/?artist_name=&artist_id=1uh2pZRWuOebEoQgFVKK7l&displayView=Disco is put in the cell "A1".
At 1st modified script, the expected values are retrieved with xpath of //tr[#class='careerTotals'][2]/td[position()<3] and those are put to the columns using TRANSPOSE.
At 2nd modified script, the expected values are retrieved from the retrieved 3 values using QUERY.
Result:
This result is from the 1st modified formula. 2nd one is also the same result.
References:
TRANSPOSE
QUERY
I'm trying to use the IMPORTXML function on Google Sheets.
For example: =IMPORTXML("https://www.tiktok.com/#charlidamelio?lang=en", XMLPATH) should return "54.3M"
I used the Chrome inspector to copy the xpath, which gives me:
/html/body/div[1]/div/div[2]/div/div[1]/div/header/h2[1]/strong[2]
When I try this in Google Sheets it returns an error: #N/A (Import Content is Empty).
P.S. I'm open to other ways to get the data I need into the google sheet, it doesn't have to use the IMPORTXML function.
How about this answer?
In this answer, IMPORTXML and REGEXEXTRACT are used. And also, it supposes that the URL of https://www.tiktok.com/#charlidamelio?lang=en is put in a cell "A1".
Pattern 1:
In this pattern, "followerCount" is retrieved.
Sample formula:
=REGEXEXTRACT(IMPORTXML(A1,"//script[#id='__NEXT_DATA__']"),"followerCount"":(\d+)")
"followerCount" is retrieved from the script.
In this case, when =VALUE(REGEXEXTRACT(IMPORTXML(A1,"//script[#id='__NEXT_DATA__']"),"followerCount"":(\d+)")) is used, the retrieved value can be used as the number.
Result:
Pattern 2:
In this pattern, "followerCount" is retrieved.
Sample formula:
=REGEXEXTRACT(IMPORTXML(A1,"//meta[#name='description']/#content")," ([\w\d.]+) Fans")
The value of "54.4M Fans" is retrieved from the metadata.
Result:
References:
IMPORTXML
REGEXEXTRACT
I'm importing some data with importXML into a google sheet, but I need to have another ImportXML (with different regexp) when the first display "#N\D". I have tried with if.error but nothing, same with if IMPORTXML(....)="#N\D".
What Can I do?
thx
Instead of
IMPORTXML(meta!B1; "//*[#title='atto'][1]/div[2]/div[4]/a/#href")="#N\D"
try
ISERROR(IMPORTXML(meta!B1; "//*[#title='atto'][1]/div[2]/div[4]/a/#href"))
or
IMPORTXML(meta!B1; "//*[#title='atto'][1]/div[2]/div[4]/a/#href")=NA()