Extract Substrings using importxml in google sheets - google-sheets

Using IMPORTXML in google sheets. I want to extract part of the result into one cell.
=IMPORTXML(B1,"//div[#class='orca-rating SwtJyda color-yellow tbody-6']/span")
I got the result spread over several columns. B8:F8
The inspect element is like this. I only want the value "2". It is in cell B8.
I think this can be done using substring-after. But I could not get the correct result.

In your situation, how about the following samples?
=REGEXREPLACE(JOIN("",IMPORTXML(B1,"//div[#class='orca-rating SwtJyda color-yellow tbody-6']/span")),"[^0-9]","")
=REGEXEXTRACT(JOIN("",IMPORTXML(B1,"//div[#class='orca-rating SwtJyda color-yellow tbody-6']/span")),"\((.*)\)")
References:
REGEXREPLACE
REGEXEXTRACT

I use this formula. That works too.
=INDEX( IMPORTXML(B1,"//div[#class='orca-rating SwtJyda color-yellow tbody-6']/span"),3)
But tanaike's formula is very good.

Related

How to use IMPORTXML and SEQUENCE together in Google Sheet

=ARRAYFORMULA("https://www.amazon.com/product-reviews/B08C1W5N87/ref=cm_cr_arp_d_viewopt_rvwer?ie=UTF8&reviewerType=avp_only_reviews&sortBy=recent&pageNumber="&SEQUENCE(5,1,1,1))
I use the code above to have the links that I would like to scrap the data. There are 5 links.
=IMPORTXML(A6,"/html/body/div[1]/div[3]/div/div[1]/div/div[1]/div[5]/div[3]/div/div[*]/div/div/div[2]/a[1]/i")
I also use the formula above to scrap the data I want from the link. A6 refers to the first link the first formula creates.
What I would like to do is, if possible, I want to scrap the data from the 5 links and list them in a column.
=IMPORTXML(ARRAYFORMULA("https://www.amazon.com/product-reviews/B08C1W5N87/ref=cm_cr_arp_d_viewopt_rvwer?ie=UTF8&reviewerType=avp_only_reviews&sortBy=recent&pageNumber="&SEQUENCE(5,1,1,1)),"/html/body/div[1]/div[3]/div/div[1]/div/div[1]/div[5]/div[3]/div/div[*]/div/div/div[2]/a[1]/i")
The formula above did not work.
=ARRAYFORMULA(IMPORTXML("https://www.amazon.com/product-reviews/B08C1W5N87/ref=cm_cr_arp_d_viewopt_rvwer?ie=UTF8&reviewerType=avp_only_reviews&sortBy=recent&pageNumber="&SEQUENCE(5,1,1,1),"/html/body/div[1]/div[3]/div/div[1]/div/div[1]/div[5]/div[3]/div/div[*]/div/div/div[2]/a[1]/i"))
The formula above did not work as well. It always scraps the first link's data only.
Thank you for your help in advance.
keep in mind that IMPORTXML itself is a "type of arrayformula" so it is not supported under ARRAYFORMULA
in your case try to hardcode 5 IMPORTRANGE formulae into array {} like:
={IMPORTRANGE();
IMPORTRANGE();
IMPORTRANGE();
etc}
update
with new LAMBDA function its possible to do it in one go:
=INDEX(TRIM(FLATTEN(SPLIT(FLATTEN(BYCOL(
"https://www.amazon.com/product-reviews/B08C1W5N87/ref=cm_cr_arp_d_viewopt_rvwer?ie=UTF8&reviewerType=avp_only_reviews&sortBy=recent&pageNumber="&
SEQUENCE(1,5,1,1), LAMBDA(x, QUERY(IMPORTXML(x,
"/html/body/div[1]/div[3]/div/div[1]/div/div[1]/div[5]/div[3]/div/div[*]/div/div/div[2]/a[1]/i")&"×",,9^9)))), "×"))))

Adapt a formula to ArrayFormula

Im very new to Google Sheets formulas.
Im trying to convert this formula:
=INDEX(PRX_ARM,MATCH(ELECTRICITE!E97,OFFSET(PRX_ARM_C1,0,MATCH(ELECTRICITE!G97,PRX_ARM_L1
,1)-1),1), MATCH(ELECTRICITE!G97&"1",PRX_ARM_L1,1))+
ELECTRICITE!H97*INDEX(PRX_REL,2,2)+
ELECTRICITE!I97*INDEX(PRX_REL,3,2)+
ELECTRICITE!J97*INDEX(PRX_BP_VOY,2,2)+
ELECTRICITE!K97*INDEX(PRX_BP_VOY,3,2)+
ELECTRICITE!L97*INDEX(PRX_BP_VOY,4,2)+
ELECTRICITE!M97*INDEX(PRX_BP_VOY,5,2)
So naturally, i converted every single cell reference to the range ArrayFormula must be applied to :
=ArrayFormula(INDEX(PRX_ARM,MATCH(ELECTRICITE!E90:E5030,OFFSET(PRX_ARM_C1,0,MATCH(ELECTRICITE!G90:G5030,PRX_ARM_L1,1)-1),1),
MATCH(ELECTRICITE!G90&"1",PRX_ARM_L1,1))+
ELECTRICITE!H90:H5030*INDEX(PRX_REL,2,2)+
ELECTRICITE!I90:I5030*INDEX(PRX_REL,3,2)+
ELECTRICITE!J90:J5030*INDEX(PRX_BP_VOY,2,2)+
ELECTRICITE!K90:K5030*INDEX(PRX_BP_VOY,3,2)+
ELECTRICITE!L90:L5030*INDEX(PRX_BP_VOY,4,2)+
ELECTRICITE!M90:M5030*INDEX(PRX_BP_VOY,5,2))
But it does not work.
Do you know what im doing wrong?
Thanks in advance.
Yu can't use index with arrayformula, use instead vlookup

How to use vlookup or index/match to return a formula instead of a value?

I'm trying to use vloookup or index/match in Google sheets to pull and plug in a query formula, rather than just its resulting value.
For example, I'm trying to use this vlookup formula:
=ArrayFormula(IFERROR(VLOOKUP(A2:A100,'SMS VLOOKUP'!A2:$B$500,2,0),""))
to plug in the formula:
=QUERY(courtdates,"SELECT D, C, BO, H WHERE BO = date '"&TEXT(TODAY()-1,"yyyy-mm-dd")&"'",0)
As I understand, you want to take formula from one place using vlookup and then use it in other place? Something like eval or evaluate() ?
The first part is possible using FORMULATEXT(cell) but you can't convert text of the formula into actual formula. Nothing like this using formulas.
The only way is using a script.

How to figure out proper xpath for IMPORTXML in Google Sheets - N/A Error?

I'm trying to use the IMPORTXML function on Google Sheets.
For example: =IMPORTXML("https://www.tiktok.com/#charlidamelio?lang=en", XMLPATH) should return "54.3M"
I used the Chrome inspector to copy the xpath, which gives me:
/html/body/div[1]/div/div[2]/div/div[1]/div/header/h2[1]/strong[2]
When I try this in Google Sheets it returns an error: #N/A (Import Content is Empty).
P.S. I'm open to other ways to get the data I need into the google sheet, it doesn't have to use the IMPORTXML function.
How about this answer?
In this answer, IMPORTXML and REGEXEXTRACT are used. And also, it supposes that the URL of https://www.tiktok.com/#charlidamelio?lang=en is put in a cell "A1".
Pattern 1:
In this pattern, "followerCount" is retrieved.
Sample formula:
=REGEXEXTRACT(IMPORTXML(A1,"//script[#id='__NEXT_DATA__']"),"followerCount"":(\d+)")
"followerCount" is retrieved from the script.
In this case, when =VALUE(REGEXEXTRACT(IMPORTXML(A1,"//script[#id='__NEXT_DATA__']"),"followerCount"":(\d+)")) is used, the retrieved value can be used as the number.
Result:
Pattern 2:
In this pattern, "followerCount" is retrieved.
Sample formula:
=REGEXEXTRACT(IMPORTXML(A1,"//meta[#name='description']/#content")," ([\w\d.]+) Fans")
The value of "54.4M Fans" is retrieved from the metadata.
Result:
References:
IMPORTXML
REGEXEXTRACT

This array formula is not working

Hi following formula is working, but array formula is not working.
Working EQ:
=IF(V2:V=1,INDEX($E$2:$E,MATCH(T2&B2&"Delivered Time (Today)",$T$2:$T&$B$2:$B&$C$2:$C,0)),"")
I wonder, but this formula is not working:
=ARRAYFORMULA(IF(V2:V=1,INDEX($E$2:$E,MATCH(T2&B2&"Delivered Time (Today)",$T$2:$T&$B$2:$B&$C$2:$C,0)),""))
Can someone educate me to fix this?
Example details:
Example Sheet is here
Unfortunately not all Sheets functions work within an arrayformula, and INDEX and MATCH are two that do not
Instead, you can use VLOOKUP and construct an array to do the job of INDEX/MATCH:
=ArrayFormula(IF(V2:V=1,VLOOKUP(T2:T&B2:B&"Delivered Time (Today)",{T2:T&B2:B&C2:C,E2:E},2,0),))
You can it working in this copy of your example sheet:
https://docs.google.com/spreadsheets/d/1dFVNfPn0R9goQaLjRvZEwggRthbkEY3nC3aqC2joPcw/edit?usp=sharing

Resources