How to get child nodes through importxml xpath query? - google-sheets

I'm trying to get the seperate <td>'s to show up in Google Sheet of a <tr> that I'm importing through IMPORTXML.
This code should get my match data based on the match ID I provide, and my player ID. I feel that simply adding /* or /td to end of Xpath should work, but that's the end of my knowledge.
I tried: adding /*, /td and other to end of xPath Query but doesn't seem to work.
Even disabled JavaScript and inspected website again but to no avail.
FORMULA:
=IMPORTXML("https://www.dotabuff.com/matches/5011379854";"//tr[contains(#class,'9764136')]")
Also tried:
=IMPORTXML("https://www.dotabuff.com/matches/5011379854";"//td[parent::tr[contains(#class,'9764136')]]")
Which only gives the first of all the /td's and not the rest.
Current outputis all mushed together:
"19LemthTop (Off)ZeusCoreTop (Off) Roaminglost27108.7k127933650626.5k-183-/-5m7m21m31m"
The output that I want is separate <td> on separate lines:
"19
LemthTop (Off)ZeusCoreTop (Off) Roaminglost
2
7
10
8.7k
127
9
336
506
26.5k
-
183
-/-
5m7m21m31m"

Issue and workaround:
Although I have tried to parse the values for each row, unfortunately, it seemed that td cannot be directly parsed using a xpath with IMPORTXML as each row. But fortunately, each table can be retrieved by IMPORTHTML and also each tab can be accessed. Using them, how about the following workaround?
Retrieve a table from the URL using IMPORTHTML.
Retrieve a row including the name corresponding to 9764136 you want using a query.
Modified formula:
=TRANSPOSE(SPLIT(TEXTJOIN("#",TRUE,QUERY(IMPORTHTML(A1,"table",1), "where Col4 contains '"&IMPORTXML(A1,"//a[contains(#href,'9764136')]")&"'", 0)),"#",TRUE,TRUE))
The URL of https://www.dotabuff.com/matches/5011379854 is put to the cell "A1".
After the table was retrieved, the row is retrieved from the table by the query.
The important point of this workaround is the methodology. I think that there are various formulas for retrieving the value. So please think of above sample formula as just one of them.
Result:
Note:
If you use above formula for other URL, an error might occur. Please be careful this.
References:
IMPORTHTML
IMPORTXML
TEXTJOIN
SPLIT
TRANSPOSE

Related

Exclude all values in a column on a different sheet from a Google Sheets Query

So I have the following query:
=QUERY('sheet - Users'!A1:S, "Select A,B,C,F,G,O,Q,S where Q >= 44223 and not lower(O) matches '.*archived.*|.*archived' and not lower(C) matches '.*admin.*|.*admin' and not UPPER(C) matches '.*SMB.*' and not C matches '.*Shared Mailbox.*' and S >=90",1)
This works perfectly fine (as convoluted as it is), however, I have a list of exceptions that I need to remove from the results (the list could change, so ideally this needs to be dynamic and not hard coded).
I did some digging around and found this example query:
=query(C2:C8,"select C where C<>'"&JOIN("' and C<>'",D2:D10)&"'"&""
But that doesn't seem to be working for me when I try to incorporate it into my query.
The data I need to exclude is on a sheet called: Exclusion List
And is in cells C2:C
Is anyone able to help?
Solution:
You can modify the exclusion list to reference C2:C from another sheet, like this:
=query(C2:C8,"select C where C<>'"&JOIN("' and C<>'",'Exclusion List'!C2:C)&"'"&""
Sample Query and Exclusion Sheet:

Using multiple function for COUNTIFS in google sheets

=COUNTIFS((Tab1!C2:Tab1!C250),"*sam*") & ((Tab1!B2:Tab1!B250), ">1-Nov-2020")
In the above formula, I'm trying to get the count of 'person names whose name is sam and the value which is past 1-Nov-2020.
While trying to fetch the count using the above formula, it is showing Formula parse error.
Please analyze and tell where might I went wrong.
You need to correct your syntax to:
=COUNTIFS('Tab1'!C2:C9,"sam", 'Tab1'!B2:B9,">1-Nov-2020")
Please read more on how the COUNTIFS function work.
EDIT (following OP's comment)
The correct syntax would be
COUNTIFS(criteria_range1, criterion1, [criteria_range2, …], [criterion2, …]) meaning:
=COUNTIFS('Tab1'!C2:C9,"sam", 'Tab1'!B2:B9,">1-Nov-2020", 'Tab1'!B2:B9,">=1-11-2020")
BUT
Since you refer to dates 1-Nov-2020 is the same as 1-11-2020.
So you only need
=COUNTIFS('Tab1'!C2:C9,"sam", 'Tab1'!B2:B9,">=1-11-2020")
OR
=COUNTIFS('Tab1'!C2:C9,"sam", 'Tab1'!B2:B9,">=1-Nov-2020")

Google Sheets importxml query to remove one piece of data

I'm using the following query in my sheet to import total Spotify streams for artists. Example:
=IMPORTXML("https://chartmasters.org/spotify-streaming-numbers-tool/?artist_name=&artist_id=1uh2pZRWuOebEoQgFVKK7l&displayView=Disco","//tr[#class='careerTotals'][2]")
However it's returning one extra value I don't want ("EAS"). I would like to just have the artist name in A and the total streams in B. Any ideas? Thanks.
How about these modifications?
Modified formula:
=TRANSPOSE(IMPORTXML(A1,"//tr[#class='careerTotals'][2]/td[position()<3]"))
or
=QUERY(IMPORTXML(A1,"//tr[#class='careerTotals'][2]"),"SELECT Col1,Col2")
The URL of https://chartmasters.org/spotify-streaming-numbers-tool/?artist_name=&artist_id=1uh2pZRWuOebEoQgFVKK7l&displayView=Disco is put in the cell "A1".
At 1st modified script, the expected values are retrieved with xpath of //tr[#class='careerTotals'][2]/td[position()<3] and those are put to the columns using TRANSPOSE.
At 2nd modified script, the expected values are retrieved from the retrieved 3 values using QUERY.
Result:
This result is from the 1st modified formula. 2nd one is also the same result.
References:
TRANSPOSE
QUERY

Google Sheets - Arrayformula Query Split debugging

I have a spreadsheet where I manually copy and paste some data from pdf tables.
I've been using array query split to split that info into different columns and it works flawless in 2 columns (date and amount) and for the other one it works most of the time (reference).
Example that works:
PDF DATA: 4500063794 21.07.2020 187.50
COLUMNS IN SPREADSHEET (final result after split): Reference:4500063794, Date:21.07.2020, Amount:187.50
Formula for retrieving Reference column: =ArrayFormula(QUERY(SPLIT(C3:C7 ;" ");"select Col1";0))
This works along the spreadsheet without any problems
Another example that works:
PDF DATA: 447/20.6TBOS 04.07.2020 804.00
COLUMNS IN SPREADSHEET (final result after split): Reference:447/20.6TBOS, Date:04.07.2020, Amount:804.00
Formula for retrieving Reference column: =ArrayFormula(QUERY(SPLIT(C3:C7 ;" ");"select Col1";0))
This works along the spreadsheet without any problems
Example that DOES NOT work:
If I paste several rows like 1st example and then add several rows as the 2nd example, when I paste afterwards more data like the one in 1st example, split stops retrieving the Reference column for pdf data similar to "447/20.6TBOS 04.07.2020 804.00" (2nd example). It retrieves blank cells for these.
Can anyone shine a light on this?
Thanks in advance
Example Spreadsheet
It will work if you change the query to:
=ArrayFormula(INDEX(SPLIT(REGEXREPLACE(C3:C7; "\s"; "♥");"♥");ROW(C3:C7)-ROW(C3);1))
The formula will replace the spaces by hearts (rare character) and then it will populate the rest.
To change the values of the rows, just change the last character 1 to 2 or 3:
)-ROW(C3); ==> 1 ))
You can use the same formula to the G column (don't forget to update the ranges), as the delimiters of both of the 4500063794 21.07.2020 187.50 and the 447/20.6TBOS 04.07.2020 804.00 are the same (whitespaces).

How to make a relative reference in an array formula in Google Sheets

Here's the straightforward version of my question:
I want to change the following formula to an array formula...
Original formula (from cell J2):
=if(F4="VM:",G4,J1)
My attempt at converting to an array formula (in cell K1):
=arrayformula(if(row(A:A)=1,G3,if(F:F = "VM:",G:G,indirect("K"&row(A:A)-1))))
This works on rows where F = "VM:", but returns a #REF error on other rows. Function INDIRECT parameter 1 value is 'K0'. It is not a valid cell/range reference.
Thoughts on how to fix this?
The more complex version of my question. i.e. Why am I trying to do this?...
I have a weird spreadsheet with data that should really be in a Wiki.
I want to create filter views for each person so they can easily filter on only their own vendors. The original formula will work, but as more vendors are added, I'd like for the formula to automatically work for those rows as well.
If there's a better way to do this, I'm listening.
I don't exactly understand your needs, but If you want to autopopulate your formula, then you only need this code in desire column in row 4 (you can change this to any other - this will autofill down from this point):
=ArrayFormula(if(F4:F="VM:",G4:G,J1:J))
Is this what you are trying to get?
After clarification:
You need this code in J2 only:
=ArrayFormula(VLOOKUP(ROW(J2:J),
QUERY({F:G,ROW(G:G)},"select Col3,Col2 where Col1='VM:'",1)
,2,1)
)
Works for you?
maybe you just need to hide errors?
=IFERROR(ARRAYFORMULA(IF(ROW(A:A)=1,G3,IF(F:F = "VM:",G:G,INDIRECT("K"&ROW(A:A)-1)))),)

Resources