Google Sheets importxml query to remove one piece of data - google-sheets

I'm using the following query in my sheet to import total Spotify streams for artists. Example:
=IMPORTXML("https://chartmasters.org/spotify-streaming-numbers-tool/?artist_name=&artist_id=1uh2pZRWuOebEoQgFVKK7l&displayView=Disco","//tr[#class='careerTotals'][2]")
However it's returning one extra value I don't want ("EAS"). I would like to just have the artist name in A and the total streams in B. Any ideas? Thanks.

How about these modifications?
Modified formula:
=TRANSPOSE(IMPORTXML(A1,"//tr[#class='careerTotals'][2]/td[position()<3]"))
or
=QUERY(IMPORTXML(A1,"//tr[#class='careerTotals'][2]"),"SELECT Col1,Col2")
The URL of https://chartmasters.org/spotify-streaming-numbers-tool/?artist_name=&artist_id=1uh2pZRWuOebEoQgFVKK7l&displayView=Disco is put in the cell "A1".
At 1st modified script, the expected values are retrieved with xpath of //tr[#class='careerTotals'][2]/td[position()<3] and those are put to the columns using TRANSPOSE.
At 2nd modified script, the expected values are retrieved from the retrieved 3 values using QUERY.
Result:
This result is from the 1st modified formula. 2nd one is also the same result.
References:
TRANSPOSE
QUERY

Related

Google sheet Query formula that use text from one column to match another column returning text from adjacent cell

I have a dropdown text column, Claims!B2:B that is supposed to match Ref!A2:A and select Ref!B2:B text.
I tried
=ArrayFormula(IF($B$2:$B="","", LOOKUP($B$2:$B,Ref!$A$2:$A,Ref!$B$2:$B)))
some results not consistent
=QUERY(Ref!A2:B,"Select B where A = "&B2:B&"")
resulting in error
=FILTER(Ref!B2:B,Ref!$A$2:$A=B2:B)
wrong results and not arrayed.
I like to know what should be the simplest array formula for the scenario and if possible correct my other trials for my learning process.
Sample data attached. sample supplier
Please use
for your Category column
=INDEX(IFNA(VLOOKUP(B2:B,Ref!A2:C,3,0)))
and for your GST Stats
=INDEX(IFNA(VLOOKUP(B2:B,Ref!A2:C,2,0)))
(As you notice the only difference is the column number from 2 to 3)
Functions used:
INDEX
IFNA
VLOOKUP

Match of an id value and extracing a string in Google Sheets

following problem:
I have a column with wrong Ids
Now I want to watch those wrong Ids with another sheet where I have same Ids and the correct link I want to match with those Ids:
So what I same up with is the following ->
=IFERROR(VLOOKUP(A2,'extract base'"B:F),"")"))
But obviously doesn't work haha. So basically very easy -> if the Id from Sheet 1 matches with the Id from Sheet two put in the second column (in my example custom_label) the value of sheet two column 2
Any help is appreciated, thank you so much!
Your current VLOOKUP formula is not structured correctly at all, and your sheet reference 'extract base'"B:F is also not written correctly. Have you read the basic documentation on VLOOKUP usage and syntax?
Delete B2:B entirely.
Then place the following in B2:
=ArrayFormula(IF(A2:A="",,IFERROR(VLOOKUP(A2:A,'extract base'!B:F,5,FALSE))))
This formula should provide results, if any, for all rows (assuming that your second sheet is, in fact, called exactly extract base).

Google Sheets - Arrayformula Query Split debugging

I have a spreadsheet where I manually copy and paste some data from pdf tables.
I've been using array query split to split that info into different columns and it works flawless in 2 columns (date and amount) and for the other one it works most of the time (reference).
Example that works:
PDF DATA: 4500063794 21.07.2020 187.50
COLUMNS IN SPREADSHEET (final result after split): Reference:4500063794, Date:21.07.2020, Amount:187.50
Formula for retrieving Reference column: =ArrayFormula(QUERY(SPLIT(C3:C7 ;" ");"select Col1";0))
This works along the spreadsheet without any problems
Another example that works:
PDF DATA: 447/20.6TBOS 04.07.2020 804.00
COLUMNS IN SPREADSHEET (final result after split): Reference:447/20.6TBOS, Date:04.07.2020, Amount:804.00
Formula for retrieving Reference column: =ArrayFormula(QUERY(SPLIT(C3:C7 ;" ");"select Col1";0))
This works along the spreadsheet without any problems
Example that DOES NOT work:
If I paste several rows like 1st example and then add several rows as the 2nd example, when I paste afterwards more data like the one in 1st example, split stops retrieving the Reference column for pdf data similar to "447/20.6TBOS 04.07.2020 804.00" (2nd example). It retrieves blank cells for these.
Can anyone shine a light on this?
Thanks in advance
Example Spreadsheet
It will work if you change the query to:
=ArrayFormula(INDEX(SPLIT(REGEXREPLACE(C3:C7; "\s"; "♥");"♥");ROW(C3:C7)-ROW(C3);1))
The formula will replace the spaces by hearts (rare character) and then it will populate the rest.
To change the values of the rows, just change the last character 1 to 2 or 3:
)-ROW(C3); ==> 1 ))
You can use the same formula to the G column (don't forget to update the ranges), as the delimiters of both of the 4500063794 21.07.2020 187.50 and the 447/20.6TBOS 04.07.2020 804.00 are the same (whitespaces).

How to get child nodes through importxml xpath query?

I'm trying to get the seperate <td>'s to show up in Google Sheet of a <tr> that I'm importing through IMPORTXML.
This code should get my match data based on the match ID I provide, and my player ID. I feel that simply adding /* or /td to end of Xpath should work, but that's the end of my knowledge.
I tried: adding /*, /td and other to end of xPath Query but doesn't seem to work.
Even disabled JavaScript and inspected website again but to no avail.
FORMULA:
=IMPORTXML("https://www.dotabuff.com/matches/5011379854";"//tr[contains(#class,'9764136')]")
Also tried:
=IMPORTXML("https://www.dotabuff.com/matches/5011379854";"//td[parent::tr[contains(#class,'9764136')]]")
Which only gives the first of all the /td's and not the rest.
Current outputis all mushed together:
"19LemthTop (Off)ZeusCoreTop (Off) Roaminglost27108.7k127933650626.5k-183-/-5m7m21m31m"
The output that I want is separate <td> on separate lines:
"19
LemthTop (Off)ZeusCoreTop (Off) Roaminglost
2
7
10
8.7k
127
9
336
506
26.5k
-
183
-/-
5m7m21m31m"
Issue and workaround:
Although I have tried to parse the values for each row, unfortunately, it seemed that td cannot be directly parsed using a xpath with IMPORTXML as each row. But fortunately, each table can be retrieved by IMPORTHTML and also each tab can be accessed. Using them, how about the following workaround?
Retrieve a table from the URL using IMPORTHTML.
Retrieve a row including the name corresponding to 9764136 you want using a query.
Modified formula:
=TRANSPOSE(SPLIT(TEXTJOIN("#",TRUE,QUERY(IMPORTHTML(A1,"table",1), "where Col4 contains '"&IMPORTXML(A1,"//a[contains(#href,'9764136')]")&"'", 0)),"#",TRUE,TRUE))
The URL of https://www.dotabuff.com/matches/5011379854 is put to the cell "A1".
After the table was retrieved, the row is retrieved from the table by the query.
The important point of this workaround is the methodology. I think that there are various formulas for retrieving the value. So please think of above sample formula as just one of them.
Result:
Note:
If you use above formula for other URL, an error might occur. Please be careful this.
References:
IMPORTHTML
IMPORTXML
TEXTJOIN
SPLIT
TRANSPOSE

How to retrieve a cell value and write it in another sheet checking two different columns

I have two sheets, Progress and App1stSession, in the same spreadsheet.
Progress sheet has these columns
Date Campaign Sessions 1stSessions
App1stSession sheet these columns
Date Campaign 1stSessions
I need to retrieve 1stSessions values from App1stSession. The match can be done on Date and Campaign values.
I've written this formula
INDEX(App1stSession!$A$1:C,
AND(MATCH($B1,App1stSession!$B$1:B,0),MATCH($A1,App1stSession!$A$1:A,0))),3)
Of course it doesn't work because AND retrieves 0 or 1.
So I've tried this solution
INDEX(App1stSession!$A$1:C,IF(AND(MATCH($B1,App1stSession!$B$1:C,0),MATCH($A1
,App1stSession!$A$1:A,0)),MATCH($C1,App1stSession!$C$1:C,0)),3)
I suppose that the second MATCH could retrieve the right row: neither this solution works.
Finally I've tried
=QUERY(App1stSession!$A$1:A,"SELECT """&App1stSession!$C1:C&""" WHERE
("""&App1stSession!$A$1:A&"""="""&$A1&""")AND("""&App1stSession!$B$1:B&"""="""&$C1&""")")
Again it doesn't work but I suppose because of a matter of syntax.
Any suggestion ?
Thanks
The way I would do it, is insert one unique id column/search key in column A, before the rest of your data, that is dynamically created by the date & campaign values in your App1stSession sheet:
=B1&C1
then use this type of formula to smush the same two values together dynamically on your search sheet in a vlookup fashion on your progress sheet:
=VLOOKUP(A1&B1,App1stSession!A:D,4,false)

Resources