When I import data, it comes in this format (image 1), with blank spaces. I would like to know if there is any way to adjust so that these blanks disappear, the two models expected (image 2 and 3) if there was any way to reach them would be important to me.
Remembering that all dates have / and all times have :
I tried to filter from QUERY, but when trying to "Select Col1, Col2, Col4 Where Col2 is not null" the dates disappear and only the times remain, I tried via REGEXMATCH to separate the dates from the times using / and : but also I was not successful.
I also tried it via IMPORTXML, but some data ends up not being imported correctly on some pages of the site, for IMPORTHTML these errors do not happen. The XML's I used were:
"//tr[#class='no-date-repetition-new' and ..//td[#class='team team-a']] | //tr[#class='no-date-repetition-new live-now' and ..//td[#class='team team-a']]"
"//td[#class='team team-a']/a | //td[#class='team team-a strong']/a"
The current formula is as follows:
=IMPORTHTML("https://int.soccerway.com/national/austria/1-liga/20192020/regular-season/r54328/","table",1)
IMPORTHTML Original:
Expected formats:
---
Rather than filtering what you need is to restructure the imported data.
Anyway, I think that the easier solution to get the final result is to use multiple IMPORTXML formulas.
URL
A1: https://int.soccerway.com/national/austria/1-liga/20192020/regular-season/r54328/
Headers
A2: //table[contains(#class,'matches')]/thead/tr/th
Day
A3: //td[contains(#class,'date')]/parent::tr
Teams and Score
A4: //td[contains(#class,'team-a')]/parent::tr
A6: =transpose(IMPORTXML($A$1,A2))
A7: =IMPORTXML($A$1,A3)
B7: =IMPORTXML(A1,A4)
You might want to replace the formula on A6 by static values in order to place them properly.
You can join 2 queries together (one next to the other) in a single formula, to get your results
={QUERY(IMPORTHTML("https://int.soccerway.com/national/austria/1-liga/20192020/regular-season/r54328/","table",1),
"select Col1 where Col2 is null and not Col1 contains '*'",1),
QUERY(IMPORTHTML("https://int.soccerway.com/national/austria/1-liga/20192020/regular-season/r54328/","table",1),
"select Col1, Col2, Col3, Col4 where Col2 is not null label Col1 'Time'",1)}
How the formula works:
As you notice the data part of both queries is the same in both of them. What is actually different is "what we ask for from the query"
In the first one we use "select Col1 where Col2 is null and not Col1 contains '*'"
In the second one "select Col1, Col2, Col3, Col4 where Col2 is not null label Col1 'Time'"
We create an array by joining them together as in ={1stQUERY,2ndQUERY}
Related
I made columns that simply lists all the schools for each student under a date column header. I only need to count the schools - I don't want to see any student data. I have a form that needs to be submitted to the state and there is a cell with the date we are submitting. I want to use an IMPORTRANGE to go to the other spreadsheet and find the column where the date matches the form date and pull back the count of schools in that column. I tried using INDEX/MATCH, etc., and while I have those working on the form for other data already on my spreadsheet - I can't seem to get them to work on the 'outside' spreadsheet. Any help is appreciated! I'm attaching an example spreadsheet with more explanation. I hope it's clear. Thank you.
Example Document
ztiaa has provided a good solution. I will suggest another which pivots the data differently. I feel that having the dates go across columns is going to quickly become unwieldy. So this approach leaves the dates running down the right side with the schools running as column headers. My solution also includes exceptions such as snow days, as those will likely be important to see at a glance as well. Just be careful to use the same notation for such exceptions. For instance, always use "snowday" (not, eg., "snow day") or "prep day" (not, e.g., "prep").
In Sheet1, use IMPORTRANGE to bring in the data exactly as you have it listed in your sample sheet (i.e., dates at top, school names or exceptions under each date header).
Then, in a new sheet, place the following formula in cell A1:
=ArrayFormula({QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!1:1<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"),TRANSPOSE({"District Totals",MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!1:1<>"")="",0,1))})})
This assumes that the first sheet is, in fact, named Sheet1. If it is not, you'll need to change every instance of that sheet name in the formula.
It will probably further aid visual ease if you freeze Row 1 and Column 1 (View > Freeze > 1 row / 1 column).
ADDENDUM (after seeing realistic copy of OP's sheet)
=ArrayFormula({TRANSPOSE(QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"));{"District Totals",MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0))}})
If you prefer to see the name of the exception (e.g., "snowday") rather than the count of 1 for each exception in the "District totals" line:
=ArrayFormula({TRANSPOSE(QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"));{"District Totals",IF(MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0))=1,FILTER(Sheet1!2:2,Sheet1!2:2<>""),MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0)))}})
If you want exceptions (e.g., "snowday") to always appear at the bottom instead of in alphabetical order within the school-name list:
=ArrayFormula({TRANSPOSE(QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&IF(REGEXMATCH(UPPER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A))),INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A))),,"‡")&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null AND Col2 <> '‡' GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"));{"District Totals",IF(MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0))=1,FILTER(Sheet1!2:2,Sheet1!2:2<>""),MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0)))}})
Try this out
=ArrayFormula({query(split(flatten(text(A1:E1,"mmm/d")&"❄️"&A2:E),"❄️"),"select Col2, count(Col2) where not Col2 matches '|Snowday' group by Col2 pivot Col1");{"District Total",transpose(MMult(transpose(N(filter(A2:E,not(RegexMatch(A2:E2,"Snowday")))<>"")),sequence(rows(A2:E))^0))}})
I want to rebuild my table of Col1 (key), Col2, Col3, Col4, Col5 to table of Col1 (Key), and Col2 (which include all accordance values from Columns 2,3,4,5...), please check example:
https://docs.google.com/spreadsheets/d/1je3uc1DHitzzsFK6Xag_ld531p7ozeheO4PwpCfZjQA
If I use the following plain text as range in Query:
=QUERY({A2:A7\B2:B7;A2:A7\C2:C7;A2:A7\D2:D7;A2:A7\E2:E7},"Select * where Col2 >0")
it works well. But I can't make that range to be dynamic inside query, I tried to put range (based on formulas or even as a text) in some cell and then pull it as follows
=QUERY(A1,"Select * where Col >0")
or
=QUERY({indirect(A1)},"Select * where Col2 >0")
but it doesn't work.
You can use this formula where you will need to change only one range in case there will be more columns:
=SORT(SPLIT(QUERY(FLATTEN(IF(B3:E = "",, A3:A & "♥" & B3:E)), "WHERE Col1 IS NOT NULL"), "♥"))
Or you could place A3:A and B3:E ranges in some cells as parameters and use INDIRECT you you need that:
=SORT(SPLIT(QUERY(FLATTEN(IF(INDIRECT(N4) = "",, INDIRECT(N3) & "♥" & INDIRECT(N4))), "WHERE Col1 IS NOT NULL"), "♥"))
When a raw data set may expand, I always recommend that the raw data set remain by itself in a sheet. For now, let's assume that you have only your existing raw data set in the 'Sample' sheet.
In a new blank sheet, place this formula:
=ArrayFormula(QUERY(SPLIT(FLATTEN(FILTER(Sample!A3:A,Sample!A3:A<>"")&"|"&FILTER(FILTER(INDIRECT("Sample!B3:"&ROWS(Sample!A:A)),Sample!A3:A<>""),Sample!B2:2<>"")),"|"),"Select * WHERE Col2 Is Not Null",0))
This formula is similar to what #kishkin offered. However, the FILTER inclusions automatically expand to include all parts of the raw data where there is something in Column A and a header added in B3:B. That is, this formula requires zero maintenance if set up as I've described.
So I have data in the following format:
I want to summarize this into another sheet like this:
So I can use TRANSPOSE(SPLIT(A1:A)) to do this individually
I can use ARRAYFORMULA(SPLIT(A:A)) to run this over a range, however I cant seem to use both of them together to the same effect
To summarize
Column A contains comma seperated values for car names
I want to get all the car names as rows in another sheet and add counts against them
You can get the whole mini-report with this one formula:
=ArrayFormula(QUERY(FLATTEN(TRIM(SPLIT(A2:A,","))),"Select Col1, COUNT(Col1) Where Col1 > '#' GROUP BY Col1 LABEL COUNT(Col1) ''"))
SPLIT will split every entry in A2:A at the comma (if there is one) and will result in an error for every blank row (dealt with later).
TRIM removes the spaces after any comma-splits.
FLATTEN creates one column from all results (words, blanks and errors).
QUERY(...,"Select Col1, COUNT(Col1) Where Col1 > '#' GROUP BY Col1 LABEL COUNT(Col1) ''") starts with everything in that FLATTEN column and a COUNT of each GROUPed by the name ruling out anything WHERE the value in Col1 is not > the '#' symbol (which will rule out all blanks and errors). Finally, LABEL will get rid of the superfluous header of "count" that would otherwise be generated for the COUNT(Col1) column of the QUERY.
Alternatively, you could rule out the errors first with a FILTER and then use the WHERE part of the QUERY to rule out blanks:
=ArrayFormula(QUERY(FLATTEN(TRIM(SPLIT(FILTER(A2:A,A2:A<>""),","))),"Select Col1, COUNT(Col1) WHERE Col1 Is Not Null GROUP BY Col1 LABEL COUNT(Col1) ''"))
Managed to solve it from this question
https://webapps.stackexchange.com/questions/114791/google-sheets-split-and-unique
=unique(transpose(arrayformula(trim(split(join(",",A1:A),",")))))
The most efficient is probably QUERY. Unlike JOIN, this has no limit on the number of characters in the result.
Basic approach
=ARRAYFORMULA(TRANSPOSE(SPLIT(
QUERY(A:A&",",,9^9),
", "
)))
Unique
=ARRAYFORMULA(UNIQUE(TRANSPOSE(SPLIT(
QUERY(A:A&",",,9^9),
", "
))))
Counting the number in a group
=ARRAYFORMULA(QUERY(
TRANSPOSE(SPLIT(QUERY(A:A&",",,9^9),", ")),
"select Col1, count(Col1) group by Col1 label count(Col1)''"
))
When I use the format that I left below, returns in error, I know that something is missing adjusting in relation to WHERE AND... But I could not fit to supply the error.
I would like some help knowing what I missed.
"select Col1,Col2,Col3 where Col1 is not null Order by Col1, Col2 AND Col1 > date'"&TEXT(today()-1,"yyyy/mm/dd")&"'"
The date and time for Col1 and Col2 are like this:
With this formula, I hope I can filter the imported data only for those that have the date of the current day or tomorrow and that today's games are higher than the current time.
Link to spreadsheet:
https://docs.google.com/spreadsheets/d/15T4UPVtEHv43DLomKcTmdaPuGWMsBUoO7bLvlOhab4k/edit?usp=sharing
Formula set in Página1 G2
it may look like you can combine parameters as you like but there is a specific order that needs to be followed:
respectively joining operator like and needs to be proceeded by parent parameter where:
Try:
"select Col1, Col2, Col3 where Col1 >='"&TEXT(TODAY()-1,"yyyy-mm-dd")&"' and Col1 is not null Order by Col1, Col2"
This returns 78 records
I have 2 spreadsheets (sheet1 + sheet2)
Sheet2 Pulls in data from Sheet1 using the IMPORTRANGE function which works fine, except that there are a few rows that have missing information in 1-2 columns and for the purpose of what I am trying to do I need to just remove these rows.
Can anyone point me in the right direction? Not sure if I need to add something to the IMPORTRANGE function or create a new spreadsheet and use a different function or do I have to manually delete these rows?
Cheers
The QUERY function provides this ability. The "IS NOT NULL" argument works with numbers, and "!=" is for Strings (anything that's not just numbers).
=QUERY({IMPORTRANGE("YourKey","SheetName!A:B");
IMPORTRANGE("YourKey,"SheetName!A:B");},
"SELECT Col1, Col2 WHERE Col2 IS NOT NULL")
Or
=QUERY({IMPORTRANGE("YourKey","SheetName!A:B");
IMPORTRANGE("YourKey,"SheetName!A:B");},
"SELECT Col1, Col2 WHERE Col2 != ''")