Import Range from spreadsheet but exclude rows with missing data - google-sheets

I have 2 spreadsheets (sheet1 + sheet2)
Sheet2 Pulls in data from Sheet1 using the IMPORTRANGE function which works fine, except that there are a few rows that have missing information in 1-2 columns and for the purpose of what I am trying to do I need to just remove these rows.
Can anyone point me in the right direction? Not sure if I need to add something to the IMPORTRANGE function or create a new spreadsheet and use a different function or do I have to manually delete these rows?
Cheers

The QUERY function provides this ability. The "IS NOT NULL" argument works with numbers, and "!=" is for Strings (anything that's not just numbers).
=QUERY({IMPORTRANGE("YourKey","SheetName!A:B");
IMPORTRANGE("YourKey,"SheetName!A:B");},
"SELECT Col1, Col2 WHERE Col2 IS NOT NULL")
Or
=QUERY({IMPORTRANGE("YourKey","SheetName!A:B");
IMPORTRANGE("YourKey,"SheetName!A:B");},
"SELECT Col1, Col2 WHERE Col2 != ''")

Related

Google Sheets - Trying to use IMPORTRANGE to get student enrollments by school by date

I made columns that simply lists all the schools for each student under a date column header. I only need to count the schools - I don't want to see any student data. I have a form that needs to be submitted to the state and there is a cell with the date we are submitting. I want to use an IMPORTRANGE to go to the other spreadsheet and find the column where the date matches the form date and pull back the count of schools in that column. I tried using INDEX/MATCH, etc., and while I have those working on the form for other data already on my spreadsheet - I can't seem to get them to work on the 'outside' spreadsheet. Any help is appreciated! I'm attaching an example spreadsheet with more explanation. I hope it's clear. Thank you.
Example Document
ztiaa has provided a good solution. I will suggest another which pivots the data differently. I feel that having the dates go across columns is going to quickly become unwieldy. So this approach leaves the dates running down the right side with the schools running as column headers. My solution also includes exceptions such as snow days, as those will likely be important to see at a glance as well. Just be careful to use the same notation for such exceptions. For instance, always use "snowday" (not, eg., "snow day") or "prep day" (not, e.g., "prep").
In Sheet1, use IMPORTRANGE to bring in the data exactly as you have it listed in your sample sheet (i.e., dates at top, school names or exceptions under each date header).
Then, in a new sheet, place the following formula in cell A1:
=ArrayFormula({QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!1:1<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"),TRANSPOSE({"District Totals",MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!1:1<>"")="",0,1))})})
This assumes that the first sheet is, in fact, named Sheet1. If it is not, you'll need to change every instance of that sheet name in the formula.
It will probably further aid visual ease if you freeze Row 1 and Column 1 (View > Freeze > 1 row / 1 column).
ADDENDUM (after seeing realistic copy of OP's sheet)
=ArrayFormula({TRANSPOSE(QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"));{"District Totals",MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0))}})
If you prefer to see the name of the exception (e.g., "snowday") rather than the count of 1 for each exception in the "District totals" line:
=ArrayFormula({TRANSPOSE(QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"));{"District Totals",IF(MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0))=1,FILTER(Sheet1!2:2,Sheet1!2:2<>""),MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0)))}})
If you want exceptions (e.g., "snowday") to always appear at the bottom instead of in alphabetical order within the school-name list:
=ArrayFormula({TRANSPOSE(QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&IF(REGEXMATCH(UPPER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A))),INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A))),,"‡")&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null AND Col2 <> '‡' GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"));{"District Totals",IF(MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0))=1,FILTER(Sheet1!2:2,Sheet1!2:2<>""),MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0)))}})
Try this out
=ArrayFormula({query(split(flatten(text(A1:E1,"mmm/d")&"❄️"&A2:E),"❄️"),"select Col2, count(Col2) where not Col2 matches '|Snowday' group by Col2 pivot Col1");{"District Total",transpose(MMult(transpose(N(filter(A2:E,not(RegexMatch(A2:E2,"Snowday")))<>"")),sequence(rows(A2:E))^0))}})

Efficient way to collate/aggregate specific data in Google Sheets

I'm looking for an efficient way to gather and aggregate some date in Google Sheets. I've been looking at the query function, pivot tables, and Index + Match formulas, but so far I've not found a way that brings me to the result I'm looking for. I have a set of data which looks more or less as follows.
The fields with an X represent irrelevant data which I don't want to show up in my end result. They only serve to illustrate that there are columns of data that I don't want in between the columns of data that I do want. The data in those columns is of varying types and of varying values per type, they are not actually fields with an "X" in it. Only the fields with numbers are of interest along with the related names at the top and left of those. The intent is to create a list that looks more or less like this.
I've highlighted those yellow fields because that data has been aggregated. For example, in the original file field D3 shows a relation between Laura and Pete with the number 1, and field L3 also shows a relation between Laura and Pete, so the number in that field is to be added to the number in the other field resulting in an aggregated total of 2 for that particular combination.
I would really appreciate any suggestions that can help me get to an elegant and efficient solution for this. The only solutions I can come up with would involve multiple "in-between" sheets and there just has to be a better way.
UPDATE:
Solved by applying the solution in player0's answer. I just had to switch around the order of Col1 and Col2 in the formula to get the table sorted the way I needed it. Formula looks like below now. Many thanks to both player0 and Erik Tyler for their efforts.
=INDEX(QUERY(SPLIT(FLATTEN(A2:A&"×"&D1:N1&"×"&D2:N), "×"),
"select Col2,Col1,sum(Col3)
where Col2 is not null
and Col3 is not null
group by Col2,Col1
label sum(Col3)''", ))
try:
=INDEX(QUERY(SPLIT(FLATTEN(A2:A&"×"&D1:N1&"×"&D2:N), "×"),
"where Col3 is not null and Col2 is not null", ))
update:
=INDEX(QUERY(SPLIT(FLATTEN(A2:A&"×"&D1:N1&"×"&D2:N), "×"),
"select Col1,Col2,sum(Col3)
where Col3 is not null
and Col2 is not null
group by Col1,Col2
label sum(Col3)''", ))
Given your current data set (which only appears to extend to Col N), place the following somewhere to the right of Col N:
=ArrayFormula(SPLIT(TRANSPOSE(QUERY(TRANSPOSE(QUERY(SPLIT(QUERY(FLATTEN(FILTER(IF(NOT(ISNUMBER(D2:N)),,D1:N1&"~ "&A2:A&"|"&D2:N),A2:A<>"")),"Select * WHERE Col1 Is Not Null"),"|"),"Select Col1, SUM(Col2) GROUP BY Col1 LABEL SUM(Col2) ''")&"~ "),,2)),"~ ",0,1))
It would be better if this were placed in a different sheet from the original data. Supposing that your original data sheet is named Sheet1, place the following version of the above formula into a new sheet:
=ArrayFormula(SPLIT(TRANSPOSE(QUERY(TRANSPOSE(QUERY(SPLIT(QUERY(FLATTEN(FILTER(IF(NOT(ISNUMBER(INDIRECT("Sheet1!D2:"&ROWS(Sheet1!A:A)))),,Sheet1!D1:1&"~ "&Sheet1!A2:A&"|"&INDIRECT("Sheet1!D2:"&ROWS(Sheet1!A2:A))),Sheet1!A2:A<>"")),"Select * WHERE Col1 Is Not Null"),"|"),"Select Col1, SUM(Col2) GROUP BY Col1 LABEL SUM(Col2) ''")&"~ "),,2)),"~ ",0,1))
This separate-sheet approach and formula allows for the original data to extend indefinitely past Col N.

Query Importrange Issue

I am not sure why this functionality stopped working, but I am sure it has to do with inconsistent back end data or how the query condition "CONTAINS" needs to be changed. The IMPORTRANGE portion works just fine, but will not always pull data into the front end sheet. The query portion looks like this
SELECT Col3, Col2, Col1 WHERE Col2 CONTAINS "&'Job Number'!A1&" ORDER BY Col3 ASC,1
Column 2 contains job numbers that are xxxxx with another 3 digit code appended to end of it. It will only populate temporarily if I manually go into the sheet and edit the IMPORTRANGE range values. If I close the spreadsheet and open it again it will not populate. Does the data in Column 2 need to be a consistent datatype throughout the column or it will break the query?
Unfortunately we do not have a test sheet, so we can not know what the expected results would be.
Nevertheless, your formula syntax is not correct.
The correct syntax would be:
SELECT Col3, Col2, Col1 WHERE Col2 CONTAINS '"&"JOB"&A1&"' ORDER BY Col3 ASC,1
Please notice the syntax '"&"JOB"&A1&"'
(If you still have issues please share a test sheet and let us know.)

Filter IMPORTHTML data

When I import data, it comes in this format (image 1), with blank spaces. I would like to know if there is any way to adjust so that these blanks disappear, the two models expected (image 2 and 3) if there was any way to reach them would be important to me.
Remembering that all dates have / and all times have :
I tried to filter from QUERY, but when trying to "Select Col1, Col2, Col4 Where Col2 is not null" the dates disappear and only the times remain, I tried via REGEXMATCH to separate the dates from the times using / and : but also I was not successful.
I also tried it via IMPORTXML, but some data ends up not being imported correctly on some pages of the site, for IMPORTHTML these errors do not happen. The XML's I used were:
"//tr[#class='no-date-repetition-new' and ..//td[#class='team team-a']] | //tr[#class='no-date-repetition-new live-now' and ..//td[#class='team team-a']]"
"//td[#class='team team-a']/a | //td[#class='team team-a strong']/a"
The current formula is as follows:
=IMPORTHTML("https://int.soccerway.com/national/austria/1-liga/20192020/regular-season/r54328/","table",1)
IMPORTHTML Original:
Expected formats:
---
Rather than filtering what you need is to restructure the imported data.
Anyway, I think that the easier solution to get the final result is to use multiple IMPORTXML formulas.
URL
A1: https://int.soccerway.com/national/austria/1-liga/20192020/regular-season/r54328/
Headers
A2: //table[contains(#class,'matches')]/thead/tr/th
Day
A3: //td[contains(#class,'date')]/parent::tr
Teams and Score
A4: //td[contains(#class,'team-a')]/parent::tr
A6: =transpose(IMPORTXML($A$1,A2))
A7: =IMPORTXML($A$1,A3)
B7: =IMPORTXML(A1,A4)
You might want to replace the formula on A6 by static values in order to place them properly.
You can join 2 queries together (one next to the other) in a single formula, to get your results
={QUERY(IMPORTHTML("https://int.soccerway.com/national/austria/1-liga/20192020/regular-season/r54328/","table",1),
"select Col1 where Col2 is null and not Col1 contains '*'",1),
QUERY(IMPORTHTML("https://int.soccerway.com/national/austria/1-liga/20192020/regular-season/r54328/","table",1),
"select Col1, Col2, Col3, Col4 where Col2 is not null label Col1 'Time'",1)}
How the formula works:
As you notice the data part of both queries is the same in both of them. What is actually different is "what we ask for from the query"
In the first one we use "select Col1 where Col2 is null and not Col1 contains '*'"
In the second one "select Col1, Col2, Col3, Col4 where Col2 is not null label Col1 'Time'"
We create an array by joining them together as in ={1stQUERY,2ndQUERY}

Google sheets - Collecting unique values from several sheets?

I'm in the need of collecting unique values from a specific range in several sheets. Is it possible to combine functions in order to do this?
All sheets look the same regarding column structure.
As of now, my function collects from one sheet and it looks like this:
=unique(filter('Sheet1'!C4:C1000,'Sheet1'!C4:C1000<>""))
This collects unique values from Sheet1 from C4 to C1000 and excludes empty cells. This works awesomely, but I have more sheets that I'd like to merge values from. Any idea?
Basic idea is to combine data, first with help of {}:
= {sheet1C4:C1000;sheet2C4:C1000;sheet3C4:C1000}
The next step is to get rid of empty cells. To do it only once, use query:
= query({sheet1C4:C1000;sheet2C4:C1000;sheet3C4:C1000},
"select Col1 where Col1 <> ''")
And then grab uniques/ The final formula will look like:
= unique (query({sheet1C4:C1000;sheet2C4:C1000;sheet3C4:C1000},
"select Col1 where Col1 <> ''"))
By the way, query string may be shortened to this "where Col1 <> ''" will also work

Resources