Create a query to remove rows with a 0 value - google-sheets

I'm sure this is probably a simple answer but I can't work out where I'm going wrong. I have a data table that I've copied to a new tab from (ForMaster!A511:G574). It brings in 7 columns data, the last 2 columns containing numerical values.
Sample Doc
https://docs.google.com/spreadsheets/d/1zcIHvSM1V_rVH8uiRE1ZhQHkptJLLlw9gnumSurZOps/edit?usp=sharing
I've been trying to set up a query that would look at columns F & G and remove rows where there is a zero in both columns. Ultimately, I want rows that have a $value in either to remain. This is a live doc, so if a row initially has a zero, I'd like it to be visible if a value is added at a future date.
I've tried using
=QUERY({ForMaster!A511:G574},"select * where Col6 >=0 or Col7 >=0"), but it doesn't eliminate any rows.
Please help.

The comparison should be only greater than 0 to eliminate rows with both zero values.
=QUERY({ForMaster!A511:G574},"select * where F>0 or G>0")
Sample Data:

try:
=QUERY(ForMaster!A511:G574; "where F+G <> 0"; 1)

Related

How to pull data from every third column using QUERY function

I am using this formula but the same formula needs to be applied to every third column. ie: starting from D3:D, G3:G, J3:J, and so on... what is the best way to apply or pull the data from every third column. (data is on the second sheet called Sitemap)
Please advise and help, many many thanks much appreciated!
=query({
'Sitemaps'!D3:D1000},
"Select * where Col1 is not null ")
Adding the sheet link maybe that will be more helpful to understand the situation, "AllURLs" needs to pull all links from Sitemaps into one list
https://docs.google.com/spreadsheets/d/1AWGfA7cHmF3Q2kiX1xkQcoec6H5EPiHUXaiWENMzZkA/edit?usp=sharing
use:
=QUERY({INDIRECT("Sitemaps!"&
ADDRESS(3, (COLUMN($D1)-1)*COLUMN(A1)+1)&":"&
ADDRESS(1000, (COLUMN($D1)-1)*COLUMN(A1)+1))},
"where Col1 is not null")
and drag to the right
update:
use in B3:
=INDEX(IFERROR(REGEXEXTRACT(C3:C,"^(?:https?:\/\/)?(?:www\.)?([^\/]+)")))
use in C3:
=QUERY(FLATTEN(FILTER(IFERROR(Sitemaps!D3:1000), MOD(COLUMN(Sitemaps!D1:1)-1, 3)=0)),
"where Col1 is not null")
Try this:
=FILTER(FILTER(Sitemaps!D3:J,MOD(COLUMN(Sitemaps!D3:J)-4,3)=0),Sitemaps!D3:D<>"")
Just replace :J with whichever column is further to the right in your data set.
This one formula should produce all results, assuming that any rows that have data in Column D also have data in that row of every other included column, and that rows that are null in Column D are also null in that row of every other included column.
MOD is the modulus function. It returns whatever is left after dividing a number by another number. For instance, MOD(7,3) would return 1, because 7 divided by 3 is 6 with 1 left over. The leftover portion is the modulus.
We can apply this to your column numbers, since the ones you want to retrieve are evenly spaced three apart. We just need to start at a baseline of zero. Since Column D has a column number of 4, we can "zero out" that baseline by subtracting 4 from every column number. Only those columns that then are evenly divisible by 3 (i.e., those that, after subtracting 4, have a modulus of 0) are returned.

Why my ArrayFormula is giving error? How do I correct it? (I'm not looking for another Arrayformula as solutions!)

I wanted a ArrayFormula at C1 which gives the required result as shown.
Entry sheet:
(Column C is my required column)
Date Entered is the date when the Name is Assigned a group i.e. a, b, c, d, e, f
Criteria:
The value of count is purely on basis of Date Entered (if john is assigned a on lowest date(10-Jun) then count value is 1, if rose is assigned a on 2nd lowest date(17-Jun) then count value is 2).
The value of count does not change even when the data is sorted in any manner because Date Entered column values is always permanent & does not change.
New entry date could be any date not necessarily highest date (If a new entry with name Rydu is assigned a on 9-Jun then the it's count value will become 1, then john's (10-Jun) will become 2 and so on)
Example:
After I sort the data in any random order say like this:
Random ordered sheet:
(Count value remains permanent)
And when I do New entries in between (Row 4th & 14th) and after last row (Row 17th):
Random Ordered sheet:
(Doesn't matter where I do)
I already got a ArrayFormula which gives the required result:
={"AF Formula1"; ArrayFormula(IF(B2:B="", "", COUNTIFS(B$2:B, "="&B2:B, D$2:D, <"&D2:D)+1))}
I'm not looking for another Arrayformula as solutions. What I want is to know what is wrong in my ArrayFormula? and how do I correct it?
I tried to figure my own ArrayFormula but it's not working:
I got Formula for each cell:
=RANK($D2,FILTER($D$2:$D, $B$2:$B=$B2),1)
I figured out Filter doesn't work with ArrayFormula so I had to take a different approach.
I took help from my previous question answer (Arrayformula at H3) which was similar since in both cases each cell FILTER formula returns more than 1 value. (It was actually answered by player0)
Using the same technique I came up with this Formula which works absolutely fine :
=RANK($D2, ARRAYFORMULA(TRANSPOSE(SPLIT(VLOOKUP($B2, SUBSTITUTE(TRIM(SPLIT(FLATTEN(QUERY(QUERY({$B:$B&"×", $D:$D}, "SELECT MAX(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1", 1),, 9^9)), "×")), " ", ","), 2, 0), ","))), 1)
Now when I tried converting it to ArrayFormula:
($D2 to $D2:$D & $B2 to $B2:$B)
=ARRAYFORMULA(RANK($D2:$D,TRANSPOSE(SPLIT(VLOOKUP($B2:$B, SUBSTITUTE(TRIM(SPLIT(FLATTEN(QUERY(QUERY({$B:$B&"×", $D:$D}, "SELECT MAX(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1", 1),, 9^9)), "×")), " ", ","), 2, 0), ",")), 1))
It gives me an error "Did not find value '' in VLOOKUP evaluation", I figured out that the problem is only in VLOOKUP when I change $B2 to $B2:$B.
I'm sure VLOOKUP works with ArrayFormula, I fail to understand where my formula is going wrong! Please help me correct my ArrayFormula.
Here is the editable sheet link
if I understand correctly, you are trying to "rank" B column based on D column dates in such way that dates are in theoretical ascending order so if you randomize your dataset, the "rank" of each entry would stay same and not change based on the randomness you introduce.
therefore the correct formula would be:
={"fx"; INDEX(IFNA(VLOOKUP(B2:B&D2:D,
{INDEX(SORT({B2:B&D2:D, D2:D}, 2, 1),,1),
IFERROR(1/(1/COUNTIFS(
INDEX(SORT(B2:D, 3, 1),,1),
INDEX(SORT(B2:D, 3, 1),,1), ROW(B2:B), "<="&ROW(B2:B))))}, 2, 0)))}
{"fx"; ...} array of 2 tables (header & actual table) under each other eg. ;
outer shorter INDEX or longer ARRAYFORMULA (doesnt matter which one) is needed coz we are processing an array
IFNA for removing possible #N/A errors from VLOOKUP function when VLOOKUP fails to find a match
we VLOOKUP joint B and D column B2:B&D2:D in our virtual table {} and returning second 2 column if there is an exact match 0
our virtual table {INDEX(SORT({B2:B&D2:D, D2:D}, 2, 1),,1), ...} we VLOOKUP from is constructed with 2 columns next to each other eg. ,
we are getting the first column by creating an array of 2 columns {B2:B&D2:D, D2:D} next to each other where we SORT this array by date/2nd column 2, in ascending order 1 but all we need after sorting is the 1st column so we use INDEX where we bring all rows ,, and the first column 1
now lets take a look on how we getting the 2nd column of our virtual table by using COUNTIFS which will mimic the "rank"
IFERROR(1/(1/ is used to remove all zero values from the output (all empty rows would have 0 in it as the "rank")
under COUNTIFS we put 2 pairs of arguments: "if column is qual to column" and "if row is larger or equal to next row increment it by 1" ROW(B2:B), "<="&ROW(B2:B))
for "if column is qual to column" we do this twice and use range B2:D and sort it by date/3rd column 3 in ascending order 1 and of this we again need only the 1st column so we INDEX it and return all rows ,, and first column 1
with this formula you can add, remove or randomize your dataset and you will always get the right value for each of your rows
as for why your formula doesnt work... to not get #N/A error for vlookup you would need to define the end row of the range but still, the result wont be as you would expect coz formula is not the right one for this job.
as mentioned there are functions that are not supported under AF like SUM,AND,OR and then there are also functions which work but in a different way like IFS or with some limitations like SPLIT,GOOGLEFINANCE,etc.
I have answered you on the tab in your shared sheet called My Practice thusly:
You cannot split a two column array as you have attempted to do in cell CI2. That is why your formula does not work. You can only split a ONE column array.
I understand you are trying to learn, but attempting to use complicated formulas like that is going to make it harder I'm afraid.

ARRAY formula to find last row to contain value in Google Sheets

I have a Google Sheet that is populated automatically via Zapier integration. For each new row added, I need to evaluate a given cell (Shipper Name) to find last instance of Shipper Name in prior rows, and if so, return Row# for the last entry.
Example Data Sheet
I am trying to create a formula that simply looks at name in new row and returns the number of the most recent row with that name.
Formula needs to run as an Array formula so that the data auto populates with each new row added to the Sheet.
I have tried to use this formula, but when refactored as Array formula, it doesn't populate new values for new rows, it just repeats the first value for all rows.
From Row J:
=sumproduct(max(row(A$1:A3)*(F4=F$1:F3)))
I need this formula refactored to be an Array formula that auto populates all the cells below it.
I have tried this version, but it doesn't work:
=ArrayFormula(IF(ISBLANK($A2:$A),"",sumproduct(max(row(A$1:A3)*($F4:$F=F$1:F3))))
A script (custom function maybe?) would be better.
Solution 1
Below is a formula you can place into the header (put in in J1, remove everything below).
It works much faster than the second solution and has no N² size restriction. Also it works with empty shippers (& "♥" is for those empty ones): as long as A:A column has some value it will not be ignored.
={
"Row of Last Entry";
ARRAYFORMULA(
IF(
A2:A = "",
"",
VLOOKUP(
ROW(F2:F)
+ VLOOKUP(
F2:F & "♥",
{
UNIQUE(F2:F & "♥"),
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1)
},
2,
0
),
SORT(
{
ROW(F2:F) + 1
+ VLOOKUP(
F2:F & "♥",
{
UNIQUE(F2:F & "♥"),
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1)
},
2,
0
),
ROW(F2:F);
{
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1),
SEQUENCE(ROWS(UNIQUE(F2:F)), 1, 0, 0)
}
},
1,
1
),
2,
1
)
)
)
}
Details on how it works
For every row we use VLOOKUP to search for a special number in a sorted virtual range to get the row number of the previous entry matching current.
A special number for a row is constructed like this: we get a sequential number for the current entry among unique entries and append to it current row number.
The right part (row number) of the resulting special numbers must be aligned between them. If the entry has sequential number 13 and the row number is 1234 and there are 100500 rows, then the number must be 13001234. 001234 is the aligned right part.
Alignment is done by multiplying a sequential number by 10 to the power of (log10(total number of rows) + 1), gives us 13000000 (from the example above). This approach is used to avoid using LEN and TEXT - working with numbers is faster then working with strings.
Virtual range has almost the same special numbers in the first column and original row numbers in the second.
Almost the same special numbers: they just increased by 1, so VLOOKUP will stop at most one step before the number corresponding to the current string.
Also virtual range has some special rows (added at the bottom before sorting) which have all 0's as the right part of their special numbers (1st column) and 0 for the row number (2nd column). That is done so VLOOKUP will find it for the first occurrence of the entry.
Virtual range is sorted, so we could use is_sorted parameter of the outer VLOOKUP set to 1: that will result in the last match that is less or equal to the number being looked for.
& "♥" are appended to the entries, so that empty entries also will be found by VLOOKUP.
Solution 2 - slow and has restrictions
But for some small enough number of rows this formula works (put in in J1, remove everything below):
={
"Row of Last Entry";
ARRAYFORMULA(
REGEXEXTRACT(
TRANSPOSE(QUERY(TRANSPOSE(
IF(
(FILTER(ROW(F2:F), F2:F <> "") > TRANSPOSE(FILTER(ROW(F2:F), F2:F <> "")))
* (FILTER(F2:F, F2:F <> "") = TRANSPOSE(FILTER(F2:F, F2:F <> ""))),
TRANSPOSE(FILTER(ROW(F2:F), F2:F <> "")),
""
)
), "", ROWS(FILTER(F2:F, F2:F <> "")))),
"(\d*)\s*$"
)
)
}
But there is a problem. The virtual range inside of the formula is of size N², where N is the number of rows. For current 1253 rows it works. But there is a limit after which it will throw an error of a range being too large.
That is the reason to use FILTER(...) and not just F2:F.
Here is a significantly simpler way to get at the information you're interested in. (I think.) I'm mostly guessing about what you want because your question wasn't really about what you want, but rather about how to get something that you think would help you get what you want. This is an example of an XY problem. I attempted to guess based on experience at what you're really after.
This editable sheet contains just 3 formulas. 2 on the raw data sheet and one in a new tab called "analysis."
The first formula on the Raw data tab extracts a properly formatted timestamp using a combination of MMULT and SPLIT functions and looks like this:
=ARRAYFORMulA({"Good Timestamp";IF(A2:A="",,MMULT(N(IFERROR(SPLIT(A2:A,"T"))),{1;1}))})
The second formula finds the amount of time since the previous timestamp for that Shipper. and subtracts it from the current timestamp thereby giving you the time between timestamps. However, it only does this if the time is less than 200 minutes. IF it is more than 200 minutes, it assumes that was a different shift for that shipper. It looks like this and uses a combination of LOOKUP() and SUBSTITUTE() to make sure it's pulling the correct timestamps. Obviously, you can find and change the 200 value to something more appropriate if it makes sense.
=ARRAYFORMULA({"Minutes/Order";IF(A2:A="",,IF(IFERROR((G2:G-1*SUBSTITUTE(LOOKUP(F2:F&G2:G-0.00001,SORT(F2:F&G2:G)),F2:F,""))*24*60)>200,,IFERROR((G2:G-1*SUBSTITUTE(LOOKUP(F2:F&G2:G-0.00001,SORT(F2:F&G2:G)),F2:F,""))*(24*60))))})
The third formula, on the tab called analysis uses query to show the average minutes per order and the number of orders per hour that each shipper is processing. It looks like this:
=QUERY({'Sample Data'!F:I},"Select Col1,AVG(Col3),COUNT(Col3)/(SUM(Col3)/60) where Col3 is not null group by Col1 label COUNT(Col3)/(SUM(Col3)/60)'Orders/ hour',AVG(Col3)'Minutes/ Order'")
Hopefully I've guessed correctly at your real goals. Always do your best to explain what they are rather than asking for only a small portion that you think will help you get to the answer. You can end up overcomplicating your process without realizing it.

Is there a way to use an array formula to keep my formula (to add specific columns) when rows are inserted?

I'm setting up a spreadsheet that has specific columns summed in each row, but I need the formula to be included when a row is inserted.
The current formula also includes a statement to make a 0 value, if a check box is checked in the last column:
=IF(T2=FALSE, SUM(I2,K2,L2,M2,N2,O2), 0)
Is there a way I can do this using an array formula?
Here is a formula which will give a sum for columns I to O in each row, ignoring column J:
=ArrayFormula(if(I2:I="","",if(T2:T<>FALSE,0,I2:I+sumif(row(K2:O)+0*column(K2:O),row(K2:O),K2:O))))
but this assumes all rows that have data will have a number in column I.
If this isn't the case, you could go on to test columns individually like this:
=ArrayFormula(if((I2:I="")*(K2:K=""),"",if(T2:T<>FALSE,0,I2:I+sumif(row(K2:O)+0*column(K2:O),row(K2:O),K2:O))))
and so on up to column O if necessary, or maybe column T is always completed and you could test that - it depends how your data actually looks.
Note 1
row(K2:O)+0*column(K2:O)
is necessary to generate an array which is has the same dimensions as K2:O as required by SUMIF.
Note 2
There's also the MMULT approach to getting the row sums as demonstrated here
={"AAA"; ARRAYFORMULA(IF(LEN(T2:T), IF(T2:T=FALSE, I2:I+K2:K+L2:L+M2:M+N2:N+O2:O, 0), ))}

Google Sheets - Query - Running Total below dynamic results

Testing Sheet:
Wondering if there is a witty way to add a Total to the last row +1 of
a Query result.
See Sheet 'Lookup' for a static example of what I am asking for.
I don't know if there is a way to have a hidden column that calculates
transposed only under the last row of a query, or if there is a smart
way to work Query for this answer.
All great answers. Each on very useful in its use case.
Макс Махров gets the answer with using a query statement.
Now I was not keen on having an extra sheet to hold the totals so I added a row at the top which I can simply hide and used this formula:
query({Orders!A:E;A1:E1},"select Col1, Col3, Col4 where Col2 = '"&C3&"' order by Col4",1)
Only problem I have is trying to figure out how to add TEXT to the bottom row, it seems to only want numerical input.
How do I fix this? What am I glitching?
Thanks !
Mars
The trick is to make second query and count totals for selected product.
Plan of actions:
add new sheet with query on it, something like this: =QUERY(Orders!A:E,"select B, 0, sum(D) where B like '"&Lookup!C2&"' Group by B",0)
Prepare arrayformula which combines data in Lookup sheet: = ArrayFormula({Importrange(1),Importrange(2)}) Note that number of columns must retain the same.
Edit query so it takes Col1, Col2, Col3... instead of A, B, C...
Make word 'total' visible instead of zero. Set number format: 0;0;total Set it for range B9:B on Lookup sheet
Make Conditional Formatting with formula =and($B4 =0,isnumber($B4)) for range A4:C on Lookup sheet.
That's seems have to complete the task.
Hope it Helps!
Your Example
Working example.
Here is one way:
Put TOTAL way down in row 1000
Select the range A3:C999. Select data > filter to create filters
Select C3, set the filter to hide all blanks
A second way is to limit the query result to show only the top 8 results:
Change your query to =query(Orders!A:E, "select A, C, D where B = '"&C2&"' order by D desc limit 8",1) It will reverse-order column D (largest first), and set row limit to 8.
Change the formula of your TOTAL to =sumif(Orders!B:B,C2,Orders!D:D)
Try this formula in the column adjacent to your query:
=ArrayFormula({$C$4:offset($C$4,count($C$4:$C),0,1,1);sum($C$4:offset($C$4,count($C$4:$C),0,1,1))})
It duplicates your column of values (I haven't figured out a way around that yet) and then adds a total to the bottom of that column, and changes dynamically with the range from your query.
Here's a working version.
Interesting challenge! It got the old grey matter turning... ;)
Thanks,
Ben

Resources