Google Sheet: How to determine where I should put arrayformula? - google-sheets

I understand some basic usages of Arrayformula, but when it comes to complex formulas, I often get confused and don't know where to put it.
Products:
ID
Name
init Stock
Current Stock
23
Bag
24
What arrayformula should I put in this cell?
43
Book
45
=C3 + SUM(filter('Records'!C2:C,'Records'!A2:A = A3,'Records'!B2:B = "in")) - SUM(filter('Records'!C2:C,'Records'!A2:A = A3,'Records'!B2:B = "out")) //a normal formula
31
Table
42
=ARRAYFORMULA(C2:C + SUM(filter('Records'!C2:C,'Records'!A2:A = A2:A,'Records'!B2:B = "in")) - SUM(filter('Records'!C2:C,'Records'!A2:A = A2:A,'Records'!B2:B = "out")) //This doesn't work
Records
ID
in/out
quantity
23
in
1
43
in
34
31
out
5
23
out
13
23
in
14
23
in
111
I am using the above tables to track stock of products, when a new in/out records is added to the Records table, the value in Current Stock should change accordingly.
In the table above I put my attempt but it doesn't work, returning error saying filter's range mismatch. I guess I will have to wrap another arrayformula around SUM and/or filter. This is when confusion starts.
How do I determine where I should put another arrayformula?
As far as I understand, when inside an arrayformula, some functions that would originally take one value as parameter can take an array as parameter, but some others can't. How do I know which functions have this behavior?

I'm no expert in order to better explain how to use ARRAYFORMULA, but it always get tricky when you need to use it with formulas that already include ranges. I recommend you to investigate about BYROW an BYCOL, basically they iterate a formula for a whole range row by row or column by column. Try this:
=BYROW(Records!C2:C,lambda(each,each + SUM(filter('Records'!C2:C,'Records'!A2:A = A2:A,'Records'!B2:B = "in")) - SUM(filter('Records'!C2:C,'Records'!A2:A = A2:A,'Records'!B2:B = "out"))))

Related

How to split and sum in google sheets

I am trying to figure out how to split the range in google sheets by
"-" delimiter and add the result. Basically from the image below, I am trying to split on "-" and add up the ones (i.e 1 + 1 +1 +1 = 4). However, using the formula below it adds up all of the numbers (i.e 1 + 5 +1 + 1+1 + 0 +1 +3 = 13) which is not what I want
You are correctly splitting the values into to two columns of data, but then summing the entire dataset. You need to specify your summation to just the column you want (which appears to be the first column). Index function will do this probably the best as the second parameter can specify which column in a data set.
Summing the first column:
=sum(index(split(B11:B14,"-"),,1))
Summing the second column:
=sum(index(split(B11:B14,"-"),,2))
Showing all (same as your your arraysformula split)
=index(split(B11:B14,"-"))
See sample sheet here.

Google Sheets Query Drop Down Filtered Returned Values Not Resetting

Using the following code, I've been able to query a single column properly from my dataset where B2 and B3 are drop downs that I use to filter and D1 is just a header (I have multiple queries side by side that use the same filters)
=IFERROR(QUERY('Input Sheet'!$A:$E, "select A where B = '"&$B$2&"' and C = '"&D$1&"' and D ='"&$B$3&"'",0),"")
I've tried looking it up and when I see other people use the Query function, their search area will reset upon changing their drop down filters. What I mean is, if the first query returns 4 rows and the second returns 1, only my first row will update for the query.
Updated to include sample data and video. Sorry I can't post directly into thread.
Dataset (date is excluded since it isn't relevant to my query)
Dropdowns
Query Table and Filters
Video
So if my first query has 4 rows and my next query returns only 1 row, the last 3 rows will not update.
So for example, lets say my first query returns the following:
10
19
32
41
Changing a filter will return
23
19
32
41
But in reality, the query should only return 23, the rest of the values are from the previous query. None of the videos I've watched have this problem (so none have addressed my issue)
If I change my filters to something that should return nothing (no data entries) I get the following:
"" (Empty Cell, Null etc)
19
32
41
My data source is formatted like the below
A B C D
1 2 3 4
w x y z
Any help would be appreciated. Thanks.

Google Sheet: formula to loop through a range

It's not hard to do this with custom function, but I'm wondering if there is a way to do it using a formula. Because datas won't automatically update when using custom function.
So I have a course list sheet, each with a price. And I'm using google form to let users choose what courses they will take. Users are allowed to take multiple courses, so how many they will take is unknown.
Now in the response sheet, I have datas like
Order ID
User ID
Courses
Total
1001
38
courseA, courseC
What formula to put here?
1002
44
courseB, courseC, courseD
What formula to put here?
1003
55
courseE
What formula to put here?
and the course sheet is like
course
Price
A
23
B
33
C
44
D
23
E
55
I want to output the total for each order and am looking at using FILTER to do this. Firstly I can get a range of unknown length for the chosen courses
=SPLIT(courses, ",") // having named the Courses column as "courses"
Now I need to filter this range against the course sheet? not quite sure how to do it or even if it is possible. Any hint is appreicated.
try:
=ARRAYFORMULA(IF(A2:A="",,MMULT(IFERROR(
VLOOKUP(SPLIT(C2:C, ", "), {F1&F2:F, G2:G}, 2, 0))*1,
ROW(INDIRECT("1:"&COLUMNS(SPLIT(C2:C, ", "))))^0)))
demo spreadsheet
As I need time to digest #player0's answer, I am doing this in a more intuitive way.
I create 2 sheets to store intermediate values.
The first one is named "chosen_courses"
Order ID
User ID
1001
=IFERROR(ARRAYFORMULA(TRIM(SPLIT(index(courses,Row(),1),","))),"")
1002
=IFERROR(ARRAYFORMULA(TRIM(SPLIT(index(courses,Row(),1),","))),"")
1003
=IFERROR(ARRAYFORMULA(TRIM(SPLIT(index(courses,Row(),1),","))),"")
In this sheet every row is a horizontal list of the chosen courses, and I created another sheet
total
course price
=IF(isblank(order_id),"",SUM(B2:2))
=IFERROR(VLOOKUP('chosen_courses'!B2,{course_Names,course_price},2,false),"")
=IF(isblank(order_id),"",SUM(C2:2))
=IFERROR(VLOOKUP('chosen_courses'!B2,{course_Names,course_price},2,false),"")
=IF(isblank(order_id),"",SUM(D2:2))
=IFERROR(VLOOKUP('chosen_courses'!B2,{course_Names,course_price},2,false),"")
course_Names,order_id and course_price are named ranges.
This works well, at least for now.
But there is a problem:
I have 20 courses, so in the 2nd sheed, there are 21 columns. And I copy the formulas to 1000 rows because that is the maximum rows you can get to using ctrl+shift+↓ and ctrl+D. Now sometimes when I open the sheet, there will be a progress bar calculating formulas in this sheet, which could take around 2 mins, even though I have only like 5 testing orders in the sheet. I am afraid this will get worse when I have more datas or when it is open by old computers.
Is it because I use some resource consuming functions? Can it be improved?

Formula to check if one cell is within a range between two cells

I'm trying to search through two columns with a given value. For example:
A(values)
0-2
3-4
5-6
7-8
9-10
B
275
285
295
305
330
now say I have 3 as a given value. I would like to compare it with the range of values in A so in a logical sense it would fall under 3-4 and return 285.
I think Vlookup would take part ... maybe an if statement.
It may be simpler to change your A values and use a formula like:
=vlookup(D1,A:B,2)
In which case any value greater than 9 would also return 330 (unless say an IF clause precludes that).
vlookup without a fourth parameter makes inexact matches (as well as exact) and when the first column of the lookup range is sorted ascending will chose the match appropriate to the highest value that is less than the search_key.
Does this formula work as you want:
=LOOKUP(3,ARRAYFORMULA(VALUE(LEFT(FILTER(A:A,LEN(A:A)),SEARCH("-",FILTER(A:A,LEN(A:A)))-1))),FILTER(B:B,LEN(B:B)))
In addition, if you use 'closed ranges' you can try something like:
=ArrayFormula(VLOOKUP("3", {REGEXEXTRACT(A2:A6, "(\d+)-"), B2:B6}, 2, 1))

Apache Pig: Join records by shifting

I have records of type:
time | url
==========
34 google.com
42 cnn.com
54 yahoo.com
64 fb.com
I want to add another column to these records time_diff which basically takes the difference of the time of the current record with the previous record. Output should look like:
time | url | time_diff
======================
34 google.com -- <can drop this row>
42 cnn.com 08
54 yahoo.com 12
64 fb.com 10
If I can somehow add another column (same as time) shifting the time by one such that 42 is aligned with 34, 54 is aligned with 42 and so on, then I can take the difference between these columns to calculate time_diff column.
I can project the time column to a new variable T and if I can drop the first record in the original data, then I can join it with T to obtain the desired result.
I appreciate any help. Thanks!
See this question, for example. You'll need to get your tuples in a bag (using GROUP ... ALL in your case), and then in a nested FOREACH, ORDER them and call a UDF to rank them. After you have this rank, you can FLATTEN the bag back out into a set of tuples again, and you'll have three fields: time, url, and rank. Once you have this, create a fourth column which is rank-1, do a self-join on those latter two columns, and you'll have what you need to compute the time_diff.
Since multiple records can have the same time, it would be a good idea to also sort on url so that you are guaranteed the same result every time.
I think you can use "lead" function of PiggyBank. Something like following might work.
A = LOAD 'T';
B = GROUP A ALL
C = FOREACH B {
C1 = ORDER A BY d;
GENERATE FLATTEN(Stitch(C1, Over(C1.time, 'lead')));
}
D = FOREACH C
GENERATE stitched::time AS time,
stitched::url AS url,
stitched::time - $3 AS time_diff;
https://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/evaluation/Over.html

Resources