I'm trying to build out a spreadsheet formula that will allow me to take one list of numbers and evenly distribute them into another list of numbers. Attaching an example below.
I'm sure there's a way to automate this process but I've done extensive research online and can't seem to figure out the right combination of existing formulas to make this work, would greatly appreciate any resources or tips to point me in the right direction. Currently using G Sheets.
Example spreadsheet: https://docs.google.com/spreadsheets/d/1JgFKXGJ2-eQEXAGtqu64p_Zw7gY2fVpEtuCFDvMb9YE/edit#gid=1179066278
List 1:
1000
500
List #2:
300
600
200
100
200
100
Desired result:
List 1 value -> List 2 values that add up to List 1 value
1000 -> 300, 600, 100
500 -> 200, 100, 200
try this partial solution:
=ARRAYFORMULA(TRANSPOSE(QUERY({ FILTER(A:A, A:A<=C3, A:A<>""), MMULT(
TRANSPOSE((ROW(INDIRECT("B1:B"&COUNTA(FILTER(A:A, A:A<=C3, A:A<>""))))<=
TRANSPOSE( ROW(INDIRECT("B1:B"&COUNTA(FILTER(A:A, A:A<=C3, A:A<>""))))))*
FILTER(A:A, A:A<=C3, A:A<>"")), SIGN(FILTER(A:A, A:A<=C3, A:A<>"")))},
"select Col1 where Col2 <="&C3)))
side note: it will give you only the exact number (as you request) or lower number (one number short)
Related
I often use query functions to pull the top 5 (or bottom 5) values of a column. My basic formulas usually look like this:
=QUERY(A2:N32, "SELECT A,N ORDER BY N DESC LIMIT 5")
but this time, it's grabbing the 1st row (N103SY, 34.7) as part of the query even though 34.7 does not come close to being within the top 5 values of all 31 possible values. The output IS correct starting with (N136SY, 62.0), so why the extra row at the top when it's not a part of the query?
N103SY 34.7
N136SY 62.0
N139SY 43.6
N127SY 43.3
N124SY 43.2
N119SY 41.0
Open doc (editable)...
https://docs.google.com/spreadsheets/d/1Oq1GvbsHdxpPM1wZ2HAXSjzYeA7dmT1Blq-raLilvbQ/edit#gid=735538815
use:
=QUERY(A2:N32, "SELECT A,N ORDER BY N DESC LIMIT 5", )
or upgrade to:
=SORTN({A2:A32, N2:N32}, 5, 2, 0)
I use a Google Spreadsheet to keep track of the accounts payable per vendor. There is a sheet per vendor in the Spreadsheet. A simplified sheet looks like this:
When I receive a new invoice, an entry for the amount is made in the Credit column and when I release a payment, an entry for the amount is made in the Debit column. I keep track of the running total in the AC Payable column. I achieve this by using a formula in each cell of the AC Payable column (the example below is from cell E4):
=IF(
ISNUMBER(INDIRECT(ADDRESS(ROW()-1,COLUMN()))),
INDIRECT(ADDRESS(ROW()-1,COLUMN()))+C4-D4,
C4-D4
)
The logic is simple. The running total for row n is calculated by:
AC Payable(n - 1) + Credit(n) - Debit(n)
This setup works fine, except I have to drag the formula into newly added rows. Is there a way to achieve this by using ARRAYFORMULA?
PS: I have found a solution using:
= ARRAYFORMULA(
SUMIF(
ROW(C3:C),
"<="&ROW(C3:C),
C3:C)
-
SUMIF(
ROW(D3:D),
"<="&ROW(D3:D),
D3:D
)
)
I feel this is a suboptimal (The original sheet dates back to 2018. It has a lot of rows) solution since, in every row, it calculates the total of the Debit and Credit columns up to the current row and then subtracts the total of the Debit column from the total of the Credit column.
I am expecting a solution that would take advantage of the running total available in the previous row and not redo the whole calculation per row.
solution for up to 1581 rows:
=ARRAYFORMULA(QUERY(QUERY(MMULT(TRANSPOSE((SEQUENCE(COUNTA(A3:A)*2)<=
SEQUENCE(1, COUNTA(A3:A)*2))*FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1})),
SEQUENCE(COUNTA(A3:A)*2, 1, 1, 0)), "offset 1", ), "skipping 2", ))
skills:
it's fast
it's smart
gets slower more rows you add
dies after 1581 rows
it's based on standard MMULT Running/Cumulative Total/Sum formula:
=ARRAYFORMULA(MMULT(TRANSPOSE((ROW(B1:B6)
<=TRANSPOSE(ROW(B1:B6)))*B1:B6), SIGN(B1:B6)))
but with a modification twist, because you got 2 columns to total
instead of ROW(B1:B6) we use a sequence of count of real data multiplied by two (because you got 2 columns):
SEQUENCE(COUNTA(A3:A)*2)
instead of TRANSPOSE(ROW(B1:B6)) we use again:
SEQUENCE(1, COUNTA(A3:A)*2)
combination of these pieces:
=ARRAYFORMULA(TRANSPOSE((SEQUENCE(COUNTA(A3:A)*2)<=SEQUENCE(1, COUNTA(A3:A)*2))))
will produce a matrix like:
and that's the reason why it dies with lots of rows because while you may think that if you have only 1500 rows in two columns, then formula will work only on 1500*2=3000 virtual cells, but in fact the MMULT formula processes (1500*2)*(1500*2)=9000000 virtual cells. still, it's worth to note, that this MMULT fx is great if deployed on a small scale.
next, instead of *B1:B6 we use:
*FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1}))
eg. with INDIRECT we take only "valid" range of C3:D which is in your example sheet just C3:D5 and we multiply C column by 1 and D column by -1 to simulate subtraction and then we FLATTEN both columns into one single column. the part +ROW(A3)-1 is just an offset because you start from row 3
and the last part of standard RT fx - SIGN(B1:B6) is replaced with one column full of ones:
SEQUENCE(COUNTA(A3:A)*2, 1, 1, 0)
then we offset the output with inner QUERY by 1 because we are interested in a totals after subtraction and finally we use skipping 2 which means that we filter out every second value - again, we are interested in totals after subtraction of D column.
solution for more than 1581 rows:
=ARRAYFORMULA(
SUMIF(SEQUENCE(COUNTA(A3:A)), "<="&SEQUENCE(COUNTA(A3:A)), INDIRECT("C3:C"&COUNTA(A3:A)))-
SUMIF(SEQUENCE(COUNTA(A3:A)), "<="&SEQUENCE(COUNTA(A3:A)), INDIRECT("D3:D"&COUNTA(A3:A))))
skills:
supports more rows
looks less smart
sadly the third argument of SUMIF always needs to be a range
gets slower with more rows
it will get sick if you feed it with 10000 rows
it may kill off your sheet with 11000+ rows
Here'a modification of Ben Collins' running total formula
=ARRAYFORMULA(
IF(ISBLANK(A2:A),,
MMULT(TRANSPOSE((ROW(C2:C)<=TRANSPOSE(ROW(C2:C)))*C2:C),SIGN(C2:C))-
MMULT(TRANSPOSE((ROW(D2:D)<=TRANSPOSE(ROW(D2:D)))*D2:D),SIGN(D2:D))))
yet another alternative to MMULT:
=INDEX(QUERY(FLATTEN(QUERY(QUERY(TRANSPOSE(QUERY(QUERY(TRANSPOSE(
(SEQUENCE(COUNTA(A3:A)*2)<=SEQUENCE(1, COUNTA(A3:A)*2))*
FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1})),
"offset 1", ), "skipping 2", )), "select "&QUERY(
"sum(Col"&SEQUENCE(COUNTA(A3:A))&"),",, 9^9)&"' '"),
"offset 1", )), "where Col1 is not null", ))
but again, LTE (<=) limitation of 10M cells won't let you use more than 1581 rows in your case or 3162 rows in the standard cumulative sum case
(1581 rows * 2 columns) raised on 2nd power < 10 million cells
(1581*2)^2 = 9998244
So I play an air combat simulator with a group on Digital Combat Simulator AKA DCS, for fun. I was asked to create a spreadsheet on google to help keep track of a bunch of statistics, to better understand stuff like: what kills us the most? What weapon are we the most accurate with? What weapon are we the least accurate with, so and so...
What I have and tried so far.
My issue right now is trying to get the spreadsheet to count up all the misses inputted into the spreadsheet for a certain person, in this case, Schmidt in aircraft 103, figure out which misses occur the most, in his case, the type of miss that occurs the most is "Energy Defeated" and then display it in the column that says "missile Defeat mode". This is specific to a person, so that's an additinal condition that I am having trouble programming for.
Thank you for your time and help.
https://docs.google.com/spreadsheets/d/1Cz1o06slDFuOCYnp8qlzF4icpscgEblHuJff7HAOXnY/edit?usp=sharing
here is a dummy spreadsheet if you want to look into the details.
Because you are matching the Pilot column, you need to use QUERY statements to gather the dataset, then ARRAY_CONSTRAIN to limit to what you want to display.
Formula
=iferror(
ARRAY_CONSTRAIN(
query(
query(
{$A$21:$A$33,$L$21:$L$33},
"select Col2, count(Col2) where Col1='"&A4&"'
group by Col2 order by count(Col2) desc limit 1"
,)
,"offset 1",)
,1,1)
,"")
Breaking this down:
The first thing that happens is you run a query against the Pilot and Outcome columns (A21:A33,L21:L33) based on the Pilot (A4) to find the outcomes (Col2), count each, then sort them in descending order so that the most common is listed first. Limit the return to just the top reason. This will produce a 2x2 array of results.
| | count |
| ---------------- | ----- |
| Sensor Defeated | 2 |
So we run the second query to remove the first row:
| Sensor Defeated | 2 |
Then, Array_Constrain lets us lop off the second column (containing the count of times that outcome was found), leaving just the most-found outcome:
Sensor Defeated
It's all wrapped up in a nice iferror statement so you can have blanks if the pilot isn't found in the query range.
Just copy that formula into each cell in G2 - G16 and you're good to go.
Example
Here is the formula in action on a copy of your sheet.
I am using this formula but the same formula needs to be applied to every third column. ie: starting from D3:D, G3:G, J3:J, and so on... what is the best way to apply or pull the data from every third column. (data is on the second sheet called Sitemap)
Please advise and help, many many thanks much appreciated!
=query({
'Sitemaps'!D3:D1000},
"Select * where Col1 is not null ")
Adding the sheet link maybe that will be more helpful to understand the situation, "AllURLs" needs to pull all links from Sitemaps into one list
https://docs.google.com/spreadsheets/d/1AWGfA7cHmF3Q2kiX1xkQcoec6H5EPiHUXaiWENMzZkA/edit?usp=sharing
use:
=QUERY({INDIRECT("Sitemaps!"&
ADDRESS(3, (COLUMN($D1)-1)*COLUMN(A1)+1)&":"&
ADDRESS(1000, (COLUMN($D1)-1)*COLUMN(A1)+1))},
"where Col1 is not null")
and drag to the right
update:
use in B3:
=INDEX(IFERROR(REGEXEXTRACT(C3:C,"^(?:https?:\/\/)?(?:www\.)?([^\/]+)")))
use in C3:
=QUERY(FLATTEN(FILTER(IFERROR(Sitemaps!D3:1000), MOD(COLUMN(Sitemaps!D1:1)-1, 3)=0)),
"where Col1 is not null")
Try this:
=FILTER(FILTER(Sitemaps!D3:J,MOD(COLUMN(Sitemaps!D3:J)-4,3)=0),Sitemaps!D3:D<>"")
Just replace :J with whichever column is further to the right in your data set.
This one formula should produce all results, assuming that any rows that have data in Column D also have data in that row of every other included column, and that rows that are null in Column D are also null in that row of every other included column.
MOD is the modulus function. It returns whatever is left after dividing a number by another number. For instance, MOD(7,3) would return 1, because 7 divided by 3 is 6 with 1 left over. The leftover portion is the modulus.
We can apply this to your column numbers, since the ones you want to retrieve are evenly spaced three apart. We just need to start at a baseline of zero. Since Column D has a column number of 4, we can "zero out" that baseline by subtracting 4 from every column number. Only those columns that then are evenly divisible by 3 (i.e., those that, after subtracting 4, have a modulus of 0) are returned.
I'm kind of in a pickle with a sheet I've been working on, I was looking for some clarification. For some reason my old account is gone that Ive for years :(, I apologize.
I'm trying get a min value for each row I've added data in. There is 2 columns where I need to convert the data first, and then find the lowest value for for all 3 columns for each row.
I've tried multiple things and only entering the formula I've created for each row works perfectly.
For Instance:
=MIN( IF(E124 > 0, E124*$E$6), IF(F124 > 0, F124*$F$6), IF(G124 > 0, G124) )
I've tried to use other examples, however, I am not familiar with QUERY. Trying to do simple calculations (adding) within the formula is confusing. Example I've tried using:
=QUERY(TRANSPOSE(QUERY(TRANSPOSE(A1:C),
"select "®EXREPLACE(JOIN( , ARRAYFORMULA(IF(LEN(A1:A&B1:B&C1:C),
"min(Col"&ROW(A1:A)-ROW(A1)+1&"),", ""))), ".\z", "")&"")),
"select Col2")
That has multiple problems when adding it. I want to ignore empty cells and text like headers. It will not write over text, and does not execute (gives me an error about overwriting values).
I've tried writing an =arrayformula but does not like calculating the min value. It does do the calculations for the rows.
=ArrayFormula(IF(ISBLANK({E8:E;F8:8;G8:8}), "", added my formula here))
Down below was something I've worked on for hours, I believe the problem is selecting the ranges inside the MIN function that is causing the problems
=arrayformula(IF(LEN(E8:G)<>0, MIN( IF(E8:E > 0, E8:E*$E$6), IF(F8:F > 0, F8:F*$F$6), IF(G8:G > 0, G124) ),)
If there's a way to do this, I'd really appreciate some help
LINK: a viewable version of a sample I made as my actual sheet is over 500 lines long.
https://docs.google.com/spreadsheets/d/133LJHY3s45ZyxWq0PWew1KikbyNE4MTt-wOeWHBrZY0/edit?usp=sharing
try (which works for all rows till bottom):
=ARRAYFORMULA(TEXT(SUBSTITUTE(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
IF(E3:G<>"", {E3:E*M5, F3:F*N5, G3:G}, 999^99)),
"select "&TEXTJOIN(",", 1,
"min(Col"&ROW(A3:A)-ROW(A3)+1&")")&"")),
"select Col2", 0), 999^99, ), "$#,###.00"))
In cell O3 give this a try
=ArrayFormula(TO_DOLLARS(index(transpose(query(transpose(E3:G18*{M5, N5, 1}),"select "&join("),","max(Col"&row(indirect("A3:A18"))-2)&")")),,2)))
and see if that delivers the expected output?
In case, you'll have more then one value in the columns E:G you could try
=ArrayFormula(TO_DOLLARS(index(transpose(query(transpose(if(ISNUMBER(E3:G18), E3:G18, 99^99)*{M5, N5, 1}),"select "&join("),","min(Col"&row(indirect("A3:A18"))-2)&")")),,2)))