I have some data in the following way
Category
[Range 1_min]
[Range 1_max]
[Range 2_min]
[Range 2_max]
...
A
120
130
...
B
100
119
131
140
...
I want to be able to quickly query a number and have it return the category it belongs to, for example 135 belongs to B and 121 belongs to A.
I already have a script that does this, but since there are 1000+ categories, it takes a long time to run. Is there a faster way of doing this?
Thanks.
You can use LOOKUP:
=ArrayFormula(LOOKUP(2,1/((G2>=B2:B)*(G2<=C2:C)+(G2>=D2:D)*(G2<=E2:E)),A2:A))
Addition:
For more ranges you can add MMULT (not sure it's easier):
=ArrayFormula(LOOKUP(1,5/(MMULT(--(K2>={B2:B,D2:D,F2:F,H2:H}),ROW(A1:A4)^0)*MMULT(--(K2<={C2:C,E2:E,G2:G,I2:I}),ROW(A1:A4)^0)),A2:A))
some conditions:
change first argument of LOOKUP to 1
for second LOOKUP argument change denominator to 5 (number of cols to compare + 1)
for second MMULT argument ROW(A1:A4) use row count according column count to compare (i.e. for 4 cols ->ROW(A1:A4), for 6 cols -> ROW(A1:A6) etc. )
Related
I am trying to weighted average of available stock ie 888 Items. We operate FIFO so that means I need to start sum from recent date backwards. How do i only select those cells that sum up to available stock balance (888) then sumproduct with the price?
Date Items Recieved Price
9/1/2022 254 $25.00
8/25/2022 242 $25.00
8/18/2022 230 $65.00
8/11/2022 218 $77.00
8/4/2022 206 $45.00
7/28/2022 194 $77.00
7/21/2022 182 $89.00
7/14/2022 737 $74.00
7/7/2022 1292 $86.00
6/30/2022 1847 $87.00
Query, Arrayformula & SUMproduct
You tagged both Excel and Google sheets. They're different. In Excel (Office 365) you can do this using:
=LET(stock,888,
data,B2:C11,
items,INDEX(data,,1),
price,INDEX(data,,2),
cumulative,SCAN(0,items,LAMBDA(a,b,a+b)),
r,XMATCH(stock,cumulative,1),
correction,INDEX(items,r)+stock-INDEX(cumulative,r),
SUMPRODUCT(
IFERROR(
VSTACK(
TAKE(items,r-1),
correction),
correction),
TAKE(price,r)))
stock is the number to sum up to.
data is the range containing both the items and prices.
SCAN is used to get the cumulative sum of all items row-by-row.
XMATCH is used to find the row (r) in the cumulative sum where the value is greater than or equal to the stock value.
r is used to correct the items in that row to the value required to get the cumulative sum up to row r equal to the stock value. (Item in row r + stock - cumulative sum in row r).
I than take the rows before r of the items and add (stack) the correction items value calculated and use that in a SUMPRODUCT with the prices up to r.
If r is the first row it'll throw an error at the TAKE(items,r-1)-part, if so IFERROR makes sure the corrected value is used without stacking it on previous items values.
Edit: since you mentioned FIFO you'd probably be interested to calculate from the bottom up. In this case you could use:
=LET(stock,888,
data,SORT(A2:C11,1,1),
items,INDEX(data,,2),
price,INDEX(data,,3),
cumulative,SCAN(0,items,LAMBDA(a,b,a+b)),
r,XMATCH(stock,cumulative,1),
correction,INDEX(items,r)+stock-INDEX(cumulative,r),
SUMPRODUCT(
IFERROR(
VSTACK(
TAKE(items,r-1),
correction),
correction),
TAKE(price,r)))
It works the same, it just uses an extra column for the data, so it could sort from old (first in) to new.
And it's unclear if you wanted this SUMPRODUCT or the average of it, but that's simply adding /stock to the last argument of LET
I have a table with a few thousand rows and columns, it looks sort of like this
this:
ID Distance1 Distance2
1 102 101
2 101 100
3 100 99
4 99 98
5 98 97
...
I would like to select all values/distances in columns B and C that are less than 100 and replace them with the value in column A (their ID number).
All distances above 100 I want to delete. The real table has several thousand columns. How can I do this?
I have tried using search and replace, and conditional formatting where I have tried creating new rule using Index + Match but I encounter errors.
Assuming ID is in A1 of Sheet1, Copy the headings row into A1 of a new sheet and in B2 of that sheet:
=IF(AND(Sheet1!B2<100,Sheet1!B2>0),Sheet1!$A2,"")
Copy across and down to suit, Select the new sheet, Copy, Paste Special, Values over the top.
This above treats 100 as more than 100 and assumes no 0 or lesser values.
I have a Google Sheet with two columns of data. A is monotonically increasing with many duplicates (based on a coarse timestamp), while B is essentially random. There are many empty rows at the bottom waiting for future data. It resembles the following:
A B
1 5 43
2 5 77
3 13 8
4 21 34
5 27 68
6 27 90
7
8
9
10
I'm trying to write a few formulae which examine all of the (non-empty) values in a column except for the last one. For example, I would like to find the maximum value of B excluding the latest value, so the result should be 77 from B2 instead of 90 from B6.
If the values in the range were strictly increasing and unique, I could filter the values of A into C, excluding any values equal to the maximum value (only the last entry), and then take the MAX(..) of that range. However, my data does not have that property; the final value could be duplicated and the duplicates would be inappropriately ignored.
C D E
1 =FILTER(A:A, A:A < MAX(A:A)) =MAX(C:C) This produces A4's 21 instead of A5's 27.
A similar approach would work if we had a third column of incrementing indices to use:
A B C D E
1 5 43 9 =MAX(FILTER(C:C, A:A <> "")) Value of index in last populated row.
2 5 77 10 =MAX(FILTER(A:A, C:C < D1)) Maximum value from a row with lower index.
3 13 8 11
4 21 34 12
5 27 68 13
6 27 90 14
7 15
8 16
9 17
10 18
But I'm looking for a solution that doesn't require modifying the original spreadsheet, because that's not always possible. I can't just create a new IndexSheet with nothing but an an index column and join it in like this instead...
A B C
1 5 43 =MAX(FILTER(IndexSheet!A:A, A:A <> ""))
2 5 77 =MAX(FILTER(A:A, IndexSheet!A:A < C1))
...
...because that requires that the IndexSheet have the same number of rows as the data sheet, and would break as more data is added.
Without modifying the original data sheet, or relying on properties of the data (beyond values being numeric and rows being empty or full), is there any way to perform an aggregate calculation on a range while excluding the last value?
You can use indirect and address formulas to create dynamic range excluding the last row
=max(indirect("A1:"&Address(count(A:A)-1,1)))
The count function gives the number of non empty cells in the column A. You subtract 1 to exclude the last row.
You use that number to build an address using "A1:"&address(row no, Col no) which in your example case should be A1:$A$5
Use this string to reference your cells using the indirect method indirect(A1:$A$5) and pass the reference to the max function to determine the max in that range.
From another sheet try:
=MAX(Sheet1!B1:indirect("Sheet1!B"&count(Sheet1!B:B)-1))
We can use the FILTER() and ROW() functions to accomplish this:
D
1 =MAX(FILTER(Data!A:A,
ROW(Data!A:A) < MAX(FILTER(ROW(Data!A:A),
Data!A:A <> ""))))
We use FILTER(ROW(DATA!A:A), Data!A:A <> "")) to get an array of row numbers of non-empty rows, and use MAX(...) to take the last row number. We use this to exclude the last row by filtering out values from lower row numbers with FILTER(Data!A:A, ROW(Data!A:A) < ...). We apply MAX(...) to this filtered array and get the result we were looking for.
I am using weights when running the data with SPSS custom tables.
Thus it is expected that the column or row values may not add up to row total, column total or Table Total due to rounding of decimals
sample table result:
variable 2
category 1 category 2 Total
variable 1 category 1 45 52 97
category 2 60 56 115
Total 105 107 211
Is there a way to force SPSS to output the correct row, column, or table totals?
expected table output:
variable 2
category 1 category 2 Total
variable 1 category 1 45 52 97
category 2 60 56 116
Total 105 108 213
If you are using the CROSSTABS procedure to produce these figures then you should do using the option ASIS.
To be clear: the total displayed by CTABLES is mathematically correct. However, if you want to display as the total the sum of the displayed values in the rows, instead, the only way to do this is by using the STATS TABLE CALC extension command to recompute the totals using the rounded values.
Here is how to do that.
First, you need to create a Python module named customcalc.py with the following contents
def custom(datacells, ncells, roworcol):
'''Calculate sum of formatted values'''
total = sum(float(datacells.GetValueAt(roworcol,i)) for i in range(ncells))
return(total)
This file should be saved in the python\lib\site-packages directory under your Statistics installation or anywhere else that Python can find it.
Then, after your CTABLES command, run this syntax
STATS TABLE CALC SUBTYPE="customtable" PROCESS=PRECEDING
/TARGET custommodule="customcalc"
FORMULA="customcalc.custom(datacells, ncells, roworcol)" DIMENSION=COLUMNS LEVEL = -2 LOCATION="Total"
LABEL="Rounded Count".
That custom function adds up the formatted values in each row instead of the full precision values. If you have suppressed the default statistic name, Count, so that "Total" is the innermost label, use LEVEL=-1 instead of LEVEL=-2 ABOVE.
I'm trying to search through two columns with a given value. For example:
A(values)
0-2
3-4
5-6
7-8
9-10
B
275
285
295
305
330
now say I have 3 as a given value. I would like to compare it with the range of values in A so in a logical sense it would fall under 3-4 and return 285.
I think Vlookup would take part ... maybe an if statement.
It may be simpler to change your A values and use a formula like:
=vlookup(D1,A:B,2)
In which case any value greater than 9 would also return 330 (unless say an IF clause precludes that).
vlookup without a fourth parameter makes inexact matches (as well as exact) and when the first column of the lookup range is sorted ascending will chose the match appropriate to the highest value that is less than the search_key.
Does this formula work as you want:
=LOOKUP(3,ARRAYFORMULA(VALUE(LEFT(FILTER(A:A,LEN(A:A)),SEARCH("-",FILTER(A:A,LEN(A:A)))-1))),FILTER(B:B,LEN(B:B)))
In addition, if you use 'closed ranges' you can try something like:
=ArrayFormula(VLOOKUP("3", {REGEXEXTRACT(A2:A6, "(\d+)-"), B2:B6}, 2, 1))