shuffle of column 2d list in dart - dart

I am trying that after creating a multidimensional list in Dart it is Shuffle by Columns. The idea is that the values associated with the indexes of list 2, when list 1 is reordered, maintain their correspondence. For example, for index 27 of both lists, index 0 is 6 | 6, which corresponds to index 27 of list 1 and it is 12.

Related

Product co-purchasing or bundles given a product ID

I have a data(amazon co-purchasing product) in two columns with values as product ID. I would like to select values from 100 - 299, 300-399, 400-999 and others values and group them. I want to create a bundle or co-purchasing between product in one group with another eg. 100-299 and 300-399, 400-999 and 100-299. The original data has two columns with FromNode and ToNode. Below are few lines of the original data. Some values(product ID) appear under both columns.
FromNode ToNode
0 1
0 2
0 3
0 4
0 5
1 0
1 2
1 4
1 5
1 15
2 0
2 11
2 13
2 14
3 65
3 66
3 67
I am using
df[df[['FromNode', 'ToNode']].isin([100,101,102...299]).any(1)]
to pick the values in the range but it seems I have to list all the values in the isin argument. Is there an efficient way to just give the range 100-299 to the isin(100-299) to fetch the values. Should just combine both columns into one and use iloc to select the values. Any tips will help.

ARRAYFORMULA with repetition

I have two columns of data, and would like to distribute the elements of one of these columns over several rows. I can easily calculate the index of the element I need, but cannot figure out how to access the element.
A B Desired output Formula for index: =ARRAYFORMULA(IF(A:A,CEILING(ROW(A:A)/3+1),""))
1 11 22 2
2 22 22 2
3 33 22 2
4 44 33 3
5 33 3
6 33 3
7 44 4
How can I modify my formula for the index so that it yields the item of column B at the calculated index?
I tried =ARRAYFORMULA(IF(A:A, INDEX(B:B, CEILING(ROW(A:A)/3+1), 1), "")) but that only repeats the first element (22) 7 times.
Use Vlookup instead of Index:
=ARRAYFORMULA(IF(A:A,vlookup(CEILING(ROW(A:A)/3+1),A:B,2),""))
EDIT
It isn't necessary to use a key column, you could use something like this:
=ARRAYFORMULA(vlookup(CEILING(sequence(counta(B:B)*3)/3+1),{row(B:B),B:B},2))
assuming you wanted to generate three rows for each non-blank row in column B not counting the first one.
Or if you want to be different, use a concatenate/split approach:
=ArrayFormula(flatten(split(rept(filter(B:B,B:B<>"",row(B:B)>1)&"|",3),"|")))
(all the above assume you want to ignore the first row in col B and start with 22).

Excluding the last value in a range from an aggregate calculation in Google Sheets

I have a Google Sheet with two columns of data. A is monotonically increasing with many duplicates (based on a coarse timestamp), while B is essentially random. There are many empty rows at the bottom waiting for future data. It resembles the following:
A B
1 5 43
2 5 77
3 13 8
4 21 34
5 27 68
6 27 90
7
8
9
10
I'm trying to write a few formulae which examine all of the (non-empty) values in a column except for the last one. For example, I would like to find the maximum value of B excluding the latest value, so the result should be 77 from B2 instead of 90 from B6.
If the values in the range were strictly increasing and unique, I could filter the values of A into C, excluding any values equal to the maximum value (only the last entry), and then take the MAX(..) of that range. However, my data does not have that property; the final value could be duplicated and the duplicates would be inappropriately ignored.
C D E
1 =FILTER(A:A, A:A < MAX(A:A)) =MAX(C:C) This produces A4's 21 instead of A5's 27.
A similar approach would work if we had a third column of incrementing indices to use:
A B C D E
1 5 43 9 =MAX(FILTER(C:C, A:A <> "")) Value of index in last populated row.
2 5 77 10 =MAX(FILTER(A:A, C:C < D1)) Maximum value from a row with lower index.
3 13 8 11
4 21 34 12
5 27 68 13
6 27 90 14
7 15
8 16
9 17
10 18
But I'm looking for a solution that doesn't require modifying the original spreadsheet, because that's not always possible. I can't just create a new IndexSheet with nothing but an an index column and join it in like this instead...
A B C
1 5 43 =MAX(FILTER(IndexSheet!A:A, A:A <> ""))
2 5 77 =MAX(FILTER(A:A, IndexSheet!A:A < C1))
...
...because that requires that the IndexSheet have the same number of rows as the data sheet, and would break as more data is added.
Without modifying the original data sheet, or relying on properties of the data (beyond values being numeric and rows being empty or full), is there any way to perform an aggregate calculation on a range while excluding the last value?
You can use indirect and address formulas to create dynamic range excluding the last row
=max(indirect("A1:"&Address(count(A:A)-1,1)))
The count function gives the number of non empty cells in the column A. You subtract 1 to exclude the last row.
You use that number to build an address using "A1:"&address(row no, Col no) which in your example case should be A1:$A$5
Use this string to reference your cells using the indirect method indirect(A1:$A$5) and pass the reference to the max function to determine the max in that range.
From another sheet try:
=MAX(Sheet1!B1:indirect("Sheet1!B"&count(Sheet1!B:B)-1))
We can use the FILTER() and ROW() functions to accomplish this:
D
1 =MAX(FILTER(Data!A:A,
ROW(Data!A:A) < MAX(FILTER(ROW(Data!A:A),
Data!A:A <> ""))))
We use FILTER(ROW(DATA!A:A), Data!A:A <> "")) to get an array of row numbers of non-empty rows, and use MAX(...) to take the last row number. We use this to exclude the last row by filtering out values from lower row numbers with FILTER(Data!A:A, ROW(Data!A:A) < ...). We apply MAX(...) to this filtered array and get the result we were looking for.

SPSS: Inconsistent totals due to rounding of numbers

I am using weights when running the data with SPSS custom tables.
Thus it is expected that the column or row values may not add up to row total, column total or Table Total due to rounding of decimals
sample table result:
variable 2
category 1 category 2 Total
variable 1 category 1 45 52 97
category 2 60 56 115
Total 105 107 211
Is there a way to force SPSS to output the correct row, column, or table totals?
expected table output:
variable 2
category 1 category 2 Total
variable 1 category 1 45 52 97
category 2 60 56 116
Total 105 108 213
If you are using the CROSSTABS procedure to produce these figures then you should do using the option ASIS.
To be clear: the total displayed by CTABLES is mathematically correct. However, if you want to display as the total the sum of the displayed values in the rows, instead, the only way to do this is by using the STATS TABLE CALC extension command to recompute the totals using the rounded values.
Here is how to do that.
First, you need to create a Python module named customcalc.py with the following contents
def custom(datacells, ncells, roworcol):
'''Calculate sum of formatted values'''
total = sum(float(datacells.GetValueAt(roworcol,i)) for i in range(ncells))
return(total)
This file should be saved in the python\lib\site-packages directory under your Statistics installation or anywhere else that Python can find it.
Then, after your CTABLES command, run this syntax
STATS TABLE CALC SUBTYPE="customtable" PROCESS=PRECEDING
/TARGET custommodule="customcalc"
FORMULA="customcalc.custom(datacells, ncells, roworcol)" DIMENSION=COLUMNS LEVEL = -2 LOCATION="Total"
LABEL="Rounded Count".
That custom function adds up the formatted values in each row instead of the full precision values. If you have suppressed the default statistic name, Count, so that "Total" is the innermost label, use LEVEL=-1 instead of LEVEL=-2 ABOVE.

How do I sum horizontally across a row based on 1st column value?

I've used the search but haven't found much on this. Essentially I would like to do a SUMIF style action on a dataset but it only grabs the first adjacent value. My table would be something like:
KT 4 5 9
AM 3 7 8
IA 2 5 12
On rows below I would have
KT | =Sumif(A1:E3,A8,B1:E3) Which returns 4
AM | =Sumif(A1:E3,A9,B1:E3) Which returns 3
IA | =Sumif(A1:E3,A8,B1:E3) Which returns 2
Now I know I could surely just add a column with a total use vlookup(array, value, index) but that is not what I want to do (although I may just do so if this is too big a pain).
Any thoughts/ideas. Demo here
Try using INDEX and MATCH to get the 'VLOOKUP` similarity:
=SUM(INDEX($B$1:$E$3, MATCH(A8, $A$1:$A$3, 0), 0))
INDEX($B$1:$E$3, MATCH(A8, $A$1:$A$3, 0), 0) returns the row within $B$1:$E$3 where the range $A$1:$A$3 corresponds to A8.

Resources