How to read decimal in SPSS syntax? - spss

I want to read the following data in SPSS :
ID Age Sex GPA
----------------
1 17 M 5
2 16 F 5
3 17 F 4.75
4 18 M 5
5 19 M 4.5
My attempt:
DATA LIST / ID 1 AGE 2-3 SEX 4(A) GPA 5-8.
BEGIN DATA
117M5
216F5
317F4.75
418M5
519M4.5
END DATA.
LIST.
But the output is
ID AGE SEX GPA
---------------
1 17 M 5
2 16 F 5
3 17 F 5
4 18 M 5
5 19 M 5
How can I get the decimals?

You data is as expected, it is just the format of the GPA variable was incorrectly set to not have any decimals. You can simply use whats below to set it to show the decimals.
FORMATS GPA (F3.2).

Alternatively you can also try this
DATA LIST / ID 1 AGE 2-3 SEX 4(A) GPA 5-7(F,2).
BEGIN DATA
117M500
317F475
END DATA.
LIST.

Related

Google Sheets: Convert Horizontal Transaction Data into Chronological Statement + Combining Columns of Data

On a sheet named, "Performance," I have data concerning stock trades in a row like so:
A B C D E F G H I J
1 TICKER TRADE OPEN DATE TRADE CLOSED DATE SHARES AVG BUY INVESTMENT AVG SALE PROCEEDS PROFIT/LOSS ROIC:
2 ABC 01/05/22 03/31/22 107 $14.22 -$1,521.54 $15.00 $1,605.00 $83.46 5.49%
3 BCA 01/05/22 03/31/22 344 $14.52 -$4,994.88 $15.00 $5,160.00 $165.12 3.31%
4 CAB 01/05/22 03/31/22 526 $12.55 -$6,601.30 $13.00 $6,838.00 $236.70 3.59%
... and so forth ...
Within the same workbook but on a separate sheet named, "Contributions/Withdrawals," I have a list of contributions and withdrawals like so:
A B
1 DATE AMOUNT
2 01/05/22 $700.00
3 02/05/22 $700.00
4 03/05/22 $400.00
5 03/15/22 -$7,000.00
... and so forth ...
I need to convert the first table of trade transactions into a vertical column format exactly like what is in the Contributions/Withdrawals table. (Note that each trade transaction actually represents two transactions, one for opening with its own date, and one for closing with its date.) Finally, I need to stack both tables of transactions in date order to make a combined chronological list of transactions so that I can run an XIRR formula on it.
The resulting table on a sheet named, "Cash Flows," needs to look like this:
A B
1 DATE AMOUNT
2 01/05/22 -$1,521.54
3 01/05/22 -$4,994.88
4 01/05/22 -$6,601.30
5 01/05/22 $700.00
6 02/05/22 $700.00
7 03/05/22 $700.00
8 03/10/22 $400.00
9 03/15/22 -$7000.00
10 03/31/22 $1,605.00
11 03/31/22 $5,160.00
12 03/31/22 $6,838.00
Using the following in cell A2 and B2...
A2 =SORT({Performance!$B$2:$B;Performance!$C$2:$C;'Contributions/Withdrawals'!$A$2:$A})
B2 =SORT({Performance!$F$2:$F;Performance!$H$2:$H;'Contributions/Withdrawals'!$B$2:$B})
...almost gets me there, but the transactions are not lining up with the correct dates. Google Sheets is ordering the amounts from smallest to largest. What I end up with is this:
A B
1 DATE AMOUNT
2 01/05/22 -$7,000.00
3 01/05/22 -$6,602.72
4 01/05/22 -$6,602.39
5 01/05/22 -$6,601.30
6 01/05/22 -$6,596.40
7 01/05/22 -$6,587.10
8 01/05/22 -$4,994.88
9 01/05/22 -$3,315.26
10 01/05/22 -$3,284.91
11 01/05/22 -$1,521.54
12 02/05/22 $400.00
13 03/05/22 $700.00
14 03/10/22 $700.00
15 03/15/22 $700.00
16 03/31/22 $1,605.00
17 03/31/22 $3.249.00
18 03/31/22 $3,731.00
19 03/31/22 $5,160.00
20 03/31/22 $6,348.00
21 03/31/22 $6,532.00
22 03/31/22 $6,786.00
23 03/31/22 $6,838.00
Any help would be appreciated. Thanks!
You are very close indeed! You should join both ranges in order to sort them by the first column:
=SORT({Performance!$B$2:$B;Performance!$C$2:$C;'Contributions/Withdrawals'!$A$2:$A,Performance!$F$2:$F;Performance!$H$2:$H;'Contributions/Withdrawals'!$B$2:$B})
(You may need to change that only comma to a inverted slash if you have another locale settings)

Find the row of highest numbers from each of names or group who'd has been have a some of similarity of names then sumif their values group of names

I want to make the total of values every each member or names in every each their own group at the first match (or after blank space) or highest values positions of each them on column "D" according to column "B" with the result's row of an output like the exactly as an EXPECT OUTPUT as act of what I've just created on column "E". That's the replace a little bit down of just only one row from the column "B" positions or row must be the same as the column "C" and "D". Could we do this anyway ?
My achievements: I feel I've tried this before and got succeed to achieve this but I've forgot how to solve this when that happened. But it's look like kinda this code of formula:
=FILTER(IF(IFERROR(MATCH($B$3:$B;$B:$B;0);0)=ROW($B$3:$B);SUMIF($B$3:$B;$B$3:$B;$D$3:$D);"");$B$3:$B<>"0")
I don't know if I'm right or wrong but please see the table I'd created at the down below this and also see how I expected with that and feel free as well to edit to my doc file of google sheet I attached down below this.
THIS HERE YOU CAN EDIT TO MY SAMPLE G.SHEET TO SOLVE THIS MY QUIZ. THANKS IN ADVANCE!
A
B
C
D
E
1
2
N U M B
I D   -   M E M B E R
I D      -     C O D E
V A L U E S
E X P E C T     O U T P U T
3
4
4
JYFI7
5
JYFI7
J3573
3
6
6
JYFI7
IYR
1
7
JYFI7
F498S
2
8
9
3
DFJ9F11
10
DFJ9F11
C684J
7
8
11
DFJ9F11
J58
1
12
13
2
H684K
14
H684K
JF585
2
2
15
16
1
FJSR
17
FJSR
4684
7
16
18
FJSR
834
1
19
FJSR
49
2
20
FJSR
9835
6
Here's a possible solution:
=ARRAYFORMULA(LAMBDA(cusum,IF(SCAN(,cusum,
LAMBDA(acc,cur,if(cur="",,acc+1)))=1,cusum,))
(SORT(SCAN(,SORT(D3:D,ROW(D3:D),0),
LAMBDA(acc,cur,if(cur="",,acc+cur))),ROW(D3:D),0)))
You can find it in tab 'z' cell F3.

Clustering to achieve heterogeneous groups

I want to group 100 users based on a categorical variable (which can be low, medium, or high). The group size should be 3. I want to get the maximal heterogeneity within groups, assuming that users are distributed equally. I wonder if I can use some clustering algorithm to group based on the dissimilarity? Any suggestions?
I don't believe you need a clustering algorithm to group the data based upon a categorical variable.
Based on you question, I think this should work.
# Code
from sklearn.model_selection import train_test_split
group1, group23 = train_test_split(data, test_size=2/3., stratify=data['lab'])
group2, group3 = train_test_split(group23, test_size=1/2., stratify=group23['lab'])
Stratify makes sure that the maximum heterogeneity is maintained for the given categorical value.
# Sample output
print(data)
val1 val2 lab
0 1 1 L
1 2 2 L
2 3 3 L
3 4 4 M
4 5 5 M
5 6 6 M
6 7 7 H
7 8 8 H
8 9 9 H
print(group1)
val1 val2 lab
4 5 5 M
1 2 2 L
6 7 7 H
print(group2)
val1 val2 lab
8 9 9 H
2 3 3 L
3 4 4 M
print(group3)
val1 val2 lab
0 1 1 L
7 8 8 H
5 6 6 M
train_test_split() Documentation

Sum data in column with criteria in row

I wish to make a formula to sum up the value with 2 criteria, example show as below:-
A B C D E
1 1-Apr 2-Apr 3-Apr 4-Apr
2 aa 1 4 7 10
3 bb 2 5 8 11
4 cc 3 6 9 12
5
6 Criteria 1 bb
7 Range start 2-Apr-16
8 Range End 4-Apr-16
9 Total sum #VALUE!
tried formula
1 SUMIF(A2:A4,C6,INDEX(B2:E4,0,MATCH(C7,B1:E1,0)))
* Only return 1 cell value
2 SUMIF(A2:A4,C6,INDEX(B2:E4,0,MATCH(">="&C7,B1:E1,0)))
* Showed N/A error
3 SUMIFS(B2:E4,A2:A4,C6,B1:E1,">="&C7,B1:E1,"<="&C8)
* Showed #Value error
Hereby I attached a link of picture for better understanding :
Can anyone help me on the formula?
I figured out the solution with step evaluation:
=SUMIF(B1:F1,">="&C7,INDEX(B2:F4,MATCH(C6,A2:A4,0),0)) -
SUMIF(B1:F1,">"&C8,INDEX(B2:F4,MATCH(C6,A2:A4,0),0))

Using COUNTIFS on 3 different columns and then need to SUM a 4th column?

I have written this formula below. I do not know the correct part of this formula that will add the numbers I have in Column AB2:AB552. As it is, this formula is counting the number of cells in that range that has numbers in it, but I need it to total those numbers as my final result. Any help would be great.
=COUNTIFS(Cases!B2:B552,"1",Cases!G2:G552,"c*",Cases!X2:X552,"No",**Cases!AB2:AB552,">0"**)
Assuming you don't actually need the intermediate counts, the sumifs function should give you the final result:
=SUMIFS(Cases!AB2:AB552,Cases!B2:B552,1,Cases!G2:G552,"c",Cases!X2:X552,"No",Cases!AB2:AB552,">0")
Testing this with some limited data:
Row B G X AB
2 2 a No 10
3 1 c No 24
4 2 c No 4
5 1 c No 0
6 1 a Yes 9
7 2 c No 12
8 2 c No 6
9 2 b No 0
10 1 b No 0
11 1 a No 10
12 2 c No 6
13 1 c No 20
14 1 c No 4
15 1 b Yes 22
16 1 b Yes 22
the formula above returned 48, the sum of AB3, AB13, and AB14, which were the only rows matching all 4 criteria

Resources