Obtaining the quantity and proportion in SPSS 21 - spss

I have the data in a sav file
CODE | QUANTITY
------|----------
A | 1
B | 4
C | 1
F | 3
B | 3
D | 12
D | 5
I need to obtain the quantity of codes which have a quantity <= 3 and to obtain the proportion in a percentage with respect to the total number and present a result like this
<= 3 | PERCENTAGE
------|----------
4 | 57 %
All of this using SPSS syntax.

I would first convert the quantity value to a 0-1 variable, and then aggregate by code to the mean. This produces a nice second dataset to make a table. Example below.
data list free / Code (A1) Quantity (F2.0).
begin data
A 1
B 4
C 1
F 3
B 3
D 12
D 5
end data.
*convert to 0-1.
compute QuantityB3 = (Quantity LE 3).
*Aggregate.
DATASET DECLARE AggQuant.
AGGREGATE
/OUTFILE='AggQuant'
/BREAK=Code
/QuantityB3 = MEAN(QuantityB3).

I dont know how you migrate your question here, I dont have reputation here to add screen shoots that's help you allot. Anyhow the procedure of your desire output is given below.
Goto Transform->Count Values within cases a dialogue box open, write the name of new variable say "New" in Target Variable: go to define values a new dialogue box is open then check the radio button Range, LOWEST through value: put in below box 3 and then press add and press continue and press ok. A new variable is created with the name of "New". Now go to Analyze -> Descriptive Statistics-> Frequencies, new dialogue box will be open send "New" variable into Variable(s): press Statistics in new dialogue box check Percentile(s): write 100 in box and press Add and then continue and ok. You get the desire results.

Related

Does GNU FORTH have an editor?

Chapter 3 of Starting FORTH says,
Now that you've made a block "current", you can list it by simply typing the word L. Unlike LIST, L does not want to be proceeded by a block number; instead it lists the current block.
When I run 180 LIST, I get
Screen 180 not modified
0
...
15
ok
But when I run L, I get an error
:30: Undefined word
>>>L<<<
Backtrace:
$7F0876E99A68 throw
$7F0876EAFDE0 no.extensions
$7F0876E99D28 interpreter-notfound1
What am I doing wrong?
Yes, gForth supports an internal (BLOCK) editor. Start gforth
type: use blocked.fb (a demo page)
type: 1 load
type editor
words will show the editor words,
s b n bx nx qx dl il f y r d i t 'par 'line 'rest c a m ok
type 0 l to list screen 0 which describes the editor,
Screen 0 not modified
0 \\ some comments on this simple editor 29aug95py
1 m marks current position a goes to marked position
2 c moves cursor by n chars t goes to line n and inserts
3 i inserts d deletes marked area
4 r replaces marked area f search and mark
5 il insert a line dl delete a line
6 qx gives a quick index nx gives next index
7 bx gives previous index
8 n goes to next screen b goes to previous screen
9 l goes to screen n v goes to current screen
10 s searches until screen n y yank deleted string
11
12 Syntax and implementation style a la PolyFORTH
13 If you don't like it, write a block editor mode for Emacs!
14
15
ok
Creating your own block file
To create your own new block file myblocks.fb
type: use blocked.fb
type: 1 load
type editor
Then
type use myblocks.fb
1 load will show BLOCK #1 (lines 0 till 15. 16 Lines of 64 characters each)
1 t will highlight line 1
Type i this is text to [i]nsert into line 1
After the current BLOCK is edited type flush in order to write BLOCK #1 to the file myblocks.fb
For more information see, gForth Blocks
It turns out these are "Editor Commands" the book says,
For Those Whose EDITOR Doesn't Follow These Rules
The FORTH-79 Standard does not specify editor commands. Your system may use a different editor; if so, check your systems documentation
I don't believe gforth supports an internal editor at all. So L, T, I, P, F, E, D, R are all presumably unsupported.
gforth is well integrated with emacs. In my xemacs here, by default any file called *.fs is considered FORTH source. "C-h m", as usual, gives the available commands.
No, GNU Forth doesn't have an internal editor; I use Vim :)

How to reference a particular row for an existing variable in SPSS syntax?

I have 2 variables, one for raw p-values and another for adjusted p-values. I need to compute a new variable based on the values of these two variables. What I need to do isn't too complicated, but I have a hard time doing it in SPSS because I can't figure out how I can reference a particular row for an existing variable in SPSS syntax.
The first column lists raw p-values in ascending order. The next column lists adjusted p-values, but these adjusted p-values are still incomplete. I need to compare two adjacent p-values in the adjusted p-values column (e.g., row 1 and 2, row 2 and 3, row 3 and 4, and so forth), and take the p-values whichever is smaller in each of these comparisons and enter those p-values into the following column as values for a new variable.
However, that's not the end of the story. One more condition has to be met. That is, the new p-values have to be in the same order as the raw p-values. However, I cannot ensure this if I start the comparisons from the top row. You can see that (i') is greater than (h') and (g'), and (d') is greater than (c'), (b'), and (a') in the example below (picture).
In order to solve this issue, I would need to start the comparison of the adjusted p-values from the bottom. In addition, I would need to compare the adjusted p-values to the new p-values of one row below. One exception is that I can simply use the value of (a) as the value of (a') since the value of (a) should always be the greatest of all the p-values as a rule. Then, for (b') , I need to compare (b) and (a') and enter whichever is smaller as (b'). For (c'), I need to compare (c) and (b') and enter whichever is smaller as (c'), and so forth. By doing this way, (d') would be 0.911 and (i') would be 0.017.
Sorry for this long post, but I would really appreciate if I can get some help to do this task in SPSS.
Thank you in advance for your help.
Raw p-values | Adjusted p-values (Temporal)| New p-values (Final)
-------------|-----------------------------|---------------------
0.002 | 0.030 (i) | 0.025 (i')
0.003 | 0.025 (h) | 0.017 (h')
0.004 | 0.017 (g) | 0.017 (g')
0.005 | 0.028 (f) | 0.028 (f')
0.023 | 0.068 (e) | 0.068 (e')
0.450 | 1.061 (d) | 1.061 (d')
0.544 | 1.145 (c) | 0.911 (c')
0.850 | 0.911 (b) | 0.911 (b')
0.974 | 0.974 (a) | 0.974 (a')
Another tool that may be convenient is the SHIFT VALUES command. It can move one or more columns of data either forward or backward.
I wonder whether the purpose of this has to do with adjusting p values for multiple testing corrections as with Benjamin-Hochberg FDR or others similar. If that is the case, you might find the STATS PADJUST (Analyze > Descriptives > Calculate adjusted p values) extension command useful. It offers six adjustment methods. You can install it from the Utilities (pre-V24) or Extensions (V24+) menu.
To get you started, here are a few tools that can help you with this task:
The LAG function
you can compare values in this line and the previous one, for example, the following will compare the Pval in each line to the one in the previous one, and put the smaller of the two in the NewPval:
compute NewPVal=min(Pval, lag(Pval)).
If you want to do the same process only start from the bottom, you can easily sort your data in reverse order and do the same.
CREATE + LEAD
if you want to make comparisons to the next line instead of the previous line, you should first create a "lead" variable and then compare to it.
for example, the following syntax will create a new variable that for each line contains the value of Pval in the next line, and then chooses the smaller of the two for the NewPval:
create /LeadPval=LEAD(Pval 1).
compute NewPVal=min(Pval, LeadPval).
Using case numbers
You can use case numbers (line numbers) in calculations and in conditions. For example, the following syntax will let you make different calculations in the first line and the following ones:
if $casenum=1 NewPval=Pval.
if $casenum>1 NewPVal=min(Pval, lag(Pval)).

Creating numbered variable names using the foreach command

I have a list of variables for which I want to create a list of numbered variables. The intent is to use these with the reshape command to create a stacked data set. How do I keep them in order? For instance, with this code
local ct = 1
foreach x in q61 q77 q99 q121 q143 q165 q187 q209 q231 q253 q275 q297 q306 q315 q324 q333 q342 q351 q360 q369 q378 q387 q396 q405 q414 q423 {
gen runs`ct' = `x'
local ct = `ct' + 1
}
when I use the reshape command it generates an order as
runs1 runs10 runs11 ... runs2 runs22 ...
rather than the desired
runs01 runs02 runs03 ... runs26
Preserving the order is necessary in this analysis. I'm trying to add a leading zero to all ct values less than 10 when assigning variable names.
Generating a series of identifiers with leading zeros is a documented and solved problem: see e.g. here.
local j = 1
foreach v in q61 q77 q99 q121 q143 q165 q187 q209 q231 q253 q275 q297 q306 q315 q324 q333 q342 q351 q360 q369 q378 q387 q396 q405 q414 q423 {
local J : di %02.0f `j'
rename `v' runs`J'
local ++j
}
Note that I used rename rather than generate. If you are going to reshape the variables afterwards, the labour of copying the contents is unnecessary. Indeed the default float type for numeric variables used by generate could in some circumstances result in loss of precision.
I note that there may also be a solution with rename groups.
All that said, it's hard to follow your complaint about what reshape does (or does not) do. If you have a series of variables like runs* the most obvious reshape is a reshape long and for example
clear
set obs 1
gen id = _n
foreach v in q61 q77 q99 q121 q143 {
gen `v' = 42
}
reshape long q, i(id) j(which)
list
+-----------------+
| id which q |
|-----------------|
1. | 1 61 42 |
2. | 1 77 42 |
3. | 1 99 42 |
4. | 1 121 42 |
5. | 1 143 42 |
+-----------------+
works fine for me; the column order information is preserved and no use of rename was needed at all. If I want to map the suffixes to 1 up, I can just use egen, group().
So, that's hard to discuss without a reproducible example. See
https://stackoverflow.com/help/mcve for how to post good code examples.

Generating means of a variable using dummy variables & foreach in Stata

My dataset includes TWO main variables X and Y.
Variable X represents distinct codes (e.g. 001X01, 001X02, etc) for multiple computer items with different brands.
Variable Y represents the tax charged for each code of variable X (e.g. 15 = 15% for 001X01) at a store.
I've created categories for these computer items using dummy variables (e.g. HD dummy variable for Hard-Drives, takes value of 1 when variable X represents a HD, etc). I have a list of over 40 variables (two of them representing X and Y, and the rest is a bunch of dummy variables for the different categories I've created for computer items).
I would like to display the averages of all these categories using a loop in Stata, but I'm not sure how to do this.
For example the code:
mean Y if HD == 1
Mean estimation Number of obs = 5
--------------------------------------------------------------
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
Tax | 7.1 2.537716 1.154172 15.24583
gives me the mean Tax for the category representing Hard Drives. How can I use a loop in Stata to automatically display all the mean Taxes charged for each category? I would do it by hand without a problem, but I want to repeat this process for multiple years, so I would like to use a loop for each year in order to come up with this output.
My goal is to create a separate Excel file with each of the computer categories I've created (38 total) and the average tax for each category by year.
Why bother with the loop and creating the indicator variables? If I understand correctly, your initial dataset allows the use of a simple collapse:
clear all
set more off
input ///
code tax str10 categ
1 0.15 "hd"
2 0.25 "pend"
3 0.23 "mouse"
4 0.29 "pend"
5 0.16 "pend"
6 0.50 "hd"
7 0.54 "monitor"
8 0.22 "monitor"
9 0.21 "mouse"
10 0.76 "mouse"
end
list
collapse (mean) tax, by(categ)
list
To take to Excel you can try export excel or put excel.
Run help collapse and help export for details.
Edit
Because you insist, below is an example that gives the same result using loops.
I assume the same data input as before. Some testing using this example database
with expand 1000000, shows that speed is virtually the same. But almost surely,
you (including your future you) and your readers will prefer collapse.
It is much clearer, cleaner and concise. It is even prettier.
levelsof categ, local(parts)
gen mtax = .
quietly {
foreach part of local parts {
summarize tax if categ == "`part'", meanonly
replace mtax = r(mean) if categ == "`part'"
}
}
bysort categ: keep if _n == 1
keep categ mtax
Stata has features that make it quite different from other languages. Once you
start getting a hold of it, you will find that many things done with loops elsewhere,
can be made loop-less in Stata. In many cases, the latter style will be preferred.
See corresponding help files using help <command> and if you are not familiarized with saved results (e.g. r(mean)), type help return.
A supplement to Roberto's excellent answer: After collapse, you will need a loop to export the results to excel.
levelsof categ, local(levels)
foreach x of local levels {
export excel `x', replace
}
I prefer to use numerical codes for variables such as your category variable. I then assign them value labels. Here's a version of Roberto's code which does this and which, for closer correspondence to your problem, adds a "year" variable
input code tax categ year
1 0.15 1 1999
2 0.25 2 2000
3 0.23 3 2013
4 0.29 1 2010
5 0.16 2 2000
6 0.50 1 2011
7 0.54 4 2000
8 0.22 4 2003
9 0.21 3 2004
10 0.76 3 2005
end
#delim ;
label define catl
1 hd
2 pend
3 mouse
4 monitor
;
#delim cr
label values categ catl
collapse (mean) tax, by(categ year)
levelsof categ, local(levels)
foreach x of local levels {
export excel `:label (categ) `x'', replace
}
The #delim ; command makes it possible to easily list each code on a separate line. The"label" function in the export statement is an extended macro function to insert a value label into the file name.

How to reference a specific row in the "compute variable" dialog of SPSS?

This is my problem:
I have a table of measured fluorescence values depending on a drug concentration. I can't use the values directly, because even in absence of drug, there is some small fluorescence. Consequently, I have to substract the value from the drug=0 measurement from all other values.
I figured I could calculate a new variable (normalized fluorescence), but how do I reference to the fluorescence value in the drug=0 row? In excel, I'd use sth like $34$2 to reference to that field, but how to do that in SPSS? Entering the value "hard-coded" seems a bit unflexible, and I wanna know how to do it by reference :). Hours of googling and reading in books have yielded no answer so far.
Thanks :)
Edit:
An example would be
Drug conc. | Fluorescence
0 | 0.1 <- this value is to be substracted from all fluo values
1 | 1.1
2 | 2.1
3 | 3.1
4 | 4.1
Constant in SPSS can be represented as a variable with the same value for all rows. So you have to make a new variable with value of Fluorescence when drug=0. Example:
data list free
/drug (f8) Fluorescence (f8.1).
begin data
0 0.1
1 1.1
2 2.1
3 3.1
4 4.1
end data.
sort cases drug.
do if drug = 0.
comp const = Fluorescence.
else.
comp const = lag(const).
end if.
exe.
comp Fluorescence2 = Fluorescence - const.
form const Fluorescence2 (f8.1).
exe.

Resources