How to handle multiple answers in SPSS - spss

I am having a questionnaire and I am trying to analyze it in the SPSS, the question is:
What are the benefits of the Implementation:
1. Fast
2. Time Saving.
3.Reduce Cost.
How to handle this in SPSS.
I have created a variable called Benefits and in the values, I added them. Now, in the Data View how to choose multiple answers?
Thanks

The usual approach is to treat this as a multiple response question. It could be a multiple dichotomy where you have a yes/no variable for each possible response. Alternatively it could be a multiple response set where you list the first k responses. The Custom Tables option provides the most flexibility here, but there is also a MULT RESPONSE procedure in the Base if you don't have Custom Tables. The Chart Builder can also handle multiple response sets.
HTH,
Jon Peck

May be this is not the "sophisticated way" but...
If I understand your question correct: 1. you want to now, how to add these values into SPSS and 2. You want to know, how to work with this data.
Actually by 3 value options you have 9 groups: 1. fast AND time saving AND reduce cost; 2. fast AND time saving NO reduced costs; 3. fast No time saving No reduced costs... and so to group 9: NO fast NO time saving NO reduce costs.
1: SPSS is similar to excel. In the "variable view" (below, left) you create a new variable, e.g. "Benefits". Then you click on "Label" and write "1" for "fast/time saving/reduce costs", "2" for "fast/timesaving", 3 for...
Then you go to "data view" and enter the corresponding values.
2: At the end you will have 9 groups and can perform an ANOVA test. Or you can "recode" the "Benefits" variable to a set of dummy variables, such as "fast" whereby every group which contained the anser "fast" is coded as "1" and the rest as "0"; and so on.
I hope it helps!
Best,
Eugene

Related

How to column sum in Airtable like excel?

I am designing airtable.
I met some issue.
I need to column total sum like excel.
For example:
Column1 Sum(Column1)
1 1
2 3
4 7
6 13
Like this. What is solution?
Thank you.
Airtable is not a spreadsheet but a relational database, the behavior you're looking for isn't straightforward to reproduce because it's not meant to work like rows matter and aren't anything but a temporary sort - they don't, so they're not.
I know you were probably hoping for a better answer than another question, but I simply have to ask - why do you need to do this inside Airtable? Why not just use one of dozens readily available and entirely free spreadsheet/table solutions?
Generally speaking and especially if you don't need to manipulate that data afterward (a big if), Airtable already does the calculation you want on the fly and has it stored as part of its metadata.
The row-by-row addition would be doable using a combination of one autonumber field, a linked record (to another table actually polling for values) and a rollup returning data. I've done this back in the day once or twiCe and it was always an overengineered mess, even if I only had to deal with small integers like from your example.
There's Google Drive sync beta ongoing at Airtable right now. Just get that and do the calculations you need elsewhere?
The alternative is the Scripting app, but that might prove crippling in terms of how it could affect your automations quota. And recalculating fields by "hand" is... not sophisticated, to put it mildly.
But hey, don't take my word for it; Curiosity and nostalgia got the better of me so I gave this futile effort another go on your behalf, here's my best take at such an overengineered mess of a field-wide sum function that's wildly annoying to use but at least doesn't take all day to update records, even if presented with hundreds or thousands of inputs.
So, yeah... right tool for the job and all that: this ain't it, chief, but be my guest. You can clone the base from the Universe and everything will be ready for testing, just keep creating new fields or deleting the exiting few ones, then hit the "Run Script" button of the only app hooked into the base to see it recalculate the sum.
Dumping the code here as we, if anyone wants to set up a new testing environment manually:
let table = base.getTable('Table 1');
let query = await table.selectRecordsAsync()
let cellsToAdd = [];
let sum = 0;
const keepSumming = x => sum += x ;
query.records.forEach( x => cellsToAdd.push(
{
"id": x.id,
fields:{
"Column1":x.getCellValue('Column1'),
"Sum":keepSumming(x.getCellValue('Column1'))
}
}
));
//Airtable limits us to 50 table mutations per request, hence the splicing
while(cellsToAdd.length>0){
await table.updateRecordsAsync(
cellsToAdd.splice(0,50)
)
};
There appears to be a feature built-in to Airtable for this now. In the desktop app, at least, take a look down at the bottom of each column below all rows. There is a context menu for the column, in which Sum is one of several options.

Creating Dynamic Sheet Cell Reference List for pulling numbers to SUM

I've been working on building a data analysis sheet, which is quite verbose at the moment and a bit more complicated than it should be as I've been trying to figure this out. Please note, I work doing student data in a school.
Basically, I have two sets of input data:
Data imported from a CSV file that includes test data and codes for Common Core Standards and the questions tied to those standards as a whole class summary
Data imported from a CSV file that includes individual scores by question
I am looking to construct 2 views:
A view that collates and displays data of individual standards per student that includes a dropdown to change the standard allowing a teacher to see class performance by standard in a broad view. The drop-down is populated dynamically from the input data (so staff could eventually dump data and go directly to reports)
A view that collates and displays data of individual students broken down by performance on each standard allowing a teachers to see the broader spectrum for each student. The student drop-down is populated from Source list 2.
I have been able to build the first view, but am struggling with the second. I've been able to separate the question codes and develop strings of cell references to the scoring data, including a dynamic reference to the row the selected student's score data appears on in the second source set from above.
I tried to pass through an indirect() formula into a sum() so as to process for a mean evaluation, and have encountered errors. I think SUM() doesn't process comma-separated cell reference lists from Indirect() [or in general] or there is something that I am missing to help parse it. Here is the formula I have tried:
=Sum(vlookup(D7,CCCodeManip!$A:$C,3,false))
CCCodeManip!C:C includes the created text (based on the dynamic standards and question codes, etc), here's an example of what would be found there:
'M-ADI'!M17, 'M-ADI'!N17, 'M-ADI'!O17, 'M-ADI'!P17, 'M-ADI'!Q17, 'M-ADI'!R17, 'M-ADI'!J17
I need these to be dynamic so that teachers can input different sets of standards, question, and student data and the sheet automatically collates and reports it in uniform ways (with an upward bound of 20 standards as I currently have it built)
Here is a link to the sheet I built, with names and ID anonymized. There's a CRAP TON of sub-tabs, and that's really just being able to split apart and re-combine data neatly without things error-ing out due to data overlapping, aside from a few different attempts and different approaches to parse the cell reference strings.
The first two tabs are the current status of the data views. I plan to hide a bunch of the functional stuff that is there to help pull data accurately.
The 3rd and 4th tab are the source data sets. 5th is a modified version of source data that allows me to reference things better, and I've tried to arrange the sheets most relevant towards the front of the set.
https://docs.google.com/spreadsheets/d/1fR_2n60lenxkvjZSzp2VDGyTUO6l-3wzwaV4P-IQ_5Y/edit?usp=sharing
Some have a different approach? I am aware that I might be as far as I cn go with this and perhaps should consider scripts - my coding experience is a bit out of date and my strength is more with the formulas, but I can dig into things with some direction, if anyone can help.
Ok so I noticed something.
It seems the failure is in the indirect reference:
=indirect(CCCodeManip!C3)
The string I am trying to parse via indirect is going to be generated into something like this, dynamic from reference to other data:
'M-ADI'!M17, 'M-ADI'!N17, 'M-ADI'!O17, 'M-ADI'!P17, 'M-ADI'!Q17, 'M-ADI'!R17, 'M-ADI'!J17
The indirect returns the error that the above string is not a cell reference with the #REF code.
Can someone give me a clue as to what is causing this? I am going to dig into the docs on Indirect() from google and will post anything that I find.
Perhaps it is that indirect() can't handle lists, but only specific references and arrays, which may require me a to build a sheet to do the SUM formula on for each question set (?)
So I think I figured it out, but i Ended up parsing the data differently, basically doing the sum based on individual cell references and a separate sum formula, bypassing the need to do it all at once, it jsut makes my sheets a lot dirtier! I am eventually going to see if code could do it better if I need to, but this is closed for now.
Basically, I did individual cell references to recall scores in a row, then used a separate SUM formula, and created references / structures to be able to pull those sum() results. Achieves the same end, but with extra crap on the sheet.

Multiple response crosstabs/frequencies based on categorical variable in SPSS

I've just started using SPSS after using R for about five years (I'm not happy about it, but you do what your boss tells you). I'm just trying to do a simple count based on a categorical variable.
I have a data set where I know a person's year of birth. I've recoded into a new variable so that I have their generation as a categorical variable, named Generation. I also have a question that allows for multiple responses. I want a frequency of how many times each response was collected.
I've created a multiple response variable (analyze>multiple response > Define variable sets). However, when I go to create crosstabs, the Generation variable isn't an option to select. I've tried googling, but the videos I have watched have the row variables as numeric.
Here is a google sheet that shows what I have and what I'm looking to achieve:
https://docs.google.com/spreadsheets/d/1oIMrhYv33ZQwPz3llX9mfxulsxsnZF9zaRf9Gh37tj8/edit#gid=0
Is it possible to do this?
First of all, to double check, when you say you go to crosstabs, is this Analyze > Multiple Response > Crosstabs (and not Analyze > Descriptive Statistics > Crosstabs)?
Second, with multiple response data, you are much better off working with Custom Tables. Start by defining the set with Analyze > Custom Tables > Multiple Response Sets. If you save your data file, those definitions are saved with it (unlike the Mult Response Procedure).
Then you can just use Custom Tables to tabulate mult response data pretty much as if it were a regular variable, but you have more choices about appropriate statistics, tests of significance etc. No need in the CTABLES code to explicitly list the set members.
Try CUSTOM TABLES, although this is an additional add-on modules that you need to have a licence for:
CTABLES /TABLE Generation[c] by (1_a+ 1_b + 1_c)[s][sum f8.0 'Count'].

SPSS Frequency Plot Complication

I am having a hard time generating precisely the frequency table I am looking for using SPSS.
The data in question: cases (n = ~800) with categorical variables DX_n (n = 1-15), each containing ICD9 codes, many of which are the same code. I would like to create a frequency table that groups the DX_n variables such that I can view frequency of every diagnosis in this sample of cases.
The next step is to test the hypothesis that the clustering of diagnoses in this sample is different than that of another. If you have any advice as to how to test this, that would be really appreciated as well!
Thanks!
Edit: My attempts:
1) Analyze -> Descriptive Statistics -> Frequencies; then add variables DX_n (1-15) and display frequency charts. The output is frequencies of each ICD9 code per DX_n variable (so 15 tables are generated - I'm hoping to just have one grouped table).
2) I tried adjusting the output format to organize by variable and also to compare variables but neither option gives the output I'm looking for.
I think what you are looking for CTABLES. It can do parallel columns of frequencies, and it includes a column proportions test that can see whether the distributions differ
Thank you, JKP! You set me on exactly the right track. I'm not sure how I overlooked that menu. Just to clarify in case anyone else comes along needing to figure this out:
Group diagnosis variables into a multiple response set using Analyze > Custom Tables > Multiple Response Sets. Code the variables as categories.
http:// i.imgur.com/ipE9suf.png
Create a custom table with your new multiple response set as a row and the subsets to compare as columns. I set summary statistics to compute from rows and added the column n% column (sorted descending).
http:// i.imgur.com/hptIkfh.png
Under test statistics, include a column proportions z-test as JKP suggested.
http:// i.imgur.com/LYI6ZRl.png
Behold, your results:
http:// i.imgur.com/LgkBA8X.png
Thanks again, and best of luck to anyone else who runs across this.
-GCH
p.s. Sorry everyone, I was going to post images but don't have enough reputation points yet. Images detailing the steps in the GUI can be found at the obfuscated links above.

Automatically updating Data Validation lists based on user input

I have a very large data set (about 16k rows). I have 10 higher level blocks and within each block I have 4 categories (10 rows for each) which use Data Validation lists to show items available in each category. The lists should automatically update based on user input. What I need your help with is that I want to use the same data set for each block and preferably a least calculation/size intensive approach. I have put together a sample file that outlines the issue with examples.
Sample File
Thank you for your help in advance.
Okay, I've found something, but it can be quite time consuming to do.
Select each range of cells. For instance, for the first one, select B3:B18 and right click on the selection. Find 'Name a Range..." and give it the name "_FIN_CNY". Repeat for all the other ranges, changing the name where necessary.
Select the first range of cells to get the data validation, and click on "Data validation", pick the option "Allow: List" (you already have it) and then in the source, put the formula:
=INDIRECT($G$4&"_CNY")
$G$4 is where the user will input. This changes as you change blocks.
_CNY is the category. Change it to _CNY2 for the second category.
Click "OK" and this should be it. Repeat for the other categories.
I have put an updated file on dropbox where you can see I already did it for the data of _FIN for categories CNY, CNY2 and INT and did the one for _GER as well. You'll notice the category of INT for _GER doesn't work, that's because the Named Range _GER_INT doesn't exist yet.

Resources