[a] [b] [c]
Chrome Chrome Chrome
Chrome Internet Explorer Chrome
Chrome Chrome Chrome
Firefox Firefox Chrome
Internet Explorer Chrome Chrome
Safari Safari Chrome
Im new to SPSS so sorry if this is basic. Trying to product a graphical representation (line-graph) of the change in frequency for each option from a to b. And then a,b,c.
I figure, for each variable I need to calculate the % for each option and then plot that.
Any help would be greatly appreciated.
The short answer to generate what I believe you want is to reshape your data from wide to long, and then produce the summary chart. Example below:
*Making fake data that looks like yours.
input program.
loop #i = 1 to 1000.
compute caseid = #i.
compute A = TRUNC(RV.UNIFORM(1,4)).
compute B = TRUNC(RV.UNIFORM(1,4)).
compute C = TRUNC(RV.UNIFORM(1,4)).
end case.
end loop.
end file.
end input program.
dataset name Sim.
value labels A B C
1 'Chrome'
2 'Firefox'
3 'IE'.
*Reshape Wide to long.
VARSTOCASES
/MAKE Browser from A B C
/INDEX Period.
*Now make the summary chart.
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=Period COUNT()[name="COUNT"] Browser
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: Period=col(source(s), name("Period"), unit.category())
DATA: COUNT=col(source(s), name("COUNT"))
DATA: Browser=col(source(s), name("Browser"), unit.category())
GUIDE: axis(dim(1), label("Period"))
GUIDE: axis(dim(2), label("Count"))
GUIDE: legend(aesthetic(aesthetic.color.interior), label("Browser"))
SCALE: cat(dim(1))
SCALE: linear(dim(2))
SCALE: cat(aesthetic(aesthetic.color.interior), include("1.00", "2.00","3.00"))
ELEMENT: line(position(Period*COUNT), color.interior(Browser), missing.wings())
END GPL.
Which produces this chart:
If you have repeated measures data (i.e. the same persons browser over multiple time periods) you have more structure in the data that can be charted. One way You may consider area charts conditioned on the initial state. Below is an example, which with some post-hoc editing the chart produces this:
do if Period = 1.
compute initial_browser = Browser.
else if Period > 1.
compute initial_browser = lag(Browser).
end if.
value labels initial_browser
1 'Chrome'
2 'Firefox'
3 'IE'.
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=Period COUNT()[name=
"COUNT"] initial_browser Browser
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: Period=col(source(s), name("Period"), unit.category())
DATA: COUNT=col(source(s), name("COUNT"))
DATA: initial_browser=col(source(s), name("initial_browser"),unit.category())
DATA: Browser=col(source(s), name("Browser"), unit.category())
GUIDE: axis(dim(1), label("Period"))
GUIDE: axis(dim(2), label("Count"))
GUIDE: axis(dim(4), label("Initial Browser"), opposite())
GUIDE: legend(aesthetic(aesthetic.color.interior), label("Browser"))
SCALE: cat(dim(1))
SCALE: linear(dim(2), include(0))
SCALE: cat(dim(4))
SCALE: cat(aesthetic(aesthetic.color.interior), include("1.00", "2.00",
"3.00"))
ELEMENT: area.stack(position(Period*COUNT*1*initial_browser),
color.interior(Browser), missing.wings())
END GPL.
There are alot of other charting possibilities if this is the case.
Related
With SPSS v.25, I'm trying to add (via syntax) descriptive text a histogram's descriptive stats. So in the area of the graph that shows the Mean, Std. Dev, and N, is there a way to add text (again, via syntax)? Here's where I'm at:
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=TS_Pre_Raw_Sq_21 TS_Post_Raw_Sq_22 MISSING=LISTWISE
REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: TS_Pre_Raw_Sq_21=col(source(s), name("TS_Pre_Raw_Sq_21"))
DATA: TS_Post_Raw_Sq_22=col(source(s), name("TS_Post_Raw_Sq_22"))
GUIDE: axis(dim(1), label("Team STEPPS Pre-Test Raw Score 2018 Fall"))
GUIDE: axis(dim(2), label("Frequency"))
GUIDE: text.title(label("Team STEPPS Analysis"))
GUIDE: text.subtitle(label("Insert Term Here; e.g, Fall 2018"))
GUIDE: text.subsubtitle(label("Insert Cohort"))
GUIDE: text.footnote(label("Pre Test: Green Post Test: Blue"))
GUIDE: legend(aesthetic(aesthetic.color), label("Gender"))
ELEMENT: interval(position(summary.count(bin.rect(TS_Pre_Raw_Sq_21))),
shape.interior(shape.square), color(color.green), transparency.interior(Transparency. "0.6")))
ELEMENT: interval(position(summary.count(bin.rect(TS_Post_Raw_Sq_22))),
shape.interior(shape.square)), color(color.blue), transparency.interior(Transparency. "0.8")))
ELEMENT: line(position(density.normal(TS_Pre_Raw_Sq_21)))
ELEMENT: line(position(density.normal(TS_Post_Raw_Sq_22)))
END GPL.
And this is what I get:
So I have forced distributions from two variables in the graph but the descriptive stats aren't labeled; I'd like to add be able to show which stats correspond to which distributions.
Thanks in advance!
I created a line diagram with multiple lines with the Chart Builder in SPSS.
Within the Chart Editor I changed the line style from "color" to "dash". I saved the style as a template to apply it to further similar line charts. However the template doesn't seem to be applied, the lines are still colored and not dashed.
Is there a way to tell SPSS in the Syntax to apply a dashed line style from template?
Yes, you have to tell SPSS inside the GPL statement that you want to use a dashed style.
So lets assume you created the following chart from the 'breakfast.sav' sample file:
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=BT COUNT()[name="COUNT"]
gender[LEVEL=NOMINAL] MISSING=LISTWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE TEMPLATE = "$HOME/SPSS/linediagram.sgt".
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: BT=col(source(s), name("BT"), unit.category())
DATA: COUNT=col(source(s), name("COUNT"))
DATA: gender=col(source(s), name("gender"), unit.category())
GUIDE: axis(dim(1), label("Buttered toast"))
GUIDE: axis(dim(2), label("Percent"))
GUIDE: legend(aesthetic(aesthetic.color.interior), label("Gender"))
SCALE: linear(dim(2), include(0))
SCALE: cat(aesthetic(aesthetic.color.interior), include("1", "2"))
ELEMENT: line(position(summary.percent(BT*COUNT,
base.aesthetic(aesthetic(aesthetic.color.interior)))),
color.interior(gender), missing.wings())
END GPL.
Now within the ELEMENT statement you need to change both color.interior functions into shape.interior. So the statement would look like this.
ELEMENT: line(position(summary.percent(BT*COUNT,
base.aesthetic(aesthetic(aesthetic.shape.interior)))),
shape.interior(gender), missing.wings())
This turns the colored lines into black dashed lines.
If you want colored and dashed lines, just add the shape.interior(gender) function to the existing ELEMENT statement:
ELEMENT: line(position(summary.percent(BT*COUNT,
base.aesthetic(aesthetic(aesthetic.color.interior)))),
color.interior(gender), shape.interior(gender), missing.wings())
I thought the point was to add these settings. But if you don't want them, just delete the aesthetic, color, and shape function references.
I have a graph that works outside of a loop, but when it is included in a loop, I get the message "running inline gpl" and the error message "GPL error: id('graphdataset') not a quoted string: 'graphdataset'." Is there a special way to run graphs in a loop that I am missing?
DEFINE !ess1 (inum=!charend ('/')
/ iname=!charend ('/')
/ iname2=!charend ('/')
/ g1=!charend ('/')
/ g2=!charend ('/')
/ g3=!charend ('/')
/ g4=!charend ('/')).
RECODE INST
(!inum=1)
( !g1 = 2)
(!g2= 3)
(!g3=4)
(!g4=5)
into cgroup.
MISSING VALUES cgroup(-9).
variable labels cgroup 'Comparison Group'.
value labels cgroup 1 !iname2 2 'Thing1' 3 'Thing2' 4 'Thing3' 5 'Thing4'.
EXECUTE.
USE ALL.
VARIABLE LEVEL ALL (NOMINAL).
CTABLES
/VLABELS VARIABLES=satisf cgroup DISPLAY=DEFAULT
/TABLE cgroup [ROWPCT.COUNT PCT40.1] BY satisf
/SLABELS VISIBLE=NO
/CATEGORIES VARIABLES=satisf cgroup ORDER=A KEY=VALUE EMPTY=INCLUDE TOTAL=YES LABEL="Overall" POSITION=AFTER
MISSING=EXCLUDE
/TITLES
TITLE= 'Overall, how satisfied have you been with this example syntax?'.
RENAME VARIABLES (sinstql sdiscus sadvising sadmresp ssoclife scampcom slivecom=Var1 Var2 Var3 Var4 Var5 Var6 Var7).
varstocases
/make Likert From Var1 to Var7
/index Question (Likert).
*I need to make a variable to panel by.
compute panel = 0.
if Likert > 2 panel = 1.
*Aggregate N per question.
AGGREGATE OUTFILE=* MODE=ADDVARIABLES
/BREAK Question
/TotalPerQ = N.
*Use trans to make a percent.
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=Question MEAN(TotalPerQ)[name="MeanTotalPerQ"] COUNT()[name="COUNT"] Likert panel
MISSING=LISTWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
COORD: transpose(mirror(rect(dim(1,2))))
DATA: Question=col(source(s), name("Question"), unit.category())
DATA: COUNT=col(source(s), name("COUNT"))
DATA: MeanTotalPerQ=col(source(s), name("MeanTotalPerQ"))
DATA: Likert=col(source(s), name("Likert"), unit.category())
DATA: panel=col(source(s), name("panel"), unit.category())
TRANS: Perc = eval((COUNT/MeanTotalPerQ)*100)
GUIDE: axis(dim(1), label("Satisfaction"))
GUIDE: axis(dim(3), null(), gap(0px))
GUIDE: legend(aesthetic(aesthetic.color.interior), null())
SCALE: linear(dim(2), include(0))
SCALE: cat(aesthetic(aesthetic.color.interior), sort.values("1","2","4","3"), map(("1", color.red), ("2", color.lightpink), ("3", color.lightgreen), ("4", color.green)))
ELEMENT: interval.stack(position(Question*Perc*panel), color.interior(Likert),shape.interior(shape.square),transparency.exterior(transparency."1"))
END GPL.
******************************.
dataset close *.
get file= 'FilePath.sav'.
DELETE VARIABLES cgroup.
OUTPUT EXPORT
/CONTENTS EXPORT=VISIBLE LAYERS=PRINTSETTING MODELVIEWS=PRINTSETTING
/PDF DOCUMENTFILE=!Quote(!Concat('Y:\Surveys\ESS\2015\COFHE ESS 2015 Comparison Report ',!iname,'.pdf'))
EMBEDBOOKMARKS=YES EMBEDFONTS=YES.
OUTPUT SAVE
OUTFILE=!Quote(!Concat('Y:\Surveys\ESS\2015\COFHE ESS 2015 Comparison Report ',!iname,'.spv'))
OUTPUT CLOSE *.
OUTPUT NEW.
!ENDDEFINE.
!ess1 inum=1/iname=Name1/ iname2='Name1'/g1= 2,3,4,5,6,7,8,9,10,11,12,13 /g2= 21,22,23,24,25,26,27,29/g3=31,32,33,34,35,36,37,38/g4=41,42,43,44,45/.
!ess1 inum=2 /iname=Name2 /iname2='Name2'/g1= 1,3,4,5,6,7,8,9,10,11,12,13 /g2= 21,22,23,24,25,26,27,29/g3=31,32,33,34,35,36,37,38/g4=41,42,43,44,45/.
!ess1 inum=3 /iname=Name3 /iname2='Name3'/g1= 1,2,4,5,6,7,8,9,10,11,12,13 /g2= 21,22,23,24,25,26,27,29/g3=31,32,33,34,35,36,37,38/g4=41,42,43,44,45/.
Macros are not supported with GPL, because the GPL syntax doesn't follow standard SPSS Statistics syntax and macro expansion would be unreliable. Sometimes it would work, but Python programmability is the appropriate mechanism for this.
A demonstration on how to do this can be found here
I'm working through the charting and comparing prices tutorial on tryfsharp.org and my Chart.Combine function in Fsharp.Charting library will not work, but, other charts, such as Chart.Line will work! Code below.
// Helper function returns dates & closing prices from 2012
let recentPrices symbol =
let data = stockData symbol (DateTime(2012,1,1)) DateTime.Now
[ for row in data.Data -> row.Date.DayOfYear, row.Close ]
Chart.Line(recentPrices "AAPL", Name="Apple") //These two guys work when I try to plot them.
Chart.Line(recentPrices "MSFT", Name="Microsoft")
Chart.Combine( // This guy will not plot. Syntax found here: http://fsharp.github.io/FSharp.Charting/PointAndLineCharts.html
[ Chart.Line(recentPrices "AAPL", Name="Apple")
Chart.Line(recentPrices "MSFT", Name="Microsoft")])
I'd suggest you substituting your data generator function with something simpler and achieving correct plotting with this mockup first. For example, the following script:
#load #"<your path here>\Fsharp.Charting.fsx"
open System
open FSharp.Charting
let rand = System.Random
let recentPricesMock symbol =
[for i in 1..12 -> DateTime(2012,i,1),rand.Next(100)]
Chart.Combine (
[ Chart.Line(recentPricesMock "AAPL", Name="Apple")
Chart.Line(recentPricesMock "MSFT", Name="Microsoft")])
must plot combined mockup chart without any problems, as it does on my local box. From here you may drill down for the cause of original problem comparing your recentPrices with recentPricesMock.
EDIT: after getting to the full problematic source code I can point out two problems there that, as I was expecting, are in your choice of data rather, than in charting per se:
First, your definition of recentPrices converts dates into sequential day of year (row.Date.DayOfYear), so transition from 2012 into 2013 messes up your data and, consequently, charts. If you want to preserve your current functionality then it makes sense to redefine recentPrices as below
let recentPrices symbol =
let data = stockData symbol (DateTime(2012,1,1)) DateTime.Now
[ for row in data.Data -> row.Date, row.Close ]
Second, you chose a pair of stocks that doesn't scale well being combined on the single chart (AAPL in high hundreds $$, while MSFT in low tens $$), which adds to repetition of data points from first problem. After changing in your code AAPL to YHOO in addition to the recentPrices definition change described above
Chart.Combine ([
Chart.Line(recentPrices "YHOO", Name="Yahoo")
Chart.Line(recentPrices "MSFT", Name="Microsoft")
])
yields a beautiful smooth chart combo:
I am working on a data set that is made up of multiple response questions. I would like to run a count frequency against all the variables and merge the graphs so it will display the percentage of people who checked off the box. I cannot figure out how to get SPSS to do multiple counts and merge the output graphs. Anyone have some insight?
The data set is set up
q1 q2 q3 q4 q5
1 - 1 1 1
1 1 1 1 1
1 1 - 1 1
1 - - 1 -
So the graph I am trying to out put will have the variables and outputting:
q1==== 100%
q2== 50%
q3== 50%
q4==== 100%
q5=== 75%
I have tried merging the responses to one variable but that is resulting in miss aligned data. Can this be achieved through recoding?
To illustrate Jon's and Lanelor's excellent advice, to start with your data;
data list fixed / q1 TO q5 1-5.
begin data
1 111
11111
11 11
1 1
end data.
dataset name mr.
I would typically not keep this as missing data, but recode to zero where a value is absent (this changes how cases are treated in charts - so it does make a difference);
recode q1 TO q5 (SYSMIS = 0).
Then you can define a mutliple response set and include it in graphs built through the chart builder.
* Define Multiple Response Sets.
MRSETS
/MDGROUP NAME=$qs CATEGORYLABELS=VARLABELS VARIABLES=q1 q2 q3 q4 q5 VALUE=1
/DISPLAY NAME=[$qs].
*Make the chart - can use chart builder GGRAPH to include multiple response sets.
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=$qs[name="qs"] COUNT()[name=
"COUNT"] MISSING=LISTWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: qs=col(source(s), name("qs"), unit.category())
DATA: COUNT=col(source(s), name("COUNT"))
GUIDE: axis(dim(1), label("$qs"))
GUIDE: axis(dim(2), label("Count"))
SCALE: cat(dim(1), include("q1", "q2", "q3", "q4", "q5"))
SCALE: linear(dim(2), include(0))
ELEMENT: interval(position(qs*COUNT), shape.interior(shape.square))
END GPL.
Similarly, if creating the table suggested by Lanelor;
MULT RESPONSE GROUPS=$q1toq5 (q1 q2 q3 q4 q5 (1))
/FREQUENCIES=$q1toq5.
You can select the desired statistics within the table, and then right-click and produce a chart from those selections (and after the screen shot it includes the chart it produces on my machine with my personal chart template);
GGRAPH and the MRSETS commands are more powerful and allow more customization over the plots, but the suggestion by Lanelor is fine for some quick EDA.
Instead of MULT RESPONSE, use Data > Define Multiple Response Sets. Then you can use the mult response variable in the Chart Builder and, if you have the Custom Tables option, you can use it in constructing tables as well. The set definitions defined this way cannot be used in the MULT RESPONSE procedure, however.
From the menu: Analyze->Multiple Response->Define Variable Set->Move to "Selected" q1 to q5, check dichotomy type and enter what number to be counted (in the example, that is 1). Choose a name and confirm. Then Analyze->Multiple Response->Frequencies-> /name of the created set/.
If you have to repeat for many variables look up the syntax coding in SPSS, like:
MULT RESPONSE GROUPS=$q1toq5 (q1 q2 q3 q4 q5 (1))
/FREQUENCIES=$q1toq5.