Overlaying graphs

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Overlaying graphs

John F Hall

I ran a training workshop, The Beginners’ [Clods’ ] Guide to Survey Analysis Using SPSS for ASSESS (SPSS users in Europe) at York University (UK) on 31st Oct.  The session lasted 2¾ hours (SPSS in 165 minutes!) and the emphasis was on using syntax for basic operations (file building and data management) and on tabulation with %% for analysis.  The underlying logic for analysis was to see what happens to a dependent variable when tabulated against an independent variable, then again when controlling for a third test variable.

 

There wasn’t time to cover any graphics as such (except barchart and histogram from FREQUENCIES) so I was writing some supplementary notes for the researchers and PhD students who attended, and started playing around with Graphs >> Chart builder. 

 

I can produce a clustered barchart for an ordinal variable “sexism” with boys and girls displayed side by side in alternating colours, but is there way to overlay the line/area charts from histograms for the scale variable height (in metres) of men and women?

 

Using GUI produces two separate graphs.  The nearest I can get so far is a population pyramid or two line graphs on the same chart using sex as a row grouping, but at least the height axis is the same for both men and women.  Separate histograms from FREQUENCIES are displayed on a different scale and can’t easily be compared.

 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

Jignesh Sutar
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

John F Hall

Jignesh

 

Thanks for this.  The stacked histogram is pretty close (I’d already found that after I mailed the list) but I really need the bars to be clustered, not stacked, as in a clustered barchart.  From the SPSS help pages, it looks as if there is no way of overlaying two line charts, but I’d be very surprised if there isn’t one.  I’ll try copying them to Word and see if I can do a workaround using Snip and changing the transparencies and/or text-wrapping settings.

 

ASSESS 2014 at York went well, but I’ve sent a separate mail to the list as it’s well off topic.

 

John

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

 

 

 

 

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jignesh Sutar
Sent: 06 November 2014 08:53
To: [hidden email]
Subject: Re: Overlaying graphs

 

Hi John,

 

Hope the workshop went well.

 

 

You can explore some example graphs from this link:

 

http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fvizml_examples_overview.htm

<http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fvizml_examples_overview.htm> 

 

 

Is this close to what you are trying to acheive?

 

http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm

<http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm> 

 

 

<http://spssx-discussion.1045642.n5.nabble.com/file/n5727814/histogramstack.png>

 

 

 

--

View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727813p5727814.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

 

=====================

To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

Albert-Jan Roskam-2
In reply to this post by John F Hall
Clustered bar graph in spss/R:
http://postimg.org/image/ghi832szj/IMHO

* sample data.
set rng=mt mtindex=43210.
input program.
+numeric id (n9) gender (f1) year (f4).
+loop id = 1 to 100.
+    compute gender = trunc(rv.uniform(0, 2)).
+    compute year = trunc(rv.uniform(1990, 2000)).
+end case.
+end loop.
+end file.
end input program.
if ( missing(id) ) id = lag(id).
execute.
 
* code for clustered bar graph.
begin program r.
require("ggplot2") || install.packages("ggplot2")
df <- spssdata.GetDataFromSPSS()
require("ggplot2")
p <- ggplot(df)
p <- p + geom_histogram(aes(factor(year), fill=factor(gender)), position="dodge")
p <- p + xlab("Year") + ylab("Count") + opts(title="Cool graph\n")
p <- p + scale_fill_discrete(name="Gender", breaks=c(0, 1), labels=c("male","female"))
graph <- file.path(Sys.getenv("temp"), "graph.png")
ggplot2::ggsave(file=graph, plot=p, dpi=100)
spssRGraphics.Submit(graph)
end program.



Regards,
Albert-Jan

------------------------------
On Thu, Nov 6, 2014 12:14 PM CET John F Hall wrote:

>Jignesh
>
>Thanks for this.  The stacked histogram is pretty close (I'd already found
>that after I mailed the list) but I really need the bars to be clustered,
>not stacked, as in a clustered barchart.  From the SPSS help pages, it looks
>as if there is no way of overlaying two line charts, but I'd be very
>surprised if there isn't one.  I'll try copying them to Word and see if I
>can do a workaround using Snip and changing the transparencies and/or
>text-wrapping settings.
>
>ASSESS 2014 at York went well, but I've sent a separate mail to the list as
>it's well off topic.
>
>John
>
>John F Hall (Mr)
>[Retired academic survey researcher]
>
>Email:    <mailto:[hidden email]> [hidden email]  
>Website:  <http://www.surveyresearch.weebly.com/>
>www.surveyresearch.weebly.com
>SPSS start page:
><http://surveyresearch.weebly.com/1-survey-analysis-workshop.html>
>www.surveyresearch.weebly.com/1-survey-analysis-workshop
>
>
>
>
>
>
>
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
>Jignesh Sutar
>Sent: 06 November 2014 08:53
>To: [hidden email]
>Subject: Re: Overlaying graphs
>
>Hi John,
>
>Hope the workshop went well.
>
>
>You can explore some example graphs from this link:
>
>
><http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.i
>bm.spss.statistics.help%2Fvizml_examples_overview.htm>
>http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ib
>m.spss.statistics.help%2Fvizml_examples_overview.htm
><
><http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.i
>bm.spss.statistics.help%2Fvizml_examples_overview.htm>
>http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ib
>m.spss.statistics.help%2Fvizml_examples_overview.htm>  
>
>
>Is this close to what you are trying to acheive?
>
>
><http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.i
>bm.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm>
>http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ib
>m.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm
><
><http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.i
>bm.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm>
>http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ib
>m.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm>  
>
>
><
><http://spssx-discussion.1045642.n5.nabble.com/file/n5727814/histogramstack.
>png>
>http://spssx-discussion.1045642.n5.nabble.com/file/n5727814/histogramstack.p
>ng>
>
>
>
>--
>View this message in context:
><http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727813p5
>727814.html>
>http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727813p57
>27814.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
><mailto:[hidden email]> [hidden email] (not to
>SPSSX-L), with no body text except the command. To leave the list, send the
>command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send
>the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

John F Hall
Albert-Jan

Showing that to students would have emptied the room!  Double-Dutch to me as
well.  However, I'd like to run the code on the two sample data sets.  If I
have a *.sav file open, will it run on that automatically or do I have to do
something else first?

John

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Albert-Jan Roskam
Sent: 06 November 2014 12:54
To: [hidden email]
Subject: Re: Overlaying graphs

Clustered bar graph in spss/R:
http://postimg.org/image/ghi832szj/IMHO

* sample data.
set rng=mt mtindex=43210.
input program.
+numeric id (n9) gender (f1) year (f4).
+loop id = 1 to 100.
+    compute gender = trunc(rv.uniform(0, 2)).
+    compute year = trunc(rv.uniform(1990, 2000)).
+end case.
+end loop.
+end file.
end input program.
if ( missing(id) ) id = lag(id).
execute.
 
* code for clustered bar graph.
begin program r.
require("ggplot2") || install.packages("ggplot2") df <-
spssdata.GetDataFromSPSS()
require("ggplot2")
p <- ggplot(df)
p <- p + geom_histogram(aes(factor(year), fill=factor(gender)),
position="dodge") p <- p + xlab("Year") + ylab("Count") + opts(title="Cool
graph\n") p <- p + scale_fill_discrete(name="Gender", breaks=c(0, 1),
labels=c("male","female")) graph <- file.path(Sys.getenv("temp"),
"graph.png") ggplot2::ggsave(file=graph, plot=p, dpi=100)
spssRGraphics.Submit(graph)
end program.



Regards,
Albert-Jan

------------------------------
On Thu, Nov 6, 2014 12:14 PM CET John F Hall wrote:

>Jignesh
>
>Thanks for this.  The stacked histogram is pretty close (I'd already
>found that after I mailed the list) but I really need the bars to be
>clustered, not stacked, as in a clustered barchart.  From the SPSS help
>pages, it looks as if there is no way of overlaying two line charts,
>but I'd be very surprised if there isn't one.  I'll try copying them to
>Word and see if I can do a workaround using Snip and changing the
>transparencies and/or text-wrapping settings.
>
>ASSESS 2014 at York went well, but I've sent a separate mail to the
>list as it's well off topic.
>
>John
>
>John F Hall (Mr)
>[Retired academic survey researcher]
>
>Email:    <mailto:[hidden email]> [hidden email]  
>Website:  <http://www.surveyresearch.weebly.com/>
>www.surveyresearch.weebly.com
>SPSS start page:
><http://surveyresearch.weebly.com/1-survey-analysis-workshop.html>
>www.surveyresearch.weebly.com/1-survey-analysis-workshop
>
>
>
>
>
>
>
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf
>Of Jignesh Sutar
>Sent: 06 November 2014 08:53
>To: [hidden email]
>Subject: Re: Overlaying graphs
>
>Hi John,
>
>Hope the workshop went well.
>
>
>You can explore some example graphs from this link:
>
>
><http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2F
>com.i bm.spss.statistics.help%2Fvizml_examples_overview.htm>
>http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fc
>om.ib m.spss.statistics.help%2Fvizml_examples_overview.htm
><
><http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2F
>com.i bm.spss.statistics.help%2Fvizml_examples_overview.htm>
>http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fc
>om.ib m.spss.statistics.help%2Fvizml_examples_overview.htm>
>
>
>Is this close to what you are trying to acheive?
>
>
><http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2F
>com.i
>bm.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm>
>http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fc
>om.ib
>m.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm
><
><http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2F
>com.i
>bm.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm>
>http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fc
>om.ib
>m.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm>
>
>
><
><http://spssx-discussion.1045642.n5.nabble.com/file/n5727814/histogramstack
.

>png>
>http://spssx-discussion.1045642.n5.nabble.com/file/n5727814/histogramst
>ack.p
>ng>
>
>
>
>--
>View this message in context:
><http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727
>813p5
>727814.html>
>http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp57278
>13p57
>27814.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
><mailto:[hidden email]> [hidden email] (not to
>SPSSX-L), with no body text except the command. To leave the list, send
>the command SIGNOFF SPSSX-L For a list of commands to manage
>subscriptions, send the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except
>the command. To leave the list, send the command SIGNOFF SPSSX-L For a
>list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

Andy W
In reply to this post by Albert-Jan Roskam-2
The example clustered bar chart Albert gives can be easily replicated within the chart builder GUI. But that isn't what I imagined John meant when he was talking about the histograms of height by gender.

Here is some example code to accomplish that John, with some notes about how I would present it. The GGRAPH code is pretty complicated no matter how you slice it (there are just so many options it can be dizzying). So I would start with boilerplate code you can create within the GUI, here I start with a histogram of heights for the entire dataset. Then I add in Gender in the second chart, and then make some aesthetic changes in the third chart, which is displayed on NABBLE at the end of this post.

To replicate this with your own data you would need to replace the variables "Height" and "Gender" in all places with their respective variable names in your data file (so Find-and-Replace should do just fine).

******************************************************************************************.
*Making fake data.
SET SEED 10.
INPUT PROGRAM.
LOOP #i = 1 TO 1000.
  COMPUTE Gender = RV.BERNOULLI(0.5).
  COMPUTE Height = RV.NORMAL(65,3) + Gender*2.5.
  END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME GenHeight.
FORMATS Gender (F1.0) Height (F2.0).
VALUE LABELS Gender 0 'Female' 1 'Male'.

*Default histogram for Height, no gender pasted from the GUI.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT: interval(position(summary.count(bin.rect(Height))), shape.interior(shape.square))
END GPL.


*Now add in Gender stuff.
*A in GRAPHDATASET line add in "Gender".
*B add in "DATA: Gender" in inline GPL.
*C  change the ELEMENT to line, and then add in ", color(Gender)"  before the last parenthesis.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  DATA: Gender=col(source(s), name("Gender"), unit.category())
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT: line(position(summary.count(bin.rect(Height*1))), color(Gender))
END GPL.


*Now lets make it look alittle nicer.

*Overlapping Areas with semi transparency can be alittle nicer.
*Not sure offhand if smooth.step should be center, left, or right?.
*But this makes the relationship to a histogram more apparent.

*Steps - these are all edits within the ELEMENT statement.
*A in ELEMENT statement change to "line" to "area".
*B add "smooth.step.center()" around the "summary.count()" mess.
*C add "transparency.interior(transparency."?")" before the end.
*  range is between [0,1] with 1 being fully transparent.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  DATA: Gender=col(source(s), name("Gender"), unit.category())
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT: area(position(smooth.step.center(summary.count(bin.rect(Height*1)))), color(Gender),
           transparency.interior(transparency."0.5"))
END GPL.

*If you want the areas of the histograms to be normalized it is a bit more difficult and you need to.
*Weight the data, see
http://andrewpwheeler.wordpress.com/2012/04/29/comparing-continuous-distributions-of-unequal-size-groups-in-spss/
*For some examples.
******************************************************************************************.

Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

John F Hall
Andy

This is brilliant.  

The first one could perhaps be changed to show %% rather than count (there
were a lot more women than men in the sample).  The second
(semi-transparent) one is the sort of thing I needed to show what happens
when you partition the overall distribution of height in metres into
conditional distributions for sex.

I'll forward the graphs to the class.

You're a star!  Many thanks.

John

John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]  
Website: www.surveyresearch.weebly.com  
SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

PS  I'll send you (off-list) my original efforts

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Andy W
Sent: 06 November 2014 14:12
To: [hidden email]
Subject: Re: Overlaying graphs

The example clustered bar chart Albert gives can be easily replicated within
the chart builder GUI. But that isn't what I imagined John meant when he was
talking about the histograms of height by gender.

Here is some example code to accomplish that John, with some notes about how
I would present it. The GGRAPH code is pretty complicated no matter how you
slice it (there are just so many options it can be dizzying). So I would
start with boilerplate code you can create within the GUI, here I start with
a histogram of heights for the entire dataset. Then I add in Gender in the
second chart, and then make some aesthetic changes in the third chart, which
is displayed on NABBLE at the end of this post.

To replicate this with your own data you would need to replace the variables
"Height" and "Gender" in all places with their respective variable names in
your data file (so Find-and-Replace should do just fine).

****************************************************************************
**************.
*Making fake data.
SET SEED 10.
INPUT PROGRAM.
LOOP #i = 1 TO 1000.
  COMPUTE Gender = RV.BERNOULLI(0.5).
  COMPUTE Height = RV.NORMAL(65,3) + Gender*2.5.
  END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME GenHeight.
FORMATS Gender (F1.0) Height (F2.0).
VALUE LABELS Gender 0 'Female' 1 'Male'.

*Default histogram for Height, no gender pasted from the GUI.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT: interval(position(summary.count(bin.rect(Height))),
shape.interior(shape.square))
END GPL.


*Now add in Gender stuff.
*A in GRAPHDATASET line add in "Gender".
*B add in "DATA: Gender" in inline GPL.
*C  change the ELEMENT to line, and then add in ", color(Gender)"  before
the last parenthesis.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  DATA: Gender=col(source(s), name("Gender"), unit.category())
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT: line(position(summary.count(bin.rect(Height*1))), color(Gender))
END GPL.


*Now lets make it look alittle nicer.

*Overlapping Areas with semi transparency can be alittle nicer.
*Not sure offhand if smooth.step should be center, left, or right?.
*But this makes the relationship to a histogram more apparent.

*Steps - these are all edits within the ELEMENT statement.
*A in ELEMENT statement change to "line" to "area".
*B add "smooth.step.center()" around the "summary.count()" mess.
*C add "transparency.interior(transparency."?")" before the end.
*  range is between [0,1] with 1 being fully transparent.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  DATA: Gender=col(source(s), name("Gender"), unit.category())
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT:
area(position(smooth.step.center(summary.count(bin.rect(Height*1)))),
color(Gender),
           transparency.interior(transparency."0.5"))
END GPL.

*If you want the areas of the histograms to be normalized it is a bit more
difficult and you need to.
*Weight the data, see
http://andrewpwheeler.wordpress.com/2012/04/29/comparing-continuous-distribu
tions-of-unequal-size-groups-in-spss/
*For some examples.
****************************************************************************
**************.

<http://spssx-discussion.1045642.n5.nabble.com/file/n5727819/Histo_Overlap.p
ng>



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727813p57
27819.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

Andy W
I thought off-hand you couldn't do this type of histogram with percents, but some experimentation proved me wrong. Here is an example with that same data for making the histograms percent based. It might wear out the parenthesis key on the student's keyboards though! (I often edit them in another text program, like Notepad++, which will highlight the parenthesis.)

**********************************.
*Making the histograms percent based.
*A add in "summary.percent.count" instead of just "summary.count".
*B change the label in GUIDE statement from "Frequency" to "Percentage".
*C add in the awful line "base.aesthetic(aesthetic(aesthetic.color.interior))"
   after the "(Height*1)" part.

GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  DATA: Gender=col(source(s), name("Gender"), unit.category())
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Percentage"))
  ELEMENT: area(position(smooth.step.center(summary.percent.count(bin.rect(Height*1),
           base.aesthetic(aesthetic(aesthetic.color.interior))))),
           color(Gender), transparency.interior(transparency."0.5"))
END GPL.
**********************************.

Another way to get from there to here is to start with the stacked histogram like Jignesh showed, and then edit them to not stack.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

Jon K Peck
In reply to this post by John F Hall
Taking a slightly different tack, the syntax below uses the STATS SUBGROUP PLOTS extension command, which is part of the Python Essentials, for a similar purpose.  It produces separate but aligned charts for each subgroup (defined by gender here) with the background distribution being the entire population.  This has the advantage of generalizing to more than two groups and allows simultaneous comparison of multiple variables.  I used a kernel smoother on the distributions here, but the command also supports straight area and bar charts or histograms.

This extension command works by generating and executing GPL code.  The code below was generated using Graphs > Compare Subgroups.




STATS SUBGROUP PLOTS SUBGROUP=Gender VARIABLES=Height
/OPTIONS XSIZE=3 YSIZE=3 YSCALE=90 ALLDATACOLOR=silver SUBGROUPCOLOR=blue TRANSPARENCY=50
ALLDATAPATTERN=solid SUBGROUPPATTERN=solid BINCOUNT=20 SMOOTHPROP=.05
MISSING=VARIABLEWISE HISTOGRAM=KERNEL.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        John F Hall <[hidden email]>
To:        [hidden email]
Date:        11/06/2014 07:18 AM
Subject:        Re: [SPSSX-L] Overlaying graphs
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Andy

This is brilliant.  

The first one could perhaps be changed to show %% rather than count (there
were a lot more women than men in the sample).  The second
(semi-transparent) one is the sort of thing I needed to show what happens
when you partition the overall distribution of height in metres into
conditional distributions for sex.

I'll forward the graphs to the class.

You're a star!  Many thanks.

John

John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]  
Website:
www.surveyresearch.weebly.com  
SPSS start page:  
www.surveyresearch.weebly.com/1-survey-analysis-workshop

PS  I'll send you (off-list) my original efforts

-----Original Message-----
From: SPSSX(r) Discussion [
[hidden email]] On Behalf Of
Andy W
Sent: 06 November 2014 14:12
To: [hidden email]
Subject: Re: Overlaying graphs

The example clustered bar chart Albert gives can be easily replicated within
the chart builder GUI. But that isn't what I imagined John meant when he was
talking about the histograms of height by gender.

Here is some example code to accomplish that John, with some notes about how
I would present it. The GGRAPH code is pretty complicated no matter how you
slice it (there are just so many options it can be dizzying). So I would
start with boilerplate code you can create within the GUI, here I start with
a histogram of heights for the entire dataset. Then I add in Gender in the
second chart, and then make some aesthetic changes in the third chart, which
is displayed on NABBLE at the end of this post.

To replicate this with your own data you would need to replace the variables
"Height" and "Gender" in all places with their respective variable names in
your data file (so Find-and-Replace should do just fine).

****************************************************************************
**************.
*Making fake data.
SET SEED 10.
INPUT PROGRAM.
LOOP #i = 1 TO 1000.
 COMPUTE Gender = RV.BERNOULLI(0.5).
 COMPUTE Height = RV.NORMAL(65,3) + Gender*2.5.
 END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME GenHeight.
FORMATS Gender (F1.0) Height (F2.0).
VALUE LABELS Gender 0 'Female' 1 'Male'.

*Default histogram for Height, no gender pasted from the GUI.
GGRAPH
 /GRAPHDATASET NAME="graphdataset" VARIABLES=Height MISSING=LISTWISE
REPORTMISSING=NO
 /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
 SOURCE: s=userSource(id("graphdataset"))
 DATA: Height=col(source(s), name("Height"))
 GUIDE: axis(dim(1), label("Height"))
 GUIDE: axis(dim(2), label("Frequency"))
 ELEMENT: interval(position(summary.count(bin.rect(Height))),
shape.interior(shape.square))
END GPL.


*Now add in Gender stuff.
*A in GRAPHDATASET line add in "Gender".
*B add in "DATA: Gender" in inline GPL.
*C  change the ELEMENT to line, and then add in ", color(Gender)"  before
the last parenthesis.
GGRAPH
 /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
 /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
 SOURCE: s=userSource(id("graphdataset"))
 DATA: Height=col(source(s), name("Height"))
 DATA: Gender=col(source(s), name("Gender"), unit.category())
 GUIDE: axis(dim(1), label("Height"))
 GUIDE: axis(dim(2), label("Frequency"))
 ELEMENT: line(position(summary.count(bin.rect(Height*1))), color(Gender))
END GPL.


*Now lets make it look alittle nicer.

*Overlapping Areas with semi transparency can be alittle nicer.
*Not sure offhand if smooth.step should be center, left, or right?.
*But this makes the relationship to a histogram more apparent.

*Steps - these are all edits within the ELEMENT statement.
*A in ELEMENT statement change to "line" to "area".
*B add "smooth.step.center()" around the "summary.count()" mess.
*C add "transparency.interior(transparency."?")" before the end.
*  range is between [0,1] with 1 being fully transparent.
GGRAPH
 /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
 /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
 SOURCE: s=userSource(id("graphdataset"))
 DATA: Height=col(source(s), name("Height"))
 DATA: Gender=col(source(s), name("Gender"), unit.category())
 GUIDE: axis(dim(1), label("Height"))
 GUIDE: axis(dim(2), label("Frequency"))
 ELEMENT:
area(position(smooth.step.center(summary.count(bin.rect(Height*1)))),
color(Gender),
          transparency.interior(transparency."0.5"))
END GPL.

*If you want the areas of the histograms to be normalized it is a bit more
difficult and you need to.
*Weight the data, see
http://andrewpwheeler.wordpress.com/2012/04/29/comparing-continuous-distribu
tions-of-unequal-size-groups-in-spss/
*For some examples.
****************************************************************************
**************.

<
http://spssx-discussion.1045642.n5.nabble.com/file/n5727819/Histo_Overlap.p
ng>



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727813p57
27819.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

John F Hall

Jon

 

I substituted sex for gender in your syntax:

 

STATS SUBGROUP PLOTS SUBGROUP=sex VARIABLES=Height

/OPTIONS XSIZE=3 YSIZE=3 YSCALE=90 ALLDATACOLOR=silver SUBGROUPCOLOR=blue TRANSPARENCY=50

ALLDATAPATTERN=solid SUBGROUPPATTERN=solid BINCOUNT=20 SMOOTHPROP=.05

MISSING=VARIABLEWISE HISTOGRAM=KERNEL.

 

. . ran it (several times):

Chart Information

Settings

Value

Subgroups Defined by

sex

Missing Value Treatment

variable by variable

Color for Entire Sample

silver

Color for Subgroups

blue

Pattern for Entire Sample

solid

Pattern for Subgroups

solid

Settings for the charts that follow

. . but keep getting a message

Warnings

Variable not in this dictionary

 

The single line:

STATS SUBGROUP PLOTS SUBGROUP=sex VARIABLES=height .

 

works.  So does the default syntax pasted from the GUI when I was playing around yesterday:

 

STATS SUBGROUP PLOTS SUBGROUP=sex VARIABLES=height

/OPTIONS XSIZE=1.75 YSIZE=1.75 YSCALE=90 ALLDATACOLOR=whitesmoke SUBGROUPCOLOR=blue TRANSPARENCY=50

ALLDATAPATTERN=solid SUBGROUPPATTERN=solid BINCOUNT=20 SMOOTHPROP=.05

MISSING=VARIABLEWISE HISTOGRAM=AREA.

What am I doing wrong?

John

 

 

From: Jon K Peck [mailto:[hidden email]]
Sent: 06 November 2014 15:39
To: John F Hall
Cc: [hidden email]
Subject: Re: [SPSSX-L] Overlaying graphs

 

Taking a slightly different tack, the syntax below uses the STATS SUBGROUP PLOTS extension command, which is part of the Python Essentials, for a similar purpose.  It produces separate but aligned charts for each subgroup (defined by gender here) with the background distribution being the entire population.  This has the advantage of generalizing to more than two groups and allows simultaneous comparison of multiple variables.  I used a kernel smoother on the distributions here, but the command also supports straight area and bar charts or histograms.

This extension command works by generating and executing GPL code.  The code below was generated using Graphs > Compare Subgroups.

STATS SUBGROUP PLOTS SUBGROUP=Gender VARIABLES=Height
/OPTIONS XSIZE=3 YSIZE=3 YSCALE=90 ALLDATACOLOR=silver SUBGROUPCOLOR=blue TRANSPARENCY=50
ALLDATAPATTERN=solid SUBGROUPPATTERN=solid BINCOUNT=20 SMOOTHPROP=.05
MISSING=VARIABLEWISE HISTOGRAM=KERNEL.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        John F Hall <[hidden email]>
To:        [hidden email]
Date:        11/06/2014 07:18 AM
Subject:        Re: [SPSSX-L] Overlaying graphs
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





Andy

This is brilliant.  

The first one could perhaps be changed to show %% rather than count (there
were a lot more women than men in the sample).  The second
(semi-transparent) one is the sort of thing I needed to show what happens
when you partition the overall distribution of height in metres into
conditional distributions for sex.

I'll forward the graphs to the class.

You're a star!  Many thanks.

John

John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]  
Website:
www.surveyresearch.weebly.com  
SPSS start page:  
www.surveyresearch.weebly.com/1-survey-analysis-workshop

PS  I'll send you (off-list) my original efforts

-----Original Message-----
From: SPSSX(r) Discussion [
[hidden email]] On Behalf Of
Andy W
Sent: 06 November 2014 14:12
To: [hidden email]
Subject: Re: Overlaying graphs

The example clustered bar chart Albert gives can be easily replicated within
the chart builder GUI. But that isn't what I imagined John meant when he was
talking about the histograms of height by gender.

Here is some example code to accomplish that John, with some notes about how
I would present it. The GGRAPH code is pretty complicated no matter how you
slice it (there are just so many options it can be dizzying). So I would
start with boilerplate code you can create within the GUI, here I start with
a histogram of heights for the entire dataset. Then I add in Gender in the
second chart, and then make some aesthetic changes in the third chart, which
is displayed on NABBLE at the end of this post.

To replicate this with your own data you would need to replace the variables
"Height" and "Gender" in all places with their respective variable names in
your data file (so Find-and-Replace should do just fine).

****************************************************************************
**************.
*Making fake data.
SET SEED 10.
INPUT PROGRAM.
LOOP #i = 1 TO 1000.
 COMPUTE Gender = RV.BERNOULLI(0.5).
 COMPUTE Height = RV.NORMAL(65,3) + Gender*2.5.
 END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME GenHeight.
FORMATS Gender (F1.0) Height (F2.0).
VALUE LABELS Gender 0 'Female' 1 'Male'.

*Default histogram for Height, no gender pasted from the GUI.
GGRAPH
 /GRAPHDATASET NAME="graphdataset" VARIABLES=Height MISSING=LISTWISE
REPORTMISSING=NO
 /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
 SOURCE: s=userSource(id("graphdataset"))
 DATA: Height=col(source(s), name("Height"))
 GUIDE: axis(dim(1), label("Height"))
 GUIDE: axis(dim(2), label("Frequency"))
 ELEMENT: interval(position(summary.count(bin.rect(Height))),
shape.interior(shape.square))
END GPL.


*Now add in Gender stuff.
*A in GRAPHDATASET line add in "Gender".
*B add in "DATA: Gender" in inline GPL.
*C  change the ELEMENT to line, and then add in ", color(Gender)"  before
the last parenthesis.
GGRAPH
 /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
 /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
 SOURCE: s=userSource(id("graphdataset"))
 DATA: Height=col(source(s), name("Height"))
 DATA: Gender=col(source(s), name("Gender"), unit.category())
 GUIDE: axis(dim(1), label("Height"))
 GUIDE: axis(dim(2), label("Frequency"))
 ELEMENT: line(position(summary.count(bin.rect(Height*1))), color(Gender))
END GPL.


*Now lets make it look alittle nicer.

*Overlapping Areas with semi transparency can be alittle nicer.
*Not sure offhand if smooth.step should be center, left, or right?.
*But this makes the relationship to a histogram more apparent.

*Steps - these are all edits within the ELEMENT statement.
*A in ELEMENT statement change to "line" to "area".
*B add "smooth.step.center()" around the "summary.count()" mess.
*C add "transparency.interior(transparency."?")" before the end.
*  range is between [0,1] with 1 being fully transparent.
GGRAPH
 /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
 /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
 SOURCE: s=userSource(id("graphdataset"))
 DATA: Height=col(source(s), name("Height"))
 DATA: Gender=col(source(s), name("Gender"), unit.category())
 GUIDE: axis(dim(1), label("Height"))
 GUIDE: axis(dim(2), label("Frequency"))
 ELEMENT:
area(position(smooth.step.center(summary.count(bin.rect(Height*1)))),
color(Gender),
          transparency.interior(transparency."0.5"))
END GPL.

*If you want the areas of the histograms to be normalized it is a bit more
difficult and you need to.
*Weight the data, see
http://andrewpwheeler.wordpress.com/2012/04/29/comparing-continuous-distribu
tions-of-unequal-size-groups-in-spss/
*For some examples.
****************************************************************************
**************.

<
http://spssx-discussion.1045642.n5.nabble.com/file/n5727819/Histo_Overlap.p
ng>



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727813p57
27819.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

John McConnell-2
In reply to this post by John F Hall

Hi John

 

I don’t have a direct answer … but hopefully a pointer to where the answer probably lies

 

As you may know the philosophy which underlies the graphs in SPSS is the “Grammar of Graphics” which was invented by Leland Wilkinson

 

http://en.wikipedia.org/wiki/Leland_Wilkinson

 

http://www.amazon.co.uk/The-Grammar-Graphics-Statistics-Computing/dp/0387245448

 

The principle of the GG is to look at graphical construction from a more elemental level than the traditional chart type; bar, pie, line, etc.

 

Of course SPSS has simplified the power of GG into a set of options available in the UI of the Chart Builder. But underneath is the powerful GPL language which is much more flexible.

 

The Guide to GPL in SPSS is here

 

ftp://public.dhe.ibm.com/software/analytics/spss/documentation/statistics/20.0/en/client/Manuals/GPL_Reference_Guide_for_IBM_SPSS_Statistics.pdf

 

HTH

John

 

John McConnell | analytical-people | Gable House, 18-24 Turnham Green Terrace, London, UK, W4 1QP| t. +44 (0)845 680 1871

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of John F Hall
Sent: 06 November 2014 11:14
To: [hidden email]
Subject: Re: Overlaying graphs

 

Jignesh

 

Thanks for this.  The stacked histogram is pretty close (I’d already found that after I mailed the list) but I really need the bars to be clustered, not stacked, as in a clustered barchart.  From the SPSS help pages, it looks as if there is no way of overlaying two line charts, but I’d be very surprised if there isn’t one.  I’ll try copying them to Word and see if I can do a workaround using Snip and changing the transparencies and/or text-wrapping settings.

 

ASSESS 2014 at York went well, but I’ve sent a separate mail to the list as it’s well off topic.

 

John

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

 

 

 

 

 

-----Original Message-----
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Jignesh Sutar
Sent: 06 November 2014 08:53
To: [hidden email]
Subject: Re: Overlaying graphs

 

Hi John,

 

Hope the workshop went well.

 

 

You can explore some example graphs from this link:

 

http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fvizml_examples_overview.htm

<http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fvizml_examples_overview.htm

 

 

Is this close to what you are trying to acheive?

 

http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm

<http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fgpl_examples_barcharts_histogram_stack.htm

 

 

<http://spssx-discussion.1045642.n5.nabble.com/file/n5727814/histogramstack.png>

 

 

 

--

View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727813p5727814.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

 

=====================

To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

ViAnn Beadle
In reply to this post by John F Hall
OK, I am a little late to the party, but here is a line chart representing
the bins with some very bold lines which is very similar to Andy's. I used
employee data.sav:

GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=salary gender MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: salary=col(source(s), name("salary"))
  DATA: gender=col(source(s), name("gender"), unit.category())
  GUIDE: axis(dim(1), label("Current Salary"))
  GUIDE: axis(dim(2), label("Percent"))
  ELEMENT: line(position(summary.percent.count(bin.rect(salary*gender))),
color(gender), size(size."15px"))
END GPL.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John F Hall
Sent: Thursday, November 06, 2014 7:13 AM
To: [hidden email]
Subject: Re: Overlaying graphs

Andy

This is brilliant.  

The first one could perhaps be changed to show %% rather than count (there
were a lot more women than men in the sample).  The second
(semi-transparent) one is the sort of thing I needed to show what happens
when you partition the overall distribution of height in metres into
conditional distributions for sex.

I'll forward the graphs to the class.

You're a star!  Many thanks.

John

John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]  
Website: www.surveyresearch.weebly.com SPSS start page:
www.surveyresearch.weebly.com/1-survey-analysis-workshop

PS  I'll send you (off-list) my original efforts

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Andy W
Sent: 06 November 2014 14:12
To: [hidden email]
Subject: Re: Overlaying graphs

The example clustered bar chart Albert gives can be easily replicated within
the chart builder GUI. But that isn't what I imagined John meant when he was
talking about the histograms of height by gender.

Here is some example code to accomplish that John, with some notes about how
I would present it. The GGRAPH code is pretty complicated no matter how you
slice it (there are just so many options it can be dizzying). So I would
start with boilerplate code you can create within the GUI, here I start with
a histogram of heights for the entire dataset. Then I add in Gender in the
second chart, and then make some aesthetic changes in the third chart, which
is displayed on NABBLE at the end of this post.

To replicate this with your own data you would need to replace the variables
"Height" and "Gender" in all places with their respective variable names in
your data file (so Find-and-Replace should do just fine).

****************************************************************************
**************.
*Making fake data.
SET SEED 10.
INPUT PROGRAM.
LOOP #i = 1 TO 1000.
  COMPUTE Gender = RV.BERNOULLI(0.5).
  COMPUTE Height = RV.NORMAL(65,3) + Gender*2.5.
  END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME GenHeight.
FORMATS Gender (F1.0) Height (F2.0).
VALUE LABELS Gender 0 'Female' 1 'Male'.

*Default histogram for Height, no gender pasted from the GUI.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT: interval(position(summary.count(bin.rect(Height))),
shape.interior(shape.square))
END GPL.


*Now add in Gender stuff.
*A in GRAPHDATASET line add in "Gender".
*B add in "DATA: Gender" in inline GPL.
*C  change the ELEMENT to line, and then add in ", color(Gender)"  before
the last parenthesis.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  DATA: Gender=col(source(s), name("Gender"), unit.category())
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT: line(position(summary.count(bin.rect(Height*1))), color(Gender))
END GPL.


*Now lets make it look alittle nicer.

*Overlapping Areas with semi transparency can be alittle nicer.
*Not sure offhand if smooth.step should be center, left, or right?.
*But this makes the relationship to a histogram more apparent.

*Steps - these are all edits within the ELEMENT statement.
*A in ELEMENT statement change to "line" to "area".
*B add "smooth.step.center()" around the "summary.count()" mess.
*C add "transparency.interior(transparency."?")" before the end.
*  range is between [0,1] with 1 being fully transparent.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  DATA: Gender=col(source(s), name("Gender"), unit.category())
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT:
area(position(smooth.step.center(summary.count(bin.rect(Height*1)))),
color(Gender),
           transparency.interior(transparency."0.5"))
END GPL.

*If you want the areas of the histograms to be normalized it is a bit more
difficult and you need to.
*Weight the data, see
http://andrewpwheeler.wordpress.com/2012/04/29/comparing-continuous-distribu
tions-of-unequal-size-groups-in-spss/
*For some examples.
****************************************************************************
**************.

<http://spssx-discussion.1045642.n5.nabble.com/file/n5727819/Histo_Overlap.p
ng>



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727813p57
27819.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Overlaying graphs

John F Hall
ViAnn

Thanks for this.  I changed salary to height and gender to sex in your
syntax:

*ViAnn Beadle.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=height sex MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: height=col(source(s), name("height"))
  DATA: sex=col(source(s), name("sex"), unit.category())
  GUIDE: axis(dim(1), label("Height in metres"))
  GUIDE: axis(dim(2), label("Percent"))
  ELEMENT: line(position(summary.percent.count(bin.rect(height*sex))),
color(sex), size(size."15px"))
END GPL.

The result is a bit like the first overlay provided by Andy which used count
not %%, but the it makes for a better comparison.  Not sure about the bold
though: simple lines look cleaner. I modified your syntax to omit the size
element and it worked, so I've learned something already!

Andy's second one is probably the best so far as it has semi-transparent
bars. I'll send the output off-list to you and the others.

Messages on this thread have been coming in thick and fast.  I accidentally
unsubscribed a few weeks ago and wondered why there was no traffic.  Nice to
be back.

Thanks to everyone for their solutions.

John


-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: 06 November 2014 18:20
To: 'John F Hall'; [hidden email]
Subject: RE: Overlaying graphs

OK, I am a little late to the party, but here is a line chart representing
the bins with some very bold lines which is very similar to Andy's. I used
employee data.sav:

GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=salary gender MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: salary=col(source(s), name("salary"))
  DATA: gender=col(source(s), name("gender"), unit.category())
  GUIDE: axis(dim(1), label("Current Salary"))
  GUIDE: axis(dim(2), label("Percent"))
  ELEMENT: line(position(summary.percent.count(bin.rect(salary*gender))),
color(gender), size(size."15px"))
END GPL.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John F Hall
Sent: Thursday, November 06, 2014 7:13 AM
To: [hidden email]
Subject: Re: Overlaying graphs

Andy

This is brilliant.  

The first one could perhaps be changed to show %% rather than count (there
were a lot more women than men in the sample).  The second
(semi-transparent) one is the sort of thing I needed to show what happens
when you partition the overall distribution of height in metres into
conditional distributions for sex.

I'll forward the graphs to the class.

You're a star!  Many thanks.

John

John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]  
Website: www.surveyresearch.weebly.com SPSS start page:
www.surveyresearch.weebly.com/1-survey-analysis-workshop

PS  I'll send you (off-list) my original efforts

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Andy W
Sent: 06 November 2014 14:12
To: [hidden email]
Subject: Re: Overlaying graphs

The example clustered bar chart Albert gives can be easily replicated within
the chart builder GUI. But that isn't what I imagined John meant when he was
talking about the histograms of height by gender.

Here is some example code to accomplish that John, with some notes about how
I would present it. The GGRAPH code is pretty complicated no matter how you
slice it (there are just so many options it can be dizzying). So I would
start with boilerplate code you can create within the GUI, here I start with
a histogram of heights for the entire dataset. Then I add in Gender in the
second chart, and then make some aesthetic changes in the third chart, which
is displayed on NABBLE at the end of this post.

To replicate this with your own data you would need to replace the variables
"Height" and "Gender" in all places with their respective variable names in
your data file (so Find-and-Replace should do just fine).

****************************************************************************
**************.
*Making fake data.
SET SEED 10.
INPUT PROGRAM.
LOOP #i = 1 TO 1000.
  COMPUTE Gender = RV.BERNOULLI(0.5).
  COMPUTE Height = RV.NORMAL(65,3) + Gender*2.5.
  END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME GenHeight.
FORMATS Gender (F1.0) Height (F2.0).
VALUE LABELS Gender 0 'Female' 1 'Male'.

*Default histogram for Height, no gender pasted from the GUI.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT: interval(position(summary.count(bin.rect(Height))),
shape.interior(shape.square))
END GPL.


*Now add in Gender stuff.
*A in GRAPHDATASET line add in "Gender".
*B add in "DATA: Gender" in inline GPL.
*C  change the ELEMENT to line, and then add in ", color(Gender)"  before
the last parenthesis.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  DATA: Gender=col(source(s), name("Gender"), unit.category())
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT: line(position(summary.count(bin.rect(Height*1))), color(Gender))
END GPL.


*Now lets make it look alittle nicer.

*Overlapping Areas with semi transparency can be alittle nicer.
*Not sure offhand if smooth.step should be center, left, or right?.
*But this makes the relationship to a histogram more apparent.

*Steps - these are all edits within the ELEMENT statement.
*A in ELEMENT statement change to "line" to "area".
*B add "smooth.step.center()" around the "summary.count()" mess.
*C add "transparency.interior(transparency."?")" before the end.
*  range is between [0,1] with 1 being fully transparent.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Height Gender MISSING=LISTWISE
REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Height=col(source(s), name("Height"))
  DATA: Gender=col(source(s), name("Gender"), unit.category())
  GUIDE: axis(dim(1), label("Height"))
  GUIDE: axis(dim(2), label("Frequency"))
  ELEMENT:
area(position(smooth.step.center(summary.count(bin.rect(Height*1)))),
color(Gender),
           transparency.interior(transparency."0.5"))
END GPL.

*If you want the areas of the histograms to be normalized it is a bit more
difficult and you need to.
*Weight the data, see
http://andrewpwheeler.wordpress.com/2012/04/29/comparing-continuous-distribu
tions-of-unequal-size-groups-in-spss/
*For some examples.
****************************************************************************
**************.

<http://spssx-discussion.1045642.n5.nabble.com/file/n5727819/Histo_Overlap.p
ng>



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Overlaying-graphs-tp5727813p57
27819.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD