Dear SPSSX Members,
I was wondering if you would recommend SPSSX as a tool for cumulative probability plots? I’m after a software package that will enable me to plot the cumulative probability percentage of a given variable form a large data set against the value of the same variable. This plot enables me to rapidly visualise distinct populations within large data sets of geochemical data.
I’m hoping to be able to tag each data point with an id so that I can cross reference the point within the plot with the database.
Does SPSSX have the flexibility to do this work or am I stuck with Excel?
Kind regards,
Clint
|
Clint,
Yes, Spss, now 19, not 'X', will do cumulataive probability
plots. One option is the Examine command; although this command does not support
tagging, so far as i know. The other option is constructing graphs in the GPL
facility. I have used it but i have not used it to your type of plots. GPL
is a graphing language and seems to be extremely flexible and powerful. The
downside of that is that constructing a graph is done by writing a set of
commands. I'm sure others on the list, more familiar with GPL, can give a fuller
comment.
At the same time, there are other, what seem to be
'graphing-primary', programs on the market. Two i've seen the promotional
material for are JMP and DataDesk. But given what you are working with, i guess
that there are more content-focused graphinc/data programs on the
market.
Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of ClintR Sent: Thursday, June 23, 2011 12:47 AM To: [hidden email] Subject: Cumulative Probability Plots View this message in context: Cumulative Probability Plots Sent from the SPSSX Discussion mailing list archive at Nabble.com. |
In reply to this post by ClintR
Cumulative distributions are often most
usefully viewed as a q-q plot against either a theoretical distribution
or another variable or dataset. q-q plots against theoretical distributions
are built in to SPSS Statistics, and q-q plots against empirical distributions
can be obtained via the SPSSINC QQPLOT2 extension command available from
the SPSS Community website (www.ibm.com/developerworks/spssdevcentral).
If you want a cumulative histogram, you can use GGRAPH. The easiest way to do this is to paste a histogram specification from the Chart Builder and then slightly modify the generated GPL code. Here is an example. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=salary MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: salary=col(source(s), name("salary")) GUIDE: axis(dim(1), label("Current Salary")) GUIDE: axis(dim(2), label("Frequency")) ELEMENT: area(position(summary.count.cumulative(bin.rect(salary, binWidth(100)))), missing.wings()) END GPL. The changes from the pasted code are to change summary.count to summary.count.cumulative and to add the binWidth function to the bin.rect specification. I set the bin width to 100 here, but setting the width is optional. HTH, Jon Peck Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: ClintR <[hidden email]> To: [hidden email] Date: 06/23/2011 07:55 AM Subject: [SPSSX-L] Cumulative Probability Plots Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear SPSSX Members, I was wondering if you would recommend SPSSX as a tool for cumulative probability plots? I’m after a software package that will enable me to plot the cumulative probability percentage of a given variable form a large data set against the value of the same variable. This plot enables me to rapidly visualise distinct populations within large data sets of geochemical data. I’m hoping to be able to tag each data point with an id so that I can cross reference the point within the plot with the database. Does SPSSX have the flexibility to do this work or am I stuck with Excel? Kind regards, Clint View this message in context: Cumulative Probability Plots Sent from the SPSSX Discussion mailing list archive at Nabble.com. |
In reply to this post by ClintR
Or, another variation without any aggregation.
GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=salary id MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: salary=col(source(s), name("salary")) DATA: id=col(source(s), name("id")) GUIDE: axis(dim(1), label("Current Salary")) GUIDE: axis(dim(2), label("Frequency")) ELEMENT: point(position(summary.count.cumulative(salary))) END GPL. Jon Peck Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: ClintR <[hidden email]> To: [hidden email] Date: 06/23/2011 07:55 AM Subject: [SPSSX-L] Cumulative Probability Plots Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear SPSSX Members, I was wondering if you would recommend SPSSX as a tool for cumulative probability plots? I’m after a software package that will enable me to plot the cumulative probability percentage of a given variable form a large data set against the value of the same variable. This plot enables me to rapidly visualise distinct populations within large data sets of geochemical data. I’m hoping to be able to tag each data point with an id so that I can cross reference the point within the plot with the database. Does SPSSX have the flexibility to do this work or am I stuck with Excel? Kind regards, Clint View this message in context: Cumulative Probability Plots Sent from the SPSSX Discussion mailing list archive at Nabble.com. |
In reply to this post by ClintR
One more version, this time with a case
id label.
GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=salary id MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: salary=col(source(s), name("salary")) DATA: id=col(source(s), name("id")) GUIDE: axis(dim(1), label("Current Salary")) GUIDE: axis(dim(2), label("Frequency")) ELEMENT: point(position(summary.count.cumulative(salary)),label(summary.first(id))) END GPL. Jon Peck Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Jon K Peck/Chicago/IBM To: ClintR <[hidden email]> Cc: [hidden email] Date: 06/23/2011 09:55 AM Subject: Re: [SPSSX-L] Cumulative Probability Plots Or, another variation without any aggregation. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=salary id MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: salary=col(source(s), name("salary")) DATA: id=col(source(s), name("id")) GUIDE: axis(dim(1), label("Current Salary")) GUIDE: axis(dim(2), label("Frequency")) ELEMENT: point(position(summary.count.cumulative(salary))) END GPL. Jon Peck Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: ClintR <[hidden email]> To: [hidden email] Date: 06/23/2011 07:55 AM Subject: [SPSSX-L] Cumulative Probability Plots Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear SPSSX Members, I was wondering if you would recommend SPSSX as a tool for cumulative probability plots? I’m after a software package that will enable me to plot the cumulative probability percentage of a given variable form a large data set against the value of the same variable. This plot enables me to rapidly visualise distinct populations within large data sets of geochemical data. I’m hoping to be able to tag each data point with an id so that I can cross reference the point within the plot with the database. Does SPSSX have the flexibility to do this work or am I stuck with Excel? Kind regards, Clint View this message in context: Cumulative Probability Plots Sent from the SPSSX Discussion mailing list archive at Nabble.com. |
Free forum by Nabble | Edit this page |