I have data on the % of clients who met a criterion (format
f8.2) for approx 5000 clinics. The clinics are nested in about 100 Agencies and
the Agencies are nested in 7 Regions. Each Clinic ‘belongs’ to only
1 Agency and each Agency is a member of only 1 Region. Currently the Clinic and Agency ID’s are strings, but could
easily be changed to numeric values if that makes any difference. In order to show graphically the variation in % of clients
who meet a criterion both within and among Agencies, I want to do violin plots separately
for each Region, that is, 7 separate charts, showing the data for each Agency
in the Region. I have some syntax from a colleague for producing violin
plots but it’s based on using R as a stand-alone product rather than as
an extension of SPSS. Since I’m a newbie at R, I don’t know what to
modify in the syntax for use as an SPSS extension. Any help would be appreciated. Any suggestions for learning R, especially as an SPSS
extension, would also be appreciated. Thanks. Pat
|
What is a violin plot? Any examples online? From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Cleland, Patricia (EDU) I have data on the % of clients who met a criterion (format f8.2) for approx 5000 clinics. The clinics are nested in about 100 Agencies and the Agencies are nested in 7 Regions. Each Clinic ‘belongs’ to only 1 Agency and each Agency is a member of only 1 Region. Currently the Clinic and Agency ID’s are strings, but could easily be changed to numeric values if that makes any difference. In order to show graphically the variation in % of clients who meet a criterion both within and among Agencies, I want to do violin plots separately for each Region, that is, 7 separate charts, showing the data for each Agency in the Region. I have some syntax from a colleague for producing violin plots but it’s based on using R as a stand-alone product rather than as an extension of SPSS. Since I’m a newbie at R, I don’t know what to modify in the syntax for use as an SPSS extension. Any help would be appreciated. Any suggestions for learning R, especially as an SPSS extension, would also be appreciated. Thanks. Pat |
A violin plot is a combination of a
boxplot and a kernel density
plot. They are essentially pretty versions of box plots, where the width
is set by the local density. For skewed distributions, you get things that look
a bit like "violins", hence the name. Here are links to some examples: http://en.wikipedia.org/wiki/Violin_plot
http://www.statmethods.net/graphs/boxplot.html http://www.r-bloggers.com/example-8-11-violin-plots/ http://www2.warwick.ac.uk/fac/sci/moac/degrees/modules/ch923/r_introduction/boxplot/ Pat -------------------------------- 15th Floor, Mowat Block phone: 416-325-2697 email: [hidden email] From:
ViAnn Beadle [mailto:[hidden email]] What is a violin
plot? Any examples online? From:
SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Cleland, Patricia (EDU) I have data on the % of clients who met a criterion (format
f8.2) for approx 5000 clinics. The clinics are nested in about 100 Agencies and
the Agencies are nested in 7 Regions. Each Clinic ‘belongs’ to only
1 Agency and each Agency is a member of only 1 Region. Currently the Clinic and Agency ID’s are strings, but
could easily be changed to numeric values if that makes any difference. In order to show graphically the variation in % of clients
who meet a criterion both within and among Agencies, I want to do violin plots
separately for each Region, that is, 7 separate charts, showing the data for
each Agency in the Region. I have some syntax from a colleague for producing violin
plots but it’s based on using R as a stand-alone product rather than as
an extension of SPSS. Since I’m a newbie at R, I don’t know what to
modify in the syntax for use as an SPSS extension. Any help would be appreciated. Any suggestions for learning R, especially as an SPSS
extension, would also be appreciated. Thanks. Pat
|
In reply to this post by Cleland, Patricia (EDU)
Everyone: Attached is a R script on how to generate a
violin plot. They are certainly interesting, but too
often they are not understood by the typical reader – at least not in
higher education. Let me know if you wish to receive the
graphical images generated by this script. I’ll shy away from
including them in this message but I’ll be glad to send them separately
if you do not use R. Best wishes. Tom Exam_Score <- c(100,098,097,056,078,086,045,
093,059,074,082,096,091,086,
059,067,083,085,081,080,078,
082,095,088,095)
summary(Exam_Score) par(ask=TRUE) # Freeze
the screen. boxplot(Exam_Score, horizontal=TRUE) # You need to use the external violinmplot package
and then # the violinmplot() function found in this external
package. install.packages("violinmplot") library(violinmplot)
# Note: It is good R practice to
use # package_name:::function_name()
syntax # when using a function from an
external # package, for future documentation
# purposes. This example is a bit different # since the function name is
violinmplot # and this function shares the same
name # used for the
package.
# However, an oddity of the
violinmplot # package is that it does not use
a # namespace so you only key the
function # name and not the package
name. par(ask=TRUE) # Freeze
the screen. violinmplot(Exam_Score,
main="Violin Plot of Exam
Scores") # Dr. Thomas W. MacFarland # Feb-09-11 ----- Thomas W. MacFarland, Ed.D. From: SPSSX(r)
Discussion [mailto:[hidden email]] On
Behalf Of Cleland, Patricia (EDU) I have data on the % of clients who met a criterion
(format f8.2) for approx 5000 clinics. The clinics are nested in about 100
Agencies and the Agencies are nested in 7 Regions. Each Clinic
‘belongs’ to only 1 Agency and each Agency is a member of only 1
Region. Currently the Clinic and Agency ID’s are
strings, but could easily be changed to numeric values if that makes any
difference. In order to show graphically the variation in % of
clients who meet a criterion both within and among Agencies, I want to do
violin plots separately for each Region, that is, 7 separate charts, showing
the data for each Agency in the Region. I have some syntax from a colleague for producing
violin plots but it’s based on using R as a stand-alone product rather
than as an extension of SPSS. Since I’m a newbie at R, I don’t know
what to modify in the syntax for use as an SPSS extension. Any help would be appreciated. Any suggestions for learning R, especially as an SPSS
extension, would also be appreciated. Thanks. Pat |
In reply to this post by Cleland, Patricia (EDU)
It is very easy to run a R program inside
Statistics with the output appearing automatically in the Viewer. Here
is an example, assuming that you have installed the R violinmplot
package. I've assumed that you use the regular SPSS techniques to
select the cases in a region. criterion is the percentage variable
and agency, well, the agency. Be careful to match the case of the
actual variable names and the functions and parameters below, since everything
in R is case sensitive.
begin program r. library(violinmplot) dta= spssdata.GetDataFromSPSS("criterion agency", missingValueToNA=TRUE) violinmplot(criterion~agency, data=dta, horizontal=FALSE) end program. There are some chapters on using R in Statistics in the Programming and Data Managment Book downloadable from the SPSS Community (www.ibm.com/developerworks/spssdevcentral) that could help you get started. HTH, Jon Peck Senior Software Engineer, IBM [hidden email] 312-651-3435 From: "Cleland, Patricia (EDU)" <[hidden email]> To: [hidden email] Date: 02/09/2011 12:38 PM Subject: [SPSSX-L] need help creating violin plots Sent by: "SPSSX(r) Discussion" <[hidden email]> I have data on the % of clients who met a criterion (format f8.2) for approx 5000 clinics. The clinics are nested in about 100 Agencies and the Agencies are nested in 7 Regions. Each Clinic ‘belongs’ to only 1 Agency and each Agency is a member of only 1 Region. Currently the Clinic and Agency ID’s are strings, but could easily be changed to numeric values if that makes any difference. In order to show graphically the variation in % of clients who meet a criterion both within and among Agencies, I want to do violin plots separately for each Region, that is, 7 separate charts, showing the data for each Agency in the Region. I have some syntax from a colleague for producing violin plots but it’s based on using R as a stand-alone product rather than as an extension of SPSS. Since I’m a newbie at R, I don’t know what to modify in the syntax for use as an SPSS extension. Any help would be appreciated. Any suggestions for learning R, especially as an SPSS extension, would also be appreciated. Thanks. Pat
|
Hi Patricia,
I can recommend "R for SAS and SPSS Users" by Robert A. Muenchen (Springer). You can look up Spss keywords and see the associated R script. www.statmethods.net also is a good site. There are many, many more pdf's about learning R out there. Be careful not to drown in all this information. Much of it is very, VERY esotheric.
Also, if you're going to use external libraries, be aware that many operations can be done in at least 5 ways. I'd try to stick with just one, even if it takes a couple of milisecs longer to calculate the solution. I mainly use the following packages: Hmisc, foreign, ggplot2, RODBC, stringr, R.utils, plyr, reshape. Cheers!!Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From: Jon K Peck <[hidden email]> To: [hidden email] Sent: Wed, February 9, 2011 11:24:01 PM Subject: Re: [SPSSX-L] need help creating violin plots It is very easy to run a R program inside Statistics with the output appearing automatically in the Viewer. Here is an example, assuming that you have installed the R violinmplot package. I've assumed that you use the regular SPSS techniques to select the cases in a region. criterion is the percentage variable and agency, well, the agency. Be careful to match the case of the actual variable names and the functions and parameters below, since everything in R is case sensitive. begin program r. library(violinmplot) dta= spssdata.GetDataFromSPSS("criterion agency", missingValueToNA=TRUE) violinmplot(criterion~agency, data=dta, horizontal=FALSE) end program. There are some chapters on using R in Statistics in the Programming and Data Managment Book downloadable from the SPSS Community (www.ibm.com/developerworks/spssdevcentral) that could help you get started. HTH, Jon Peck Senior Software Engineer, IBM [hidden email] 312-651-3435 From: "Cleland, Patricia (EDU)" <[hidden email]> To: [hidden email] Date: 02/09/2011 12:38 PM Subject: [SPSSX-L] need help creating violin plots Sent by: "SPSSX(r) Discussion" <[hidden email]> I have data on the % of clients who met a criterion (format f8.2) for approx 5000 clinics. The clinics are nested in about 100 Agencies and the Agencies are nested in 7 Regions. Each Clinic ‘belongs’ to only 1 Agency and each Agency is a member of only 1 Region. Currently the Clinic and Agency ID’s are strings, but could easily be changed to numeric values if that makes any difference. In order to show graphically the variation in % of clients who meet a criterion both within and among Agencies, I want to do violin plots separately for each Region, that is, 7 separate charts, showing the data for each Agency in the Region. I have some syntax from a colleague for producing violin plots but it’s based on using R as a stand-alone product rather than as an extension of SPSS. Since I’m a newbie at R, I don’t know what to modify in the syntax for use as an SPSS extension. Any help would be appreciated. Any suggestions for learning R, especially as an SPSS extension, would also be appreciated. Thanks. Pat |
In reply to this post by Thomas MacFarland
Hello, Re: your remark about package_name:::function_name() I always use double, not triple colons. This also works in your example: violinmplot::violinmplot(Exam_Score, main="Violin Plot of Exam Scores") Triple colons are used to access library variables that are not designed to be accessed. See also: http://stat.ethz.ch/R-manual/R-devel/library/base/html/ns-dblcolon.html Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From: Dr. Thomas W. MacFarland <[hidden email]> To: [hidden email] Sent: Wed, February 9, 2011 10:46:32 PM Subject: Re: [SPSSX-L] need help creating violin plots Everyone:
Attached is a R script on how to generate a violin plot.
They are certainly interesting, but too often they are not understood by the typical reader – at least not in higher education.
Let me know if you wish to receive the graphical images generated by this script. I’ll shy away from including them in this message but I’ll be glad to send them separately if you do not use R.
Best wishes.
Tom
Exam_Score <- c(100,098,097,056,078,086,045, 093,059,074,082,096,091,086, 059,067,083,085,081,080,078, 082,095,088,095)
summary(Exam_Score)
par(ask=TRUE) # Freeze the screen. boxplot(Exam_Score, horizontal=TRUE)
# You need to use the external violinmplot package and then # the violinmplot() function found in this external package.
install.packages("violinmplot") library(violinmplot)
# Note: It is good R practice to use # package_name:::function_name() syntax # when using a function from an external # package, for future documentation # purposes. This example is a bit different # since the function name is violinmplot # and this function shares the same name # used for the package.
# However, an oddity of the violinmplot # package is that it does not use a # namespace so you only key the function # name and not the package name.
par(ask=TRUE) # Freeze the screen. violinmplot(Exam_Score, main="Violin Plot of Exam Scores")
# Dr. Thomas W. MacFarland # Feb-09-11
----- Thomas W. MacFarland, Ed.D.
From: SPSSX(r)
Discussion [mailto:[hidden email]] On
Behalf Of Cleland, Patricia (EDU)
I have data on the % of clients who met a criterion (format f8.2) for approx 5000 clinics. The clinics are nested in about 100 Agencies and the Agencies are nested in 7 Regions. Each Clinic ‘belongs’ to only 1 Agency and each Agency is a member of only 1 Region.
Currently the Clinic and Agency ID’s are strings, but could easily be changed to numeric values if that makes any difference.
In order to show graphically the variation in % of clients who meet a criterion both within and among Agencies, I want to do violin plots separately for each Region, that is, 7 separate charts, showing the data for each Agency in the Region.
I have some syntax from a colleague for producing violin plots but it’s based on using R as a stand-alone product rather than as an extension of SPSS. Since I’m a newbie at R, I don’t know what to modify in the syntax for use as an SPSS extension.
Any help would be appreciated.
Any suggestions for learning R, especially as an SPSS extension, would also be appreciated.
Thanks. Pat |
Free forum by Nabble | Edit this page |