Hi Everyone,
I typically almost analyze data in "wide" form (one participant per line), and am very familiar with plotting Bar Graphs with standard errors using the GUI "Legacy Dialogs".
But I now have a "long form" very large data set where one subject's data spans a 1000+ lines. Could anyone point me to which graphing menus or functions I should read up documentation on?
Right now if I just use Legacy Dialogs for Bar Graphs as usual, it's calculating variables like mean as if every line was a participant, rather than averaging within each participant. For example I've found the "Chart Builder" quite confusing, but maybe there's an option in there? Thanks a lot, Joseph
|
Hi Joseph, If you want to work with the averages within each participant, use AGGREGATE to create a new dataset of the participant averages (break by participant ID, making
new variables with the MEAN function – this will give you one row per participant), and then use Chart Builder to get the graph that you want. Hope this helps. Cheers, Kylie.
From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Joseph Williams Hi Everyone, I typically almost analyze data in "wide" form (one participant per line), and am very familiar with plotting Bar Graphs with standard errors using the GUI "Legacy Dialogs". But I now have a "long form" very large data set where one subject's data spans a 1000+ lines. Could anyone point me to which graphing menus or functions I should read up documentation on? Right now if I just use Legacy Dialogs for Bar Graphs as usual, it's calculating variables like mean as if every line was a participant, rather than averaging within each participant. For example I've found the "Chart Builder" quite confusing, but maybe there's an option in there? Thanks a lot,
Joseph |
In reply to this post by joseph.williams
Hi Joseph,
This can very easily be done using the R plugin. R is much better in data manipulation and graphing than the basic spss.: begin program r. vars<-spssdata.GetDataFromSPSS() barplot( rowMeans(vars[-1], na.rm=T), names=vars[1], horiz=T) end program. 2013/11/26 Joseph Williams <[hidden email]>
-- ------------------- dr F.H.G. (Frans) Marcelissen DigiPsy (www.DigiPsy.nl) Pomperschans 26 5595 AV Leende tel: 040 2065030/06 2325 06 53 skype adres: frans.marcelissen email: [hidden email] |
In reply to this post by joseph.williams
Alos would like correlation matrix, which is simpe in wide form, but less so in long form. Although many procedures for long form do have output option for corr or cov matrices, but no specific procedures. Do not think graphs are possible at present – future versions request? We can’t be the only ones Best Diana On 26/11/2013 01:16, "Joseph Williams" <joseph.williams@...> wrote: Hi Everyone, Professor Diana Kornbrot email: : d.e.kornbrot@... web: http://dianakornbrot.wordpress.com/ http://go.herts.ac.uk/diana_kornbrot Work Department of Psychology School of Life and Medical Sciences University of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK voice: +44 (0) 170 728 4626 Home 19 Elmhurst Avenue London N2 0LT, UK voice: +44 (0) 208 444 2081 mobile: +44 (0) 740 318 1612 |
In reply to this post by Kylie
Best diana On 26/11/2013 01:40, "Kylie Lange" <kylie.lange@...> wrote: Hi Joseph, Professor Diana Kornbrot email: : d.e.kornbrot@... web: http://dianakornbrot.wordpress.com/ http://go.herts.ac.uk/diana_kornbrot Work Department of Psychology School of Life and Medical Sciences University of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK voice: +44 (0) 170 728 4626 Home 19 Elmhurst Avenue London N2 0LT, UK voice: +44 (0) 208 444 2081 mobile: +44 (0) 740 318 1612 |
Administrator
|
In reply to this post by joseph.williams
I suspect any answer really depends upon what you want to graph?
"-But I now have a "long form" very large data set where one subject's data spans a 1000+ lines.' Very large is a relative concept. 1 Million rows of data is modest by some standards. Some people work with SPSS on over 100,000,000 rows . In your case you will likely want to AGGREGATE the data. The specifics of the aggregation totally depends upon your goals.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by joseph.williams
In addition to what others have said with respect to Aggregate, which sounds like the immediate, practical solution, I urge you to look at the GPL syntax reference,
another FM, although not quite as fine as the command syntax reference and which should have been installed along with everything else. There are numerous examples with syntax. However, there is a conceptual and linguistic shift with GPL that takes time to
get used to. Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Joseph Williams Hi Everyone, I typically almost analyze data in "wide" form (one participant per line), and am very familiar with plotting Bar Graphs with standard errors using the GUI "Legacy Dialogs". But I now have a "long form" very large data set where one subject's data spans a 1000+ lines. Could anyone point me to which graphing menus or functions I should read up documentation on? Right now if I just use Legacy Dialogs for Bar Graphs as usual, it's calculating variables like mean as if every line was a participant, rather than averaging within each participant. For example I've found the "Chart Builder" quite confusing, but maybe there's an option in there? Thanks a lot,
Joseph |
The GPL syntax reference is installed with
other Help materials. It's in the Reference section of the help along
with the CSR and other materials. In Statistics 20, the GPL reference
is listed with the CSR and option help.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "Maguin, Eugene" <[hidden email]> To: [hidden email], Date: 11/26/2013 06:56 AM Subject: Re: [SPSSX-L] Plotting Graphs & Visualizations when data is in "long form" Sent by: "SPSSX(r) Discussion" <[hidden email]> In addition to what others have said with respect to Aggregate, which sounds like the immediate, practical solution, I urge you to look at the GPL syntax reference, another FM, although not quite as fine as the command syntax reference and which should have been installed along with everything else. There are numerous examples with syntax. However, there is a conceptual and linguistic shift with GPL that takes time to get used to. Gene Maguin From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Joseph Williams Sent: Monday, November 25, 2013 8:16 PM To: [hidden email] Subject: Plotting Graphs & Visualizations when data is in "long form" Hi Everyone, I typically almost analyze data in "wide" form (one participant per line), and am very familiar with plotting Bar Graphs with standard errors using the GUI "Legacy Dialogs". But I now have a "long form" very large data set where one subject's data spans a 1000+ lines. Could anyone point me to which graphing menus or functions I should read up documentation on? Right now if I just use Legacy Dialogs for Bar Graphs as usual, it's calculating variables like mean as if every line was a participant, rather than averaging within each participant. For example I've found the "Chart Builder" quite confusing, but maybe there's an option in there? Thanks a lot, Joseph |
In reply to this post by joseph.williams
It helps to be more specific (both you and Diana) about what you want. Below are a bunch of examples of different "caterpillar" charts of means and standard errors for a set of fake data with 100 people, 10 waves, with half in an experimental condition and a random categorical grouping variable.
The advice about having SPSS generate the necessary dataset of your specific summaries (through aggregate or split file and regression or OMS or whatever) generically extends to any situation. If you are more specific about what you want the end result to be, it is easier to give advice about how to go about it. ************************************************************. set seed 10. input program. loop #j = 1 to 100. compute #cate = TRUNC(RV.UNIFORM(1,11)). loop #i = 1 to 10. compute pers = #j. compute wave = #i. compute exp = (#j > 50). compute cate = #cate. end case. end loop. end loop. end file. end input program. dataset name long. *outcome variable. compute out = RV.NORMAL(pers/10,wave) + RV.UNIFORM(-5,5) + 5*exp*(wave-6)*RV.NORMAL(0,1) - cate. variable levels wave (ordinal) exp (nominal) pers (nominal) cate (nominal) out (scale). formats pers wave exp cate (F2.0) out (F2.0). value label exp 0 'Control' 1 'Treatment'. exe. *Standard Error Graph of out by person - generated this through GUI. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=pers MEANSE(out, 2)[name= "MEANSE_out_2" LOW="MEANSE_out_2_LOW" HIGH="MEANSE_out_2_HIGH"] MISSING= LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: pers=col(source(s), name("pers"), unit.category()) DATA: MEAN_out=col(source(s), name("MEANSE_out_2")) DATA: LOW=col(source(s), name("MEANSE_out_2_LOW")) DATA: HIGH=col(source(s), name("MEANSE_out_2_HIGH")) GUIDE: axis(dim(1), label("pers")) GUIDE: axis(dim(2), label("Mean out")) GUIDE: text.footnote(label("Error Bars: +/- 2 SE")) SCALE: cat(dim(1)) SCALE: linear(dim(2), include(0)) ELEMENT: point(position(pers*MEAN_out)) ELEMENT: interval(position(region.spread.range(pers*(LOW+HIGH))), shape.interior(shape.ibeam)) END GPL. *Standard Error Graph of out by wave - generated through GUI. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=wave MEANSE(out, 2)[name= "MEANSE_out_2" LOW="MEANSE_out_2_LOW" HIGH="MEANSE_out_2_HIGH"] MISSING= LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: wave=col(source(s), name("wave"), unit.category()) DATA: MEAN_out=col(source(s), name("MEANSE_out_2")) DATA: LOW=col(source(s), name("MEANSE_out_2_LOW")) DATA: HIGH=col(source(s), name("MEANSE_out_2_HIGH")) GUIDE: axis(dim(1), label("wave")) GUIDE: axis(dim(2), label("Mean out")) GUIDE: text.footnote(label("Error Bars: +/- 2 SE")) SCALE: cat(dim(1)) SCALE: linear(dim(2), include(0)) ELEMENT: point(position(wave*MEAN_out)) ELEMENT: interval(position(region.spread.range(wave*(LOW+HIGH))), shape.interior(shape.ibeam)) END GPL. *SE graph of out by person with categorical group colored and experimental groups in different panels. *Added false person variable for the graphs (location on the x axis is arbitrary) and then manually added in color statement for element. *Took out point because bar is symmetric - also took out x axis label since it is arbitrary. compute pers2 = pers - (50*exp). GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=pers2[LEVEL=NOMINAL] MEANSE( out, 2)[name="MEANSE_out_2" LOW="MEANSE_out_2_LOW" HIGH= "MEANSE_out_2_HIGH"] exp cate /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: pers2=col(source(s), name("pers2"), unit.category()) DATA: MEAN_out=col(source(s), name("MEANSE_out_2")) DATA: LOW=col(source(s), name("MEANSE_out_2_LOW")) DATA: HIGH=col(source(s), name("MEANSE_out_2_HIGH")) DATA: exp=col(source(s), name("exp"), unit.category()) DATA: cate=col(source(s), name("cate"), unit.category()) GUIDE: axis(dim(1), null()) GUIDE: axis(dim(2), label("Mean out")) GUIDE: axis(dim(4), label("exp"), opposite()) GUIDE: text.footnote(label("Error Bars: +/- 2 SE")) SCALE: cat(dim(1)) SCALE: linear(dim(2), include(0)) SCALE: cat(dim(4), include("0", "1")) ELEMENT: interval(position(region.spread.range(pers2*(LOW+HIGH)*1*exp)), shape.interior(shape.ibeam), color(cate)) END GPL. *Lets do some more panels by group and by treatment. *Again making a false person variable for location on the x axis. sort cases by cate exp pers wave. DO IF $casenum = 1 or cate <> lag(cate) or exp <> lag(exp). COMPUTE pers3 = 1. ELSE IF cate = LAG(cate) AND exp = LAG(exp) AND pers = LAG(pers). COMPUTE pers3 = lag(pers3). ELSE. COMPUTE pers3 = lag(pers3) + 1. END IF. *Only change to prior graph is added cate to faceting structure in the element statement and used pers3 instead of pers2 & took out axis label for outcome. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=pers3[LEVEL=NOMINAL] MEANSE( out, 2)[name="MEANSE_out_2" LOW="MEANSE_out_2_LOW" HIGH= "MEANSE_out_2_HIGH"] exp cate /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: pers3=col(source(s), name("pers3"), unit.category()) DATA: MEAN_out=col(source(s), name("MEANSE_out_2")) DATA: LOW=col(source(s), name("MEANSE_out_2_LOW")) DATA: HIGH=col(source(s), name("MEANSE_out_2_HIGH")) DATA: exp=col(source(s), name("exp"), unit.category()) DATA: cate=col(source(s), name("cate"), unit.category()) GUIDE: axis(dim(1), null(), label("Person")) GUIDE: axis(dim(2)) GUIDE: axis(dim(3), opposite()) GUIDE: axis(dim(4), opposite()) GUIDE: text.footnote(label("Error Bars: +/- 2 SE for persons")) SCALE: cat(dim(1)) SCALE: linear(dim(2)) ELEMENT: interval(position(region.spread.range(pers3*(LOW+HIGH)*exp*cate)), shape.interior(shape.ibeam)) END GPL. *This shows group 3 is confounded - no individuals in the control panel. ************************************************************. |
In reply to this post by Frans Marcelissen-5
I'm not sure that that is the plot anyone
would want, but it is simple to do it in native Statistics code.
compute means = mean(V1 to V10). GRAPH /BAR(SIMPLE)=VALUE(means). Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Frans Marcelissen <[hidden email]> To: [hidden email], Date: 11/26/2013 07:20 AM Subject: Re: [SPSSX-L] Plotting Graphs & Visualizations when data is in "long form" Sent by: "SPSSX(r) Discussion" <[hidden email]> Hi Joseph, This can very easily be done using the R plugin. R is much better in data manipulation and graphing than the basic spss.: begin program r. vars<-spssdata.GetDataFromSPSS() barplot( rowMeans(vars[-1], na.rm=T), names=vars[1], horiz=T) end program. 2013/11/26 Joseph Williams <joseph.williams@...> Hi Everyone, I typically almost analyze data in "wide" form (one participant per line), and am very familiar with plotting Bar Graphs with standard errors using the GUI "Legacy Dialogs". But I now have a "long form" very large data set where one subject's data spans a 1000+ lines. Could anyone point me to which graphing menus or functions I should read up documentation on? Right now if I just use Legacy Dialogs for Bar Graphs as usual, it's calculating variables like mean as if every line was a participant, rather than averaging within each participant. For example I've found the "Chart Builder" quite confusing, but maybe there's an option in there? Thanks a lot, Joseph -- ------------------- dr F.H.G. (Frans) Marcelissen DigiPsy (www.DigiPsy.nl) Pomperschans 26 5595 AV Leende tel: 040 2065030/06 2325 06 53 skype adres: frans.marcelissen email: frans.marcelissen@... |
Free forum by Nabble | Edit this page |