|
Hello,
I am doing survey analysis and for each of the variables in my survey I am interested in both the total count, total weighted percentage, as well as the count and weight broken down by gender and geographical region. My survey has hundreds of variables. I feel there must be a better way of disaggregating each of these by gender and region than by manually running crosstabs and copying the counts and percentages into an excel spreadsheet. What I would ideally like is a table structured like this: Variable Name | Overall Count | Overall % | Men count | Men % | Women count | Women % | Geographic region #1 count | Geographic region #2 % | etc. Does anyone have any advice? I don't have the budget to buy SPSS 'tables' and I have been unable to find any help on-line. Thanks and kind regards, Megan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Megan,
Before I offer a suggestion I'd like to make sure I understand your desired output because either I don't or you have an error in your description. Let's say you had a question like 'Steven Harper should be reelected? (Yes/No). Your crosstab would be a 2 by 2 plus marginal totals. Exactly what numbers from the crosstab do you want to be in your table? You have three choices: 'No' counts and percents for males, females and total sample; 'Yes' counts and percents for the same, and total column counts and percents for a gender by item crosstab. Also, what happens for DK/NA/Refused responses? So I'm wondering if you aren't looking for a final result that looks like this. Variable Name | Value/Value label | Overall Count | Overall % | Men count | Men % | Women count | Women % | Geographic region #1 count | Geographic region #2 % | etc. Thanks, Gene Maguin >>I am doing survey analysis and for each of the variables in my survey I am interested in both the total count, total weighted percentage, as well as the count and weight broken down by gender and geographical region. My survey has hundreds of variables. I feel there must be a better way of disaggregating each of these by gender and region than by manually running crosstabs and copying the counts and percentages into an excel spreadsheet. What I would ideally like is a table structured like this: Variable Name | Overall Count | Overall % | Men count | Men % | Women count | Women % | Geographic region #1 count | Geographic region #2 % | etc. Does anyone have any advice? I don't have the budget to buy SPSS 'tables' and I have been unable to find any help on-line. Thanks and kind regards, Megan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Megan,
Ok, that is very helpful. Thanks. So far as I know, there are two methods to get to what you want. Both involve complexity. This is going to be lengthy so bear with me. I made up a little dataset to use and it embodies the following expectations: 1) you have missing data; 2) all DVs are yes/no response format. data list / id sex region dv1 dv2 dv3(f2.0,5(1x,f1.0)). begin data 01 1 1 9 2 1 02 9 1 2 1 1 03 1 2 2 9 2 04 2 9 2 1 2 05 1 3 1 2 1 06 2 3 1 1 9 07 1 3 2 2 2 08 2 2 1 1 1 09 9 1 2 9 2 10 2 9 2 1 2 11 1 2 9 2 1 12 9 3 1 1 1 13 1 9 2 9 2 14 2 1 2 1 1 15 1 1 1 2 9 end data. execute. * dv1 to dv2: 1=yes; 2=no. sex: 1=male; 2=female. missing values sex region dv1 dv2 dv3(9). * METHOD 1. procedure output outfile='c:\mixedtests\meganf1.txt'. crosstabs variables=sex dv1 dv2 dv3(1,2)/tables=sex by dv1 dv2 dv3/ cells=count col/write all. procedure output outfile='c:\mixedtests\meganf2.txt'. crosstabs variables=region(1,3) dv1 dv2 dv3(1,2)/ tables=region by dv1 dv2 dv3/cells=count col/write all. * METHOD 2. oms /select tables/if subtypes='Crosstabulation'/ destination format=sav outfile='c:\mixedtests\meganf1.sav'. crosstabs sex by dv1 dv2 dv3/cells=count col. omsend. oms /select tables/if subtypes='Crosstabulation'/ destination format=sav outfile='c:\mixedtests\meganf2.sav'. crosstabs region by dv1 dv2 dv3/cells=count col. omsend. oms /select tables/if subtypes='Crosstabulation'/ destination format=sav outfile='c:\mixedtests\meganf2.sav'. crosstabs region by dv1 dv2 dv3/cells=count row. omsend. Commentary. Overall, my preference is for method 1 because I think it is easier to work with. However, look at the output for method 2. It will seem confusing especially if you have never used OMS before. The key (unfortunate) thing is that each table adds columns to the file. So, 200 tables, maybe 400 or more columns. I think it's not a big deal; just an additional step to collapse columms. Method 1, in contrast, adds rows. I'm going to procede with method 1. Method 1 commentary. This method uses the write subcommand of the crosstabs command (read up on the crosstabs documentation in the syntax reference). Note also that I used the Integer mode rather than the more commonly used general model. The output file for this is a text file (see syntax ref for further documentation) and looks like this for the region variable. There are as many rows as there are DVs times the range of values for region times 2 (for yes and no). Tableid corresponds to the list order of the 'column' variables on the generating crosstabs command. That the variables are not named here is unfortunate but trivial. The values in the rowval and colval are the values of the row and column variables, respectively. Tag TableID count rowval colval 1 1 1 1 1 1 1 1 2 1 1 1 3 3 1 1 1 3 1 2 1 1 1 2 2 1 1 1 3 2 1 2 2 1 1 1 2 1 2 1 1 2 2 3 1 1 2 2 1 2 1 2 1 2 2 1 2 2 3 2 1 3 3 1 1 1 3 2 2 1 1 3 2 3 1 1 3 1 1 2 1 3 1 2 2 1 3 1 3 2 The rest of the syntax is this. data list file='c:\mixedtests\meganf1.txt'/ tag 4 tableid 8 count 9-16 rowid 24 colid 32. execute. casestovars /id=tableid/drop=tag. compute gendertotaln=sum(count.1 to count.4). compute gendertotalyes=sum(count.1,count.2). compute gendertotalpcnt=100*gendertotalyes/gendertotaln. compute malepcnt=100*count.1/gendertotalyes. compute femalepcnt=100*count.2/gendertotalyes. execute. format gendertotaln gendertotalyes(f2.0) gendertotalpcnt malepcnt femalepcnt(f5.2). list count.1 to count.4 rowid.1 to rowid.4 colid.1 to colid.4 gendertotaln gendertotalyes gendertotalpcnt malepcnt femalepcnt. save outfile='c:\mixedtests\meganf1a.sav'/keep=tableid count.1 count.2 gendertotalyes malepcnt femalepcnt gendertotalpcnt/ rename=(count.1 count.2=malen femalen). **************************************************************************** ******. data list file='c:\mixedtests\meganf2.txt'/ tag 4 tableid 8 count 9-16 rowid 24 colid 32. execute. casestovars /id=tableid/drop=tag. compute regiontotaln=sum(count.1 to count.6). compute regiontotalyes=sum(count.1,count.2,count.3). compute regiontotalpcnt=100*regiontotalyes/regiontotaln. compute region1pcnt=100*count.1/regiontotalyes. compute region2pcnt=100*count.2/regiontotalyes. compute region3pcnt=100*count.3/regiontotalyes. execute. format regiontotaln regiontotalyes(f2.0) regiontotalpcnt region1pcnt region2pcnt region3pcnt(f5.2). list count.1 to count.6 rowid.1 to rowid.6 colid.1 to colid.6 regiontotaln regiontotalyes regiontotalpcnt region1pcnt region2pcnt region3pcnt. save outfile='c:\mixedtests\meganf2a.sav'/keep=tableid count.1 count.2 count.3 regiontotalyes region1pcnt region2pcnt region3pcnt regiontotalpcnt/ rename=(count.1 count.2 count.3=region1n region2n region3n). **************************************************************************** ******. match files file='c:\mixedtests\meganf1a.sav'/ file='c:\mixedtests\meganf2a.sav'/by tableid. execute. value labels tableid 1 'dv1' 2 'dv2' 3 'dv3'. summarize tables=tableid gendertotalyes gendertotalpcnt malen malepcnt femalen femalepcnt regiontotalyes regiontotalpcnt region1n region1pcnt region2n region2pcnt region3n region3pcnt/format=list. Commentary. The basic operation is to read each file written out from the crossstabs, perform the required computations to compute totals and percents, and save a new file keeping the relevant variables. The two new files are put together by tableid. I used Summarize to output the results thinking it would be more accessible for you. Of course, the Summarize command output can be exported and worked over in word. I think I have every thing correct but check the results carefully and let me know if you have problems. Gene Maguin -----Original Message----- From: Megan Misovic [mailto:[hidden email]] Sent: Friday, August 27, 2010 6:18 PM To: 'Gene Maguin' Subject: RE: Combining the output from multiple crosstabs Hi Gene, You are absolutely right - I am interested only in the 'yes' responses. In my case I am analysing a health survey, so in most cases I am only interested if the person has a certain health condition. So, for example I would like to be able to report: Condition | Provincial Total Count | Provincial Total Percentage | Men count Diabetes | 56 | 3% | 34| High Blood Pressure | 246 | 5% | 176 Etc. So- the 'provincial total count' would be the total number of 'yes' cases in the data (this is important because we cannot report small numbers), the 'provincial total percentage' would be the percent of all respondents who had the condition, the 'men count' would be the number of men with the condition and so on. Does this help describe a little better what I'm looking for? I started doing all this by hand and I realized that I would be foolish to try to do this with brute force - there must be a way to automate it! Thanks for your help Gene! Megan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin Sent: August 27, 2010 3:05 PM To: [hidden email] Subject: Re: Combining the output from multiple crosstabs Megan, Before I offer a suggestion I'd like to make sure I understand your desired output because either I don't or you have an error in your description. Let's say you had a question like 'Steven Harper should be reelected? (Yes/No). Your crosstab would be a 2 by 2 plus marginal totals. Exactly what numbers from the crosstab do you want to be in your table? You have three choices: 'No' counts and percents for males, females and total sample; 'Yes' counts and percents for the same, and total column counts and percents for a gender by item crosstab. Also, what happens for DK/NA/Refused responses? So I'm wondering if you aren't looking for a final result that looks like this. Variable Name | Value/Value label | Overall Count | Overall % | Men count | Men % | Women count | Women % | Geographic region #1 count | Geographic region #2 % | etc. Thanks, Gene Maguin >>I am doing survey analysis and for each of the variables in my survey I am interested in both the total count, total weighted percentage, as well as the count and weight broken down by gender and geographical region. My survey has hundreds of variables. I feel there must be a better way of disaggregating each of these by gender and region than by manually running crosstabs and copying the counts and percentages into an excel spreadsheet. What I would ideally like is a table structured like this: Variable Name | Overall Count | Overall % | Men count | Men % | Women count | Women % | Geographic region #1 count | Geographic region #2 % | etc. Does anyone have any advice? I don't have the budget to buy SPSS 'tables' and I have been unable to find any help on-line. Thanks and kind regards, Megan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
Here's a different approach that gives what Megan asks for, I think. I'll use Gene's data to illustrate. data list / id sex region dv1 dv2 dv3(f2.0,5(1x,f1.0)). begin data 01 1 1 9 2 1 02 9 1 2 1 1 03 1 2 2 9 2 04 2 9 2 1 2 05 1 3 1 2 1 06 2 3 1 1 9 07 1 3 2 2 2 08 2 2 1 1 1 09 9 1 2 9 2 10 2 9 2 1 2 11 1 2 9 2 1 12 9 3 1 1 1 13 1 9 2 9 2 14 2 1 2 1 1 15 1 1 1 2 9 end data. * I prefer 1=Yes, 0=No coding, because then SUMS give the count of Yes responses. recode sex dv1 to dv3 (2=0) (else=copy). exe. * dv1 to dv2: 1=yes; 0=no. sex: 1=male; 0=female. missing values sex region dv1 dv2 dv3(9). * Megan said, "provincial total count' would be the total number of 'yes' cases in the data". * This suggests that the data are from one province only, so no need to handle multiple provinces. * Use AGGREGATE to get Provincial Total Count and Provincial Total Percentage . AGGREGATE /OUTFILE= "C:\temp\province.sav" /BREAK= /PTC1 = SUM(dv1) /PTC2 = SUM(dv2) /PTC3 = SUM(dv3) /PTpercent1 = PGT(dv1 0) /PTpercent2 = PGT(dv2 0) /PTpercent3 = PGT(dv3 0) . * Now get the total count for men only . temporary. select if (sex EQ 1). AGGREGATE /OUTFILE= "C:\temp\men.sav" /BREAK= /MenTC1 = SUM(dv1) /MenTC2 = SUM(dv2) /MenTC3 = SUM(dv3) get file = "C:\temp\province.sav" . match files file = * / file = "C:\temp\men.sav" . exe. VARSTOCASES /MAKE ProvTotCount FROM PTC1 PTC2 PTC3 /MAKE ProvTotPercent FROM PTpercent1 PTpercent2 PTpercent3 /MAKE MenTotCount FROM MenTC1 MenTC2 MenTC3 /INDEX=Condition(3) /KEEP= /NULL=KEEP. formats ProvTotCount MenTotCount (f5.0) / ProvTotPercent (f5.2). variable labels condition "Condition" ProvTotCount "Provincial Total Count" ProvTotPercent "Provincial Total Percent" MenTotCount "Total Count for Men" . value labels Condition 1 "Condition 1 label" 2 "Condition 2 label" 3 "Condition 3 label" . Report /FORMAT=CHWRAP(ON) PREVIEW(OFF) CHALIGN(BOTTOM) UNDERSCORE(ON) ONEBREAKCOLUMN(OFF) CHDSPACE(1) SUMSPACE(0) AUTOMATIC LIST BRKSPACE(0) PAGE(1) MISSING'.' LENGTH(1, 39) ALIGN(LEFT) TSPACE(1) FTSPACE(1) MARGINS(1,55) /TITLE=LEFT 'Title Here' RIGHT 'Page )PAGE' /VARIABLES Condition (LABELS) (LEFT) (OFFSET(0))(20) ProvTotCount (VALUES) (RIGHT) (OFFSET(0)) (10) ProvTotPercent (VALUES) (RIGHT) (OFFSET(0)) (10) MenTotCount (VALUES) (RIGHT) (OFFSET(0)) (8) . HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
| Free forum by Nabble | Edit this page |
