SPSSX Discussion - Re: Survey Analysis Questions

Re: Survey Analysis Questions

Posted by ariel barak on Dec 09, 2011; 10:46pm
URL: http://spssx-discussion.165.s1.nabble.com/Survey-Analysis-Questions-tp5062418p5063255.html

Steve,

I have demographic data for all youth in the program, who their provider was, as well as whether they were in the program for more or less than 365 days. I ran Chi-square tests on sex, race, program provider, program length for those who responded and the total mailing list (minus the bad addresses) and the Pearson Chi-Square values are .258, .298, .582, and .165 respectively for the categories above. I take these values to mean that the population that responded to the survey is not statistically different form those that did not. I apologize for not including this information initially.

I have looked at the distribution of the scale scores and believe that one would consider them very skewed. I will look at the Mann Whitney U test for sex, program provider (only 2), and program length (flagged 0 or 1 for <365 days or 365+) and the Kruskal-Wallis test for comparing scores of race (broken into 4).

For what it's worth, my eyeball of the Spearman's rho compared to the Pearson Correlations for the domains don't appear to be very different.

I'm not sure what you mean in the quote below:

"Before you use multiple regression, we need to have a discussion regarding the correlations between predictors and coding your multi-category nominal and ordinal variables."

Could you please clarify?

Ultimately, I intend to have data similar to what's below, and would like to know if it is possible to test which domain is most closely correlated with team meeting attendance by parents.

Client   Sex    Race   Program    ParentAttendance Domain1Avg Domain2Avg
100       1        1            1          26.3%                    4.0              3.67
101       2        2        2          10.6%                    2.5              2.33
102       1        1            1          78.7%                    3.5    4.00

Thanks for all your help with this...it is very much appreciated.

-Ariel

On Fri, Dec 9, 2011 at 3:15 PM, StatisticsDoc <[hidden email]> wrote:

Ariel,

Jay Fogleman
It would be best to focus on the skew of the scale scores. The sum of the items is often less skewed than the individual items are. How skewed are the data? Parametric statistics are reasonably robust to violations of normality.

If your scale scores are extremely skewed, do not use Chi-square to compare scores between groups. Your scale scores are not categorical and need not be treated as such by dividing them into categories. You would want to use the Mann Whitney U test to compare scores between two groups (e.g. sex), the Kruskal-Wallis test for comparing scores between three or more levels of a categorical variable, and the Spearman correlation for correlations between the scales and other scales. These statistics can computed using the non-parametric stats commands in SPSS (NPAR TESTS, NPAR COR ).

When your data depart markedly from normality, non-parametric statistics provide a more accurate test. You would not use them if your data were close to normally distributed, as they would be less powerful than parametric stats (i.e., less likely to detect significant associations in the data). SPSS does not provide non-parametric multiple regression as a standard feature, but there may be a script or other extension program out there in the user community that you can use (the most amazing things can be found there).

Hopefully, the departures from normality are small, and you can use the more familiar family of parametric statistics: t-tests, one way ANOVA, multiple regression. Before you use multiple regression, we need to have a discussion regarding the correlations between predictors and coding your multi-category nominal and ordinal variables.

Advice concerning the non-random nature of the sample is well taken. I am going for proceed for the sake of discussion as though your sample was random, although of course sampling bias is a great problem. Do you have any data on the non-responders that would enable you to compare them with the responders?

Best

Steve

From: SPSSX(r) Discussion [hidden email] On Behalf Of Ariel Barak
Sent: Friday, December 09, 2011 1:45 PM

To: [hidden email]
Subject: Re: Survey Analysis Questions

Stephen --

Thanks for the quick response. Cronbach's alpha scores on the 5 domains are .807, .907, .956, .959 & .925. Not sure whether you intended for me to examine the questions themselves or the domain averages for normality, but since they are so consistent with their domain, I'm not sure it matters. I have looked at the domain scores and they seem very much right-skewed (agree/strongly agree).

I assume this means that I should be using a chi-square instead of t-test. Please advise me if this is incorrect. This solves the first piece of the analysis I'd like to conduct.

For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child.

1) How can I test to see which of the 5 domains is the best predictor of whether parents attend team meetings for all respondents?

2) How can I break the respondents by group (race, sex, program) to see what domain is the best predictor of whether parents attend team meetings?

Dave -- you are correct, this is not a random sample. The entire population of those enrolled in the program received a survey (n=268) and I received 86 responses. I dropped 28 youth due to bad addresses for a response rate of 86/240 ~ 35.8%. What do you suggest in analyzing the differences between groups?

Thanks for your help!

-Ariel

On Fri, Dec 9, 2011 at 11:20 AM, <[hidden email]> wrote:

Ariel,
I would start by computing the internal consistency of the computed scores, and then examining the distribution of scores (o determine the degree to which they follow a normal distribution. The first will give a conservative estimate of how reliable the scores are in your sample. The second will provide guidance about whether the assumptions for parametric or non parametric analysis should be used.
Best,
Stephen Brand
www.StatisticsDoc.com

From: Ariel Barak <[hidden email]>
Sender: "SPSSX(r) Discussion" <[hidden email]>
Date: Fri, 9 Dec 2011 10:58:55 -0600

To: <[hidden email]>
ReplyTo: Ariel Barak <[hidden email]>

Subject: Survey Analysis Questions

Hello All,

Background: I have 86 satisfaction surveys from families who have children in a program that provides case management services. The 20 likert questions (Strongly Disagree, Disagree, Undecided, Agree, Strongly Agree) are scored 1 through 5 and are broken into 5 domains. Per the instructions from one of the survey creators, I have only calculated domain scores (averages) if 2/3 or more of the questions for a given domain have been answered and most respondents answered all of the questions.

I would like to test whether there are statistically significant differences in the domain scores based on the following factors: Race, Sex, Length of Program Service (less than 365 days, 365+ days), program provider (there are two).

Second, I also have data related to whether parent(s) attended team meetings via a database, regardless of whether or not they responded to the survey. For those who completed the survey, I would like to test for whether there are correlations between the domains mentioned above and the percent of team meetings attended by a family member. I would also like to know if it is possible to state which domains have the strongest correlations to team meeting attendance by the different factors above: race, sex, etc.

Could someone provide me with some help on what the appropriate statistical test(s) are and what the syntax would look like?

Any assistance would be greatly appreciated.

Thanks,
Ariel