Hello All, Background: I have 86 satisfaction surveys from families who
have children in a program that provides case management services. The 20
likert questions (Strongly Disagree, Disagree, Undecided, Agree, Strongly
Agree) are scored 1 through 5 and are broken into 5 domains. Per the
instructions from one of the survey creators, I have only calculated domain
scores (averages) if 2/3 or more of the questions for a given domain have been
answered and most respondents answered all of the questions.
I would like to test whether there are statistically significant differences in the domain scores based on the following factors: Race, Sex, Length of Program Service (less than 365 days, 365+ days), program provider (there are two).
Second, I also have data related to whether parent(s) attended team meetings via a database, regardless of whether or not they responded to the survey. For those who completed the survey, I would like to test for whether there are correlations between the domains mentioned above and the percent of team meetings attended by a family member. I would also like to know if it is possible to state which domains have the strongest correlations to team meeting attendance by the different factors above: race, sex, etc. Could someone provide me with some help on what the appropriate statistical test(s) are and what the syntax would look like? Any assistance would be greatly appreciated. Thanks, Ariel |
Ariel,
I would start by computing the internal consistency of the computed scores, and then examining the distribution of scores (o determine the degree to which they follow a normal distribution. The first will give a conservative estimate of how reliable the scores are in your sample. The second will provide guidance about whether the assumptions for parametric or non parametric analysis should be used. Best, Stephen Brand www.StatisticsDoc.com From: Ariel Barak <[hidden email]>
Sender: "SPSSX(r) Discussion" <[hidden email]>
Date: Fri, 9 Dec 2011 10:58:55 -0600 To: <[hidden email]> ReplyTo: Ariel Barak <[hidden email]>
Subject: Survey Analysis Questions Hello All, Background: I have 86 satisfaction surveys from families who
have children in a program that provides case management services. The 20
likert questions (Strongly Disagree, Disagree, Undecided, Agree, Strongly
Agree) are scored 1 through 5 and are broken into 5 domains. Per the
instructions from one of the survey creators, I have only calculated domain
scores (averages) if 2/3 or more of the questions for a given domain have been
answered and most respondents answered all of the questions.
I would like to test whether there are statistically significant differences in the domain scores based on the following factors: Race, Sex, Length of Program Service (less than 365 days, 365+ days), program provider (there are two).
Second, I also have data related to whether parent(s) attended team meetings via a database, regardless of whether or not they responded to the survey. For those who completed the survey, I would like to test for whether there are correlations between the domains mentioned above and the percent of team meetings attended by a family member. I would also like to know if it is possible to state which domains have the strongest correlations to team meeting attendance by the different factors above: race, sex, etc. Could someone provide me with some help on what the appropriate statistical test(s) are and what the syntax would look like? Any assistance would be greatly appreciated. Thanks, Ariel |
While agreeing in part with Stephen, it sounds like you do not have
a random sample. Significance testing pertains to random samples,
asking about the probability of getting a result as strong as you
observe if another random sample were taken. If your data are the
entire universe to which you are generalizing (ex., all your
clients), then barring measurement error, even very small results
are true results and significance levels do not apply. If you have a
non-random sample, you would like to know the probability of a
result of given strength if you took another sample, but this is
unknowable. Significance levels for non-random data will be in
error to an unknown degree.
In an experimental setting, if subjects are randomly assigned to groups, statistical inference may be made but, as Knapp (2009) notes, "the inference is to all possible randomizations for the given sample, not to the population from which the sample was [non-randomly] drawn. " That is, if one uses significance testing for non-random subjects who are randomly assigned, one is asking about the probability of getting a result of the given strength if one randomizes again. Dave On 12/9/2011 12:20 PM, Statisticsdoc Consulting wrote: Ariel, -- |
In reply to this post by statisticsdoc
Stephen --
Thanks for the quick response. Cronbach's alpha scores on the 5 domains are .807, .907, .956, .959 & .925. Not sure whether you intended for me to examine the questions themselves or the domain averages for normality, but since they are so consistent with their domain, I'm not sure it matters. I have looked at the domain scores and they seem very much right-skewed (agree/strongly agree). I assume this means that I should be using a chi-square instead of t-test. Please advise me if this is incorrect. This solves the first piece of the analysis I'd like to conduct. For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child. 1) How can I test to see which of the 5 domains is the best predictor of whether parents attend team meetings for all respondents? 2) How can I break the respondents by group (race, sex, program) to see what domain is the best predictor of whether parents attend team meetings? Dave -- you are correct, this is not a random sample. The entire population of those enrolled in the program received a survey (n=268) and I received 86 responses. I dropped 28 youth due to bad addresses for a response rate of 86/240 ~ 35.8%. What do you suggest in analyzing the differences between groups? Thanks for your help! -Ariel On Fri, Dec 9, 2011 at 11:20 AM, <[hidden email]> wrote:
|
Administrator
|
In reply to this post by G David Garson
Hi Dave. Is this the Knapp article you are referring to?
www.tomswebpage.net/images/sigconf.doc Thanks, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by ariel barak
I'm not Stephen, but see my comments inserted, below.
Date: Fri, 9 Dec 2011 12:45:22 -0600 From: [hidden email] Subject: Re: Survey Analysis Questions To: [hidden email] Stephen -- "Thanks for the quick response. Cronbach's alpha scores on the 5 domains are .807, .907, .956, .959 & .925. Not sure whether you intended for me to examine the questions themselves or the domain averages for normality, but since they are so consistent with their domain, I'm not sure it matters. I have looked at the domain scores and they seem very much right-skewed (agree/strongly agree). " - The "skew" gets it name from the side with the long tail. For the order you stated, your scores bunch up on the right, so this is "left-skewed". "I assume this means that I should be using a chi-square instead of t-test. Please advise me if this is incorrect. This solves the first piece of the analysis I'd like to conduct." The non-parametric tests are usually tested with something other than a chi-squared. However, I do not think that many people would agree with Stephen, where he advocated (rank-based) nonparametric tests for totals of Likert-type items. If you score the "Totals" as Item- average scores, you have an immediate set of anchor-labels for interpreting the means that you observe for various groups. That is the biggest gain. These items do fail to meet the Likert ideal, because they do not have averages near the midpoint. There is very little loss of power or validity for the tests, when the variance is relatively restricted by being this sort of sum. However, here are several alternatives or extensions: a) Since there are few responses of 1 and 2, create new scores of "Objecters" by counting the number of responses of 1 and 2 - if you want to see whether the LOW extremes are particularly important. b) To create a nicer "interval" basis for the items that are totalled, rescore the items as 2-5 or 3-5, and obtain totals from those. c) To preserve the original scoring while creating a nicer interval basis for the Total, you could subtract each Total from its maximum- possible (for left-skewed), and take the square root. The scoring now will run in the opposite direction, so it will be useful to apply the actual labels, as transformed, to keep track of what you are observing. "For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child." Unless there are always a lot of meetings, that % will not be very attractive as a score. But you do need to figure out what will make a useful criterion. What is logical? What will your audience see as meaningful? "1) How can I test to see which of the 5 domains is the best predictor of whether parents attend team meetings for all respondents?" Please use the term "best correlate" and please consider why and how a correlation can result from "common causation". If, as you say, this is a "satisfaction survey", it is pretty hard to construe the satisfaction as actually "predicting" the attendance for the previous year. You can look at scatterplots of scores, and simple correlations, if you are considering two continuous scores. Your display may be more important that your test -- These are observational data, so you do need to tell a convincing story about how much association you see, and why it matters. "2) How can I break the respondents by group (race, sex, program) to see what domain is the best predictor of whether parents attend team meetings?" Again, "correlation" and not "prediction". Even though you don't have the timing problem for race and sex, "Correlation is not causation." What is the "average attendance score" for males vs. females? "Dave -- you are correct, this is not a random sample. The entire population of those enrolled in the program received a survey (n=268) and I received 86 responses. I dropped 28 youth due to bad addresses for a response rate of 86/240 ~ 35.8%. What do you suggest in analyzing the differences between groups?" Since you have the data on hand, you certainly should do your basic comparisons of 86 Responders vs. 28 Dropped-responders vs. 154 non-responders, for whatever data (Attendance, only?) in the complete file. [snip, previous] -- Rich Ulrich |
In reply to this post by ariel barak
Steve,
I have demographic data for all youth in the program, who their provider was, as well as whether they were in the program for more or less than 365 days. I ran Chi-square tests on sex, race, program provider, program length for those who responded and the total mailing list (minus the bad addresses) and the Pearson Chi-Square values are .258, .298, .582, and .165 respectively for the categories above. I take these values to mean that the population that responded to the survey is not statistically different form those that did not. I apologize for not including this information initially. I have looked at the distribution of the scale scores and believe that one would consider them very skewed. I will look at the Mann Whitney U test for sex, program provider (only 2), and program length (flagged 0 or 1 for <365 days or 365+) and the Kruskal-Wallis test for comparing scores of race (broken into 4). For what it's worth, my eyeball of the Spearman's rho compared to the Pearson Correlations for the domains don't appear to be very different. I'm not sure what you mean in the quote below: "Before you use multiple regression, we need to have a discussion regarding the correlations between predictors and coding your multi-category nominal and ordinal variables." Could you please clarify? Ultimately, I intend to have data similar to what's below, and would like to know if it is possible to test which domain is most closely correlated with team meeting attendance by parents. Client Sex Race Program ParentAttendance Domain1Avg Domain2Avg 100 1 1 1 26.3% 4.0 3.67 101 2 2 2 10.6% 2.5 2.33 102 1 1 1 78.7% 3.5 4.00 Thanks for all your help with this...it is very much appreciated. -Ariel On Fri, Dec 9, 2011 at 3:15 PM, StatisticsDoc <[hidden email]> wrote:
|
In reply to this post by Rich Ulrich
Hi Rich,
Thanks for responding to the topic. I'll try breaking general issues apart by asterisks. ****** I appreciate the 3 options you provided. With competing ideas on what I should be doing, I want to be sure that I understand your perspective. You believe that I could use parametric tests and that I don't need to use one of the 3 options you provided...right? ****** You wrote "These items do fail to meet the Likert ideal, because they do not have averages near the midpoint." What effect does this have on what I should be doing? ****** I wrote - For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child. You wrote - Unless there are always a lot of meetings, that % will not be very attractive as a score. But you do need to figure out what will make a useful criterion. What is logical? What will your audience see as meaningful?" The team meetings are to occur on a monthly basis and some youth may only be in the program for a few months, so you're right to bring up that concern. I could limit this piece of the analysis to those who had more than 6 team meetings. Any other suggestions on how I should handle this? I could create a variable based on 25% increments but am not sure if this would be appropriate. ****** Regarding domains, they are: Access (location was convenient, services available at convenient times) Participation (I helped choose services, I helped choose trt. goals) Cultural Sensitivity (Staff treated me with respect, Staff spoke with me in a way I understood) Appropriateness (I'm satisfied with the service my child received) Outcomes (My child is better at handling daily life, doing better at school, etc.) It seems reasonable that if parents felt the location was inconvenient or staff were disrespectful, that they would not attend team meetings. I will look at the scatterplots of the different domain scores by the % of team meetings attended by parents and look for correlations. Thanks to everyone for their feedback. -Ariel On Fri, Dec 9, 2011 at 4:13 PM, Rich Ulrich <[hidden email]> wrote:
|
I will be out of the office on Monday 12/12 through Friday 12/16, returning on Monday (12/19). If you need immediate assistance please call the main office
number 503/223-8248 or 800/788-1887 and the receptionist will ensure that I get the message. Kelly |
In reply to this post by ariel barak
I've numbered your asterisks, and I'll comment.
1) Right, pretty much. Especially for data where the actual means are so useful, I certainly would avoid the rank tests as my main presentation. If you use the skewed data instead of a modification, it is possible to mollify the nervous nellies by running those tests, too, so you can add, "Non-parametric tests give the same results." 2) No effect on analyses. Something to know when you are talking about them. 3) (How to handle counts of meetings.) I would want to have on hand the distribution of observed counts and durations: 6 meetings out of 12 may be different from 6 meetings out of 6. Then I would want to ask the people running the meetings how they might "rank" the participation in some sense. What is "good participation" when you consider the purpose and the clientele? 4) I don't see a question. I can comment -- I once was told that one of the big factors for poor attendance for psychiatric outpatients was inadequate bus service. -- Rich Ulrich Date: Fri, 9 Dec 2011 17:40:17 -0600 From: [hidden email] Subject: Re: Survey Analysis Questions To: [hidden email] Hi Rich, Thanks for responding to the topic. I'll try breaking general issues apart by asterisks. ****** (1) I appreciate the 3 options you provided. With competing ideas on what I should be doing, I want to be sure that I understand your perspective. You believe that I could use parametric tests and that I don't need to use one of the 3 options you provided...right? ****** (2) You wrote "These items do fail to meet the Likert ideal, because they do not have averages near the midpoint." What effect does this have on what I should be doing? ****** (3) I wrote - For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child. You wrote - Unless there are always a lot of meetings, that % will not be very attractive as a score. But you do need to figure out what will make a useful criterion. What is logical? What will your audience see as meaningful?" The team meetings are to occur on a monthly basis and some youth may only be in the program for a few months, so you're right to bring up that concern. I could limit this piece of the analysis to those who had more than 6 team meetings. Any other suggestions on how I should handle this? I could create a variable based on 25% increments but am not sure if this would be appropriate. ****** (4) Regarding domains, they are: Access (location was convenient, services available at convenient times) Participation (I helped choose services, I helped choose trt. goals) Cultural Sensitivity (Staff treated me with respect, Staff spoke with me in a way I understood) Appropriateness (I'm satisfied with the service my child received) Outcomes (My child is better at handling daily life, doing better at school, etc.) It seems reasonable that if parents felt the location was inconvenient or staff were disrespectful, that they would not attend team meetings. I will look at the scatterplots of the different domain scores by the % of team meetings attended by parents and look for correlations. [snip, previous] |
Free forum by Nabble | Edit this page |