SPSSX Discussion

Survey Analysis Questions

Classic

List

Threaded

10 messages Options

ariel barak

Survey Analysis Questions

Hello All,

Background: I have 86 satisfaction surveys from families who have children in a program that provides case management services. The 20 likert questions (Strongly Disagree, Disagree, Undecided, Agree, Strongly Agree) are scored 1 through 5 and are broken into 5 domains. Per the instructions from one of the survey creators, I have only calculated domain scores (averages) if 2/3 or more of the questions for a given domain have been answered and most respondents answered all of the questions.

I would like to test whether there are statistically significant differences in the domain scores based on the following factors: Race, Sex, Length of Program Service (less than 365 days, 365+ days), program provider (there are two).

Second, I also have data related to whether parent(s) attended team meetings via a database, regardless of whether or not they responded to the survey. For those who completed the survey, I would like to test for whether there are correlations between the domains mentioned above and the percent of team meetings attended by a family member. I would also like to know if it is possible to state which domains have the strongest correlations to team meeting attendance by the different factors above: race, sex, etc.

Could someone provide me with some help on what the appropriate statistical test(s) are and what the syntax would look like?

Any assistance would be greatly appreciated.

Thanks,

Ariel

statisticsdoc

Re: Survey Analysis Questions

Ariel,
I would start by computing the internal consistency of the computed scores, and then examining the distribution of scores (o determine the degree to which they follow a normal distribution. The first will give a conservative estimate of how reliable the scores are in your sample. The second will provide guidance about whether the assumptions for parametric or non parametric analysis should be used.
Best,
Stephen Brand

www.StatisticsDoc.com

From: Ariel Barak <[hidden email]>

Sender: "SPSSX(r) Discussion" <[hidden email]>

Date: Fri, 9 Dec 2011 10:58:55 -0600

To: <[hidden email]>

ReplyTo: Ariel Barak <[hidden email]>

Subject: Survey Analysis Questions

Hello All,

Could someone provide me with some help on what the appropriate statistical test(s) are and what the syntax would look like?

Any assistance would be greatly appreciated.

Thanks,

Ariel

G David Garson

Re: Survey Analysis Questions

While agreeing in part with Stephen, it sounds like you do not have a random sample. Significance testing pertains to random samples, asking about the probability of getting a result as strong as you observe if another random sample were taken. If your data are the entire universe to which you are generalizing (ex., all your clients), then barring measurement error, even very small results are true results and significance levels do not apply. If you have a non-random sample, you would like to know the probability of a result of given strength if you took another sample, but this is unknowable. Significance levels for non-random data will be in error to an unknown degree.

In an experimental setting, if subjects are randomly assigned to groups, statistical inference may be made but, as Knapp (2009) notes, "the inference is to all possible randomizations for the given sample, not to the population from which the sample was [non-randomly] drawn. " That is, if one uses significance testing for non-random subjects who are randomly assigned, one is asking about the probability of getting a result of the given strength if one randomizes again.

Dave

On 12/9/2011 12:20 PM, Statisticsdoc Consulting wrote:

Ariel,
I would start by computing the internal consistency of the computed scores, and then examining the distribution of scores (o determine the degree to which they follow a normal distribution. The first will give a conservative estimate of how reliable the scores are in your sample. The second will provide guidance about whether the assumptions for parametric or non parametric analysis should be used.
Best,
Stephen Brand
www.StatisticsDoc.com

From: Ariel Barak [hidden email]

Sender: "SPSSX(r) Discussion" [hidden email]

Date: Fri, 9 Dec 2011 10:58:55 -0600

To: [hidden email]

ReplyTo: Ariel Barak [hidden email]

Subject: Survey Analysis Questions

Hello All,

Background: I have 86 satisfaction surveys from families who have children in a program that provides case management services. The 20 likert questions (Strongly Disagree, Disagree, Undecided, Agree, Strongly Agree) are scored 1 through 5 and are broken into 5 domains. Per the instructions from one of the survey creators, I have only calculated domain scores (averages) if 2/3 or more of the questions for a given domain have been answered and most respondents answered all of the questions.

I would like to test whether there are statistically significant differences in the domain scores based on the following factors: Race, Sex, Length of Program Service (less than 365 days, 365+ days), program provider (there are two).

Second, I also have data related to whether parent(s) attended team meetings via a database, regardless of whether or not they responded to the survey. For those who completed the survey, I would like to test for whether there are correlations between the domains mentioned above and the percent of team meetings attended by a family member. I would also like to know if it is possible to state which domains have the strongest correlations to team meeting attendance by the different factors above: race, sex, etc.

Could someone provide me with some help on what the appropriate statistical test(s) are and what the syntax would look like?

Any assistance would be greatly appreciated.

Thanks,

Ariel

--

ariel barak

Re: Survey Analysis Questions

In reply to this post by statisticsdoc

Stephen --

Thanks for the quick response. Cronbach's alpha scores on the 5 domains are .807, .907, .956, .959 & .925. Not sure whether you intended for me to examine the questions themselves or the domain averages for normality, but since they are so consistent with their domain, I'm not sure it matters. I have looked at the domain scores and they seem very much right-skewed (agree/strongly agree).

I assume this means that I should be using a chi-square instead of t-test. Please advise me if this is incorrect. This solves the first piece of the analysis I'd like to conduct.

For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child.

1) How can I test to see which of the 5 domains is the best predictor of whether parents attend team meetings for all respondents?

2) How can I break the respondents by group (race, sex, program) to see what domain is the best predictor of whether parents attend team meetings?

Dave -- you are correct, this is not a random sample. The entire population of those enrolled in the program received a survey (n=268) and I received 86 responses. I dropped 28 youth due to bad addresses for a response rate of 86/240 ~ 35.8%. What do you suggest in analyzing the differences between groups?

Thanks for your help!

-Ariel

On Fri, Dec 9, 2011 at 11:20 AM, <[hidden email]> wrote:

Ariel,
I would start by computing the internal consistency of the computed scores, and then examining the distribution of scores (o determine the degree to which they follow a normal distribution. The first will give a conservative estimate of how reliable the scores are in your sample. The second will provide guidance about whether the assumptions for parametric or non parametric analysis should be used.
Best,
Stephen Brand
www.StatisticsDoc.com
From: Ariel Barak <[hidden email]>
Sender: "SPSSX(r) Discussion" <[hidden email]>
Date: Fri, 9 Dec 2011 10:58:55 -0600
To: <[hidden email]>
ReplyTo: Ariel Barak <[hidden email]>
Subject: Survey Analysis Questions

Hello All,

Background: I have 86 satisfaction surveys from families who have children in a program that provides case management services. The 20 likert questions (Strongly Disagree, Disagree, Undecided, Agree, Strongly Agree) are scored 1 through 5 and are broken into 5 domains. Per the instructions from one of the survey creators, I have only calculated domain scores (averages) if 2/3 or more of the questions for a given domain have been answered and most respondents answered all of the questions.

I would like to test whether there are statistically significant differences in the domain scores based on the following factors: Race, Sex, Length of Program Service (less than 365 days, 365+ days), program provider (there are two).

Second, I also have data related to whether parent(s) attended team meetings via a database, regardless of whether or not they responded to the survey. For those who completed the survey, I would like to test for whether there are correlations between the domains mentioned above and the percent of team meetings attended by a family member. I would also like to know if it is possible to state which domains have the strongest correlations to team meeting attendance by the different factors above: race, sex, etc.

Could someone provide me with some help on what the appropriate statistical test(s) are and what the syntax would look like?

Any assistance would be greatly appreciated.

Thanks,
Ariel

Bruce Weaver

Re: Survey Analysis Questions

Administrator

In reply to this post by G David Garson

Hi Dave. Is this the Knapp article you are referring to?

www.tomswebpage.net/images/sigconf.doc

Thanks,
Bruce

G David Garson wrote

While agreeing in part with Stephen, it sounds like you do not have a
random sample. Significance testing pertains to random samples, asking
about the probability of getting a result as strong as you observe if
another random sample were taken. If your data are the entire universe
to which you are generalizing (ex., all your clients), then barring
measurement error, even very small results are true results and
significance levels do not apply. If you have a non-random sample, you
would like to know the probability of a result of given strength if you
took another sample, but this is unknowable. Significance levels for
non-random data will be in error to an unknown degree.

In an experimental setting, if subjects are randomly assigned to groups,
statistical inference may be made but, as Knapp (2009) notes, "the
inference is to all possible randomizations for the given sample, not to
the population from which the sample was [non-randomly] drawn. " That
is, if one uses significance testing for non-random subjects who are
randomly assigned, one is asking about the probability of getting a
result of the given strength if one randomizes again.

Dave

On 12/9/2011 12:20 PM, Statisticsdoc Consulting wrote:
> Ariel,
> I would start by computing the internal consistency of the computed
> scores, and then examining the distribution of scores (o determine the
> degree to which they follow a normal distribution. The first will give
> a conservative estimate of how reliable the scores are in your sample.
> The second will provide guidance about whether the assumptions for
> parametric or non parametric analysis should be used.
> Best,
> Stephen Brand
> www.StatisticsDoc.com
> ------------------------------------------------------------------------
> *From: * Ariel Barak <[hidden email]>
> *Sender: * "SPSSX(r) Discussion" <[hidden email]>
> *Date: *Fri, 9 Dec 2011 10:58:55 -0600
> *To: *<[hidden email]>
> *ReplyTo: * Ariel Barak <[hidden email]>
> *Subject: *Survey Analysis Questions
>
> Hello All,
>
>
> Background: I have 86 satisfaction surveys from families who have
> children in a program that provides case management services. The 20
> likert questions (Strongly Disagree, Disagree, Undecided, Agree,
> Strongly Agree) are scored 1 through 5 and are broken into 5 domains.
> Per the instructions from one of the survey creators, I have only
> calculated domain scores (averages) if 2/3 or more of the questions
> for a given domain have been answered and most respondents answered
> all of the questions.
>
>
> I would like to test whether there are statistically significant
> differences in the domain scores based on the following factors: Race,
> Sex, Length of Program Service (less than 365 days, 365+ days),
> program provider (there are two).
>
> Second, I also have data related to whether parent(s) attended team
> meetings via a database, regardless of whether or not they responded
> to the survey. For those who completed the survey, I would like to
> test for whether there are correlations between the domains mentioned
> above and the percent of team meetings attended by a family member. I
> would also like to know if it is possible to state which domains have
> the strongest correlations to team meeting attendance by the different
> factors above: race, sex, etc.
>
>
> Could someone provide me with some help on what the appropriate
> statistical test(s) are and what the syntax would look like?
>
>
> Any assistance would be greatly appreciated.
>
>
> Thanks,
>
> Ariel
>

--

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

Rich Ulrich

Re: Survey Analysis Questions

In reply to this post by ariel barak

I'm not Stephen, but see my comments inserted, below.

Date: Fri, 9 Dec 2011 12:45:22 -0600
From: [hidden email]
Subject: Re: Survey Analysis Questions
To: [hidden email]

Stephen --

"Thanks for the quick response. Cronbach's alpha scores on the 5 domains are .807, .907, .956, .959 & .925. Not sure whether you intended for me to examine the questions themselves or the domain averages for normality, but since they are so consistent with their domain, I'm not sure it matters. I have looked at the domain scores and they seem very much right-skewed (agree/strongly agree). "

- The "skew" gets it name from the side with the long tail. For the
order you stated, your scores bunch up on the right, so this is "left-skewed".

"I assume this means that I should be using a chi-square instead of t-test. Please advise me if this is incorrect. This solves the first piece of the analysis I'd like to conduct."

The non-parametric tests are usually tested with something other
than a chi-squared. However, I do not think that many people would
agree with Stephen, where he advocated (rank-based) nonparametric
tests for totals of Likert-type items. If you score the "Totals" as Item-
average scores, you have an immediate set of anchor-labels for
interpreting the means that you observe for various groups. That is
the biggest gain.

These items do fail to meet the Likert ideal, because they do not
have averages near the midpoint.

There is very little loss of power or validity for the tests, when the
variance is relatively restricted by being this sort of sum. However,
here are several alternatives or extensions:
a) Since there are few responses of 1 and 2, create new scores of
"Objecters" by counting the number of responses of 1 and 2 - if you
want to see whether the LOW extremes are particularly important.
b) To create a nicer "interval" basis for the items that are totalled,
rescore the items as 2-5 or 3-5, and obtain totals from those.
c) To preserve the original scoring while creating a nicer interval
basis for the Total, you could subtract each Total from its maximum-
possible (for left-skewed), and take the square root. The scoring
now will run in the opposite direction, so it will be useful to apply
the actual labels, as transformed, to keep track of what you are
observing.

"For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child."

Unless there are always a lot of meetings, that % will not be very
attractive as a score. But you do need to figure out what will make
a useful criterion. What is logical? What will your audience see as
meaningful?

"1) How can I test to see which of the 5 domains is the best predictor of whether parents attend team meetings for all respondents?"

Please use the term "best correlate" and please consider why and
how a correlation can result from "common causation". If, as you
say, this is a "satisfaction survey", it is pretty hard to construe the
satisfaction as actually "predicting" the attendance for the previous
year.

You can look at scatterplots of scores, and simple correlations, if you
are considering two continuous scores. Your display may be more
important that your test -- These are observational data, so you do
need to tell a convincing story about how much association you see,
and why it matters.

"2) How can I break the respondents by group (race, sex, program) to see what domain is the best predictor of whether parents attend team meetings?"

Again, "correlation" and not "prediction". Even though you don't have
the timing problem for race and sex, "Correlation is not causation."
What is the "average attendance score" for males vs. females?

"Dave -- you are correct, this is not a random sample. The entire population of those enrolled in the program received a survey (n=268) and I received 86 responses. I dropped 28 youth due to bad addresses for a response rate of 86/240 ~ 35.8%. What do you suggest in analyzing the differences between groups?"

Since you have the data on hand, you certainly should do your basic
comparisons of 86 Responders vs. 28 Dropped-responders vs. 154
non-responders, for whatever data (Attendance, only?) in the complete file.

[snip, previous]

--
Rich Ulrich

ariel barak

Re: Survey Analysis Questions

In reply to this post by ariel barak

Steve,

I have demographic data for all youth in the program, who their provider was, as well as whether they were in the program for more or less than 365 days. I ran Chi-square tests on sex, race, program provider, program length for those who responded and the total mailing list (minus the bad addresses) and the Pearson Chi-Square values are .258, .298, .582, and .165 respectively for the categories above. I take these values to mean that the population that responded to the survey is not statistically different form those that did not. I apologize for not including this information initially.

I have looked at the distribution of the scale scores and believe that one would consider them very skewed. I will look at the Mann Whitney U test for sex, program provider (only 2), and program length (flagged 0 or 1 for <365 days or 365+) and the Kruskal-Wallis test for comparing scores of race (broken into 4).

For what it's worth, my eyeball of the Spearman's rho compared to the Pearson Correlations for the domains don't appear to be very different.

I'm not sure what you mean in the quote below:

"Before you use multiple regression, we need to have a discussion regarding the correlations between predictors and coding your multi-category nominal and ordinal variables."

Could you please clarify?

Ultimately, I intend to have data similar to what's below, and would like to know if it is possible to test which domain is most closely correlated with team meeting attendance by parents.

Client   Sex    Race   Program    ParentAttendance Domain1Avg Domain2Avg
100       1        1            1          26.3%                    4.0              3.67
101       2        2        2          10.6%                    2.5              2.33
102       1        1            1          78.7%                    3.5    4.00

Thanks for all your help with this...it is very much appreciated.

-Ariel

On Fri, Dec 9, 2011 at 3:15 PM, StatisticsDoc <[hidden email]> wrote:

Ariel,

Jay Fogleman
It would be best to focus on the skew of the scale scores. The sum of the items is often less skewed than the individual items are. How skewed are the data? Parametric statistics are reasonably robust to violations of normality.

If your scale scores are extremely skewed, do not use Chi-square to compare scores between groups. Your scale scores are not categorical and need not be treated as such by dividing them into categories. You would want to use the Mann Whitney U test to compare scores between two groups (e.g. sex), the Kruskal-Wallis test for comparing scores between three or more levels of a categorical variable, and the Spearman correlation for correlations between the scales and other scales. These statistics can computed using the non-parametric stats commands in SPSS (NPAR TESTS, NPAR COR ).

When your data depart markedly from normality, non-parametric statistics provide a more accurate test. You would not use them if your data were close to normally distributed, as they would be less powerful than parametric stats (i.e., less likely to detect significant associations in the data). SPSS does not provide non-parametric multiple regression as a standard feature, but there may be a script or other extension program out there in the user community that you can use (the most amazing things can be found there).

Hopefully, the departures from normality are small, and you can use the more familiar family of parametric statistics: t-tests, one way ANOVA, multiple regression. Before you use multiple regression, we need to have a discussion regarding the correlations between predictors and coding your multi-category nominal and ordinal variables.

Advice concerning the non-random nature of the sample is well taken. I am going for proceed for the sake of discussion as though your sample was random, although of course sampling bias is a great problem. Do you have any data on the non-responders that would enable you to compare them with the responders?

Best

Steve

From: SPSSX(r) Discussion [hidden email] On Behalf Of Ariel Barak
Sent: Friday, December 09, 2011 1:45 PM

To: [hidden email]
Subject: Re: Survey Analysis Questions

Stephen --

Thanks for the quick response. Cronbach's alpha scores on the 5 domains are .807, .907, .956, .959 & .925. Not sure whether you intended for me to examine the questions themselves or the domain averages for normality, but since they are so consistent with their domain, I'm not sure it matters. I have looked at the domain scores and they seem very much right-skewed (agree/strongly agree).

I assume this means that I should be using a chi-square instead of t-test. Please advise me if this is incorrect. This solves the first piece of the analysis I'd like to conduct.

For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child.

1) How can I test to see which of the 5 domains is the best predictor of whether parents attend team meetings for all respondents?

2) How can I break the respondents by group (race, sex, program) to see what domain is the best predictor of whether parents attend team meetings?

Dave -- you are correct, this is not a random sample. The entire population of those enrolled in the program received a survey (n=268) and I received 86 responses. I dropped 28 youth due to bad addresses for a response rate of 86/240 ~ 35.8%. What do you suggest in analyzing the differences between groups?

Thanks for your help!

-Ariel

On Fri, Dec 9, 2011 at 11:20 AM, <[hidden email]> wrote:

Ariel,
I would start by computing the internal consistency of the computed scores, and then examining the distribution of scores (o determine the degree to which they follow a normal distribution. The first will give a conservative estimate of how reliable the scores are in your sample. The second will provide guidance about whether the assumptions for parametric or non parametric analysis should be used.
Best,
Stephen Brand
www.StatisticsDoc.com

From: Ariel Barak <[hidden email]>
Sender: "SPSSX(r) Discussion" <[hidden email]>
Date: Fri, 9 Dec 2011 10:58:55 -0600

To: <[hidden email]>
ReplyTo: Ariel Barak <[hidden email]>

Subject: Survey Analysis Questions

Hello All,

Background: I have 86 satisfaction surveys from families who have children in a program that provides case management services. The 20 likert questions (Strongly Disagree, Disagree, Undecided, Agree, Strongly Agree) are scored 1 through 5 and are broken into 5 domains. Per the instructions from one of the survey creators, I have only calculated domain scores (averages) if 2/3 or more of the questions for a given domain have been answered and most respondents answered all of the questions.

I would like to test whether there are statistically significant differences in the domain scores based on the following factors: Race, Sex, Length of Program Service (less than 365 days, 365+ days), program provider (there are two).

Second, I also have data related to whether parent(s) attended team meetings via a database, regardless of whether or not they responded to the survey. For those who completed the survey, I would like to test for whether there are correlations between the domains mentioned above and the percent of team meetings attended by a family member. I would also like to know if it is possible to state which domains have the strongest correlations to team meeting attendance by the different factors above: race, sex, etc.

Could someone provide me with some help on what the appropriate statistical test(s) are and what the syntax would look like?

Any assistance would be greatly appreciated.

Thanks,
Ariel

ariel barak

Re: Survey Analysis Questions

In reply to this post by Rich Ulrich

Hi Rich,

Thanks for responding to the topic. I'll try breaking general issues apart by asterisks.

******
I appreciate the 3 options you provided. With competing ideas on what I should be doing, I want to be sure that I understand your perspective. You believe that I could use parametric tests and that I don't need to use one of the 3 options you provided...right?

******
You wrote "These items do fail to meet the Likert ideal, because they do not
have averages near the midpoint."

What effect does this have on what I should be doing?

******
I wrote - For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child.

You wrote - Unless there are always a lot of meetings, that % will not be very
attractive as a score. But you do need to figure out what will make
a useful criterion. What is logical? What will your audience see as
meaningful?"

The team meetings are to occur on a monthly basis and some youth may only be in the program for a few months, so you're right to bring up that concern. I could limit this piece of the analysis to those who had more than 6 team meetings. Any other suggestions on how I should handle this?

I could create a variable based on 25% increments but am not sure if this would be appropriate.

******
Regarding domains, they are:
Access (location was convenient, services available at convenient times)
Participation (I helped choose services, I helped choose trt. goals)
Cultural Sensitivity (Staff treated me with respect, Staff spoke with me in a way I understood)
Appropriateness (I'm satisfied with the service my child received)
Outcomes (My child is better at handling daily life, doing better at school, etc.)

It seems reasonable that if parents felt the location was inconvenient or staff were disrespectful, that they would not attend team meetings. I will look at the scatterplots of the different domain scores by the % of team meetings attended by parents and look for correlations.

Thanks to everyone for their feedback.

-Ariel

On Fri, Dec 9, 2011 at 4:13 PM, Rich Ulrich <[hidden email]> wrote:

I'm not Stephen, but see my comments inserted, below.

Date: Fri, 9 Dec 2011 12:45:22 -0600
From: [hidden email]

Subject: Re: Survey Analysis Questions
To: [hidden email]

Stephen --

"Thanks for the quick response. Cronbach's alpha scores on the 5 domains are .807, .907, .956, .959 & .925. Not sure whether you intended for me to examine the questions themselves or the domain averages for normality, but since they are so consistent with their domain, I'm not sure it matters. I have looked at the domain scores and they seem very much right-skewed (agree/strongly agree). "

- The "skew" gets it name from the side with the long tail. For the
order you stated, your scores bunch up on the right, so this is "left-skewed".

"I assume this means that I should be using a chi-square instead of t-test. Please advise me if this is incorrect. This solves the first piece of the analysis I'd like to conduct."

The non-parametric tests are usually tested with something other
than a chi-squared. However, I do not think that many people would
agree with Stephen, where he advocated (rank-based) nonparametric
tests for totals of Likert-type items. If you score the "Totals" as Item-
average scores, you have an immediate set of anchor-labels for
interpreting the means that you observe for various groups. That is
the biggest gain.

These items do fail to meet the Likert ideal, because they do not
have averages near the midpoint.

There is very little loss of power or validity for the tests, when the
variance is relatively restricted by being this sort of sum. However,
here are several alternatives or extensions:
a) Since there are few responses of 1 and 2, create new scores of
"Objecters" by counting the number of responses of 1 and 2 - if you
want to see whether the LOW extremes are particularly important.
b) To create a nicer "interval" basis for the items that are totalled,
rescore the items as 2-5 or 3-5, and obtain totals from those.
c) To preserve the original scoring while creating a nicer interval
basis for the Total, you could subtract each Total from its maximum-
possible (for left-skewed), and take the square root. The scoring
now will run in the opposite direction, so it will be useful to apply
the actual labels, as transformed, to keep track of what you are
observing.

"For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child."

Unless there are always a lot of meetings, that % will not be very
attractive as a score. But you do need to figure out what will make
a useful criterion. What is logical? What will your audience see as
meaningful?

"1) How can I test to see which of the 5 domains is the best predictor of whether parents attend team meetings for all respondents?"

Please use the term "best correlate" and please consider why and
how a correlation can result from "common causation". If, as you
say, this is a "satisfaction survey", it is pretty hard to construe the
satisfaction as actually "predicting" the attendance for the previous
year.

You can look at scatterplots of scores, and simple correlations, if you
are considering two continuous scores. Your display may be more
important that your test -- These are observational data, so you do
need to tell a convincing story about how much association you see,
and why it matters.

"2) How can I break the respondents by group (race, sex, program) to see what domain is the best predictor of whether parents attend team meetings?"

Again, "correlation" and not "prediction". Even though you don't have
the timing problem for race and sex, "Correlation is not causation."
What is the "average attendance score" for males vs. females?

"Dave -- you are correct, this is not a random sample. The entire population of those enrolled in the program received a survey (n=268) and I received 86 responses. I dropped 28 youth due to bad addresses for a response rate of 86/240 ~ 35.8%. What do you suggest in analyzing the differences between groups?"

Since you have the data on hand, you certainly should do your basic
comparisons of 86 Responders vs. 28 Dropped-responders vs. 154
non-responders, for whatever data (Attendance, only?) in the complete file.

[snip, previous]

--
Rich Ulrich

Kelly Vander Ley

Automatic reply: Survey Analysis Questions

I will be out of the office on Monday 12/12 through Friday 12/16, returning on Monday (12/19). If you need immediate assistance please call the main office number 503/223-8248 or 800/788-1887 and the receptionist will ensure that I get the message. Kelly

Rich Ulrich

Re: Survey Analysis Questions

In reply to this post by ariel barak

I've numbered your asterisks, and I'll comment.

1) Right, pretty much. Especially for data where the actual means
are so useful, I certainly would avoid the rank tests as my main
presentation. If you use the skewed data instead of a modification,
it is possible to mollify the nervous nellies by running those tests, too,
so you can add, "Non-parametric tests give the same results."

2) No effect on analyses. Something to know when you are talking
about them.

3) (How to handle counts of meetings.) I would want to have on
hand the distribution of observed counts and durations: 6 meetings
out of 12 may be different from 6 meetings out of 6. Then I would
want to ask the people running the meetings how they might "rank"
the participation in some sense. What is "good participation" when
you consider the purpose and the clientele?

4) I don't see a question. I can comment -- I once was told that
one of the big factors for poor attendance for psychiatric outpatients
was inadequate bus service.

--
Rich Ulrich

Date: Fri, 9 Dec 2011 17:40:17 -0600
From: [hidden email]
Subject: Re: Survey Analysis Questions
To: [hidden email]

Hi Rich,

Thanks for responding to the topic. I'll try breaking general issues apart by asterisks.

****** (1)
I appreciate the 3 options you provided. With competing ideas on what I should be doing, I want to be sure that I understand your perspective. You believe that I could use parametric tests and that I don't need to use one of the 3 options you provided...right?

****** (2)
You wrote "These items do fail to meet the Likert ideal, because they do not
have averages near the midpoint."

What effect does this have on what I should be doing?

****** (3)
I wrote - For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child.

You wrote - Unless there are always a lot of meetings, that % will not be very
attractive as a score. But you do need to figure out what will make
a useful criterion. What is logical? What will your audience see as
meaningful?"

The team meetings are to occur on a monthly basis and some youth may only be in the program for a few months, so you're right to bring up that concern. I could limit this piece of the analysis to those who had more than 6 team meetings. Any other suggestions on how I should handle this?

I could create a variable based on 25% increments but am not sure if this would be appropriate.

****** (4)
Regarding domains, they are:
Access (location was convenient, services available at convenient times)
Participation (I helped choose services, I helped choose trt. goals)
Cultural Sensitivity (Staff treated me with respect, Staff spoke with me in a way I understood)
Appropriateness (I'm satisfied with the service my child received)
Outcomes (My child is better at handling daily life, doing better at school, etc.)

It seems reasonable that if parents felt the location was inconvenient or staff were disrespectful, that they would not attend team meetings. I will look at the scatterplots of the different domain scores by the % of team meetings attended by parents and look for correlations.

[snip, previous]