Dear All,
I have market share data that has 83 cases (products) by 15 sources of information (variables). So the whole matrix is populated with share information with the 15 sources of information being possible places where respondent would have heard about the product (before they used it). Now my simple task is to determine whether the shares differ depending on what the source of information was. Could anyone have an idea on how to approach this? The data looks something like this: info source1 info source2 info source3 info source4 info source5 prod 1 9.67 6.04 2.14 5.10 6.00 prod 2 3.00 6.67 . 0.00 6.25 prod 3 31.17 30.16 0.00 30.00 29.27 prod 4 3.75 0.74 0.00 1.00 3.75 prod 5 25.00 28.33 . 5.00 15.00 prod 6 8.38 2.87 3.14 2.05 2.00 prod 7 . 2.50 0.00 0.00 10.00 prod 8 22.25 17.87 10.04 17.40 18.92 prod 9 6.00 6.83 2.83 1.52 3.67 prod 10 6.33 2.74 3.80 2.73 3.18 prod 11 . 2.00 . . . prod 12 . 0.00 0.00 0.00 0.00 Thanks, Sibusiso. |
Hi Sibusiso
I'm not really sure about what you are asking. Are you interested in finding out if the sources differ (they don't give the same mean value for all the products) or if they are consistent (if product nr. 3 has the highest value in source 1, then source 2, 3... should give it also higher values than the rest of products...). First question would be a repeated measures ANOVA (or Friedman test if data are non normally distributed). Second question would answered by Kendall's test (warning: I'm NOT talking about Kendall's tau correlation coefficient). Both cases, you have a problem: those scattered missing data will lower your sample size. SM> I have market share data that has 83 cases (products) by 15 SM> sources of information (variables). So the whole matrix is SM> populated with share information with the 15 sources of SM> information being possible places where respondent would have SM> heard about the product (before they used it). SM> Now my simple task is to determine whether the shares differ SM> depending on what the source of information was. SM> Could anyone have an idea on how to approach this? SM> The data looks something like this: SM> info source1 info source2 info source3 info source4 info source5 SM> prod 1 9.67 6.04 2.14 5.10 6.00 SM> prod 2 3.00 6.67 . 0.00 6.25 SM> prod 3 31.17 30.16 0.00 30.00 29.27 SM> prod 4 3.75 0.74 0.00 1.00 3.75 SM> prod 5 25.00 28.33 . 5.00 15.00 SM> prod 6 8.38 2.87 3.14 2.05 2.00 SM> prod 7 . 2.50 0.00 0.00 10.00 SM> prod 8 22.25 17.87 10.04 17.40 18.92 SM> prod 9 6.00 6.83 2.83 1.52 3.67 SM> prod 10 6.33 2.74 3.80 2.73 3.18 SM> prod 11 . 2.00 . . . SM> prod 12 . 0.00 0.00 0.00 0.00 -- Regards, Dr. Marta García-Granero,PhD mailto:[hidden email] Statistician --- "It is unwise to use a statistical procedure whose use one does not understand. SPSS syntax guide cannot supply this knowledge, and it is certainly no substitute for the basic understanding of statistics and statistical thinking that is essential for the wise choice of methods and the correct interpretation of their results". (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) |
Marta,
Thanks for your response. What I am trying to find out is which Source of information consistently yields a higher score (/share), across different products. That is if these people heard from info-source1 are their scores higher than if they heard it from info-source 2 etc? I hope this is much clearer. Sibusiso. -----Original Message----- From: SPSSX(r) Discussion on behalf of Marta García-Granero Sent: Fri 11/3/2006 1:49 AM To: [hidden email] Cc: Subject: Re: Stats qns Hi Sibusiso I'm not really sure about what you are asking. Are you interested in finding out if the sources differ (they don't give the same mean value for all the products) or if they are consistent (if product nr. 3 has the highest value in source 1, then source 2, 3... should give it also higher values than the rest of products...). First question would be a repeated measures ANOVA (or Friedman test if data are non normally distributed). Second question would answered by Kendall's test (warning: I'm NOT talking about Kendall's tau correlation coefficient). Both cases, you have a problem: those scattered missing data will lower your sample size. SM> I have market share data that has 83 cases (products) by 15 SM> sources of information (variables). So the whole matrix is SM> populated with share information with the 15 sources of SM> information being possible places where respondent would have SM> heard about the product (before they used it). SM> Now my simple task is to determine whether the shares differ SM> depending on what the source of information was. SM> Could anyone have an idea on how to approach this? SM> The data looks something like this: SM> info source1 info source2 info source3 info source4 info source5 SM> prod 1 9.67 6.04 2.14 5.10 6.00 SM> prod 2 3.00 6.67 . 0.00 6.25 SM> prod 3 31.17 30.16 0.00 30.00 29.27 SM> prod 4 3.75 0.74 0.00 1.00 3.75 SM> prod 5 25.00 28.33 . 5.00 15.00 SM> prod 6 8.38 2.87 3.14 2.05 2.00 SM> prod 7 . 2.50 0.00 0.00 10.00 SM> prod 8 22.25 17.87 10.04 17.40 18.92 SM> prod 9 6.00 6.83 2.83 1.52 3.67 SM> prod 10 6.33 2.74 3.80 2.73 3.18 SM> prod 11 . 2.00 . . . SM> prod 12 . 0.00 0.00 0.00 0.00 -- Regards, Dr. Marta García-Granero,PhD mailto:[hidden email] Statistician --- "It is unwise to use a statistical procedure whose use one does not understand. SPSS syntax guide cannot supply this knowledge, and it is certainly no substitute for the basic understanding of statistics and statistical thinking that is essential for the wise choice of methods and the correct interpretation of their results". (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) |
In reply to this post by Sibusiso Moyo
Marta,
Here is the test I ended up running: GLM w5_mean_1.1 w5_mean_1.2 w5_mean_1.3 w5_mean_1.5 w5_mean_1.6 w5_mean_1.7 w5_mean_1.8 w5_mean_1.9 w5_mean_1.10 w5_mean_1.11 w5_mean_1.12 w5_mean_1.13 w5_mean_1.14 /WSFACTOR = Source 13 Polynomial /METHOD = SSTYPE(3) /EMMEANS = TABLES(Source) /PRINT = DESCRIPTIVE ETASQ OPOWER PARAMETER TEST(MMATRIX) LOF GEF /PLOT = RESIDUALS /CRITERIA = ALPHA(.05) /WSDESIGN = Source . My justification is that each of my products provides data for all the 14 sources of information. So I am looking at this as a One Way within Subjects ANOVA. Thanks for your insight, Sibusiso. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Marta García-Granero Sent: Friday, November 03, 2006 1:49 AM To: [hidden email] Subject: Re: Stats qns Hi Sibusiso I'm not really sure about what you are asking. Are you interested in finding out if the sources differ (they don't give the same mean value for all the products) or if they are consistent (if product nr. 3 has the highest value in source 1, then source 2, 3... should give it also higher values than the rest of products...). First question would be a repeated measures ANOVA (or Friedman test if data are non normally distributed). Second question would answered by Kendall's test (warning: I'm NOT talking about Kendall's tau correlation coefficient). Both cases, you have a problem: those scattered missing data will lower your sample size. SM> I have market share data that has 83 cases (products) by 15 SM> sources of information (variables). So the whole matrix is SM> populated with share information with the 15 sources of SM> information being possible places where respondent would have SM> heard about the product (before they used it). SM> Now my simple task is to determine whether the shares differ SM> depending on what the source of information was. SM> Could anyone have an idea on how to approach this? SM> The data looks something like this: SM> info source1 info source2 info source3 info source4 info source5 SM> prod 1 9.67 6.04 2.14 5.10 6.00 SM> prod 2 3.00 6.67 . 0.00 6.25 SM> prod 3 31.17 30.16 0.00 30.00 29.27 SM> prod 4 3.75 0.74 0.00 1.00 3.75 SM> prod 5 25.00 28.33 . 5.00 15.00 SM> prod 6 8.38 2.87 3.14 2.05 2.00 SM> prod 7 . 2.50 0.00 0.00 10.00 SM> prod 8 22.25 17.87 10.04 17.40 18.92 SM> prod 9 6.00 6.83 2.83 1.52 3.67 SM> prod 10 6.33 2.74 3.80 2.73 3.18 SM> prod 11 . 2.00 . . . SM> prod 12 . 0.00 0.00 0.00 0.00 -- Regards, Dr. Marta García-Granero,PhD mailto:[hidden email] Statistician --- "It is unwise to use a statistical procedure whose use one does not understand. SPSS syntax guide cannot supply this knowledge, and it is certainly no substitute for the basic understanding of statistics and statistical thinking that is essential for the wise choice of methods and the correct interpretation of their results". (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) |
Hi Sibusiso
Did you loose a lotsample size after listwise deletion? If you want to recover those partial data you have for any product with missing sources, you can use varstocases to change your dataset from wide to long and use a two factor unianova with no interaction terms. The only condition for the last approach (besides normality, of course) is that Mauchly's sphericity test (you got it as part pf your GLM analysis) is non significant. If you need a bit of assistance to transform your dataset and run the unianova tell me, and I'll send a worked sample dataset. Regards, Marta SM> GLM SM> w5_mean_1.1 w5_mean_1.2 w5_mean_1.3 w5_mean_1.5 w5_mean_1.6 w5_mean_1.7 SM> w5_mean_1.8 w5_mean_1.9 w5_mean_1.10 w5_mean_1.11 w5_mean_1.12 w5_mean_1.13 SM> w5_mean_1.14 SM> /WSFACTOR = Source 13 Polynomial SM> /METHOD = SSTYPE(3) SM> /EMMEANS = TABLES(Source) SM> /PRINT = DESCRIPTIVE ETASQ OPOWER PARAMETER TEST(MMATRIX) LOF GEF SM> /PLOT = RESIDUALS SM> /CRITERIA = ALPHA(.05) SM> /WSDESIGN = Source . SM> My justification is that each of my products provides data SM> for all the 14 sources of information. So I am looking at this as SM> a One Way within Subjects ANOVA. |
In reply to this post by Sibusiso Moyo
Marta,
Thanks for your help. I have been using "series means" to replace the missing values. I am not sure how accuarate that is, but I am willing to try the method you are proposing. Please send a worked sample data set. Thanks a million, Sibusiso. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Marta García-Granero Sent: Saturday, November 04, 2006 5:35 AM To: [hidden email] Subject: Re: Stats qns Hi Sibusiso Did you loose a lotsample size after listwise deletion? If you want to recover those partial data you have for any product with missing sources, you can use varstocases to change your dataset from wide to long and use a two factor unianova with no interaction terms. The only condition for the last approach (besides normality, of course) is that Mauchly's sphericity test (you got it as part pf your GLM analysis) is non significant. If you need a bit of assistance to transform your dataset and run the unianova tell me, and I'll send a worked sample dataset. Regards, Marta SM> GLM SM> w5_mean_1.1 w5_mean_1.2 w5_mean_1.3 w5_mean_1.5 w5_mean_1.6 w5_mean_1.7 SM> w5_mean_1.8 w5_mean_1.9 w5_mean_1.10 w5_mean_1.11 w5_mean_1.12 w5_mean_1.13 SM> w5_mean_1.14 SM> /WSFACTOR = Source 13 Polynomial SM> /METHOD = SSTYPE(3) SM> /EMMEANS = TABLES(Source) SM> /PRINT = DESCRIPTIVE ETASQ OPOWER PARAMETER TEST(MMATRIX) LOF GEF SM> /PLOT = RESIDUALS SM> /CRITERIA = ALPHA(.05) SM> /WSDESIGN = Source . SM> My justification is that each of my products provides data SM> for all the 14 sources of information. So I am looking at this as SM> a One Way within Subjects ANOVA. |
Hi Sibusiso
SM> I have been using "series means" to replace the missing SM> values. Houmm... There are several caveats to that approach. Read this excellent work on the topic, please: http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Missing_Data/Missing.html There is also a program called NORM that works with SPSS and generates a large collection of datasets with different replacements of missing data: http://www.stat.psu.edu/~jls/misoftwa.html http://www.stat.psu.edu/~jls/mifaq.html SM> I am not sure how accurate that is, but I am willing to SM> try the method you are proposing. SM> Please send a worked sample data set. Here it goes (ignore the error message when the dataset is defined, it's due to the missing data). DATA LIST LIST/control zone1 zone2 zone3 (4 F8.1). BEGIN DATA 15.0 17.9 16.5 16.7 23.5 26.5 35.4 34.1 20.1 45.2 22.6 . 26.1 39.1 33.4 30.6 26.5 35.2 37.6 30.1 19.4 35.1 30.4 24.6 16.4 31.8 . 20.1 21.1 21.4 20.8 18.4 19.8 33.1 29.4 24.3 17.4 31.1 28.4 29.6 END DATA. VAR LABEL control 'Cu levels in Control skin' /zone1 'Cu levels en Zone 1 burned skin' /zone2 'Cu levels en Zone 2 burned skin' /zone3 'Cu levels en Zone 3 burned skin'. VARSTOCASES /ID = rats /MAKE copper FROM control TO zone3 /INDEX = zones 'Burned skin zones'(4) /KEEP = /NULL = KEEP. VAR LAB copper 'Copper levels'. VAL LAB zones 1'Control' 2'Zone 1' 3'Zone 2' 4'Zone 3'. UNIANOVA copper BY zones rats /RANDOM = rats /METHOD = SSTYPE(4) /INTERCEPT = EXCLUDE /EMMEANS = TABLES(zones) COMPARE ADJ(BONFERRONI) /PLOT = RESIDUALS /CRITERIA = ALPHA(.05) /DESIGN = zones rats . Regards, Marta |
In reply to this post by Sibusiso Moyo
Marta,
When I ran the GLM (WSFACTOR) in wide format I had 83 cases, and that number drops down to 23 useful ones due to missing values. My data was actually in long format before I used Casestovars to tranform it to a wide format. So i have both formats. I will now proceed and run the second method you suggested! Thank you, Sibusiso. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Marta García-Granero Sent: Saturday, November 04, 2006 5:35 AM To: [hidden email] Subject: Re: Stats qns Hi Sibusiso Did you loose a lotsample size after listwise deletion? If you want to recover those partial data you have for any product with missing sources, you can use varstocases to change your dataset from wide to long and use a two factor unianova with no interaction terms. The only condition for the last approach (besides normality, of course) is that Mauchly's sphericity test (you got it as part pf your GLM analysis) is non significant. If you need a bit of assistance to transform your dataset and run the unianova tell me, and I'll send a worked sample dataset. Regards, Marta SM> GLM SM> w5_mean_1.1 w5_mean_1.2 w5_mean_1.3 w5_mean_1.5 w5_mean_1.6 w5_mean_1.7 SM> w5_mean_1.8 w5_mean_1.9 w5_mean_1.10 w5_mean_1.11 w5_mean_1.12 w5_mean_1.13 SM> w5_mean_1.14 SM> /WSFACTOR = Source 13 Polynomial SM> /METHOD = SSTYPE(3) SM> /EMMEANS = TABLES(Source) SM> /PRINT = DESCRIPTIVE ETASQ OPOWER PARAMETER TEST(MMATRIX) LOF GEF SM> /PLOT = RESIDUALS SM> /CRITERIA = ALPHA(.05) SM> /WSDESIGN = Source . SM> My justification is that each of my products provides data SM> for all the 14 sources of information. So I am looking at this as SM> a One Way within Subjects ANOVA. |
This would be better as a mixed models analysis.
________________________________ From: SPSSX(r) Discussion on behalf of Sibusiso Moyo Sent: Mon 11/6/2006 10:00 AM To: [hidden email] Subject: Re: Stats qns Marta, When I ran the GLM (WSFACTOR) in wide format I had 83 cases, and that number drops down to 23 useful ones due to missing values. My data was actually in long format before I used Casestovars to tranform it to a wide format. So i have both formats. I will now proceed and run the second method you suggested! Thank you, Sibusiso. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Marta García-Granero Sent: Saturday, November 04, 2006 5:35 AM To: [hidden email] Subject: Re: Stats qns Hi Sibusiso Did you loose a lotsample size after listwise deletion? If you want to recover those partial data you have for any product with missing sources, you can use varstocases to change your dataset from wide to long and use a two factor unianova with no interaction terms. The only condition for the last approach (besides normality, of course) is that Mauchly's sphericity test (you got it as part pf your GLM analysis) is non significant. If you need a bit of assistance to transform your dataset and run the unianova tell me, and I'll send a worked sample dataset. Regards, Marta SM> GLM SM> w5_mean_1.1 w5_mean_1.2 w5_mean_1.3 w5_mean_1.5 w5_mean_1.6 w5_mean_1.7 SM> w5_mean_1.8 w5_mean_1.9 w5_mean_1.10 w5_mean_1.11 w5_mean_1.12 w5_mean_1.13 SM> w5_mean_1.14 SM> /WSFACTOR = Source 13 Polynomial SM> /METHOD = SSTYPE(3) SM> /EMMEANS = TABLES(Source) SM> /PRINT = DESCRIPTIVE ETASQ OPOWER PARAMETER TEST(MMATRIX) LOF GEF SM> /PLOT = RESIDUALS SM> /CRITERIA = ALPHA(.05) SM> /WSDESIGN = Source . SM> My justification is that each of my products provides data SM> for all the 14 sources of information. So I am looking at this as SM> a One Way within Subjects ANOVA. |
In reply to this post by Sibusiso Moyo
Paul,
It is my understanding that Linear Mixed Models are related to the GLM Univariate or GLM Repeated Measures procedures, that is they can be used interchangeably (after having taken care of correlation issues). So Paul I was wondering why you thought Mixed Model analysis would be better in this case? Thanks, Sibusiso. -----Original Message----- From: Swank, Paul R [mailto:[hidden email]] Sent: Monday, November 06, 2006 3:30 PM To: Sibusiso Moyo; [hidden email] Subject: RE: Stats qns This would be better as a mixed models analysis. _____ From: SPSSX(r) Discussion on behalf of Sibusiso Moyo Sent: Mon 11/6/2006 10:00 AM To: [hidden email] Subject: Re: Stats qns Marta, When I ran the GLM (WSFACTOR) in wide format I had 83 cases, and that number drops down to 23 useful ones due to missing values. My data was actually in long format before I used Casestovars to tranform it to a wide format. So i have both formats. I will now proceed and run the second method you suggested! Thank you, Sibusiso. -----Original Message----- From: SPSSX(r) Discussion [ mailto:[hidden email]]On Behalf Of Marta García-Granero Sent: Saturday, November 04, 2006 5:35 AM To: [hidden email] Subject: Re: Stats qns Hi Sibusiso Did you loose a lotsample size after listwise deletion? If you want to recover those partial data you have for any product with missing sources, you can use varstocases to change your dataset from wide to long and use a two factor unianova with no interaction terms. The only condition for the last approach (besides normality, of course) is that Mauchly's sphericity test (you got it as part pf your GLM analysis) is non significant. If you need a bit of assistance to transform your dataset and run the unianova tell me, and I'll send a worked sample dataset. Regards, Marta SM> GLM SM> w5_mean_1.1 w5_mean_1.2 w5_mean_1.3 w5_mean_1.5 w5_mean_1.6 w5_mean_1.7 SM> w5_mean_1.8 w5_mean_1.9 w5_mean_1.10 w5_mean_1.11 w5_mean_1.12 w5_mean_1.13 SM> w5_mean_1.14 SM> /WSFACTOR = Source 13 Polynomial SM> /METHOD = SSTYPE(3) SM> /EMMEANS = TABLES(Source) SM> /PRINT = DESCRIPTIVE ETASQ OPOWER PARAMETER TEST(MMATRIX) LOF GEF SM> /PLOT = RESIDUALS SM> /CRITERIA = ALPHA(.05) SM> /WSDESIGN = Source . SM> My justification is that each of my products provides data SM> for all the 14 sources of information. So I am looking at this as SM> a One Way within Subjects ANOVA. |
Free forum by Nabble | Edit this page |