|
Dear Members,
I hope you won't mind turning back again to an old question. Actually, I am having a data in the follwoing format: Group A: x1 n1 which means p1=x1/n1 Group B: x2 n2 refers to p2=x2/n2 Group C :x3 n3 refers to p3=x3/n3 . . . Group Z: x20 n20 refers to p20=x20/n20 I want to compare all possible combinations proportions across the groups to check for any significant difference in proportions: The code provided by you actually, compare props by column or row whihc is not my objective. I tried using Ctables but did not able to get the desired result I would be grateful to you for support and help. Regards Ramzan Afzal |
|
Hi, ctable is the better option keeping group variable in the columns. Try to generate your desired combinations into new variables to be kept in separate rows (before running the ctable, else u can run multiple ctable syntax). This should work.
On Sun, Jun 28, 2009 at 6:40 PM, Ramzan Afzal <[hidden email]> wrote:
-- Samir Paul 382 Main Street, TORONTO ON M4C4X8 CANADA Phone: ( 001 ) 416 686 9958 |
|
In reply to this post by Ramzan Afzal-2
Hi Ramzan
I don't fully grasp what you are trying to achieve. Do you want to compare first group against all the other, then the 2nd against all the rest, and so on? What's the format of the dataset (post an example), do you have aggregated or raw data? Marta Ramzan Afzal wrote: > I hope you won't mind turning back again to an old question. > Actually, I am having a data in the follwoing format: > > Group A: x1 n1 which means p1=x1/n1 > Group B: x2 n2 refers to p2=x2/n2 > Group C :x3 n3 refers to p3=x3/n3 > . > . > . > Group Z: x20 n20 refers to p20=x20/n20 > > I want to compare all possible combinations proportions across the > groups to check for any significant difference in proportions: The > code provided by you actually, compare props by column or row whihc is > not my objective. I tried using Ctables but did not able to get the > desired result > -- For miscellaneous SPSS related statistical stuff, visit: http://gjyp.nl/marta/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Ramzan
Please, remember to address your messages to the list,not to me privately, since the response might benefit other people too, and others might also contribute with better solutions. Ramzan Afzal wrote: > > Yes - I want to compare first against the others and then second and > so on. I have the aggregated data xs and ns for all groups and I have > 16 groups. > > The format of the data is as mentioned below: > Group x n > A 20 100 > B 14 200 > C 30 145 > D 54 90 > > HTH, Marta GG * SYNTAX *. *Sample dataset * DATA LIST LIST/Group(A1) x n (2 F8). BEGIN DATA A 20 100 B 14 200 C 30 145 D 54 90 END DATA. MATRIX. PRINT /TITLE=' "ONE GROUP VS ALL THE OTHERS" COMPARISONS (ASYMPTOTIC)'. GET Group /VAR=Group. GET data /VAR= x n. COMPUTE Crosstab={data(:,1),data(:,2)-data(:,1)}. PRINT CrossTab /RNAMES=group /CLABEL='Yes','No' /TITLE='Input data: Contingency table'. RELEASE data. COMPUTE n=MSUM(crosstab). COMPUTE K=NROW(CrossTab). COMPUTE Report=MAKE(K,2,0). * All comparisons *. LOOP I=1 TO K. . COMPUTE a=CrossTab(I,1). . COMPUTE b=CrossTab(I,2). . COMPUTE c=CSUM(CrossTab(:,1)) - a. . COMPUTE d=CSUM(CrossTab(:,2)) - b. . COMPUTE Gnames={Group(I),'Others'}. PRINT {a,b,a+b;c,d,c+d;a+c,b+d,n} /FORMAT='F8' /RNAMES= Gnames /TITLE='Comparison:'. * Chi-square (uncorrected) for 2x2 contingency table *. . COMPUTE ChiSq=n*((a*d-b*c)**2)/((a+b)*(c+d)*(a+c)*(b+d)). . COMPUTE ChiSig=1-CHICDF(ChiSq,1). . COMPUTE Report(I,1)=ChiSq. . COMPUTE Report(I,2)=ChiSig. END LOOP. PRINT Report /FORMAT='F8.4' /RNAMES=Group /CLABELS='Chi-Sq','Sig.' /TITLE='Results for every group compared vs the rest'. END MATRIX. -- For miscellaneous SPSS related statistical stuff, visit: http://gjyp.nl/marta/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Marta,
Thanks again for the solution. It is fine but it is comparing Group A with others (C+B+D). My problems is that i want this betwen the pair of variables like comparison between A & B, then A & C, A & D, B & C, B & D and finally C & D.
Ctables does this (proprotion test) but it sums x and n as a total - and calculates perecentages based on the total for x and n. For my case logically it becomes wrong as I am dividing x/n for each group and then comparing the difference between proprotions for all possible combinations of groups.
Any help will be much appreciated whether by amedning the code or by Ctables
2009/6/29 Marta García-Granero <[hidden email]> Hi Ramzan |
|
Hi Ramzan
You wrote: > Thanks again for the solution. It is fine but it is comparing Group A > with others (C+B+D). My problems is that i want this betwen the pair > of variables like comparison between A & B, then A & C, A & D, B & C, > B & D and finally C & D. > > Ctables does this (proprotion test) but it sums x and n as a total - > and calculates perecentages based on the total for x and n. For my > case logically it becomes wrong as I am dividing x/n for each group > and then comparing the difference between proprotions for all possible > combinations of groups. > > Any help will be much appreciated whether by amedning the code or by > Ctables CTABLES doesn't do what you want, but it does indeed. Check the output, because I think you must be misinterpreting it, or perhaps specifying the comparisons in a wrong way. CTABLES performs pairwise comparisons (that's the name, BTW, for what you wanted) for group percentages, with Bonferroni correction to control for type I error. You can also check the Marascuilo procedure code I sent to the list several weeks go, because it also performs all pairwise comparisons, but with a Scheffee-like correction instead of Bonferroni. This is the syntax to run CTABLES for your data: * Sample dataset *. DATA LIST LIST/Group(A1) x n (2 F8). BEGIN DATA A 20 100 B 14 200 C 30 145 D 54 90 END DATA. * Data must be restructured first *. COMPUTE y = n - x . EXECUTE . DELETE VARIABLES n. VARSTOCASES /MAKE frequency FROM x y /INDEX = Outcome(2) /KEEP = Group /NULL = KEEP. VALUE LABEL outcome 1'Yes' 2'No'. WEIGHT BY frequency. * Pairwise comparisons *. CTABLES /VLABELS VARIABLES=Group Outcome DISPLAY=DEFAULT /TABLE Outcome [COUNT F40.0, COLPCT.COUNT PCT40.1] BY Group /SLABELS POSITION=ROW /CATEGORIES VARIABLES=Group Outcome ORDER=A KEY=VALUE EMPTY=INCLUDE /COMPARETEST TYPE=PROP ALPHA=0.05 ADJUST=BONFERRONI ORIGIN=COLUMN INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE. The results are OK (to my knowledge), unless I am missing anything you have not explained yet. Marta -- For miscellaneous SPSS related statistical stuff, visit: http://gjyp.nl/marta/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
2009/6/29 Marta García-Granero <[hidden email]> Hi Ramzan |
|
Hi Marta,
This code is the best and perfectly fine results are also ok. The only thing which i am not getting is the comparison of required proportions. Now, I explain to you
Group A: x1=20 and n1=80 I want p1=20/80 not like 20/100 and 80/100
Group B:x1=40 and n1=50, I want p2=40/50 not like 40/90 and 50/90 Hence I want to compare the difference between p1 and p2 i..e between 20/80 and 40/50 only. Now I am getting results for 20/100 and 80/100 which I do not need. I need to compare 20/80 and 40/50.
Rest your codes are excellent and many thanks for your patience, time and technical support.
Regards
Ramzan
On Mon, Jun 29, 2009 at 2:00 PM, Ramzan Afzal <[hidden email]> wrote:
|
|
Hi
Ramzan Afzal wrote: > This code is the best and perfectly fine results are also ok. The > only thing which i am not getting is the comparison of required > proportions. Now, I explain to you > > Group A: x1=20 and n1=80 I want p1=20/80 not like 20/100 and 80/100 > Group B:x1=40 and n1=50, I want p2=40/50 not like 40/90 and 50/90 > > Hence I want to compare the difference between p1 and p2 i..e between > 20/80 and 40/50 only. Now I am getting results for 20/100 and 80/100 > which I do not need. I need to compare 20/80 and 40/50. Ramzan, those are NOT proportions, but odds. If x=20 and n=100 (according to the sample dataset you sent before) then the proportion is 20/100=20%, and that's what CTABLES is comparing. The ratio 20/(100-20)=20/80=0.25, is an odd, not a proportion. Therefore, you DON'T want a test to compare proportions, but odds. You should clarified that when you asked the first time This is the sample dataset you provided: Group x n A 20 100 B 14 200 C 30 145 D 54 90 Proportions are 20%, 7% .... Odds are 0.25, 0.075 ... Logistic regression will provide only a partial response to your question, but perhaps the matrix code could be adapted. Please, clarify what you want, I'm not going to start writing code that doesn't answer the correct question (I'm rather busy at the moment). Marta ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
2009/6/29 Marta García-Granero <[hidden email]> Hi Thanks again for your valuable time and effort. I formulated the problem just to give you an idea what I wanted to do basically p1=x1/n1 which is a proportion i.e. number of favorable case divided by the total number of cases. Anyway, if it is odds then I want to compare odds. I would be grateful to you for a modified code.
Thanks again for your highly skilled technical support.
Regards
Ramzan
|
|
Ramzan
I think we should focus a bit on WHY doyou want to compare odds instead of proportions. I'm afraid that too many translations to and from English (I believe it is the native tongue of neither of us) may be causing some misunderstanding between us. Right now, I have to leave, I'll be back in some hours and then we'll discuss a bit your data layout and what you are trying to achieve before I just send you a code that might be the correct answer to a very wrong question. Be back in a few hours Marta GG Ramzan Afzal WROTE: > > > 2009/6/29 Marta García-Granero <[hidden email] > <mailto:[hidden email]>> > > Hi > > > Thanks again for your valuable time and effort. I formulated the > problem just to give you an idea what I wanted to do basically > p1=x1/n1 which is a proportion i.e. number of favorable case divided > by the total number of cases. Anyway, if it is odds then I want to > compare odds. I would be grateful to you for a modified code. > > Thanks again for your highly skilled technical support. > > Regards > > Ramzan > > > > Ramzan Afzal wrote: > > This code is the best and perfectly fine results are also ok. The > only thing which i am not getting is the comparison of required > proportions. Now, I explain to you > > Group A: x1=20 and n1=80 I want p1=20/80 not like 20/100 and > 80/100 > Group B:x1=40 and n1=50, I want p2=40/50 not like 40/90 and 50/90 > > Hence I want to compare the difference between p1 and p2 i..e > between > 20/80 and 40/50 only. Now I am getting results for 20/100 and > 80/100 > which I do not need. I need to compare 20/80 and 40/50. > > > Ramzan, those are NOT proportions, but odds. If x=20 and n=100 > (according to the sample dataset you sent before) then the > proportion is > 20/100=20%, and that's what CTABLES is comparing. The ratio > 20/(100-20)=20/80=0.25, is an odd, not a proportion. Therefore, you > DON'T want a test to compare proportions, but odds. You should > clarified > that when you asked the first time > > This is the sample dataset you provided: > > Group x n > > A 20 100 .20 > B 14 200 > C 30 145 > D 54 90 > > > Proportions are 20%, 7% .... Odds are 0.25, 0.075 ... > > Logistic regression will provide only a partial response to your > question, but perhaps the matrix code could be adapted. Please, > clarify > what you want, I'm not going to start writing code that doesn't answer > the correct question (I'm rather busy at the moment). > > Marta > > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] <mailto:[hidden email]> (not > to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > -- For miscellaneous SPSS related statistical stuff, visit: http://gjyp.nl/marta/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
I'm back
Ramzan Afzal wrote: > As I said that I wanted to compare proportions between different > groups ok. So, lets say I have 4 groups i.e. A,B,C and D. Each group > has (x,n and p) > > x is number of favorable cases > n is total number of cases in a particular group > p is the proportion > > So we have four proportions from four groups p1,p2,p3 and p4. We want > to check a difference between the proportions for each group- the all > possible combinations of groups will be AB, AC, AD, BC, BD and CD. It > is the *_simple application of z_* test for the difference between two > proportions to all possible combinations of groups. My only problem is > that I want to compute in a single run all possible differences > between the groups. MGG answers: That's EXACTLY what CTABLES is doing with your data. Where on earth is the problem? You had the correct answer from the beginning. > > Now, again illustrating by the example already provided: MGG answers: This NOT the example you provided the 1st time (just to clarify everything). See below and you will see the example you actually gave to me, the one I used to write the original code. Besides, I find strange that if you add x+n you get a very round number (200 in all groups but the last), this is a bit confusing, and leads to think that n is not the total number of cases, but only the cases that are not "favorable cases". I will consider that just a coincidence and work on your example assuming that n is really the total number of cases, since the proportions listed agree with that. Are those real date or were they just prepared as an example? > > *Group* *x* *n* *p* > A 20 180 11.11% > B 15 185 8.11% > C 45 145 31.03% > D 24 76 31.58% > > > Hence we are comparing (11.11 and 8.11) then (11.11,8.11) and so on...... Now, with the clear example and definition of your goal, I can affirm that CTABLES does exactly what you wanted: it computes in one go all the Z tests and presents in a table (a bit confusing to interpret, I admit) the significant results (after adjusting the p-values with Bonferroni method). Period. End of the discussion. USE CTABLES, THE OUTPUT IS OK. > You said that these are odds so I said ok though I still believe that > these are proportions for groups where p=x/n for each group. This is > all I can explain MGG answers: I said they are odds because you offered an example where x=1 and n=100, and then, afterwards, you said you did not want to compute 20/100, but 20/80 (perhaps you modified your example without telling me, and I got confused?). Therefore, I interpreted you want to work with a particular type of ratio, where the numerator is NOT included in the denominator: an odd. Going to your FIRST example (not the one you have offered above) > > > This is the sample dataset you provided: > > Group x n > > A 20 100 > B 14 200 > C 30 145 > D 54 90 > X: Favorable cases Y: Not favorable cases N: Total cases (X+Y) P: Proportion (X/[X+Y]=X/N) O: Odd (X/Y) Group X Y N P O A 20 80 100 0.20 0.25 B 14 180 200 0.07 0.077 C 30 115 145 0.21 0.26 D 54 36 90 0.60 1.47 As you can see, there is quite a difference between a proportion and an odd. With your second dataset and your more detailed explanation, it is now clear you want to compute and compare proportions ----> USE CTABLES Using your second dataset as a correct sample of your data, this is the CTABLES and Marascuilo procedure syntax. You can check and see that both methods compare the proportions you wanted to: 11.11% 8.11% 31.03% 31.58% With the only difference that CTABLES uses Bonferroni adjustment for p-values, and Marascuilo procedure uses a more conservative one (Scheffee-like). * Your SECOND dataset (copied exactly from your mail) * DATA LIST LIST/Group(A1) x n (2 F8). BEGIN DATA A 20 180 B 15 185 C 45 145 D 24 76 END DATA. * Data STILL need restructuring *. COMPUTE y = n - x . EXECUTE . DELETE VARIABLES n. VARSTOCASES /MAKE frequency FROM x y /INDEX = Outcome(2) /KEEP = Group /NULL = KEEP. VALUE LABEL outcome 1'Favorable cases' 2'Non favorable'. WEIGHT BY frequency. ************ CTABLES ************ . * Replace "Outcome" & "Group" by your variable names if different *. CTABLES /VLABELS VARIABLES=Group Outcome DISPLAY=DEFAULT /TABLE Outcome [COUNT F40.0, COLPCT.COUNT PCT40.1] BY Group /SLABELS POSITION=ROW /CATEGORIES VARIABLES=Group Outcome ORDER=A KEY=VALUE EMPTY=INCLUDE /COMPARETEST TYPE=PROP ALPHA=0.05 ADJUST=BONFERRONI ORIGIN=COLUMN INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE. ************ MARASCUILO PROCEDURE ************ . * Don't change anything here *. DATASET NAME Data. DATASET DECLARE Results1 WINDOW=HIDDEN. DATASET DECLARE Results2 WINDOW=HIDDEN. DATASET DECLARE Contingency. OMS /SELECT TABLES /IF COMMANDS = ["Crosstabs"] SUBTYPES = ["Crosstabulation"] /DESTINATION FORMAT = SAV OUTFILE = Contingency. OMS /SELECT TABLES /IF COMMANDS = ["Crosstabs"] SUBTYPES = ["Case Processing Summary"] /DESTINATION VIEWER = NO. * Replace "Outcome" & "Group" by your variable names if different *. CROSSTABS /TABLES=Outcome BY group /FORMAT= AVALUE TABLES /STATISTIC=CHISQ /CELLS= COUNT COLUMN /COUNT ROUND CELL . * Don't change anything from here (fully automatic) *. OMSEND. DATASET ACTIVATE Contingency. COMPUTE Id=$casenum/2. EXE. SELECT IF (Id NE TRUNC(Id)). EXE. DELETE VARIABLES Command_ TO Var3 total Id. PRESERVE. SET MXLOOPS=200 . MATRIX. PRINT /TITLE='MARASCUILO PROCEDURE FOR MULTIPLE PROPORTIONS'. GET Data /VAR=ALL /NAMES=vnames. PRINT Data /CNAMES=vnames /RLABELS='Row 1','Row 2','Total' /TITLE='Input data'. COMPUTE K=NCOL(Data). COMPUTE P=Data(1,:)&/Data(3,:). COMPUTE PLabels={'P(1)','P(2)','P(3)','P(4)','P(5)','P(6)','P(7)','P(8)','P(9)','P(10)','P(11)', 'P(12)','P(13)','P(14)','P(15)','P(16)','P(17)','P(18)','P(19)','P(20)'}. PRINT {100*P} /FORMAT='F8.3' /CNAMES=PLabels /TITLE='Proportions (%) to be compared'. COMPUTE N=Data(3,:). * Critical Chi-square values for up to k=20 *. COMPUTE Chi2={ 3.8415, 5.9915, 7.8147, 9.4877,11.0705,12.5916,14.0671,15.5073,16.9190,18.3070, 19.6751,21.0261,22.3620,23.6848,24.9958,26.2962,27.5871,28.8693,30.1435,31.4104}. COMPUTE Chi2Val=Chi2(K-1). COMPUTE NComp=K*(K-1)/2. COMPUTE Rij= MAKE(NComp,1,0). COMPUTE ABSDiff=MAKE(NComp,1,0). COMPUTE Labels= MAKE(Ncomp,3," "). COMPUTE Sig= MAKE(NComp,1,'(ns)'). COMPUTE Index=1. LOOP i=1 TO K-1. . LOOP j=i+1 TO K. . COMPUTE ABSDiff(Index)=ABS(P(i)-P(j)). . COMPUTE Rij(Index)=SQRT(Chi2Val)*SQRT(P(i)*(1-P(i))/N(i)+P(j)*(1-P(j))/N(j)). . COMPUTE Labels(Index,:)={PLabels(i),"-",PLabels(j)}. . DO IF (ABSDiff(Index) GT Rij(Index)). . COMPUTE Sig(Index)='(*)'. . END IF. . COMPUTE Index=Index+1. . END LOOP. END LOOP. PRINT {ABSDiff,Rij} /FORMAT='F8.3' /CLABELS='|Pi-Pj|','Cr. Range' /TITLE='Absolute differences and their ranges'. PRINT {Labels,Sig} /FORMAT='A4' /TITLE='Contrasts and their significance'. SAVE {ABSDiff,Rij}/OUTFILE=Results1 /VARIABLES=Diffs,Rij. SAVE {Labels,Sig} /OUTFILE=Results2 /VARIABLES=Labels1 TO Labels3,Sig /STRINGS=Labels1 TO Labels3,Sig. END MATRIX. RESTORE. * Cuter (pivot tables) report *. DATASET ACTIVATE Results2. DATASET CLOSE Contingency. STRING Comp(A9). COMPUTE Comp=CONCAT(RTRIM(Labels1),RTRIM(Labels2),RTRIM(Labels3)). MATCH FILES /FILE=* /FILE='Results1'. DATASET CLOSE Results1. VAR LABEL Comp '|Pi)-P(j)|' Diffs'Absolute Difference' Rij'Critical Range' Sig 'Significance'. FORMAT Diffs Rij (F8.3). OMS /SELECT TABLES /IF COMMANDS = ["Summarize"] SUBTYPES = ["Case Processing Summary"] /DESTINATION VIEWER = NO. SUMMARIZE /TABLES=Comp Diffs Rij Sig /FORMAT=LIST NOCASENUM NOTOTAL /TITLE='Multiple comparisons: Marascuilo procedure' /MISSING=VARIABLE /CELLS=NONE. OMSEND. DATASET ACTIVATE Data. DATASET CLOSE Results2. MGG -- For miscellaneous SPSS related statistical stuff, visit: http://gjyp.nl/marta/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Marta,
Many thanks for the solution and it has worked perfectly fine. I have obtained the required results in a single run of syntax you created. This syntax is really nice and powerful. Thanks for your support, concern and patience as well.
Best wishes & regards
Ramzan
2009/6/29 Marta García-Granero <[hidden email]> I'm back |
| Free forum by Nabble | Edit this page |
