GGRAPH/GPL: plotting CI of percents for a MR set

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

GGRAPH/GPL: plotting CI of percents for a MR set

Kirill Orlov
Please help. I can't find a way to add Error bars of 95% confidence interval for barplot showing percents when the variable is a Multiple response set.

Example. This is a dichotomous MR set:

b1 b2 b3 b4

 1  1  1  0
 1  0  0  0
 1  0  0  0
 1  1  1  0
 1  0  0  0
 1  1  0  0
 1  0  0  0
 1  0  1  0
 0  1  0  0
 0  1  0  0
 0  1  0  0
 0  1  0  0
 1  1  0  0
 1  1  0  0
 0  1  0  0
 0  1  0  0
 0  1  1  0
 0  1  0  0
 0  1  0  0
 1  1  1  0
 0  0  1  0
 1  1  1  0
 0  0  1  0
 0  0  1  0
 0  1  1  0
 0  0  1  0
 1  0  1  1
 0  0  1  1
 0  0  0  1
 0  0  0  1

Percents of the 4 responses and their confidence intervals:

CTABLES
  /VLABELS VARIABLES= $b DISPLAY=DEFAULT
  /TABLE $b [COLPCT.COUNT PCT40.1, COLPCT.COUNT.LCL PCT40.1, COLPCT.COUNT.UCL PCT40.1,
    COLPCT.RESPONSES.COUNT PCT40.1]
  /CATEGORIES VARIABLES= $b  EMPTY=INCLUDE
  /CRITERIA CILEVEL=95.

          Column N %    95.0% Lower CL     95.0% Upper CL   Column Response % (Base: Count)
                        for Column N %     for Column N %                 

$b    b1    43.3%           26.9%               61.0%                43.3%

      b2    56.7%           39.0%               73.1%                56.7%
      b3    43.3%           26.9%               61.0%                43.3%
      b4    13.3%            4.7%               28.7%                13.3%

Please note the following. In general, CTABLES do not compute CI for percents of MR sets (it is written in CTABLES Command Syntax Reference).
However, when each response cannot duplicate within a case/respondent - and this is always the case with Dichotomous MR set - the percents "Column Response % (Base: Count)" [numerator - responses, denominator - cases] coincide with the usual "Column N %" [numerator and denominator - cases] percents.
That is shown in the above table where two types of percents, both with base = case count, are identical.
This equality makes it possible to have confidence intervals for the MRS set, because CI for percents are available for "Column N %" percent type.

So, we have CI's for percents with a MR set. Via CTABLES.

HOW TO PLOT THESE SAME CI's as error bars in GGRAPH? Say, simple boxplot showing the 4 percents with four bars. I want to add the above CI's as error bars to the bars.

Two subquestions: (1) How to do it from the dataset (i.e. casewise data); (2) How to do it from the table above (i.e. having the above table as the 'aggregated dataset'.

Any suggestions?


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: GGRAPH/GPL: plotting CI of percents for a MR set

Andy W
Untested, but from your already aggregated table you could do something like:

**********************************************.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Variable Perc Low High
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Variable=col(source(s), name("Variable"), unit.category())
  DATA: Perc=col(source(s), name("Perc"))
  DATA: LowerBound=col(source(s), name("Low"))
  DATA: UpperBound=col(source(s), name("High"))
  GUIDE: axis(dim(1))
  GUIDE: axis(dim(2), label("Percent and 95% Confidence Interval"))
  ELEMENT: edge(position(region.spread.range(Variable*(Low + High))))
  ELEMENT: point(position(Variable*Perc))
END GPL.
**********************************************.

Don't do bars with CI's (dynamite plots), use the points for the mid-point.
See
https://andrewpwheeler.wordpress.com/2012/02/20/avoid-dynamite-plots-visualizing-dot-plots-with-super-imposed-confidence-intervals-in-spss-and-r/.

I also like making the points have a white border, so something like:

ELEMENT: point(position(Variable*Perc), color.interior(color.black),
color.exterior(color.white))

My Tufte style minimalist advice anyway. I bet you could do the CI's with
the original data, but I am not sure what confidence intervals you will get
(so can probably go below 0 and above 100). So I prefer working with the
aggregate table.



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: GGRAPH/GPL: plotting CI of percents for a MR set

Andy W
In reply to this post by Kirill Orlov
Here is a full worked example, the second with the original data was what is
spit out when using the chart builder with the original data. I manipulate
one of the variables to show it will produce error bars below 0, since it is
just the normal based approximation.

****************************************************************************************************************.
*Aggregate data.
DATA LIST FREE / Variable (A2) Perc Low High (3F3.1).
BEGIN DATA
b1    43.3           26.9               61.0
b2    56.7           39.0               73.1
b3    43.3           26.9               61.0
b4    13.3            4.7               28.7
END DATA.
DATASET NAME Agg.
EXECUTE.

GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Variable Perc Low High
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Variable=col(source(s), name("Variable"), unit.category())
  DATA: Perc=col(source(s), name("Perc"))
  DATA: Low=col(source(s), name("Low"))
  DATA: High=col(source(s), name("High"))
  GUIDE: axis(dim(1))
  GUIDE: axis(dim(2), label("Percent and 95% Confidence Interval"))
  ELEMENT: edge(position(region.spread.range(Variable*(Low + High))))
  ELEMENT: point(position(Variable*Perc))
END GPL.

DATASET CLOSE ALL.

*With the full original binary data.
DATA LIST FREE / b1 to b4 (4F1.0).
BEGIN DATA
 1  1  1  0
 1  0  0  0
 1  0  0  0
 1  1  1  0
 1  0  0  0
 1  1  0  0
 1  0  0  0
 1  0  1  0
 0  1  0  0
 0  1  0  0
 0  1  0  0
 0  1  0  0
 1  1  0  0
 1  1  0  0
 0  1  0  0
 0  1  0  0
 0  1  1  0
 0  1  0  0
 0  1  0  0
 1  1  1  0
 0  0  1  0
 1  1  1  0
 0  0  1  0
 0  0  1  0
 0  1  1  0
 0  0  1  0
 1  0  1  1
 0  0  1  1
 0  0  0  1
 0  0  0  1
END DATA.
DATASET NAME Orig.
EXECUTE.

*Making one with very low proportion, will error bar go below 0?.
DO IF $casenum < 3.
  COMPUTE b4 = 1.
ELSE.
  COMPUTE b4 = 0.
END IF.
EXECUTE.

DATASET ACTIVATE Orig.
FORMATS b1 TO b4 (F3.2).
* Chart Builder.
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=MEANCI(b1, 95) MEANCI(b2, 95)
MEANCI(b3, 95)
    MEANCI(b4, 95) MISSING=LISTWISE REPORTMISSING=NO
    TRANSFORM=VARSTOCASES(SUMMARY="#SUMMARY" INDEX="#INDEX" LOW="#LOW"
HIGH="#HIGH")
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: SUMMARY=col(source(s), name("#SUMMARY"))
  DATA: INDEX=col(source(s), name("#INDEX"), unit.category())
  DATA: LOW=col(source(s), name("#LOW"))
  DATA: HIGH=col(source(s), name("#HIGH"))
  GUIDE: axis(dim(2), label("Mean"))
  GUIDE: text.title(label("Simple Error Bar Mean of b1, Mean of b2, Mean of
b3, Mean of b4 by ",
    "INDEX"))
  GUIDE: text.footnote(label("Error Bars: 95% CI"))
  SCALE: cat(dim(1), include("0", "1", "2", "3"))
  SCALE: linear(dim(2), include(0))
  ELEMENT: point(position(INDEX*SUMMARY))
  ELEMENT: interval(position(region.spread.range(INDEX*(LOW+HIGH))),
shape.interior(shape.ibeam))
END GPL.
****************************************************************************************************************.



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: GGRAPH/GPL: plotting CI of percents for a MR set

PRogman
In reply to this post by Kirill Orlov
I am not aware of how confidence intervals are calculated in GGRAPH. I think
it needs to be for binomial proportions, and I believe CTables do this for
MR variables. I see two ways 1) capture the CTable data with OMS, give the
table som massage and plot it, or 2) do transformation and calculations
yourself, before plotting. Following path 2 and using Jeffrey's Interval for
binomial proportions gets the results of the CTable output.
I frequently make use of the variable Name changing in the GGRAPH header to
allow me to reuse the GPL section (sometimes it even gets easier to read).
HTH, PR

DATASET CLOSE ALL.

DATA LIST LIST /
b1 b2 b3 b4 (4F8.0).
BEGIN DATA
 1  1  1  0
 1  0  0  0
 1  0  0  0
 1  1  1  0
 1  0  0  0
 1  1  0  0
 1  0  0  0
 1  0  1  0
 0  1  0  0
 0  1  0  0
 0  1  0  0
 0  1  0  0
 1  1  0  0
 1  1  0  0
 0  1  0  0
 0  1  0  0
 0  1  1  0
 0  1  0  0
 0  1  0  0
 1  1  1  0
 0  0  1  0
 1  1  1  0
 0  0  1  0
 0  0  1  0
 0  1  1  0
 0  0  1  0
 1  0  1  1
 0  0  1  1
 0  0  0  1
 0  0  0  1
END DATA.
DATASET NAME Demo.
EXECUTE.

DATASET DECLARE Demo2 WINDOW=MINIMIZED.
AGGREGATE
  /OUTFILE=Demo2
  /BREAK=
  /b1=SUM(b1)
  /b2=SUM(b2)
  /b3=SUM(b3)
  /b4=SUM(b4)
  /TotN=N
.
DATASET ACTIVATE Demo2 WINDOW=ASIS.

VARSTOCASES
  /MAKE Responses FROM b1 b2 b3 b4
  /INDEX=b(4)
  /KEEP=TotN
  /NULL=KEEP
.
VALUE LABELS b
  1'b1'  
  2'b2'
  3'b3'
  4'b4'
.
COMPUTE Frac       = Responses/TotN.
COMPUTE #CILevel   = 95.
COMPUTE #AlphaHalf = (100-#CILevel)/100/2.
*Jeffrey's Interval for binomial proportions*.
COMPUTE CI95L      = MAX(IDF.BETA(  #AlphaHalf, Responses+0.5,
TotN-Responses+0.5), 0).
COMPUTE CI95U      = MIN(IDF.BETA(1-#AlphaHalf, Responses+0.5,
TotN-Responses+0.5), 1).
FORMATS CI95L CI95U (F8.2).

*Only the variable names may need editing*.
GGRAPH
  /GRAPHDATASET
   NAME="graphdataset"
   VARIABLES = b     [NAME ="C"]
               Frac  [NAME ="Y"]
               CI95U [NAME ="UCL"]
               CI95L [NAME ="LCL"]
   MISSING=LISTWISE
   REPORTMISSING=NO
  /GRAPHSPEC
   SOURCE=INLINE.
BEGIN GPL
  SOURCE:  s   = userSource(id("graphdataset"))
  DATA:    C   = col(source(s), name("C"), unit.category())
  DATA:    Y   = col(source(s), name("Y"))
  DATA:    UCL = col(source(s), name("UCL"))
  DATA:    LCL = col(source(s), name("LCL"))
  GUIDE:   axis(dim(1), label("b (95% confidence interval)"))
  GUIDE:   axis(dim(2)
               ,delta(0.1)
               ,label("Proportion responses")
               )
  SCALE:   linear(dim(2)
                 ,include(0, 1)
                 ,max(1)
                 )
  ELEMENT: bar(position(C*Y)
              )
  ELEMENT: interval(position(region.spread.range(C*(LCL+UCL)))
                   ,shape(shape.ibeam)
                   )
END GPL.

**************************************************.


Kirill Orlov wrote

> Please help. I can't find a way to add Error bars of 95% confidence
> interval for barplot showing percents when the variable is a Multiple
> response set.
>
> Example. This is a dichotomous MR set:
>
> b1 b2 b3 b4
>
>   1  1  1  0
>   1  0  0  0
>   1  0  0  0
>   1  1  1  0
>   1  0  0  0
>   1  1  0  0
>   1  0  0  0
>   1  0  1  0
>   0  1  0  0
>   0  1  0  0
>   0  1  0  0
>   0  1  0  0
>   1  1  0  0
>   1  1  0  0
>   0  1  0  0
>   0  1  0  0
>   0  1  1  0
>   0  1  0  0
>   0  1  0  0
>   1  1  1  0
>   0  0  1  0
>   1  1  1  0
>   0  0  1  0
>   0  0  1  0
>   0  1  1  0
>   0  0  1  0
>   1  0  1  1
>   0  0  1  1
>   0  0  0  1
>   0  0  0  1
>
> Percents of the 4 responses and their confidence intervals:
>
> CTABLES
>    /VLABELS VARIABLES= $b DISPLAY=DEFAULT
>    /TABLE $b [COLPCT.COUNT PCT40.1, COLPCT.COUNT.LCL PCT40.1,
> COLPCT.COUNT.UCL PCT40.1,
>      COLPCT.RESPONSES.COUNT PCT40.1]
>    /CATEGORIES VARIABLES= $b  EMPTY=INCLUDE
>    /CRITERIA CILEVEL=95.
>
>            Column N %    95.0% Lower CL     95.0% Upper CL Column
> Response % (Base: Count)
> for Column N % for Column N %
>
> $b    b1    43.3%           26.9% 61.0%                43.3%
>        b2    56.7%           39.0% 73.1%                56.7%
>        b3    43.3%           26.9% 61.0%                43.3%
>        b4    13.3%            4.7% 28.7%                13.3%
>
> Please note the following. In _general_, CTABLES do not compute CI for
> percents of MR sets (it is written in CTABLES Command Syntax Reference).
> However, when each response cannot duplicate within a case/respondent -
> and this is always the case with Dichotomous MR set - the percents
> "Column Response % (Base: Count)" [numerator - responses, denominator -
> cases] coincide with the usual "Column N %" [numerator and denominator -
> cases] percents.
> That is shown in the above table where two types of percents, both with
> base = case count, are identical.
> This equality makes it possible to have confidence intervals for the MRS
> set, because CI for percents are available for "Column N %" percent type.
>
> So, we have CI's for percents with a MR set. Via CTABLES.
>
> HOW TO PLOT THESE SAME CI's as error bars in GGRAPH? Say, simple boxplot
> showing the 4 percents with four bars. I want to add the above CI's as
> error bars to the bars.
>
> Two subquestions: (1) How to do it from the dataset (i.e. casewise
> data); (2) How to do it from the table above (i.e. having the above
> table as the 'aggregated dataset'.
>
> Any suggestions?
>
>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: GGRAPH/GPL: plotting CI of percents for a MR set

Bruce Weaver
Administrator
Just noting that I too found (via Stata) that the CIs Kirill showed are
Jeffreys intervals.  


. ci proportion b*, jeffreys

                                                         ----- Jeffreys
-----
    Variable |        Obs  Proportion    Std. Err.       [95% Conf.
Interval]
-------------+---------------------------------------------------------------
          b1 |         30    .4333333     .090472        .2689192  
.6099397
          b2 |         30    .5666667     .090472        .3900603  
.7310808
          b3 |         30    .4333333     .090472        .2689192  
.6099397
          b4 |         30    .1333333    .0620633        .0467275  
.2865289

. /* From Kirill's post:
> b1    43.3           26.9               61.0
> b2    56.7           39.0               73.1
> b3    43.3           26.9               61.0
> b4    13.3            4.7               28.7
> */




PRogman wrote

> I am not aware of how confidence intervals are calculated in GGRAPH. I
> think
> it needs to be for binomial proportions, and I believe CTables do this for
> MR variables. I see two ways 1) capture the CTable data with OMS, give the
> table som massage and plot it, or 2) do transformation and calculations
> yourself, before plotting. Following path 2 and using Jeffrey's Interval
> for
> binomial proportions gets the results of the CTable output.
> I frequently make use of the variable Name changing in the GGRAPH header
> to
> allow me to reuse the GPL section (sometimes it even gets easier to read).
> HTH, PR
>
> --- snip the rest ---





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: GGRAPH/GPL: plotting CI of percents for a MR set

Kirill Orlov
Bruce, Yes I think it is. SPSS CTABLES Algorithms states it that the CI
for props/percents are computed from Beta distribution.

Bruce, PRogman,
Thanks for your responses so far, I'll study them and make a feedback.


04.12.2019 1:08, Bruce Weaver пишет:

> Just noting that I too found (via Stata) that the CIs Kirill showed are
> Jeffreys intervals.
>
>
> . ci proportion b*, jeffreys
>
>                                                           ----- Jeffreys
> -----
>      Variable |        Obs  Proportion    Std. Err.       [95% Conf.
> Interval]
> -------------+---------------------------------------------------------------
>            b1 |         30    .4333333     .090472        .2689192
> .6099397
>            b2 |         30    .5666667     .090472        .3900603
> .7310808
>            b3 |         30    .4333333     .090472        .2689192
> .6099397
>            b4 |         30    .1333333    .0620633        .0467275
> .2865289
>
> . /* From Kirill's post:
>> b1    43.3           26.9               61.0
>> b2    56.7           39.0               73.1
>> b3    43.3           26.9               61.0
>> b4    13.3            4.7               28.7
>> */
>
>
>
> PRogman wrote
>> I am not aware of how confidence intervals are calculated in GGRAPH. I
>> think
>> it needs to be for binomial proportions, and I believe CTables do this for
>> MR variables. I see two ways 1) capture the CTable data with OMS, give the
>> table som massage and plot it, or 2) do transformation and calculations
>> yourself, before plotting. Following path 2 and using Jeffrey's Interval
>> for
>> binomial proportions gets the results of the CTable output.
>> I frequently make use of the variable Name changing in the GGRAPH header
>> to
>> allow me to reuse the GPL section (sometimes it even gets easier to read).
>> HTH, PR
>>
>> --- snip the rest ---
>
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: GGRAPH/GPL: plotting CI of percents for a MR set

Bruce Weaver
Administrator
I quite like the Wilson score CI for proportions.  Here's an excerpt from one
of my old syntax files that shows how to compute both it and the Jeffreys
interval, in case you're interested.  


* The data used here are from Table I in Newcombe (1998), Statistics
   in Medicine, Vol 17, 857-872.

DATA LIST LIST /x(f8.0) n(f8.0) confid(f5.3) .
BEGIN DATA.
81 263 .95
15 148 .95
0   20 .95
1   29 .95
81 263 .90
15 148 .90
0   20 .90
1   29 .90
81 263 .99
15 148 .99
0   20 .99
1   29 .99
END DATA.

compute alpha = 1 - confid.
compute p = x/n.
compute q = 1-p.
compute z = probit(1-alpha/2).

*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .

* Wilson score method (Method 3 in Newcombe, 1998) .
* Code adapted from Robert Newcombe's code posted here:
    http://archive.uwcm.ac.uk/uwcm/ms/Robert2.html .

COMPUTE #x1 = 2*n*p+z**2 .
COMPUTE #x2 = z*(z**2+4*n*p*(1-p))**0.5 .
COMPUTE #x3 = 2*(n+z**2) .
COMPUTE lower4 = (#x1 - #x2) / #x3 .
COMPUTE upper4 = (#x1 + #x2) / #x3 .

*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .

* Jeffreys method shown on the IBM-SPSS website at
* http://www-01.ibm.com/support/docview.wss?uid=swg21474963 .

compute lower5 = idf.beta(alpha/2,x+.5,n-x+.5).
compute upper5 = idf.beta(1-alpha/2,x+.5,n-x+.5).

*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .



Kirill Orlov wrote

> Bruce, Yes I think it is. SPSS CTABLES Algorithms states it that the CI
> for props/percents are computed from Beta distribution.
>
> Bruce, PRogman,
> Thanks for your responses so far, I'll study them and make a feedback.
>
>
> 04.12.2019 1:08, Bruce Weaver пишет:
>> Just noting that I too found (via Stata) that the CIs Kirill showed are
>> Jeffreys intervals.
>>
>>
>> . ci proportion b*, jeffreys
>>
>>                                                           ----- Jeffreys
>> -----
>>      Variable |        Obs  Proportion    Std. Err.       [95% Conf.
>> Interval]
>> -------------+---------------------------------------------------------------
>>            b1 |         30    .4333333     .090472        .2689192
>> .6099397
>>            b2 |         30    .5666667     .090472        .3900603
>> .7310808
>>            b3 |         30    .4333333     .090472        .2689192
>> .6099397
>>            b4 |         30    .1333333    .0620633        .0467275
>> .2865289
>>
>> . /* From Kirill's post:
>>> b1    43.3           26.9               61.0
>>> b2    56.7           39.0               73.1
>>> b3    43.3           26.9               61.0
>>> b4    13.3            4.7               28.7
>>> */
>>
>>
>>
>> PRogman wrote
>>> I am not aware of how confidence intervals are calculated in GGRAPH. I
>>> think
>>> it needs to be for binomial proportions, and I believe CTables do this
>>> for
>>> MR variables. I see two ways 1) capture the CTable data with OMS, give
>>> the
>>> table som massage and plot it, or 2) do transformation and calculations
>>> yourself, before plotting. Following path 2 and using Jeffrey's Interval
>>> for
>>> binomial proportions gets the results of the CTable output.
>>> I frequently make use of the variable Name changing in the GGRAPH header
>>> to
>>> allow me to reuse the GPL section (sometimes it even gets easier to
>>> read).
>>> HTH, PR
>>>
>>> --- snip the rest ---
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>>

> bweaver@

>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: GGRAPH/GPL: plotting CI of percents for a MR set

Kirill Orlov
In reply to this post by Andy W
Andy, thank you.
However, The "aggregated" way - which allows to plot the true CI's
borrowed from CTABLES - is not convenient. You have to use OMS etc.
I was seeking for a way like "original data". But it is inadequate since
it doesn't plot a CI (asymmetric, say, based on Beta distribution, what
CTABLES return) for a proportion of a value (say, value equal to 1, with
binary data). Chart Builder and GPL seem to be making Errors/CIs only
for means of scale variables (which, for binary ones, gives normal
approx. CI for prop.). That is too meagre an option, just a defect.


03.12.2019 4:06, Andy W пишет:

> Here is a full worked example, the second with the original data was what is
> spit out when using the chart builder with the original data. I manipulate
> one of the variables to show it will produce error bars below 0, since it is
> just the normal based approximation.
>
> ****************************************************************************************************************.
> *Aggregate data.
> DATA LIST FREE / Variable (A2) Perc Low High (3F3.1).
> BEGIN DATA
> b1    43.3           26.9               61.0
> b2    56.7           39.0               73.1
> b3    43.3           26.9               61.0
> b4    13.3            4.7               28.7
> END DATA.
> DATASET NAME Agg.
> EXECUTE.
>
> GGRAPH
>    /GRAPHDATASET NAME="graphdataset" VARIABLES=Variable Perc Low High
>    /GRAPHSPEC SOURCE=INLINE.
> BEGIN GPL
>    SOURCE: s=userSource(id("graphdataset"))
>    DATA: Variable=col(source(s), name("Variable"), unit.category())
>    DATA: Perc=col(source(s), name("Perc"))
>    DATA: Low=col(source(s), name("Low"))
>    DATA: High=col(source(s), name("High"))
>    GUIDE: axis(dim(1))
>    GUIDE: axis(dim(2), label("Percent and 95% Confidence Interval"))
>    ELEMENT: edge(position(region.spread.range(Variable*(Low + High))))
>    ELEMENT: point(position(Variable*Perc))
> END GPL.
>
> DATASET CLOSE ALL.
>
> *With the full original binary data.
> DATA LIST FREE / b1 to b4 (4F1.0).
> BEGIN DATA
>   1  1  1  0
>   1  0  0  0
>   1  0  0  0
>   1  1  1  0
>   1  0  0  0
>   1  1  0  0
>   1  0  0  0
>   1  0  1  0
>   0  1  0  0
>   0  1  0  0
>   0  1  0  0
>   0  1  0  0
>   1  1  0  0
>   1  1  0  0
>   0  1  0  0
>   0  1  0  0
>   0  1  1  0
>   0  1  0  0
>   0  1  0  0
>   1  1  1  0
>   0  0  1  0
>   1  1  1  0
>   0  0  1  0
>   0  0  1  0
>   0  1  1  0
>   0  0  1  0
>   1  0  1  1
>   0  0  1  1
>   0  0  0  1
>   0  0  0  1
> END DATA.
> DATASET NAME Orig.
> EXECUTE.
>
> *Making one with very low proportion, will error bar go below 0?.
> DO IF $casenum < 3.
>    COMPUTE b4 = 1.
> ELSE.
>    COMPUTE b4 = 0.
> END IF.
> EXECUTE.
>
> DATASET ACTIVATE Orig.
> FORMATS b1 TO b4 (F3.2).
> * Chart Builder.
> GGRAPH
>    /GRAPHDATASET NAME="graphdataset" VARIABLES=MEANCI(b1, 95) MEANCI(b2, 95)
> MEANCI(b3, 95)
>      MEANCI(b4, 95) MISSING=LISTWISE REPORTMISSING=NO
>      TRANSFORM=VARSTOCASES(SUMMARY="#SUMMARY" INDEX="#INDEX" LOW="#LOW"
> HIGH="#HIGH")
>    /GRAPHSPEC SOURCE=INLINE.
> BEGIN GPL
>    SOURCE: s=userSource(id("graphdataset"))
>    DATA: SUMMARY=col(source(s), name("#SUMMARY"))
>    DATA: INDEX=col(source(s), name("#INDEX"), unit.category())
>    DATA: LOW=col(source(s), name("#LOW"))
>    DATA: HIGH=col(source(s), name("#HIGH"))
>    GUIDE: axis(dim(2), label("Mean"))
>    GUIDE: text.title(label("Simple Error Bar Mean of b1, Mean of b2, Mean of
> b3, Mean of b4 by ",
>      "INDEX"))
>    GUIDE: text.footnote(label("Error Bars: 95% CI"))
>    SCALE: cat(dim(1), include("0", "1", "2", "3"))
>    SCALE: linear(dim(2), include(0))
>    ELEMENT: point(position(INDEX*SUMMARY))
>    ELEMENT: interval(position(region.spread.range(INDEX*(LOW+HIGH))),
> shape.interior(shape.ibeam))
> END GPL.
> ****************************************************************************************************************.
>
>
>
> -----
> Andy W
> [hidden email]
> http://andrewpwheeler.wordpress.com/
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD