Syntax for SAMPLE

classic Classic list List threaded Threaded
20 messages Options
CG
Reply | Threaded
Open this post in threaded view
|

Syntax for SAMPLE

CG

I must draw fixed count  random samples from different groups and I am looking for more efficient code. For example:

 

DO IF GROUPCODE=7.

SAMPLE 5 FROM 32.

(write to outfile)

END IF.

 

DO IF GROUPCODE=9.

SAMPLE  10 from 93.

(write to outfile)

END IF.

 

I have tried using DO REPEAT, but the SAMPLE command doesn’t let me use a function or macro variable in place of the numbers.  I look to your wisdom and tutelage – any help appreciated.

Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

David Marso
Administrator
I would go at it quite differently.
I am assuming you know the counts for each group?  No need in this case to know N per group.
say
GROUPCODE DesiredCount
1   10
2    5
3    ..
...
7    5
..
9   10

Create a small SPSS data file containing these 2 variables and values
save it as "<somevaliddirpath>COUNTS.SAV".
<somevaliddirpath> being something like "C:\Documents\SPSSData\SomeProject\" .
Grab your data file.
GET FILE'masterfileblahblah.sav".
COMPUTE @SCRAMBLER=UNIFORM(1).
SORT CASES BY GROUPCODE @SCRAMBLER.
MATCH FILES
         / FILE *
         / TABLE "<somevaliddirpath>COUNTS.SAV"
         / BY GROUPCODE.
IF $CASENUM=1 OR LAG(GROUPCODE) NE GROUPCODE  GPCounter=1.
IF MISSING( GPCounter) GPCounter=LAG(GPCounter+1).
COMPUTE Use_This_Case=GPCounter LE DesiredCount .
FILTER BY Use_This_Case.




Gregory, Cindy, PED wrote
I must draw fixed count  random samples from different groups and I am looking for more efficient code. For example:

DO IF GROUPCODE=7.
SAMPLE 5 FROM 32.
(write to outfile)
END IF.

DO IF GROUPCODE=9.
SAMPLE  10 from 93.
(write to outfile)
END IF.

I have tried using DO REPEAT, but the SAMPLE command doesn't let me use a function or macro variable in place of the numbers.  I look to your wisdom and tutelage - any help appreciated.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

John F Hall
In reply to this post by CG

Totally untested, but what happens if you try something like:

 

Do repeat x = 7,9

  /y =  5, 10

  /z = 32, 93

  /a = 1,2 .

/b = b1, b2 .

Do if groupcode = x .

Sample y from z .

End if .

Compute b = a.

End repeat .

 

This gives variables b1 and b2 which can be used as filters for writing.  I’m sure regulars such as Bruce, David, ViAnn or Albert-Jan will come up with something.

 

 

John F Hall

 

[hidden email]

www.surveyresearch.weebly.com

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gregory, Cindy, PED
Sent: 15 July 2011 18:03
To: [hidden email]
Subject: Syntax for SAMPLE

 

I must draw fixed count  random samples from different groups and I am looking for more efficient code. For example:

 

DO IF GROUPCODE=7.

SAMPLE 5 FROM 32.

(write to outfile)

END IF.

 

DO IF GROUPCODE=9.

SAMPLE  10 from 93.

(write to outfile)

END IF.

 

I have tried using DO REPEAT, but the SAMPLE command doesn’t let me use a function or macro variable in place of the numbers.  I look to your wisdom and tutelage – any help appreciated.

Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

David Marso
Administrator
It makes *NO* sense at all to use SAMPLE within a DO REPEAT.
Think about that for a moment ;-)

When you do something like
DO REPEAT X= a b c d / Y =  e f g h / Z= ae be cg dh.
COMPUTE Z=X/Y.
END REPEAT.

what do you end up with?
4 new variables on *EACH* case...
i.e. DO REPEAT handles TRANSFORMATIONS and applies to each case.
How would SAMPLE fit into a DO REPEAT?
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

John F Hall
David

It was just a thought, and I was only trying to help!

I once had a similar problem when trying to introduce students to
inferential statistics.  I'd have say 24 students and a survey with 1800
cases for a lab session.  Each student was asked to sample n from N with a
different SET SEED starting point.  This was in the days of 16 VDU's
connected to a remote Vax mainframe (12 working if you were lucky and severe
time constraints if we didn't want to get locked in the building, or get
away before the Arsenal match finished up the road) not modern PCs and
distance learning.  The idea was to get 3 samples each to yield 72
statistics (mean lifesat, % happy or whatever) which we then plotted (on the
chalk-board) hopefully to demonstrate a distribution with an approximately
normal distribution.  Sometimes it worked, sometimes it didn't (usually
because sampling 30 from 3000 isn't as stable as 300 from 3000) but the
students learned a lot from the attempt and understood what we were trying
to do, especially that the sampling distribution of the mean was
approximately normal even if the variable itself was not (eg age)

Now, how about some syntax to do what I need for a new tutorial for the
website?

Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size 300:
then to save mean lifesat, mean age and % very happy (code 3 on happy) as
variables in a separate file with 100 cases (one for each sample).

Have a nice weekend.

John

[hidden email]
www.surveyresearch.weebly.com








-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 15 July 2011 19:33
To: [hidden email]
Subject: Re: Syntax for SAMPLE

It makes *NO* sense at all to use SAMPLE within a DO REPEAT.
Think about that for a moment ;-)

When you do something like
DO REPEAT X= a b c d / Y =  e f g h / Z= ae be cg dh.
COMPUTE Z=X/Y.
END REPEAT.

what do you end up with?
4 new variables on *EACH* case...
i.e. DO REPEAT handles TRANSFORMATIONS and applies to each case.
How would SAMPLE fit into a DO REPEAT?

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45
91568.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

David Marso
Administrator
"Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size 300:
then to save mean lifesat, mean age and % very happy (code 3 on happy) as
variables in a separate file with 100 cases (one for each sample)."

Here you go John,
enjoy ;-)
---
*** OK here we go **.
**First of all oversample **.
LOOP SAMPLE=1 TO 100.
+  DO IF UNIFORM(1) < .12 .
+    XSAVE OUTFILE "Samples.sav" / KEEP caseid lifesat age happy sample.
+  END IF.
END LOOP.
*ONE OF THE ONLY Places you NEED an EXECUTE *.
EXECUTE.
GET FILE "samples.sav".
FREQ SAMPLE.
* Using same ideas as what I posted to Cindy Gregory *
COMPUTE SCRAMBLE=UNIFORM(1).
SORT CASES BY SAMPLE SCRAMBLE.
IF $CASENUM=1 OR LAG(SAMPLE) NE SAMPLE  GPCount=1.
IF MISSING( GPCount) GPCount=LAG(GPCount)+1.

*ONE OF THE ONLY Places you NEED an EXECUTE *.
EXECUTE.
SELECT IF GPCount LE 300.
FREQ SAMPLE.
AGGREGATE OUTFILE *
        / BREAK SAMPLE
        / MLIFESAT MEANAGE = MEAN(lifesat age)
        / PctVHapp=PIN(happy,3,3).
 

John F Hall wrote
David

It was just a thought, and I was only trying to help!

I once had a similar problem when trying to introduce students to
inferential statistics.  I'd have say 24 students and a survey with 1800
cases for a lab session.  Each student was asked to sample n from N with a
different SET SEED starting point.  This was in the days of 16 VDU's
connected to a remote Vax mainframe (12 working if you were lucky and severe
time constraints if we didn't want to get locked in the building, or get
away before the Arsenal match finished up the road) not modern PCs and
distance learning.  The idea was to get 3 samples each to yield 72
statistics (mean lifesat, % happy or whatever) which we then plotted (on the
chalk-board) hopefully to demonstrate a distribution with an approximately
normal distribution.  Sometimes it worked, sometimes it didn't (usually
because sampling 30 from 3000 isn't as stable as 300 from 3000) but the
students learned a lot from the attempt and understood what we were trying
to do, especially that the sampling distribution of the mean was
approximately normal even if the variable itself was not (eg age)

Now, how about some syntax to do what I need for a new tutorial for the
website?

Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size 300:
then to save mean lifesat, mean age and % very happy (code 3 on happy) as
variables in a separate file with 100 cases (one for each sample).

Have a nice weekend.

John

[hidden email]
www.surveyresearch.weebly.com








-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 15 July 2011 19:33
To: [hidden email]
Subject: Re: Syntax for SAMPLE

It makes *NO* sense at all to use SAMPLE within a DO REPEAT.
Think about that for a moment ;-)

When you do something like
DO REPEAT X= a b c d / Y =  e f g h / Z= ae be cg dh.
COMPUTE Z=X/Y.
END REPEAT.

what do you end up with?
4 new variables on *EACH* case...
i.e. DO REPEAT handles TRANSFORMATIONS and applies to each case.
How would SAMPLE fit into a DO REPEAT?

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45
91568.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

Bruce Weaver
Administrator
In reply to this post by John F Hall
Hi John.  I've sent you something off-list that you may be able to cobble into the kind of demo/tutorial you want.  It uses some code written by David Marso, but posted by his then colleague David Nichols.  Here's the link:

   http://groups.google.com/group/sci.stat.consult/msg/710ea4ab83ddf24a?dmode=source

HTH.


John F Hall wrote
David

It was just a thought, and I was only trying to help!

I once had a similar problem when trying to introduce students to
inferential statistics.  I'd have say 24 students and a survey with 1800
cases for a lab session.  Each student was asked to sample n from N with a
different SET SEED starting point.  This was in the days of 16 VDU's
connected to a remote Vax mainframe (12 working if you were lucky and severe
time constraints if we didn't want to get locked in the building, or get
away before the Arsenal match finished up the road) not modern PCs and
distance learning.  The idea was to get 3 samples each to yield 72
statistics (mean lifesat, % happy or whatever) which we then plotted (on the
chalk-board) hopefully to demonstrate a distribution with an approximately
normal distribution.  Sometimes it worked, sometimes it didn't (usually
because sampling 30 from 3000 isn't as stable as 300 from 3000) but the
students learned a lot from the attempt and understood what we were trying
to do, especially that the sampling distribution of the mean was
approximately normal even if the variable itself was not (eg age)

Now, how about some syntax to do what I need for a new tutorial for the
website?

Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size 300:
then to save mean lifesat, mean age and % very happy (code 3 on happy) as
variables in a separate file with 100 cases (one for each sample).

Have a nice weekend.

John

[hidden email]
www.surveyresearch.weebly.com








-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 15 July 2011 19:33
To: [hidden email]
Subject: Re: Syntax for SAMPLE

It makes *NO* sense at all to use SAMPLE within a DO REPEAT.
Think about that for a moment ;-)

When you do something like
DO REPEAT X= a b c d / Y =  e f g h / Z= ae be cg dh.
COMPUTE Z=X/Y.
END REPEAT.

what do you end up with?
4 new variables on *EACH* case...
i.e. DO REPEAT handles TRANSFORMATIONS and applies to each case.
How would SAMPLE fit into a DO REPEAT?

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45
91568.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

David Marso
Administrator

GACK Bruce!
That old thing???
It is from 1996 right about the time I bailed from SPSS TekSport and transferred to the consulting group.
Almost 15 years to the day... How time flies!!!!!!!!!!!!!!!!
The SPSS-X Archives are truncated at 1996 and everything before that has been bit-bucketted.
Hey, ever wonder what ever happened to all the gems which I posted between 1992 and 1996?
There were some pretty twisted Rube-Goldbergesque monstrosities I inflicted upon the world in those days.  Self modifying SPSS code, unintelligible one liners, programs to read data configured like they came from the brain of H.P. Lovecraft or some other dark place (almost lost my mind doing that job).

I used to have those all on my old dead PC (HD is fine but the box is dead, need to pull data some day).
*BUT* I think I like what I posted earlier today much more than that old one.
Note also that what you reference does BOOTSTRAP sampling (with replacement).
The current does sampling without replacement (slightly oversample beyond the desired ratio, XSAVE cases which are 'sampled', then do the random scramble and nuke the extra cases at the end).
Thanks for the blast from the past (I think-- cringe--)


Bruce Weaver wrote
Hi John.  I've sent you something off-list that you may be able to cobble into the kind of demo/tutorial you want.  It uses some code written by David Marso, but posted by his then colleague David Nichols.  Here's the link:

   http://groups.google.com/group/sci.stat.consult/msg/710ea4ab83ddf24a?dmode=source

HTH.


John F Hall wrote
David

It was just a thought, and I was only trying to help!

I once had a similar problem when trying to introduce students to
inferential statistics.  I'd have say 24 students and a survey with 1800
cases for a lab session.  Each student was asked to sample n from N with a
different SET SEED starting point.  This was in the days of 16 VDU's
connected to a remote Vax mainframe (12 working if you were lucky and severe
time constraints if we didn't want to get locked in the building, or get
away before the Arsenal match finished up the road) not modern PCs and
distance learning.  The idea was to get 3 samples each to yield 72
statistics (mean lifesat, % happy or whatever) which we then plotted (on the
chalk-board) hopefully to demonstrate a distribution with an approximately
normal distribution.  Sometimes it worked, sometimes it didn't (usually
because sampling 30 from 3000 isn't as stable as 300 from 3000) but the
students learned a lot from the attempt and understood what we were trying
to do, especially that the sampling distribution of the mean was
approximately normal even if the variable itself was not (eg age)

Now, how about some syntax to do what I need for a new tutorial for the
website?

Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size 300:
then to save mean lifesat, mean age and % very happy (code 3 on happy) as
variables in a separate file with 100 cases (one for each sample).

Have a nice weekend.

John

[hidden email]
www.surveyresearch.weebly.com








-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 15 July 2011 19:33
To: [hidden email]
Subject: Re: Syntax for SAMPLE

It makes *NO* sense at all to use SAMPLE within a DO REPEAT.
Think about that for a moment ;-)

When you do something like
DO REPEAT X= a b c d / Y =  e f g h / Z= ae be cg dh.
COMPUTE Z=X/Y.
END REPEAT.

what do you end up with?
4 new variables on *EACH* case...
i.e. DO REPEAT handles TRANSFORMATIONS and applies to each case.
How would SAMPLE fit into a DO REPEAT?

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45
91568.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

Bruce Weaver
Administrator
When I found that old post, I was looking for a quick & dirty way of generating bootstrap samples (with replacement), and as I recall, there was not much else out there.  Anyway, it worked out quite well for what I was doing at the time.  ;-)



David Marso wrote
GACK Bruce!
That old thing???
It is from 1996 right about the time I bailed from SPSS TekSport and transferred to the consulting group.
Almost 15 years to the day... How time flies!!!!!!!!!!!!!!!!
The SPSS-X Archives are truncated at 1996 and everything before that has been bit-bucketted.
Hey, ever wonder what ever happened to all the gems which I posted between 1992 and 1996?
There were some pretty twisted Rube-Goldbergesque monstrosities I inflicted upon the world in those days.  Self modifying SPSS code, unintelligible one liners, programs to read data configured like they came from the brain of H.P. Lovecraft or some other dark place (almost lost my mind doing that job).

I used to have those all on my old dead PC (HD is fine but the box is dead, need to pull data some day).
*BUT* I think I like what I posted earlier today much more than that old one.
Note also that what you reference does BOOTSTRAP sampling (with replacement).
The current does sampling without replacement (slightly oversample beyond the desired ratio, XSAVE cases which are 'sampled', then do the random scramble and nuke the extra cases at the end).
Thanks for the blast from the past (I think-- cringe--)


Bruce Weaver wrote
Hi John.  I've sent you something off-list that you may be able to cobble into the kind of demo/tutorial you want.  It uses some code written by David Marso, but posted by his then colleague David Nichols.  Here's the link:

   http://groups.google.com/group/sci.stat.consult/msg/710ea4ab83ddf24a?dmode=source

HTH.


John F Hall wrote
David

It was just a thought, and I was only trying to help!

I once had a similar problem when trying to introduce students to
inferential statistics.  I'd have say 24 students and a survey with 1800
cases for a lab session.  Each student was asked to sample n from N with a
different SET SEED starting point.  This was in the days of 16 VDU's
connected to a remote Vax mainframe (12 working if you were lucky and severe
time constraints if we didn't want to get locked in the building, or get
away before the Arsenal match finished up the road) not modern PCs and
distance learning.  The idea was to get 3 samples each to yield 72
statistics (mean lifesat, % happy or whatever) which we then plotted (on the
chalk-board) hopefully to demonstrate a distribution with an approximately
normal distribution.  Sometimes it worked, sometimes it didn't (usually
because sampling 30 from 3000 isn't as stable as 300 from 3000) but the
students learned a lot from the attempt and understood what we were trying
to do, especially that the sampling distribution of the mean was
approximately normal even if the variable itself was not (eg age)

Now, how about some syntax to do what I need for a new tutorial for the
website?

Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size 300:
then to save mean lifesat, mean age and % very happy (code 3 on happy) as
variables in a separate file with 100 cases (one for each sample).

Have a nice weekend.

John

[hidden email]
www.surveyresearch.weebly.com








-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 15 July 2011 19:33
To: [hidden email]
Subject: Re: Syntax for SAMPLE

It makes *NO* sense at all to use SAMPLE within a DO REPEAT.
Think about that for a moment ;-)

When you do something like
DO REPEAT X= a b c d / Y =  e f g h / Z= ae be cg dh.
COMPUTE Z=X/Y.
END REPEAT.

what do you end up with?
4 new variables on *EACH* case...
i.e. DO REPEAT handles TRANSFORMATIONS and applies to each case.
How would SAMPLE fit into a DO REPEAT?

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45
91568.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

Richard Ristow
In reply to this post by CG
At 12:03 PM 7/15/2011, Gregory, Cindy, PED wrote:

>I must draw fixed count  random samples from different groups and I
>am looking for more efficient code. For example:
>
>DO IF GROUPCODE=7.
>SAMPLE 5 FROM 32.
>(write to outfile)
>END IF.
>
>DO IF GROUPCODE=9.
>SAMPLE  10 from 93.
>(write to outfile)
>END IF.

I'd skip SAMPLE, and write straight 'k/n' logic. First, sort the data
by GROUPCODE. Then (untested)

AGGREGATE OUTFILE=* MODE=ADDVARIABLES
   /BREAK=GROUPCODE
   /GroupSize 'Number of members in group' = NU.

Then, if you don't mind writing the desired sample sizes into your
code, something like this (still untested):


DO IF   $CASENUM EQ 1 OR GROUPCODE NE LAG(GROUPCODE)      .
*  #n is initially the group size, and then the number    .
*     of group members not yet tested.                    .
*  #k is initially the desired sample size from the group,.
*     and then the number of members of that sample still .
*     to be selected.                                     .
.  COMPUTE #n = GroupSize.
*  The RECODE statement gives the group sample sizes;     .
*  this is an example of "data in code".                  .
.  RECODE  GROUPCODE
          (1 = ???)
          (2 = ???)
          (7 =   5)
          (9 =  10) INTO #k /* desired sample from group */.
END IF.

*  If desired, specify which random-number generator to use, .
*  and give a starting seed.                                 .

NUMERIC   Take_It (F2).
VAR LABEL Take_It 'Indicator, for case being in the sample'.

COMPUTE   Take_It = RV.BERNOULLI(#k/#n).
COMPUTE   #k      = #k - Take_It.
COMPUTE   #n      = #n - 1.

EXECUTE /* (not sure whether this is needed) */.

SELECT IF Take_It.

If you don't like specifying sample sizes in the RECODE statement,
you can set up a file named SAMPLES, like this:

     GROUPCODE  SampleSize.
         1        ???
         2        ???
         7         5
         9        10

Then, after the AGGREGATE,

MATCH FILES
    /FILE =*
    /TABLE=SAMPLES
    /BY    GROUPCODE.

DO IF   $CASENUM EQ 1 OR GROUPCODE NE LAG(GROUPCODE)      .
*  #n is initially the group size, and then the number    .
*     of group members not yet tested.                    .
*  #k is initially the desired sample size from the group,.
*     and then the number of members of that sample still .
*     to be selected.                                     .
.  COMPUTE #n = GroupSize.
.  COMPUTE #k = SampleSize.
END IF.

and continue as before.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

David Marso
Administrator
In reply to this post by Bruce Weaver
If one just wants say a mean then something like this works without creating a large case level file.
*SIMULATE SOME DATA.
NEW FILE.
INPUT PROGRAM.
LOOP #ID=1 TO 1000.
+  COMPUTE VAR= RV.Normal(5,10).
+  END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXE.
DESC VAR.
FLIP.

VECTOR V=VAR001 TO VAR1000.
LOOP SAMPLE=1 TO 500.
+  COMPUTE MEAN=0.
+  LOOP #PULL=1 TO 100.
+    COMPUTE #=TRUNC(UNIFORM(1000)+1).
+    COMPUTE MEAN=MEAN+V(#).
+  END LOOP.
+  COMPUTE MEAN=MEAN/100.
+  XSAVE OUTFILE "BOOTTEMP.SAV" / KEEP SAMPLE MEAN.
END LOOP.
EXECUTE.
GET FILE "BOOTTEMP.SAV" .
DESC MEAN.

Bruce Weaver wrote
When I found that old post, I was looking for a quick & dirty way of generating bootstrap samples (with replacement), and as I recall, there was not much else out there.  Anyway, it worked out quite well for what I was doing at the time.  ;-)



David Marso wrote
GACK Bruce!
That old thing???
It is from 1996 right about the time I bailed from SPSS TekSport and transferred to the consulting group.
Almost 15 years to the day... How time flies!!!!!!!!!!!!!!!!
The SPSS-X Archives are truncated at 1996 and everything before that has been bit-bucketted.
Hey, ever wonder what ever happened to all the gems which I posted between 1992 and 1996?
There were some pretty twisted Rube-Goldbergesque monstrosities I inflicted upon the world in those days.  Self modifying SPSS code, unintelligible one liners, programs to read data configured like they came from the brain of H.P. Lovecraft or some other dark place (almost lost my mind doing that job).

I used to have those all on my old dead PC (HD is fine but the box is dead, need to pull data some day).
*BUT* I think I like what I posted earlier today much more than that old one.
Note also that what you reference does BOOTSTRAP sampling (with replacement).
The current does sampling without replacement (slightly oversample beyond the desired ratio, XSAVE cases which are 'sampled', then do the random scramble and nuke the extra cases at the end).
Thanks for the blast from the past (I think-- cringe--)


Bruce Weaver wrote
Hi John.  I've sent you something off-list that you may be able to cobble into the kind of demo/tutorial you want.  It uses some code written by David Marso, but posted by his then colleague David Nichols.  Here's the link:

   http://groups.google.com/group/sci.stat.consult/msg/710ea4ab83ddf24a?dmode=source

HTH.


John F Hall wrote
David

It was just a thought, and I was only trying to help!

I once had a similar problem when trying to introduce students to
inferential statistics.  I'd have say 24 students and a survey with 1800
cases for a lab session.  Each student was asked to sample n from N with a
different SET SEED starting point.  This was in the days of 16 VDU's
connected to a remote Vax mainframe (12 working if you were lucky and severe
time constraints if we didn't want to get locked in the building, or get
away before the Arsenal match finished up the road) not modern PCs and
distance learning.  The idea was to get 3 samples each to yield 72
statistics (mean lifesat, % happy or whatever) which we then plotted (on the
chalk-board) hopefully to demonstrate a distribution with an approximately
normal distribution.  Sometimes it worked, sometimes it didn't (usually
because sampling 30 from 3000 isn't as stable as 300 from 3000) but the
students learned a lot from the attempt and understood what we were trying
to do, especially that the sampling distribution of the mean was
approximately normal even if the variable itself was not (eg age)

Now, how about some syntax to do what I need for a new tutorial for the
website?

Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size 300:
then to save mean lifesat, mean age and % very happy (code 3 on happy) as
variables in a separate file with 100 cases (one for each sample).

Have a nice weekend.

John

[hidden email]
www.surveyresearch.weebly.com








-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 15 July 2011 19:33
To: [hidden email]
Subject: Re: Syntax for SAMPLE

It makes *NO* sense at all to use SAMPLE within a DO REPEAT.
Think about that for a moment ;-)

When you do something like
DO REPEAT X= a b c d / Y =  e f g h / Z= ae be cg dh.
COMPUTE Z=X/Y.
END REPEAT.

what do you end up with?
4 new variables on *EACH* case...
i.e. DO REPEAT handles TRANSFORMATIONS and applies to each case.
How would SAMPLE fit into a DO REPEAT?

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45
91568.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

John F Hall
In reply to this post by Bruce Weaver
David, Bruce

What a great exchange Cindy started.

Thanks for all these suggestions.  Luckily I got 4 hours veggie patch
weeding done yesterday and it's raining all day today and tomorrow, so I'll
have a shot at producing a draft demonstration.  Technically I suppose the
method I used in class was sampling with replacement as students were asked
to set the seed to a very high number (their date of birth in yymmdd form):
on one occasion three students got the same samples because they all had the
same date of birth!

In the 1970s and even the 1990s, these were mostly graduates in sociology
and related subjects with little or no training in statistics or experience
of computers, some barely numerate and some who could not type.  They would
run a mile from anything with an equation in it.  However, by using
painstaking (and time-consuming) step-by-step plotting of these and similar
results we could build up an equation in a way students (and I) understood,
not just as a mathematical expression, but also as an incredibly powerful
tool.

I used to have another example for regression and correlation using imagery
of a rigid pole and elastic bands, but that's another story.

John

[hidden email]
www.surveyresearch.weebly.com






-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bruce Weaver
Sent: 15 July 2011 23:06
To: [hidden email]
Subject: Re: Syntax for SAMPLE

When I found that old post, I was looking for a quick & dirty way of
generating bootstrap samples (with replacement), and as I recall, there was
not much else out there.  Anyway, it worked out quite well for what I was
doing at the time.  ;-)




David Marso wrote:

>
> GACK Bruce!
> That old thing???
> It is from 1996 right about the time I bailed from SPSS TekSport and
> transferred to the consulting group.
> Almost 15 years to the day... How time flies!!!!!!!!!!!!!!!!
> The SPSS-X Archives are truncated at 1996 and everything before that has
> been bit-bucketted.
> Hey, ever wonder what ever happened to all the gems which I posted between
> 1992 and 1996?
> There were some pretty twisted Rube-Goldbergesque monstrosities I
> inflicted upon the world in those days.  Self modifying SPSS code,
> unintelligible one liners, programs to read data configured like they came
> from the brain of H.P. Lovecraft or some other dark place (almost lost my
> mind doing that job).
>
> I used to have those all on my old dead PC (HD is fine but the box is
> dead, need to pull data some day).
> *BUT* I think I like what I posted earlier today much more than that old
> one.
> Note also that what you reference does BOOTSTRAP sampling (with
> replacement).
> The current does sampling without replacement (slightly oversample beyond
> the desired ratio, XSAVE cases which are 'sampled', then do the random
> scramble and nuke the extra cases at the end).
> Thanks for the blast from the past (I think-- cringe--)
>
>
>
> Bruce Weaver wrote:
>>
>> Hi John.  I've sent you something off-list that you may be able to cobble
>> into the kind of demo/tutorial you want.  It uses some code written by
>> David Marso, but posted by his then colleague David Nichols.  Here's the
>> link:
>>
>>
>>
http://groups.google.com/group/sci.stat.consult/msg/710ea4ab83ddf24a?dmode=s
ource

>>
>> HTH.
>>
>>
>>
>> John F Hall wrote:
>>>
>>> David
>>>
>>> It was just a thought, and I was only trying to help!
>>>
>>> I once had a similar problem when trying to introduce students to
>>> inferential statistics.  I'd have say 24 students and a survey with 1800
>>> cases for a lab session.  Each student was asked to sample n from N with
>>> a
>>> different SET SEED starting point.  This was in the days of 16 VDU's
>>> connected to a remote Vax mainframe (12 working if you were lucky and
>>> severe
>>> time constraints if we didn't want to get locked in the building, or get
>>> away before the Arsenal match finished up the road) not modern PCs and
>>> distance learning.  The idea was to get 3 samples each to yield 72
>>> statistics (mean lifesat, % happy or whatever) which we then plotted (on
>>> the
>>> chalk-board) hopefully to demonstrate a distribution with an
>>> approximately
>>> normal distribution.  Sometimes it worked, sometimes it didn't (usually
>>> because sampling 30 from 3000 isn't as stable as 300 from 3000) but the
>>> students learned a lot from the attempt and understood what we were
>>> trying
>>> to do, especially that the sampling distribution of the mean was
>>> approximately normal even if the variable itself was not (eg age)
>>>
>>> Now, how about some syntax to do what I need for a new tutorial for the
>>> website?
>>>
>>> Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size
>>> 300:
>>> then to save mean lifesat, mean age and % very happy (code 3 on happy)
>>> as
>>> variables in a separate file with 100 cases (one for each sample).
>>>
>>> Have a nice weekend.
>>>
>>> John
>>>
>>> [hidden email]
>>> www.surveyresearch.weebly.com
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
>>> David Marso
>>> Sent: 15 July 2011 19:33
>>> To: [hidden email]
>>> Subject: Re: Syntax for SAMPLE
>>>
>>> It makes *NO* sense at all to use SAMPLE within a DO REPEAT.
>>> Think about that for a moment ;-)
>>>
>>> When you do something like
>>> DO REPEAT X= a b c d / Y =  e f g h / Z= ae be cg dh.
>>> COMPUTE Z=X/Y.
>>> END REPEAT.
>>>
>>> what do you end up with?
>>> 4 new variables on *EACH* case...
>>> i.e. DO REPEAT handles TRANSFORMATIONS and applies to each case.
>>> How would SAMPLE fit into a DO REPEAT?
>>>
>>> --
>>> View this message in context:
>>>
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45

>>> 91568.html
>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>> [hidden email] (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>> [hidden email] (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>>
>


-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45
92216.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

John F Hall
In reply to this post by David Marso
David

Tried your syntax, but had to modify it a bit.

*Marso sample .
* data set 'ql4gb1975.sav' .
* variables: var544 (happy), var545 (lifesat), age .
compute happy = var544.
compute lifesat = var545 .
compute caseid = serial .
*** OK here we go **.

SPSS doesn't like SCRAMBLE and I can't find it anywhere in the Syntax
Reference Guide, so I dumped it.  The new file has four variables, but only
one case.  I was looking to create a data file with up to n cases
corresponding to the number of samples drawn.  I can then run SPSS to
demonstrate the distributions of sample statistics with:

Freq MLIFESAT MEANAGE PctVHapp /for not /his .

Also the variables SAMPLE (all with value 101) CASEID ( 1 to 300) and
GPCOUNT (1 to 300) are appended to the existing data set, which I don't
particularly want, and the number of cases has dropped from 932 to 300,
which I definitely don't want! Some of this complex syntax is new to me, but
I did use Algol intensively in the 1960s (to manage and analyse survey data)
so I can follow the logic.  I haven't tried Bruce's syntax yet, or Dave
Nicholls' earlier version, but if I can get something that works, it will be
a valuable learning aid.  Gives me something to do instead of gardening in
the rain or catching up on 300 or so films, dramas and documentaries
recorded from TV!

John

[hidden email]
www.surveyresearch.weebly.com

PS  If you haven't already seen it, get Bruce to send you the slideshow.






-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 15 July 2011 21:45
To: [hidden email]
Subject: Re: Syntax for SAMPLE

"Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size 300:
then to save mean lifesat, mean age and % very happy (code 3 on happy) as
variables in a separate file with 100 cases (one for each sample)."

Here you go John,
enjoy ;-)
---
*** OK here we go **.
**First of all oversample **.
LOOP SAMPLE=1 TO 100.
+  DO IF UNIFORM(1) < .12 .
+    XSAVE OUTFILE "Samples.sav" / KEEP caseid lifesat age happy sample.
+  END IF.
END LOOP.
*ONE OF THE ONLY Places you NEED an EXECUTE *.
EXECUTE.
GET FILE "samples.sav".
FREQ SAMPLE.
* Using same ideas as what I posted to Cindy Gregory *
COMPUTE SCRAMBLE=UNIFORM(1).
SORT CASES BY SAMPLE SCRAMBLE.
IF $CASENUM=1 OR LAG(SAMPLE) NE SAMPLE  GPCount=1.
IF MISSING( GPCount) GPCount=LAG(GPCount)+1.

*ONE OF THE ONLY Places you NEED an EXECUTE *.
EXECUTE.
SELECT IF GPCount LE 300.
FREQ SAMPLE.
AGGREGATE OUTFILE *
        / BREAK SAMPLE
        / MLIFESAT MEANAGE = MEAN(lifesat age)
        / PctVHapp=PIN(happy,3,3).



John F Hall wrote:

>
> David
>
> It was just a thought, and I was only trying to help!
>
> I once had a similar problem when trying to introduce students to
> inferential statistics.  I'd have say 24 students and a survey with 1800
> cases for a lab session.  Each student was asked to sample n from N with a
> different SET SEED starting point.  This was in the days of 16 VDU's
> connected to a remote Vax mainframe (12 working if you were lucky and
> severe
> time constraints if we didn't want to get locked in the building, or get
> away before the Arsenal match finished up the road) not modern PCs and
> distance learning.  The idea was to get 3 samples each to yield 72
> statistics (mean lifesat, % happy or whatever) which we then plotted (on
> the
> chalk-board) hopefully to demonstrate a distribution with an approximately
> normal distribution.  Sometimes it worked, sometimes it didn't (usually
> because sampling 30 from 3000 isn't as stable as 300 from 3000) but the
> students learned a lot from the attempt and understood what we were trying
> to do, especially that the sampling distribution of the mean was
> approximately normal even if the variable itself was not (eg age)
>
> Now, how about some syntax to do what I need for a new tutorial for the
> website?
>
> Assume data set is BSA89.sav, N is 3000 and I want 100 samples of size
> 300:
> then to save mean lifesat, mean age and % very happy (code 3 on happy) as
> variables in a separate file with 100 cases (one for each sample).
>
> Have a nice weekend.
>
> John
>
> [hidden email]
> www.surveyresearch.weebly.com
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
> David Marso
> Sent: 15 July 2011 19:33
> To: [hidden email]
> Subject: Re: Syntax for SAMPLE
>
> It makes *NO* sense at all to use SAMPLE within a DO REPEAT.
> Think about that for a moment ;-)
>
> When you do something like
> DO REPEAT X= a b c d / Y =  e f g h / Z= ae be cg dh.
> COMPUTE Z=X/Y.
> END REPEAT.
>
> what do you end up with?
> 4 new variables on *EACH* case...
> i.e. DO REPEAT handles TRANSFORMATIONS and applies to each case.
> How would SAMPLE fit into a DO REPEAT?
>
> --
> View this message in context:
>
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45

> 91568.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>


--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p45
91971.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

odds ratio and 95% CI in Meta analysis

E. Bernardo
Can we do a meta analysis when the available data for each study are odds ratio and 95% confidence interval?  There are 17 studies for the past 5 years.

Thank you for any input.

Eins

Reply | Threaded
Open this post in threaded view
|

Re: Syntax for SAMPLE

David Marso
Administrator
In reply to this post by John F Hall
I tested that code but commented it later and forgot the period in the comment.
* Using same ideas as what I posted to Cindy Gregory *
COMPUTE SCRAMBLE=UNIFORM(1).

In all likelihood the rest of the code failed.  
Not sure why it would append.  Anyhow, go back to my original and don't modify anything other than putting a period in the comment!!!

<quote author="John F Hall">
David

Tried your syntax, but had to modify it a bit.

*Marso sample .
* data set 'ql4gb1975.sav' .
* variables: var544 (happy), var545 (lifesat), age .
compute happy = var544.
compute lifesat = var545 .
compute caseid = serial .
*** OK here we go **.

SPSS doesn't like SCRAMBLE and I can't find it anywhere in the Syntax
Reference Guide, so I dumped it.  
<SNIP>

Also the variables SAMPLE (all with value 101) CASEID ( 1 to 300) and
GPCOUNT (1 to 300) are appended to the existing data set,
<SNIP>

John


Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: odds ratio and 95% CI in Meta analysis

Bruce Weaver
Administrator
In reply to this post by E. Bernardo
What you need for each study is a measure of effect size (Y) and its standard error.  In this case,

Y = ln(OR) -- were 'ln' = natural log

Then let U = ln(upper limit of the 95% CI for the OR)

U = Y + 1.96*SE(Y)
1.96*SE(Y) = U-Y
SE(Y) = (U-Y)/1.96

Some meta-analysis (in the past, at least) wanted the variance of Y rather than the standard error.  If that is the case, just square the standard error.

HTH.


Eins Bernardo wrote
Can we do a meta analysis when the available data for each study are odds ratio and 95% confidence interval?  There are 17 studies for the past 5 years.
Thank you for any input.
Eins
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: odds ratio and 95% CI in Meta analysis

E. Bernardo
I thought my data were hopeless.  Now I am enlighten.  Thank you for your input, Bruce.
Given the data (Y=In OR and SE(Y)), please suggest a method and software.  

Eins

--- On Sat, 7/16/11, Bruce Weaver <[hidden email]> wrote:

From: Bruce Weaver <[hidden email]>
Subject: Re: odds ratio and 95% CI in Meta analysis
To: [hidden email]
Date: Saturday, 16 July, 2011, 7:22 PM

What you need for each study is a measure of effect size (Y) and its standard
error.  In this case,

Y = ln(OR) -- were 'ln' = natural log

Then let U = ln(upper limit of the 95% CI for the OR)

U = Y + 1.96*SE(Y)
1.96*SE(Y) = U-Y
SE(Y) = (U-Y)/1.96

Some meta-analysis (in the past, at least) wanted the variance of Y rather
than the standard error.  If that is the case, just square the standard
error.

HTH.



Eins Bernardo wrote:
>
> Can we do a meta analysis when the available data for each study are odds
> ratio and 95% confidence interval? Â There are 17 studies for the past 5
> years.
> Thank you for any input.
> Eins
>


-----
--
Bruce Weaver
bweaver@...
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-SAMPLE-tp4591268p4594715.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: odds ratio and 95% CI in Meta analysis

Marta Garcia-Granero
El 17/07/2011 2:23, Eins Bernardo escribió:
I thought my data were hopeless.  Now I am enlighten.  Thank you for your input, Bruce.
Given the data (Y=In OR and SE(Y)), please suggest a method and software. 
I Eins:

Once your data are organized as OR & SE(logOR), as you were advised, try the syntax below (rather old, it doesn't use the capability of handling several datasets at the same time).

HTH,
Marta GG
(I will not be back from holidays until July 26).


* Data from: Wald et al, BMJ 2002;325:1202 Homocysteine and cardiovascular
* disease: evidence on causality from a meta-analysis
* (Fig 2, prospective studies on ischemic heart disease)'.

DATA LIST LIST/trial (F2) year (A5) study (A11) orr (F8.2) selog (F8.3).
BEGIN DATA
1    1997     Evans          0.89    0.120
2    2001     Knekt(ND)      0.97    0.242
3    1994     Alfthan        1.00    0.146
4    2001     Fallon         1.13    0.090
5    1999     Whincup        1.15    0.067
6    1998     Stehouwer      1.19    0.191
7    1998     Folsom         1.21    0.268
8    1999     Ridker         1.24    0.112
9    1998     Wald           1.26    0.064
10    1992     Stampfer       1.29    0.135
11    1999     Kark           1.33    0.093
12    1999     Bots           1.34    0.108
13    1995     Arnesen        1.42    0.160
14    2001     Vollset        1.51    0.121
15    1997     Nygard         1.55    0.163
16    2001     Knekt(D)       1.73    0.235
END DATA.

CACHE.
EXECUTE.

SUMMARIZE
  /TABLES=year TO selog
  /FORMAT=VALIDLIST NOCASENUM TOTAL
  /TITLE='Input data'
  /MISSING=VARIABLE
  /CELLS=NONE.

************** FIXED EFFECTS MODEL ***************.
MATRIX.
PRINT  /TITLE ' META-ANALYSIS: SUMMARY RATIO FROM PROCESSED DATA (FIXED-EFFECT MODEL)'.
PRINT /TITLE="INVERSE VARIANCE (WOOLF'S METHOD)".
GET trial /VAR=study.
GET orr   /VAR=orr.
GET selog /VAR=selog.
* General calculations & report *.
COMPUTE k=NROW(orr).
PRINT k
 /FORMAT="F8.0"
 /TITLE="Number of trials analysed (K)".
COMPUTE wi=(1/selog)&**2.
COMPUTE percwi=100*wi&/MSUM(wi).
COMPUTE cilow=EXP(LN(orr)-1.96&*selog).
COMPUTE ciup=EXP(LN(orr)+1.96&*selog).
PRINT {orr,selog,cilow,ciup,percwi}
 /FORMAT="F8.2"
 /RNAMES=trial
 /CLABELS="OR","SE(LNOR)","Lower","Upper", "Weight%"
 /TITLE='                                  95% CI'.
* Summary OR *.
COMPUTE num=MSUM(wi&*LN(orr)).
COMPUTE den=MSUM(wi).
COMPUTE woolforr=EXP(num/den).
COMPUTE sewoolf=1/SQRT(den).
COMPUTE cilowor=EXP((num/den)-1.96*sewoolf).
COMPUTE ciupwor=EXP((num/den)+1.96*sewoolf).
PRINT {woolforr,sewoolf,cilowor,ciupwor,100}
 /FORMAT="F8.2"
 /RLABELS=" Overall"
 /CLABELS="OR","SE(LNOR)","Lower","Upper", "Weight%"
 /TITLE='SUMMARY                           95% CI'.
COMPUTE chival=wi&*((LN(orr)-LN(woolforr))&**2).
COMPUTE het_chi=MSUM(chival).
COMPUTE het_sig=1-CHICDF(het_chi,k-1).
COMPUTE a_chi=(LN(woolforr)/sewoolf)**2.
COMPUTE a_sig=1-CHICDF(a_chi,1).
* Report *.
PRINT {a_chi,a_sig}
 /FORMAT="f8.4"
 /CLABELS="Chi^2","Sig."
 /TITLE="Association Chi-square statistic (df=1) - H0: No association".
PRINT {het_chi,het_sig}
 /FORMAT="f8.4"
 /CLABELS="Chi^2","Sig."
 /TITLE="Cochran Q heterogeneity test (df=K-1) - H0: Homogeneity".
DO IF het_chi GT (k-1).
- DO IF het_sig GT 0.10.
-  PRINT /TITLE="WARNING: Q p>0.10, but some heterogeneity exists!".
- END IF.
- COMPUTE h=SQRT(het_chi/(k-1)).
- COMPUTE isqr=100*(het_chi-(k-1))/het_chi.
- DO IF het_chi GT k.
-  COMPUTE eeh=LN(h)/(SQRT(2*het_chi)-SQRT(2*k-3)).
- ELSE IF het_chi LE k.
-  COMPUTE eeh=SQRT((1-(1/(3*(k-2)**2)))/(2*(k-2))).
- END IF.
- COMPUTE lowh=h*EXP(-1.96*eeh).
- COMPUTE upph=h*EXP(1.96*eeh).
- COMPUTE lowisqr=100*(lowh**2-1)/(lowh**2).
- DO IF lowisqr LT 0.
-  COMPUTE lowisqr=0.
- END IF.
- COMPUTE uppisqr=100*(upph**2-1)/(upph**2).
- DO IF uppisqr GT 100.
-  COMPUTE uppisqr=100.
- END IF.
- PRINT {isqr,lowisqr,uppisqr}
 /FORMAT="F8.1"
 /CLABELS='I^2(%)','Low95 CI','Upp95 CI'
 /TITLE='Heterogeneity statistic: 25%(low), 50%(moderate), 75%(high)'.
END IF.
* Exporting data for forest-plot *.
COMPUTE data1={woolforr,sewoolf}.
COMPUTE namevec1={"orr","selog"}.
SAVE data1 /OUTFILE='c:\temp\extrarow.sav' /NAMES=namevec1.
COMPUTE data2={cilow,ciup,percwi;cilowor,ciupwor,100}.
COMPUTE namevec2={"loworr","highorr","wi"}.
SAVE data2 /OUTFILE='C:\temp\extracols.sav' /NAMES=namevec2.
END MATRIX.

* Adding extra statistics & data to current file *.

ADD FILES /FILE=*
 /FILE='C:\temp\extrarow.sav'.
IF (MISSING(trial)) study = 'Total' .
IF (MISSING(trial)) trial = $casenum .
MATCH FILES /FILE=*
 /FILE='C:\temp\extracols.sav'.
EXECUTE.

*  Forest plot with individual and aggregated OR *.

VAR LABEL loworr 'Lower 95%CI' /highorr 'Upper 95%CI' /orr 'OR'.
GRAPH /HILO(SIMPLE)=VALUE( highorr loworr orr ) BY study
 /TITLE='Fixed Effects Model'.

This is the code for a random effects model:

CACHE.
EXECUTE.

SUMMARIZE
  /TABLES=year TO selog
  /FORMAT=VALIDLIST NOCASENUM TOTAL
  /TITLE='Input data'
  /MISSING=VARIABLE
  /CELLS=NONE.

************* RANDOM EFFECTS MODEL ***************.
MATRIX.
PRINT  /TITLE=" META-ANALYSIS: ODDS-RATIO FROM PROCESSED DATA".
PRINT /TITLE=" RANDOM EFFECTS MODEL: DERSIMONIAN-LAIRD'S METHOD".
GET trial /VAR=study.
GET orr   /VAR=orr.
GET selog /VAR=selog.
* General calculations & report *.
COMPUTE k=NROW(orr).
PRINT k
 /FORMAT="F8.0"
 /TITLE="Number of trials analysed (K)".
COMPUTE wi=(1/selog)&**2.
COMPUTE cilow=EXP(LN(orr)-1.96&*selog).
COMPUTE ciup=EXP(LN(orr)+1.96&*selog).
* Individual weights & heterogeneity report *.
COMPUTE dp=MSUM(wi&*LN(orr))/MSUM(wi).
COMPUTE het_chi=MSUM(wi&*(LN(orr)-dp)&**2).
COMPUTE het_sig=1-CHICDF(het_chi,k-1).
PRINT /TITLE='Heterogeneity before taking Tau-square into consideration'.
print {het_chi,het_sig}
 /format="F8.4"
 /clabels="Chi^2","Sig."
 /title="Cochran Q heterogeneity test (df=K-1) - H0: Homogeneity".
DO IF het_chi GT (k-1).
- DO IF het_sig GT 0.10.
-  PRINT /TITLE="WARNING: Q p>0.10, but some heterogeneity exists!".
- END IF.
- COMPUTE h=SQRT(het_chi/(k-1)).
- COMPUTE isqr=100*(het_chi-(k-1))/het_chi.
- DO IF het_chi GT k.
-  COMPUTE eeh=LN(h)/(SQRT(2*het_chi)-SQRT(2*k-3)).
- ELSE IF het_chi LE k.
-  COMPUTE eeh=SQRT((1-(1/(3*(k-2)**2)))/(2*(k-2))).
- END IF.
- COMPUTE lowh=h*EXP(-1.96*eeh).
- COMPUTE upph=h*EXP(1.96*eeh).
- COMPUTE lowisqr=100*(lowh**2-1)/(lowh**2).
- DO IF lowisqr LT 0.
-  COMPUTE lowisqr=0.
- END IF.
- COMPUTE uppisqr=100*(upph**2-1)/(upph**2).
- DO IF uppisqr GT 100.
-  COMPUTE uppisqr=100.
- END IF.
- PRINT {isqr,lowisqr,uppisqr}
 /FORMAT="F8.1"
 /CLABELS='I^2(%)','Low95 CI','Upp95 CI'
 /TITLE='Heterogeneity statistic: 25%(low), 50%(moderate), 75%(high)'.
END IF.
COMPUTE tau=(het_chi-(k-1))/(MSUM(wi)-(MSUM(wi&**2))/MSUM(wi)).
DO IF tau GT 0.
- PRINT tau /FORMAT='F8.3'
 /TITLE='Tau-square (between trials variance)'.
- COMPUTE vartilda=tau+selog&**2.
- COMPUTE wi=1/vartilda.
ELSE.
- PRINT /TITLE='Tau-square=0 (random- & fixed-effect models will yield'
 +' identical results)'.
END IF.
COMPUTE percwi=100*wi&/MSUM(wi).
PRINT /TITLE="INDIVIDUAL RR & TAU-SQUARE MODIFIED WEIGHTS".
* Summary OR *.
COMPUTE num=MSUM(wi&*LN(orr)).
COMPUTE den=MSUM(wi).
COMPUTE delaiorr=EXP(num/den).
COMPUTE sedelai=1/SQRT(den).
COMPUTE cilowdl=EXP((num/den)-1.96*sedelai).
COMPUTE ciupdl=EXP((num/den)+1.96*sedelai).
* Reports *.
PRINT {orr,selog,cilow,ciup,percwi}
 /FORMAT="F8.2"
 /RNAMES=trial
 /CLABELS="OR","SE(LNOR)","Lower","Upper","Weight%"
 /TITLE='                                  95% CI'.
DO IF tau2 GT 0.
- PRINT /TITLE='Dersimonian-Laird statistic:'.
END IF.
PRINT {delaiorr,sedelai,cilowdl,ciupdl,100}
 /FORMAT="F8.2"
 /RLABELS=" Overall"
 /CLABELS="OR","SE(LNOR)","Lower","Upper","Weight%"
 /TITLE='SUMMARY                           95% CI'.
COMPUTE a_chi=((LN(delaiorr))/sedelai)**2.
COMPUTE a_sig=1-CHICDF(a_chi,1).
PRINT {a_chi,a_sig}
 /FORMAT="F8.4"
 /CLABELS="Chi^2","Sig."
 /TITLE="Association Chi-square statistic (df=1) - H0: No association".
* Exporting data for forest-plot *.
COMPUTE data1={delaiorr,sedelai}.
COMPUTE namevec1={"orr","selog"}.
SAVE data1 /OUTFILE='c:\temp\extrarow.sav' /NAMES=namevec1.
COMPUTE data2={cilow,ciup,percwi;cilowdl,ciupdl,100}.
COMPUTE namevec2={"loworr","highorr","wi"}.
SAVE data2 /OUTFILE='C:\temp\extracols.sav' /NAMES=namevec2.
END MATRIX.

* Adding extra statistics & data to current file *.

ADD FILES /FILE=*
 /FILE='C:\temp\extrarow.sav'.
IF (MISSING(trial)) study = 'Total' .
IF (MISSING(trial)) trial = $casenum .
MATCH FILES /FILE=*
 /FILE='C:\temp\extracols.sav'.
EXECUTE.

*  Forest plot with individual and aggregated OR *.

VAR LABEL loworr 'Lower 95%CI' /highorr 'Upper 95%CI' /orr 'OR'.
GRAPH /HILO(SIMPLE)=VALUE( highorr loworr orr ) BY study
 /TITLE='Random Effects Model'.




Reply | Threaded
Open this post in threaded view
|

Re: odds ratio and 95% CI in Meta analysis

Martin Holt
I remember doing something like this in Stata, using "metan". This also generates the Forest plot.
 HTH,
 
Martin Holt
Medical Statistician
From: Marta García-Granero <[hidden email]>
To: [hidden email]
Sent: Sunday, 17 July 2011, 8:54
Subject: Re: odds ratio and 95% CI in Meta analysis
El 17/07/2011 2:23, Eins Bernardo escribió:
I thought my data were hopeless.  Now I am enlighten.  Thank you for your input, Bruce.
Given the data (Y=In OR and SE(Y)), please suggest a method and software. 
I Eins: Once your data are organized as OR & SE(logOR), as you were advised, try the syntax below (rather old, it doesn't use the capability of handling several datasets at the same time). HTH, Marta GG (I will not be back from holidays until July 26). * Data from: Wald et al, BMJ 2002;325:1202 Homocysteine and cardiovascular * disease: evidence on causality from a meta-analysis * (Fig 2, prospective studies on ischemic heart disease)'. DATA LIST LIST/trial (F2) year (A5) study (A11) orr (F8.2) selog (F8.3). BEGIN DATA 1    1997     Evans          0.89    0.120 2    2001     Knekt(ND)      0.97    0.242 3    1994     Alfthan        1.00    0.146 4    2001     Fallon         1.13    0.090 5    1999     Whincup        1.15    0.067 6    1998     Stehouwer      1.19    0.191 7    1998     Folsom         1.21    0.268 8    1999     Ridker         1.24    0.112 9    1998     Wald           1.26    0.064 10    1992     Stampfer       1.29    0.135 11    1999     Kark           1.33    0.093 12    1999     Bots           1.34    0.108 13    1995     Arnesen        1.42    0.160 14    2001     Vollset        1.51    0.121 15    1997     Nygard         1.55    0.163 16    2001     Knekt(D)       1.73    0.235 END DATA. CACHE. EXECUTE. SUMMARIZE   /TABLES=year TO selog   /FORMAT=VALIDLIST NOCASENUM TOTAL   /TITLE='Input data'   /MISSING=VARIABLE   /CELLS=NONE. ************** FIXED EFFECTS MODEL ***************. MATRIX. PRINT  /TITLE ' META-ANALYSIS: SUMMARY RATIO FROM PROCESSED DATA (FIXED-EFFECT MODEL)'. PRINT /TITLE="INVERSE VARIANCE (WOOLF'S METHOD)". GET trial /VAR=study. GET orr   /VAR=orr. GET selog /VAR=selog. * General calculations & report *. COMPUTE k=NROW(orr). PRINT k  /FORMAT="F8.0"  /TITLE="Number of trials analysed (K)". COMPUTE wi=(1/selog)&**2. COMPUTE percwi=100*wi&/MSUM(wi). COMPUTE cilow=EXP(LN(orr)-1.96&*selog). COMPUTE ciup=EXP(LN(orr)+1.96&*selog). PRINT {orr,selog,cilow,ciup,percwi}  /FORMAT="F8.2"  /RNAMES=trial  /CLABELS="OR","SE(LNOR)","Lower","Upper", "Weight%"  /TITLE='                                  95% CI'. * Summary OR *. COMPUTE num=MSUM(wi&*LN(orr)). COMPUTE den=MSUM(wi). COMPUTE woolforr=EXP(num/den). COMPUTE sewoolf=1/SQRT(den). COMPUTE cilowor=EXP((num/den)-1.96*sewoolf). COMPUTE ciupwor=EXP((num/den)+1.96*sewoolf). PRINT {woolforr,sewoolf,cilowor,ciupwor,100}  /FORMAT="F8.2"  /RLABELS=" Overall"  /CLABELS="OR","SE(LNOR)","Lower","Upper", "Weight%"  /TITLE='SUMMARY                           95% CI'. COMPUTE chival=wi&*((LN(orr)-LN(woolforr))&**2). COMPUTE het_chi=MSUM(chival). COMPUTE het_sig=1-CHICDF(het_chi,k-1). COMPUTE a_chi=(LN(woolforr)/sewoolf)**2. COMPUTE a_sig=1-CHICDF(a_chi,1). * Report *. PRINT {a_chi,a_sig}  /FORMAT="f8.4"  /CLABELS="Chi^2","Sig."  /TITLE="Association Chi-square statistic (df=1) - H0: No association". PRINT {het_chi,het_sig}  /FORMAT="f8.4"  /CLABELS="Chi^2","Sig."  /TITLE="Cochran Q heterogeneity test (df=K-1) - H0: Homogeneity". DO IF het_chi GT (k-1). - DO IF het_sig GT 0.10. -  PRINT /TITLE="WARNING: Q p>0.10, but some heterogeneity exists!". - END IF. - COMPUTE h=SQRT(het_chi/(k-1)). - COMPUTE isqr=100*(het_chi-(k-1))/het_chi. - DO IF het_chi GT k. -  COMPUTE eeh=LN(h)/(SQRT(2*het_chi)-SQRT(2*k-3)). - ELSE IF het_chi LE k. -  COMPUTE eeh=SQRT((1-(1/(3*(k-2)**2)))/(2*(k-2))). - END IF. - COMPUTE lowh=h*EXP(-1.96*eeh). - COMPUTE upph=h*EXP(1.96*eeh). - COMPUTE lowisqr=100*(lowh**2-1)/(lowh**2). - DO IF lowisqr LT 0. -  COMPUTE lowisqr=0. - END IF. - COMPUTE uppisqr=100*(upph**2-1)/(upph**2). - DO IF uppisqr GT 100. -  COMPUTE uppisqr=100. - END IF. - PRINT {isqr,lowisqr,uppisqr}  /FORMAT="F8.1"  /CLABELS='I^2(%)','Low95 CI','Upp95 CI'  /TITLE='Heterogeneity statistic: 25%(low), 50%(moderate), 75%(high)'. END IF. * Exporting data for forest-plot *. COMPUTE data1={woolforr,sewoolf}. COMPUTE namevec1={"orr","selog"}. SAVE data1 /OUTFILE='c:\temp\extrarow.sav' /NAMES=namevec1. COMPUTE data2={cilow,ciup,percwi;cilowor,ciupwor,100}. COMPUTE namevec2={"loworr","highorr","wi"}. SAVE data2 /OUTFILE='C:\temp\extracols.sav' /NAMES=namevec2. END MATRIX. * Adding extra statistics & data to current file *. ADD FILES /FILE=*  /FILE='C:\temp\extrarow.sav'. IF (MISSING(trial)) study = 'Total' . IF (MISSING(trial)) trial = $casenum . MATCH FILES /FILE=*  /FILE='C:\temp\extracols.sav'. EXECUTE. *  Forest plot with individual and aggregated OR *. VAR LABEL loworr 'Lower 95%CI' /highorr 'Upper 95%CI' /orr 'OR'. GRAPH /HILO(SIMPLE)=VALUE( highorr loworr orr ) BY study  /TITLE='Fixed Effects Model'. This is the code for a random effects model: CACHE. EXECUTE. SUMMARIZE   /TABLES=year TO selog   /FORMAT=VALIDLIST NOCASENUM TOTAL   /TITLE='Input data'   /MISSING=VARIABLE   /CELLS=NONE. ************* RANDOM EFFECTS MODEL ***************. MATRIX. PRINT  /TITLE=" META-ANALYSIS: ODDS-RATIO FROM PROCESSED DATA". PRINT /TITLE=" RANDOM EFFECTS MODEL: DERSIMONIAN-LAIRD'S METHOD". GET trial /VAR=study. GET orr   /VAR=orr. GET selog /VAR=selog. * General calculations & report *. COMPUTE k=NROW(orr). PRINT k  /FORMAT="F8.0"  /TITLE="Number of trials analysed (K)". COMPUTE wi=(1/selog)&**2. COMPUTE cilow=EXP(LN(orr)-1.96&*selog). COMPUTE ciup=EXP(LN(orr)+1.96&*selog). * Individual weights & heterogeneity report *. COMPUTE dp=MSUM(wi&*LN(orr))/MSUM(wi). COMPUTE het_chi=MSUM(wi&*(LN(orr)-dp)&**2). COMPUTE het_sig=1-CHICDF(het_chi,k-1). PRINT /TITLE='Heterogeneity before taking Tau-square into consideration'. print {het_chi,het_sig}  /format="F8.4"  /clabels="Chi^2","Sig."  /title="Cochran Q heterogeneity test (df=K-1) - H0: Homogeneity". DO IF het_chi GT (k-1). - DO IF het_sig GT 0.10. -  PRINT /TITLE="WARNING: Q p>0.10, but some heterogeneity exists!". - END IF. - COMPUTE h=SQRT(het_chi/(k-1)). - COMPUTE isqr=100*(het_chi-(k-1))/het_chi. - DO IF het_chi GT k. -  COMPUTE eeh=LN(h)/(SQRT(2*het_chi)-SQRT(2*k-3)). - ELSE IF het_chi LE k. -  COMPUTE eeh=SQRT((1-(1/(3*(k-2)**2)))/(2*(k-2))). - END IF. - COMPUTE lowh=h*EXP(-1.96*eeh). - COMPUTE upph=h*EXP(1.96*eeh). - COMPUTE lowisqr=100*(lowh**2-1)/(lowh**2). - DO IF lowisqr LT 0. -  COMPUTE lowisqr=0. - END IF. - COMPUTE uppisqr=100*(upph**2-1)/(upph**2). - DO IF uppisqr GT 100. -  COMPUTE uppisqr=100. - END IF. - PRINT {isqr,lowisqr,uppisqr}  /FORMAT="F8.1"  /CLABELS='I^2(%)','Low95 CI','Upp95 CI'  /TITLE='Heterogeneity statistic: 25%(low), 50%(moderate), 75%(high)'. END IF. COMPUTE tau=(het_chi-(k-1))/(MSUM(wi)-(MSUM(wi&**2))/MSUM(wi)). DO IF tau GT 0. - PRINT tau /FORMAT='F8.3'  /TITLE='Tau-square (between trials variance)'. - COMPUTE vartilda=tau+selog&**2. - COMPUTE wi=1/vartilda. ELSE. - PRINT /TITLE='Tau-square=0 (random- & fixed-effect models will yield'  +' identical results)'. END IF. COMPUTE percwi=100*wi&/MSUM(wi). PRINT /TITLE="INDIVIDUAL RR & TAU-SQUARE MODIFIED WEIGHTS". * Summary OR *. COMPUTE num=MSUM(wi&*LN(orr)). COMPUTE den=MSUM(wi). COMPUTE delaiorr=EXP(num/den). COMPUTE sedelai=1/SQRT(den). COMPUTE cilowdl=EXP((num/den)-1.96*sedelai). COMPUTE ciupdl=EXP((num/den)+1.96*sedelai). * Reports *. PRINT {orr,selog,cilow,ciup,percwi}  /FORMAT="F8.2"  /RNAMES=trial  /CLABELS="OR","SE(LNOR)","Lower","Upper","Weight%"  /TITLE='                                  95% CI'. DO IF tau2 GT 0. - PRINT /TITLE='Dersimonian-Laird statistic:'. END IF. PRINT {delaiorr,sedelai,cilowdl,ciupdl,100}  /FORMAT="F8.2"  /RLABELS=" Overall"  /CLABELS="OR","SE(LNOR)","Lower","Upper","Weight%"  /TITLE='SUMMARY                           95% CI'. COMPUTE a_chi=((LN(delaiorr))/sedelai)**2. COMPUTE a_sig=1-CHICDF(a_chi,1). PRINT {a_chi,a_sig}  /FORMAT="F8.4"  /CLABELS="Chi^2","Sig."  /TITLE="Association Chi-square statistic (df=1) - H0: No association". * Exporting data for forest-plot *. COMPUTE data1={delaiorr,sedelai}. COMPUTE namevec1={"orr","selog"}. SAVE data1 /OUTFILE='c:\temp\extrarow.sav' /NAMES=namevec1. COMPUTE data2={cilow,ciup,percwi;cilowdl,ciupdl,100}. COMPUTE namevec2={"loworr","highorr","wi"}. SAVE data2 /OUTFILE='C:\temp\extracols.sav' /NAMES=namevec2. END MATRIX. * Adding extra statistics & data to current file *. ADD FILES /FILE=*  /FILE='C:\temp\extrarow.sav'. IF (MISSING(trial)) study = 'Total' . IF (MISSING(trial)) trial = $casenum . MATCH FILES /FILE=*  /FILE='C:\temp\extracols.sav'. EXECUTE. *  Forest plot with individual and aggregated OR *. VAR LABEL loworr 'Lower 95%CI' /highorr 'Upper 95%CI' /orr 'OR'. GRAPH /HILO(SIMPLE)=VALUE( highorr loworr orr ) BY study  /TITLE='Random Effects Model'.

Reply | Threaded
Open this post in threaded view
|

Re: odds ratio and 95% CI in Meta analysis

Bruce Weaver
Administrator
Meta-analysis via SPSS is very much a roll-your-own thing at the moment.  A general meta-analysis procedure (including forest & funnel plots etc) would be a very nice addition.  I wonder what the chances are?  Given how long we waited for GENLIN (and then GENLIN-MIXED, or whatever it's called), I'll not hold my breath.  



Martin Holt wrote
I remember doing something like this in Stata, using "metan". This also generates the Forest plot.

 HTH,

Martin Holt
Medical Statistician

From: Marta García-Granero <[hidden email]>
To: [hidden email]
Sent: Sunday, 17 July 2011, 8:54
Subject: Re: odds ratio and 95% CI in Meta analysis

  El 17/07/2011 2:23, Eins Bernardo escribió:
I thought my data were hopeless.  Now I am enlighten.  Thank you for your input, Bruce.
>Given the data (Y=In OR and SE(Y)), please suggest a method and software.         
I Eins:  Once your data are organized as OR & SE(logOR), as you were advised, try the syntax below (rather old, it doesn't use the capability of handling several datasets at the same time).  HTH, Marta GG (I will not be back from holidays until July 26).  

--- snip Marta's syntax for performing meta-analysis ---
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).