SPSSX Discussion

Need help with aggregate macro with multiple file saves

Classic

List

Threaded

8 messages Options

Ken Chui

Need help with aggregate macro with multiple file saves

Hello all,

I am trying to modify the following macro so that it would aggregate by
different break variables (status, chemo) and save a file separately. Right
now I can only use this version to come up with "status.sav" and
"chemo.sav"... how can I incorporate the second token !FILEN so that the
output file will be named "a01.sav" for 'status' and "a02.sav" for 'chemo'
separately? Thanks for any advice/help.

-Ken

DEFINE AGG1 (GP = !ENCLOSE('[',']') / FILEN = !ENCLOSE('[',']')) .
!DO !I !IN (!GP)
GET
FILE='C:\Program Files\SPSS\AML survival.sav'.
AGGREGATE
/OUTFILE=!QUOTE(!CONCAT ('C:\temp\', !I, '.sav'))
/BREAK=!I
/time_mean = MEAN(time).
!DOEND .
!ENDDEFINE .

AGG1 GP = [status chemo] FILEN = [a01 a02] .

Raynald Levesque

Re: Need help with aggregate macro with multiple file saves

Hi

This is one way (of course "cleaner" solutions are possible using Python):

*////////////////////////////.
DEFINE !agg1 (gp = !ENCLOSE('[',']') / filen= !ENCLOSE('[',']')) .

!LET !cnt1=!BLANKS(0)
!DO !I !IN (!gp)
!LET !cnt1=!CONCAT(!cnt1,!BLANKS(1))

!LET !cnt2=!BLANKS(0)
!DO !name !IN (!filen)
!LET !cnt2=!CONCAT(!cnt2,!BLANKS(1))
!IF (!cnt2 = !cnt1) !THEN
!LET !filename=!name
!BREAK
!IFEND
!DOEND

GET FILE='C:\Program Files\SPSS\AML survival.sav'.
AGGREGATE
/OUTFILE=!QUOTE(!CONCAT ('C:\temp\', !filename, '.sav'))
/BREAK=!I
/time_mean = MEAN(time).
!DOEND
*////////////////////////////.

!ENDDEFINE .

SET MPRINT=YES.
!agg1 gp = [status chemo] filen = [a01 a02] .

This is the expanded syntax produced by the macro:
202 0 M> GET FILE='C:\Program Files\SPSS\AML survival.sav'.
203 0 M> AGGREGATE /OUTFILE= 'C:\temp\a01.sav' /BREAK= status /time_mean =
MEAN(time).
204 0 M>
205 0 M>
206 0 M> GET FILE='C:\Program Files\SPSS\AML survival.sav'.
207 0 M> AGGREGATE /OUTFILE= 'C:\temp\a02.sav' /BREAK= chemo /time_mean =
MEAN(time).

Cheers!

Raynald Levesque [hidden email]
Visit my SPSS site: http://www.spsstools.net

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Ken
Chui
Sent: August 15, 2006 11:07 AM
To: [hidden email]
Subject: Need help with aggregate macro with multiple file saves

Hello all,

I am trying to modify the following macro so that it would aggregate by
different break variables (status, chemo) and save a file separately. Right
now I can only use this version to come up with "status.sav" and
"chemo.sav"... how can I incorporate the second token !FILEN so that the
output file will be named "a01.sav" for 'status' and "a02.sav" for 'chemo'
separately? Thanks for any advice/help.

-Ken

DEFINE AGG1 (GP = !ENCLOSE('[',']') / FILEN = !ENCLOSE('[',']')) .
!DO !I !IN (!GP)
GET
FILE='C:\Program Files\SPSS\AML survival.sav'.
AGGREGATE
/OUTFILE=!QUOTE(!CONCAT ('C:\temp\', !I, '.sav'))
/BREAK=!I
/time_mean = MEAN(time).
!DOEND .
!ENDDEFINE .

AGG1 GP = [status chemo] FILEN = [a01 a02] .

Gary Rosin

Loss function for log-likelihood nonlinear regression of proportion via Inverse Logit

I have aggregated data for proportion of persons passing an
high-stakes exam. I'd like to use a logit transformation of the
proportion to do a multinomial regression against predictors.
I want more information than that in probit/logit in SPSS.

I'm trying nonlinear regression, but would like to use log-
likelihood as a loss function instead of least-squares.

How do I specify the loss function in nonlinear analysis?

---
Prof. Gary S. Rosin [hidden email]
South Texas College of Law
1303 San Jacinto Voice: (713) 646-1854
Houston, TX 77002-7000 Fax: 646-1766

Hector Maletta

Re: Loss function for log-likelihood nonlinear regression of proportion via Inverse Logit

If you predict the binary exam outcome (pass or fail) via logistic
regression, you are predicting the logit (i.e. the natural logarithm of
[p/(1-p)]) as a linear function of the predictors. The log likelihood is
produced as a matter of course. But for doing so you would need individual,
not aggregated data, except if you have the proportion of people passing the
exam for every combination of values of predictors. In this latter case you
would need a file structured the following way: one row per combination and
one column per variable. The variables would be: one column for every
predictor, one column for the proportion passing the exam (p), and one
column for the frequency of each combination of values of predictors. To
make things easier for SPSS, you may duplicate every case, assigning to one
copy the frequency pn, and to the other copy the frequency (1-p)n, where n
is the total number of cases in the sample. Finally, weight by the frequency
variable, and apply logistic regression.
All this is for logit. For probit the answer would be different, but I do
not think you seriously need probit.
I hope this answers your question.
Hector

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Gary
Rosin
Enviado el: Tuesday, August 15, 2006 10:31 PM
Para: [hidden email]
Asunto: Loss function for log-likelihood nonlinear regression of proportion
via Inverse Logit

I have aggregated data for proportion of persons passing an
high-stakes exam. I'd like to use a logit transformation of the
proportion to do a multinomial regression against predictors.
I want more information than that in probit/logit in SPSS.

I'm trying nonlinear regression, but would like to use log-
likelihood as a loss function instead of least-squares.

How do I specify the loss function in nonlinear analysis?

---
Prof. Gary S. Rosin [hidden email]
South Texas College of Law
1303 San Jacinto Voice: (713) 646-1854
Houston, TX 77002-7000 Fax: 646-1766

Gary Rosin

Re: Loss function for log-likelihood nonlinear regression of proportion via Inverse Logit

At 09:16 PM 8/15/2006, Hector Maletta wrote:
>If you predict the binary exam outcome (pass or fail) via logistic
>regression, you are predicting the logit (i.e. the natural logarithm of
>[p/(1-p)]) as a linear function of the predictors. The log likelihood is
>produced as a matter of course. But for doing so you would need
>individual, not aggregated data ... . ***

Thanks, but I want to do nonlinear regression using the model
Pass Rate (PR) =
exp(b0 + b1*x1+ ... + bn*xn)/(1+exp(b0 + b1*x1+ ... + bn*xn))
What I need is the formula for the log-likelihood loss function, so that
I can use that instead of least-squares.

Gary

Hector Maletta

Re: Loss function for log-likelihood nonlinear regression of proportion via Inverse Logit

The expression for non linear expression for the pass rate mentioned in
your message is exactly the definition of the probability of passing the
exam in logistic regression. Calling Z to the combination
b1X1+b2X2+...+bnXn, that probability is exp(z) / 1 + exp(Z), and the b
coefficients are the log odds ratios coefficients computed by logistic
regression.
Hector.

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Gary
Rosin
Enviado el: Tuesday, August 15, 2006 11:29 PM
Para: [hidden email]
Asunto: Re: Loss function for log-likelihood nonlinear regression of
proportion via Inverse Logit

At 09:16 PM 8/15/2006, Hector Maletta wrote:
>If you predict the binary exam outcome (pass or fail) via logistic
>regression, you are predicting the logit (i.e. the natural logarithm of
>[p/(1-p)]) as a linear function of the predictors. The log likelihood is
>produced as a matter of course. But for doing so you would need
>individual, not aggregated data ... . ***

Thanks, but I want to do nonlinear regression using the model
Pass Rate (PR) =
exp(b0 + b1*x1+ ... + bn*xn)/(1+exp(b0 + b1*x1+ ... + bn*xn))
What I need is the formula for the log-likelihood loss function, so that
I can use that instead of least-squares.

Gary

Gary Rosin

Re: Loss function for log-likelihood nonlinear regression of proportion via Inverse Logit

At 06:25 AM 8/16/2006, Hector Maletta wrote:
> The expression for non linear expression for the pass rate mentioned in
>your message is exactly the definition of the probability of passing the
>exam in logistic regression. Calling Z to the combination
>b1X1+b2X2+...+bnXn, that probability is exp(z) / 1 + exp(Z), and the b
>coefficients are the log odds ratios coefficients computed by logistic
>regression.

Yes. In Nonlinear Regression that expression is what I use as the
'Model". The default loss function in least-squares. I'd like to use
log-likelihood instead, but don't know the expression for that loss
function.

Gary

---

Prof. Gary S. Rosin Internet: [hidden email]
South Texas College of Law
1303 San Jacinto Voice: (713) 646-1854
Houston, TX 77002-7000 Fax: (713) 646-1766

Ken Chui

Re: Need help with aggregate macro with multiple file saves

In reply to this post by Ken Chui

Yay~!! Thank you!

It works, and I'll look at the syntax at details. I have never used !BLANKS
and !BREAK before.

Thanks a lot.

-Ken