SPSSX Discussion

calculate group membership probabilities for a-posteriori additions in DFA?

Classic

List

Threaded

7 messages Options

Ian Martin-2

calculate group membership probabilities for a-posteriori additions in DFA?

Hi All,

I've generated the discriminant function between 2 groups, and now want to add in additional cases not used in the original DFA. I understand how to generate the DF score for these new cases, based on their values of the variables used in DFA and the table of Canonical Discriminant Function Coefficients.

But since the groups overlap, and because the prior probabilities are not equal, I would like to calculate the probabilities of membership in both Group 1 and Group 2 for these new cases but haven't figured out how to use the output to do so.

Can anyone point me in the right direction?

Thanks,
Ian Martin
ps here is a sample syntax

DISCRIMINANT
/GROUPS=condition_diag_over20(1 2)
/VARIABLES=v1 v2 v3 v4 v5 v6
/ANALYSIS ALL
/SAVE=CLASS SCORES PROBS
/PRIORS SIZE
/STATISTICS=MEAN STDDEV UNIVF BOXM COEFF RAW TABLE CROSSVALID
/CLASSIFY=NONMISSING POOLED.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Jon K Peck

Re: calculate group membership probabilities for a-posteriori additions in DFA?

Save your model from discriminant as an xml file. Then run code like this.
MODEL HANDLE NAME=discrim FILE='c:\temp\discrim.xml'
/OPTIONS MISSING=SUBSTITUTE.
COMPUTE PredictedValue=APPLYMODEL(discrim, 'PREDICT').
COMPUTE PredictedProbability=APPLYMODEL(discrim, 'PROBABILITY').
COMPUTE SelectedProbability0=APPLYMODEL(discrim, 'PROBABILITY', 0).
COMPUTE SelectedProbability1=APPLYMODEL(discrim, 'PROBABILITY', 1).
EXEC.
MODEL CLOSE NAME=discrim.

You need the EXEC, because the model cannot be closed until the transformations have been executed.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621

From: Ian Martin <[hidden email]>
To: [hidden email],
Date: 06/06/2014 02:02 PM
Subject: [SPSSX-L] calculate group membership probabilities for a-posteriori additions in DFA?
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Hi All, I've generated the discriminant function between 2 groups, and now want to add in additional cases not used in the original DFA. I understand how to generate the DF score for these new cases, based on their values of the variables used in DFA and the table of Canonical Discriminant Function Coefficients. But since the groups overlap, and because the prior probabilities are not equal, I would like to calculate the probabilities of membership in both Group 1 and Group 2 for these new cases but haven't figured out how to use the output to do so. Can anyone point me in the right direction? Thanks, Ian Martin ps here is a sample syntax DISCRIMINANT /GROUPS=condition_diag_over20(1 2) /VARIABLES=v1 v2 v3 v4 v5 v6 /ANALYSIS ALL /SAVE=CLASS SCORES PROBS /PRIORS SIZE /STATISTICS=MEAN STDDEV UNIVF BOXM COEFF RAW TABLE CROSSVALID /CLASSIFY=NONMISSING POOLED. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Art Kendall

Re: calculate group membership probabilities for a-posteriori additions in DFA?

In reply to this post by Ian Martin-2

ou say the groups overlap. Do you mean in group membership or on scores on a variable other than the group membership variable?

The easiest way to score the new cases IF you have the original data, is to ADD CASES and give them a value on the group membership variable that is not one of the original values. "3" would work.
Then rerun your syntax.

Art Kendall
Social Research Consultants

Ian Martin-2

Re: calculate group membership probabilities for a-posteriori additions in DFA?

On Jun 6, 2014, at 4:42 PM, Art Kendall <[hidden email]> wrote:

> ou say the groups overlap. Do you mean in group membership or on scores on a
> variable other than the group membership variable?
>
> The easiest way to score the new cases IF you have the original data, is to
> ADD CASES and give them a value on the group membership variable that is
> not one of the original values. "3" would work.
> Then rerun your syntax.
>

Art & Jon,

I mean the groups overlap on value of the DF score (Art's question). Jon, thank you for the effort on syntax, but I was looking for a different solution:

I want to provide a scoring and classification system that can be used outside of SPSS.

Otherwise, I guess either your ADD CASES or Jon's syntax would work. Sorry, I should have been clearer about the objective. Ideally, somebody could plug numbers for v1-v6 into a spreadsheet or even hand calculator, and get the DF score for that single new case and the groups' classification probabilities. As I said, it's easy to do that for the DF score (in Excel or whatever, just by lifting the Canonical DF Coeffs. from SPSS), but I need the group classification probabilities too.

I have been assuming I can somehow generate the classification probabilities outside of SPSS by using the SPSS output table "Classification Function Coefficients" and the table "Prior Probabilities for Groups". Hope so, anyway!

My apologies for not indicating that I wanted to do this outside of SPSS.

regards,
Ian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

David Marso

Re: calculate group membership probabilities for a-posteriori additions in DFA?

Administrator

Click
Help>Algorithms ...
Locate the formulas, write an implimentation in Excel...
Have fun with the latter...

Ian Martin-2 wrote

On Jun 6, 2014, at 4:42 PM, Art Kendall <[hidden email]> wrote:

> ou say the groups overlap. Do you mean in group membership or on scores on a
> variable other than the group membership variable?
>
> The easiest way to score the new cases IF you have the original data, is to
> ADD CASES and give them a value on the group membership variable that is
> not one of the original values. "3" would work.
> Then rerun your syntax.
>

Art & Jon,

I mean the groups overlap on value of the DF score (Art's question). Jon, thank you for the effort on syntax, but I was looking for a different solution:

I want to provide a scoring and classification system that can be used outside of SPSS.

Otherwise, I guess either your ADD CASES or Jon's syntax would work. Sorry, I should have been clearer about the objective. Ideally, somebody could plug numbers for v1-v6 into a spreadsheet or even hand calculator, and get the DF score for that single new case and the groups' classification probabilities. As I said, it's easy to do that for the DF score (in Excel or whatever, just by lifting the Canonical DF Coeffs. from SPSS), but I need the group classification probabilities too.

I have been assuming I can somehow generate the classification probabilities outside of SPSS by using the SPSS output table "Classification Function Coefficients" and the table "Prior Probabilities for Groups". Hope so, anyway!

My apologies for not indicating that I wanted to do this outside of SPSS.

regards,
Ian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Rich Ulrich

Re: calculate group membership probabilities for a-posteriori additions in DFA?

In reply to this post by Ian Martin-2

I wonder, are you having a problem with "prior probabilities are not equal"?

If so, I have to ask, was this an well-informed, intentional decision, or was it a
casual choice based on, "well, this looks like something better"? If so ... don't do it.

- because, it is almost always wrong to use anything but the default of
"equal priors". I once did use priors saying that one group was two times "more
probable" because I did want 95% of the errors to be in the other direction.

What the Prior does is this: it effectively multiplies the probability-densities by the weights.
Think of the two overlapping normal curves for the two samples; now, multiply one of
them by the ratio of the priors; now, notice how far it shifts the cutoff score (assuming
that all cases are not put into the same group). The equation is shown at -
http://www-01.ibm.com/support/knowledgecenter/SSLVMB_20.0.0/com.ibm.spss.statistics.help/alg_discriminant_classification.htm?lang=en

--
Rich Ulrich

> Date: Fri, 6 Jun 2014 16:01:30 -0400

> From: [hidden email]
> Subject: calculate group membership probabilities for a-posteriori additions in DFA?
> To: [hidden email]
>
> Hi All,
>
> I've generated the discriminant function between 2 groups, and now want to add in additional cases not used in the original DFA. I understand how to generate the DF score for these new cases, based on their values of the variables used in DFA and the table of Canonical Discriminant Function Coefficients.
>
> But since the groups overlap, and because the prior probabilities are not equal, I would like to calculate the probabilities of membership in both Group 1 and Group 2 for these new cases but haven't figured out how to use the output to do so.
>
> Can anyone point me in the right direction?
>
> Thanks,
> Ian Martin
> ps here is a sample syntax
>
> DISCRIMINANT
> /GROUPS=condition_diag_over20(1 2)
> /VARIABLES=v1 v2 v3 v4 v5 v6
> /ANALYSIS ALL
> /SAVE=CLASS SCORES PROBS
> /PRIORS SIZE
> /STATISTICS=MEAN STDDEV UNIVF BOXM COEFF RAW TABLE CROSSVALID
> /CLASSIFY=NONMISSING POOLED.
>
> ============

Rich Ulrich

Re: calculate group membership probabilities for a-posteriori additions in DFA?

The usual and desired result in the research that I know is to have a
similar fraction of errors in each direction. That would be: sensitivity
versus (negative) specificity, or False Positives versus False Negatives.

I have never figured out whether Priors=size is justified by giving the
"fewest errors" or if it is just a totally bad idea. In terms of "fewest
errors", I will point out that for grossly unequal group sizes, you may, indeed,
get the "fewest errors" by placing every single case into the larger group;
and that is what the option will give you.

Setting Priors to something other than equal will provide a shifted cutoff,
but I suggest trying out various cutoffs in order to find what works for you,
if you are looking for some particular combination of sensitivity and specificity.

--
Rich Ulrich

From: [hidden email]
Subject: Re: calculate group membership probabilities for a-posteriori additions in DFA?
Date: Sat, 7 Jun 2014 09:21:38 -0400
To: [hidden email]; [hidden email]

Rich,

Thanks for your input and the link.

I chose to set prior probabilities based on the sample sizes of the 2 groups, which were quite different. I can see that pulling priors out of a hat based on some hoped for effect might be biased, but in my case it seemed less biased to use group sizes to set prior probabilities rather than use equal prior probs. Is that unreasonable?

I’ll have a closer look at the algorithms later, thanks.

Ian

On 07Jun, 2014, at 1:08 AM, Rich Ulrich <[hidden email]> wrote:

I wonder, are you having a problem with "prior probabilities are not equal"?

If so, I have to ask, was this an well-informed, intentional decision, or was it a
casual choice based on, "well, this looks like something better"? If so ... don't do it.

- because, it is almost always wrong to use anything but the default of
"equal priors".   I once did use priors saying that one group was two times "more
probable" because I did want 95% of the errors to be in the other direction.

What the Prior does is this: it effectively multiplies the probability-densities by the weights.
Think of the two overlapping normal curves for the two samples; now, multiply one of
them by the ratio of the priors; now, notice how far it shifts the cutoff score (assuming
that all cases are not put into the same group). The equation is shown at -
http://www-01.ibm.com/support/knowledgecenter/SSLVMB_20.0.0/com.ibm.spss.statistics.help/alg_discriminant_classification.htm?lang=en

--
Rich Ulrich

> Date: Fri, 6 Jun 2014 16:01:30 -0400

> From: [hidden email]
> Subject: calculate group membership probabilities for a-posteriori additions in DFA?
> To: [hidden email]
>
> Hi All,
>
> I've generated the discriminant function between 2 groups, and now want to add in additional cases not used in the original DFA. I understand how to generate the DF score for these new cases, based on their values of the variables used in DFA and the table of Canonical Discriminant Function Coefficients.
>
> But since the groups overlap, and because the prior probabilities are not equal, I would like to calculate the probabilities of membership in both Group 1 and Group 2 for these new cases but haven't figured out how to use the output to do so.
>
> Can anyone point me in the right direction?
>
> Thanks,
> Ian Martin
> ps here is a sample syntax
>
> DISCRIMINANT
> /GROUPS=condition_diag_over20(1 2)
> /VARIABLES=v1 v2 v3 v4 v5 v6
> /ANALYSIS ALL
> /SAVE=CLASS SCORES PROBS
> /PRIORS SIZE
> /STATISTICS=MEAN STDDEV UNIVF BOXM COEFF RAW TABLE CROSSVALID
> /CLASSIFY=NONMISSING POOLED.
>
> ============