SPSSX Discussion

Calculation of discriminant function scores from raw data

Classic

List

Threaded

6 messages Options

Ian Martin-3

Jan 14, 2020; 6:45pm

Calculation of discriminant function scores from raw data

I’m interested in taking some data cases not used in the original DFA and “projecting” these into the DF space, using the coefficients from the SPSS output.

In this support doc:

www.ibm.com/support/pages/which-coefficients-are-used-computing-discriminant-scores-spss

it states that:

"The raw or unstandardized canonical function coefficients are used to compute the saved or pinted discriminant function scores. The scores are computed by applying the regression-like equation of the constant plus each coefficient times the raw value of the appropriate variable, and summing.”

I took the SPSS output table of Canonical Discriminant Function Coefficients (Unstandardized Coefficients) into Excel, along with the unstandardized original variables and tried to compute the scores of some observations used in the DFA, to make sure I was using the correct method before projecting in the new observations. Basically just the regression equation:

DFscore = v1*c1 + v2*c2 + v3*c3…. + constant

The scores I got were close to those saved from the analysis, but not the same. They differed enough that I don’t think we are talking about rounding errors or something.

Any advice or comment appreciated.

Ian Martin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Jon Peck

Jan 14, 2020; 6:54pm

Re: Calculation of discriminant function scores from raw data

If you are doing this within Statistics, the easiest way would be to save the discriminant model by exporting it to an XML file and then using the Utilities > Scoring Wizard to do the predictions.

On Tue, Jan 14, 2020 at 11:45 AM Ian Martin <[hidden email]> wrote:

I’m interested in taking some data cases not used in the original DFA and “projecting” these into the DF space, using the coefficients from the SPSS output.

In this support doc:
www.ibm.com/support/pages/which-coefficients-are-used-computing-discriminant-scores-spss

it states that:
"The raw or unstandardized canonical function coefficients are used to compute the saved or pinted discriminant function scores. The scores are computed by applying the regression-like equation of the constant plus each coefficient times the raw value of the appropriate variable, and summing.”

I took the SPSS output table of Canonical Discriminant Function Coefficients (Unstandardized Coefficients) into Excel, along with the unstandardized original variables and tried to compute the scores of some observations used in the DFA, to make sure I was using the correct method before projecting in the new observations. Basically just the regression equation:
DFscore = v1*c1 + v2*c2 + v3*c3…. + constant

The scores I got were close to those saved from the analysis, but not the same. They differed enough that I don’t think we are talking about rounding errors or something.

Any advice or comment appreciated.

Ian Martin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

... [show rest of quote]

Jon K Peck
[hidden email]

Kirill Orlov

Jan 14, 2020; 7:04pm

Re: Calculation of discriminant function scores from raw data

In reply to this post by Ian Martin-3

If you have the "old" (training) dataset you can merge it with the new data points and rerun the analysis, now using the Selection Variable field.

14.01.2020 21:45, Ian Martin пишет:

I’m interested in taking some data cases not used in the original DFA and “projecting” these into the DF space, using the coefficients from the SPSS output.

In this support doc:

www.ibm.com/support/pages/which-coefficients-are-used-computing-discriminant-scores-spss

it states that:

"The raw or unstandardized canonical function coefficients are used to compute the saved or pinted discriminant function scores. The scores are computed by applying the regression-like equation of the constant plus each coefficient times the raw value of the appropriate variable, and summing.”

I took the SPSS output table of Canonical Discriminant Function Coefficients (Unstandardized Coefficients) into Excel, along with the unstandardized original variables and tried to compute the scores of some observations used in the DFA, to make sure I was using the correct method before projecting in the new observations. Basically just the regression equation:

DFscore = v1*c1 + v2*c2 + v3*c3…. + constant

The scores I got were close to those saved from the analysis, but not the same. They differed enough that I don’t think we are talking about rounding errors or something.

Any advice or comment appreciated.

Ian Martin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

... [show rest of quote]

Art Kendall

Jan 14, 2020; 9:05pm

Re: Calculation of discriminant function scores from raw data

In reply to this post by Ian Martin-3

Do you have the original dataset?
If so, append the new cases to the old set. Create a value in the DFA
grouping variable that is not in the original set of values.

Then in then classification phase, treat that value as ungrouped.

It is often useful to keep the original grouping variable and look at the
assigned values vs the original values
vs the assigned values of the ungrouped cases.

Are the discriminating variables items from a summative scale?

Does the output from the first DFA suggest items could be grouped int sets
fro a summative scale score?

-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Art Kendall
Social Research Consultants

Rich Ulrich

Jan 14, 2020; 9:48pm

Re: Calculation of discriminant function scores from raw data

In reply to this post by Ian Martin-3

What I recall from 15 years ago is that I never got the Unstandardized

Coefficients to work. What worked - what gave numbers the same as

the DF procedure - was to use the standardized set and apply them to

the z-scored version of a set of data.

For my external data (I had multiple sets, too), the "z-scoring" for new data

used DO REPEAT with the mean and SD of the original dataset. If I recall

correctly, I wrote out the z-scores by themselves and used MATRIX to

read and score all multiple factors at once, using matrix multiplication.

- I think I had to experiment to get the matrix multiplication to work,

possibly because I was confused by accidentally leaning on my math

education and Fortran conventions, which, themselves differ; instead of

focusing on SPSS MATRIX. (Is it var(row, column) or var(column, row)?)

Rich Ulrich

From: SPSSX(r) Discussion <[hidden email]> on behalf of Ian Martin <[hidden email]>
Sent: Tuesday, January 14, 2020 1:45 PM
To: [hidden email] <[hidden email]>
Subject: Calculation of discriminant function scores from raw data

I’m interested in taking some data cases not used in the original DFA and “projecting” these into the DF space, using the coefficients from the SPSS output.

In this support doc:

www.ibm.com/support/pages/which-coefficients-are-used-computing-discriminant-scores-spss

it states that:

DFscore = v1*c1 + v2*c2 + v3*c3…. + constant

The scores I got were close to those saved from the analysis, but not the same. They differed enough that I don’t think we are talking about rounding errors or something.

Any advice or comment appreciated.

Ian Martin

Kirill Orlov

Jan 15, 2020; 10:48am

Re: Calculation of discriminant function scores from raw data

My "MATRIX - END MATRIX" function !DISCRIM extracts discriminants the same way as SPSS DISCRIMINANT command does, it is equivalent to it.

It reads:

*NAME5 - discriminant scores, they are DATA*NAME2, i.e. computed with the help of normalized eigenvectors

*(if OUT nonpositive) or with the help of discriminant coefficients (if OUT positive).

*If you need to supply the vector of constants to the unstandardized coefficients (to obtain un-centered

*discriminants) then it is equal: -csum(mdiag(MEAN)*NAME2), where MEAN is the vector of means of the original

*variables and NAME2 is the unstandardized canonical discriminant coefficients.

15.01.2020 0:48, Rich Ulrich пишет:

What I recall from 15 years ago is that I never got the Unstandardized

Coefficients to work. What worked - what gave numbers the same as

the DF procedure - was to use the standardized set and apply them to

the z-scored version of a set of data.

For my external data (I had multiple sets, too), the "z-scoring" for new data

used DO REPEAT with the mean and SD of the original dataset. If I recall

correctly, I wrote out the z-scores by themselves and used MATRIX to

read and score all multiple factors at once, using matrix multiplication.

- I think I had to experiment to get the matrix multiplication to work,

possibly because I was confused by accidentally leaning on my math

education and Fortran conventions, which, themselves differ; instead of

focusing on SPSS MATRIX. (Is it var(row, column) or var(column, row)?)