Calculation of discriminant function scores from raw data

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Calculation of discriminant function scores from raw data

Ian Martin-3

I’m interested in taking some data cases not used in the original DFA and “projecting” these into the DF space, using the coefficients from the SPSS output.  

In this support doc:

it states that:
"The raw or unstandardized canonical function coefficients are used to compute the saved or pinted discriminant function scores. The scores are computed by applying the regression-like equation of the constant plus each coefficient times the raw value of the appropriate variable, and summing.

I took the SPSS output table of Canonical Discriminant Function Coefficients (Unstandardized Coefficients) into Excel, along with the unstandardized original variables and tried to compute the scores of some observations used in the DFA, to make sure I was using the correct method before projecting in the new observations.   Basically just the regression equation:
DFscore = v1*c1 + v2*c2 + v3*c3…. + constant

The scores I got were close to those saved from the analysis, but not the same.  They differed enough that I don’t think we are talking about rounding errors or something. 

Any advice or comment appreciated.

Ian Martin


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Calculation of discriminant function scores from raw data

Jon Peck
If you are doing this within Statistics, the easiest way would be to save the discriminant model by exporting it to an XML file and then using the Utilities > Scoring Wizard to do the predictions.

On Tue, Jan 14, 2020 at 11:45 AM Ian Martin <[hidden email]> wrote:

I’m interested in taking some data cases not used in the original DFA and “projecting” these into the DF space, using the coefficients from the SPSS output.  

In this support doc:

it states that:
"The raw or unstandardized canonical function coefficients are used to compute the saved or pinted discriminant function scores. The scores are computed by applying the regression-like equation of the constant plus each coefficient times the raw value of the appropriate variable, and summing.

I took the SPSS output table of Canonical Discriminant Function Coefficients (Unstandardized Coefficients) into Excel, along with the unstandardized original variables and tried to compute the scores of some observations used in the DFA, to make sure I was using the correct method before projecting in the new observations.   Basically just the regression equation:
DFscore = v1*c1 + v2*c2 + v3*c3…. + constant

The scores I got were close to those saved from the analysis, but not the same.  They differed enough that I don’t think we are talking about rounding errors or something. 

Any advice or comment appreciated.

Ian Martin


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Calculation of discriminant function scores from raw data

Kirill Orlov
In reply to this post by Ian Martin-3
If you have the "old" (training)  dataset you can merge it with the new data points and rerun the analysis, now using the Selection Variable field.


14.01.2020 21:45, Ian Martin пишет:

I’m interested in taking some data cases not used in the original DFA and “projecting” these into the DF space, using the coefficients from the SPSS output.  

In this support doc:

it states that:
"The raw or unstandardized canonical function coefficients are used to compute the saved or pinted discriminant function scores. The scores are computed by applying the regression-like equation of the constant plus each coefficient times the raw value of the appropriate variable, and summing.

I took the SPSS output table of Canonical Discriminant Function Coefficients (Unstandardized Coefficients) into Excel, along with the unstandardized original variables and tried to compute the scores of some observations used in the DFA, to make sure I was using the correct method before projecting in the new observations.   Basically just the regression equation:
DFscore = v1*c1 + v2*c2 + v3*c3…. + constant

The scores I got were close to those saved from the analysis, but not the same.  They differed enough that I don’t think we are talking about rounding errors or something. 

Any advice or comment appreciated.

Ian Martin


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Calculation of discriminant function scores from raw data

Art Kendall
In reply to this post by Ian Martin-3
Do you have the original dataset?
If so, append the new cases to the old set. Create a value in the DFA
grouping variable that is not in the original set of values.

Then in then classification phase, treat that value as ungrouped.

It is often useful to keep the original grouping variable and look at the
assigned values vs the original values
vs the assigned values of the ungrouped cases.

Are the discriminating variables items from a summative scale?

Does the output from the first DFA suggest items could be grouped int sets
fro a summative scale score?



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Calculation of discriminant function scores from raw data

Rich Ulrich
In reply to this post by Ian Martin-3
What I recall from 15 years ago is that I never got the Unstandardized
Coefficients to work.  What worked - what gave numbers the same as
the DF procedure - was to use the standardized set and apply them to
the z-scored version of a set of data. 

For my external data (I had multiple sets, too), the "z-scoring" for new data
used DO REPEAT with the mean and SD of the original dataset. If I recall
correctly, I wrote out the z-scores by themselves and used MATRIX to
read and score all multiple factors at once, using matrix multiplication.
 
- I think I had to experiment to get the matrix multiplication to work,
possibly because I was confused by accidentally leaning on my math
education and Fortran conventions, which, themselves differ; instead of
focusing on SPSS MATRIX.  (Is  it var(row, column) or var(column, row)?)

--
Rich Ulrich

From: SPSSX(r) Discussion <[hidden email]> on behalf of Ian Martin <[hidden email]>
Sent: Tuesday, January 14, 2020 1:45 PM
To: [hidden email] <[hidden email]>
Subject: Calculation of discriminant function scores from raw data
 

I’m interested in taking some data cases not used in the original DFA and “projecting” these into the DF space, using the coefficients from the SPSS output.  

In this support doc:

it states that:
"The raw or unstandardized canonical function coefficients are used to compute the saved or pinted discriminant function scores. The scores are computed by applying the regression-like equation of the constant plus each coefficient times the raw value of the appropriate variable, and summing.

I took the SPSS output table of Canonical Discriminant Function Coefficients (Unstandardized Coefficients) into Excel, along with the unstandardized original variables and tried to compute the scores of some observations used in the DFA, to make sure I was using the correct method before projecting in the new observations.   Basically just the regression equation:
DFscore = v1*c1 + v2*c2 + v3*c3…. + constant

The scores I got were close to those saved from the analysis, but not the same.  They differed enough that I don’t think we are talking about rounding errors or something. 

Any advice or comment appreciated.

Ian Martin


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Calculation of discriminant function scores from raw data

Kirill Orlov
My "MATRIX - END MATRIX" function !DISCRIM extracts discriminants the same way as SPSS DISCRIMINANT command does, it is equivalent to it.

It reads:

*NAME5 - discriminant scores, they are DATA*NAME2, i.e. computed with the help of normalized eigenvectors

*(if OUT nonpositive) or with the help of discriminant coefficients (if OUT positive).

*If you need to supply the vector of constants to the unstandardized coefficients (to obtain un-centered

*discriminants) then it is equal: -csum(mdiag(MEAN)*NAME2), where MEAN is the vector of means of the original

*variables and NAME2 is the unstandardized canonical discriminant coefficients.




15.01.2020 0:48, Rich Ulrich пишет:
What I recall from 15 years ago is that I never got the Unstandardized
Coefficients to work.  What worked - what gave numbers the same as
the DF procedure - was to use the standardized set and apply them to
the z-scored version of a set of data. 

For my external data (I had multiple sets, too), the "z-scoring" for new data
used DO REPEAT with the mean and SD of the original dataset. If I recall
correctly, I wrote out the z-scores by themselves and used MATRIX to
read and score all multiple factors at once, using matrix multiplication.
 
- I think I had to experiment to get the matrix multiplication to work,
possibly because I was confused by accidentally leaning on my math
education and Fortran conventions, which, themselves differ; instead of
focusing on SPSS MATRIX.  (Is  it var(row, column) or var(column, row)?)


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD