Hi All,
I have run a series of discriminant function analyses (DFA) in SPSS to compare the utility of different variables in classification (in my case whether a bird is male or female) and derive a user-friendly function. I had four variables (measurements of bird size) so I ran 1) four analyses with a single variable, 2) a full model with all four, and 3) a stepwise model (which produced a model with two variables). I used leave-one-out cross-validation to 'test' model performance. - Single variables had classification results of 66.2%, 66.9%, 83.1% and 83.1% (the last two coincidentally had the same values, they didn't agree on classifications of all cases) - The full model classified 87.7% of cases correctly - The stepwise model classified 88.3% of cases correctly (it included a variable, call it A, with 83.1% classification on its own, and a variable, call it B, with 66.9% classification) My main question is: how can a model with a subset of the variables perform better than a model with all variables? I've been told "It is impossible, in any sensible world, for a model based on only two parameters to perform better [than a model with more parameters]. By analogy, if a multiple regression – a technique closely related to LDA - came up with a higher r-squared using a subset of variables than obtained by using all of them, you would immediately go looking for the error. We do have an error here". (note this reviewer refers to DFA as Linear Discriminant Analysis for some reason - as if there aren't enough terms for the same statistical tests!) Second to that: am I correct in interpreting that the inclusion of variable B (which correctly classified only 66.9% of cases) in the stepwise model with variable A, instead of the other better performing variable (call it C), is because variable A correctly classified a greater number of the cases incorrectly classified by B than expected by chance, whereas variable C or the other variables did not do so. Also, why is variable A selected rather than variable C when they had the same classification accuracy alone? Is it because variable C and B share more correct classifications in common than B does with A? Any enlightenment will be greatly appreciated so I can prepare a response to the reviewer's comment. Thanks, Dean PS: I have re-run the analyses to confirm the results and I get the same classification percentages. (I'm using a single SPSS worksheet and simply replacing variables in the 'Independents' list in the 'Discriminant Analysis' window, or adding them all in the dependent list for the full model and leaving 'Enter Independents Together' checked, or adding them all and selecting 'Use Stepwise Method'. I double-checked the stepwise result by running an analysis with the two variables the model had selected and choosing 'Enter Independents Together' - same result as expected. All cases have values for all variables, i.e. sample size remains unchanged). **************************************** Dean Portelli PhD candidate School of Biological, Earth and Environmental Sciences University of New South Wales Sydney AUSTRALIA 2052 |
I will mention first that leave-one-out validation is better
than no validation; but stepwise methods can call for much more extensive validation than that, if you want robust results that extend reliably to other samples. As to your "main question" -- DFA is mathematically a version of regression on a 0/1 criterion, where R^2 is the criterion. This criterion is always "improved" by including more variables, so the assumption in your question is wrong. What is improved is the sum of the squared deviations from predicting 0 or 1. (And, relevant to a comment below, an instance that is predicted as "less than 0" or "greater than 1.0" starts adding more to the error term, instead of being treated as fine success in prediction.) The rate of "correct classifications" is an ancillary statistic, which can be varied (for instance) by changing the cut-point used for classes. It is not a criterion for the equation. In fact, when the two groups are massively different in Ns, it is likely that you get more "correct classifications" by labeling all cases as <big group> than by doing any analysis at all. "90 versus 10" yields "90% right" for the no-analysis option. On the other hand, a "balanced" solution with 20% errors in each group will only have "80% right". Many people prefer Logistic Regression over DFA for all their modeling. The weaknesses of LR are for small Ns, and prediction that is "too perfect". I also like the statistics and presentation of means that I get with DFA. However, the DFA is a model that is less precisely appropriate, especially for instances with very good discrimination, as you seem to have. As to coefficients: Both DFA and LR provide "partial regression coefficients"; those show a unique contribution beyond what is contributed by other variables in the equation. When predictors are highly correlated, it is possible that their *difference* may also be predictive, in which case their two coefficients will be unusually large and have opposite signs (see: suppressor variables). -- Rich Ulrich Date: Thu, 5 Jul 2012 01:44:53 +1000 From: [hidden email] Subject: Stepwise vs Full model Discriminant Function Analysis and classification results To: [hidden email]
Hi All, I have run a series of discriminant function analyses (DFA) in SPSS to compare the utility of different variables in classification (in my case whether a bird is male or female) and derive a user-friendly function. I had four variables (measurements of bird size) so I ran 1) four analyses with a single variable, 2) a full model with all four, and 3) a stepwise model (which produced a model with two variables). I used leave-one-out cross-validation to 'test' model performance. - Single variables had classification results of 66.2%, 66.9%, 83.1% and 83.1% (the last two coincidentally had the same values, they didn't agree on classifications of all cases) - The full model classified 87.7% of cases correctly - The stepwise model classified 88.3% of cases correctly (it included a variable, call it A, with 83.1% classification on its own, and a variable, call it B, with 66.9% classification) My main question is: how can a model with a subset of the variables perform better than a model with all variables? I've been told "It is impossible, in any sensible world, for a model based on only two parameters to perform better [than a model with more parameters]. By analogy, if a multiple regression – a technique closely related to LDA - came up with a higher r-squared using a subset of variables than obtained by using all of them, you would immediately go looking for the error. We do have an error here". (note this reviewer refers to DFA as Linear Discriminant Analysis for some reason - as if there aren't enough terms for the same statistical tests!) Second to that: am I correct in interpreting that the inclusion of variable B (which correctly classified only 66.9% of cases) in the stepwise model with variable A, instead of the other better performing variable (call it C), is because variable A correctly classified a greater number of the cases incorrectly classified by B than expected by chance, whereas variable C or the other variables did not do so. Also, why is variable A selected rather than variable C when they had the same classification accuracy alone? Is it because variable C and B share more correct classifications in common than B does with A? Any enlightenment will be greatly appreciated so I can prepare a response to the reviewer's comment. Thanks, Dean PS: I have re-run the analyses to confirm the results and I get the same classification percentages. (I'm using a single SPSS worksheet and simply replacing variables in the 'Independents' list in the 'Discriminant Analysis' window, or adding them all in the dependent list for the full model and leaving 'Enter Independents Together' checked, or adding them all and selecting 'Use Stepwise Method'. I double-checked the stepwise result by running an analysis with the two variables the model had selected and choosing 'Enter Independents Together' - same result as expected. All cases have values for all variables, i.e. sample size remains unchanged). **************************************** Dean Portelli PhD candidate School of Biological, Earth and Environmental Sciences University of New South Wales Sydney AUSTRALIA 2052 |
Hi Rich, Thanks for the prompt response! I have tried to digest what you have said, but I think I'm a little lost (which is quite normal). RICH: "As to your "main question" -- DFA is mathematically a version of regression on a 0/1 criterion, where R^2 is the criterion. This criterion is always "improved" by including more variables, so the assumption in your question is wrong. What is improved is the sum of the squared deviations from predicting 0 or 1." I understand that any increase in explanatory variables/parameters in a model would increase the coefficient of determination, R^2 (i.e. greater proportion of variance 'explained' collectively by the variables), but do I understand you correctly in that R^2 isn't predictably related to classification accuracy? Therefore, the assumption that increased variables will increase classification accuracy is incorrect. On a related matter, I have reported r (the canonical correlation coefficient) for each of the discriminant analyses, but I'm not sure how to interpret this value. Is it analogous to r from a simple OLS regression? Should I report R^2 instead? RICH: "The rate of "correct classifications" is an ancillary statistic, which can be varied (for instance) by changing the cut-point used for classes. It is not a criterion for the equation". I'm afraid I don't understand this at all, but it seems important to understanding how the classification accuracy is influenced. I think I'll gradually have an adequate understanding to defend my analysis. Many thanks, Dean From: [hidden email] To: [hidden email]; [hidden email] Subject: RE: Stepwise vs Full model Discriminant Function Analysis and classification results Date: Wed, 4 Jul 2012 12:29:20 -0400
I will mention first that leave-one-out validation is better than no validation; but stepwise methods can call for much more extensive validation than that, if you want robust results that extend reliably to other samples. As to your "main question" -- DFA is mathematically a version of regression on a 0/1 criterion, where R^2 is the criterion. This criterion is always "improved" by including more variables, so the assumption in your question is wrong. What is improved is the sum of the squared deviations from predicting 0 or 1. (And, relevant to a comment below, an instance that is predicted as "less than 0" or "greater than 1.0" starts adding more to the error term, instead of being treated as fine success in prediction.) The rate of "correct classifications" is an ancillary statistic, which can be varied (for instance) by changing the cut-point used for classes. It is not a criterion for the equation. In fact, when the two groups are massively different in Ns, it is likely that you get more "correct classifications" by labeling all cases as <big group> than by doing any analysis at all. "90 versus 10" yields "90% right" for the no-analysis option. On the other hand, a "balanced" solution with 20% errors in each group will only have "80% right". Many people prefer Logistic Regression over DFA for all their modeling. The weaknesses of LR are for small Ns, and prediction that is "too perfect". I also like the statistics and presentation of means that I get with DFA. However, the DFA is a model that is less precisely appropriate, especially for instances with very good discrimination, as you seem to have. As to coefficients: Both DFA and LR provide "partial regression coefficients"; those show a unique contribution beyond what is contributed by other variables in the equation. When predictors are highly correlated, it is possible that their *difference* may also be predictive, in which case their two coefficients will be unusually large and have opposite signs (see: suppressor variables). -- Rich Ulrich Date: Thu, 5 Jul 2012 01:44:53 +1000 From: [hidden email] Subject: Stepwise vs Full model Discriminant Function Analysis and classification results To: [hidden email]
Hi All, I have run a series of discriminant function analyses (DFA) in SPSS to compare the utility of different variables in classification (in my case whether a bird is male or female) and derive a user-friendly function. I had four variables (measurements of bird size) so I ran 1) four analyses with a single variable, 2) a full model with all four, and 3) a stepwise model (which produced a model with two variables). I used leave-one-out cross-validation to 'test' model performance. - Single variables had classification results of 66.2%, 66.9%, 83.1% and 83.1% (the last two coincidentally had the same values, they didn't agree on classifications of all cases) - The full model classified 87.7% of cases correctly - The stepwise model classified 88.3% of cases correctly (it included a variable, call it A, with 83.1% classification on its own, and a variable, call it B, with 66.9% classification) My main question is: how can a model with a subset of the variables perform better than a model with all variables? I've been told "It is impossible, in any sensible world, for a model based on only two parameters to perform better [than a model with more parameters]. By analogy, if a multiple regression – a technique closely related to LDA - came up with a higher r-squared using a subset of variables than obtained by using all of them, you would immediately go looking for the error. We do have an error here". (note this reviewer refers to DFA as Linear Discriminant Analysis for some reason - as if there aren't enough terms for the same statistical tests!) Second to that: am I correct in interpreting that the inclusion of variable B (which correctly classified only 66.9% of cases) in the stepwise model with variable A, instead of the other better performing variable (call it C), is because variable A correctly classified a greater number of the cases incorrectly classified by B than expected by chance, whereas variable C or the other variables did not do so. Also, why is variable A selected rather than variable C when they had the same classification accuracy alone? Is it because variable C and B share more correct classifications in common than B does with A? Any enlightenment will be greatly appreciated so I can prepare a response to the reviewer's comment. Thanks, Dean PS: I have re-run the analyses to confirm the results and I get the same classification percentages. (I'm using a single SPSS worksheet and simply replacing variables in the 'Independents' list in the 'Discriminant Analysis' window, or adding them all in the dependent list for the full model and leaving 'Enter Independents Together' checked, or adding them all and selecting 'Use Stepwise Method'. I double-checked the stepwise result by running an analysis with the two variables the model had selected and choosing 'Enter Independents Together' - same result as expected. All cases have values for all variables, i.e. sample size remains unchanged). **************************************** Dean Portelli PhD candidate School of Biological, Earth and Environmental Sciences University of New South Wales Sydney AUSTRALIA 2052 |
In reply to this post by Rich Ulrich
Hi Rich, Firstly, I'm not sure what has happened with the threads of this post - there are duplicate copies of my first email and your first reply in the thread on SPSSX Discussion archives. I only realised after I sent the recent email to you about posting that your reply was tacked on the bottom of your email. The reply to my most recent email, which I understand has not gone to the group, appears to have snipped off so may not be there in full? Thanks very much for the confirmation and clarification. I'm confident I understand the salient points to interpreting R^2: R^2 in DFA is a pseudo-R^2 that is calculated differently to least squares regression as predictions from DFA are either 1 or 0 R^2 and classification are not directly related Model 'performance' is assessed by comparing predicted group memberships with actual membership, rather than through interpretation of the R^2 (which is what I thought initially). In which case, does the R^2 value provide any additional pertinent information about the analysis? I have inserted my queries amongst your blue text "Look at the Predicted Values: DF is predicting to 0/1 scores which are *not* probabilities. I get this bit - i.e. the 'fitted value' from the linear model (DFA) is either a 0 or 1 (group membership). Is R^2 calculated the same in MANOVA? It upsets non-statisticians to see predicted values outside the range, - not sure what 'range' refers to - because they were comfortable think of them as p. Logistic Regression predicts to log(p/(1-p)) -- which does include "probability" on an infinite scale" - I'm lost here, I don't understand how a regression 'predicts to' probabilities. Going back to basics, my understanding is that a fitted model predicts a value of the dependent variable for each case based on the independent variables. The discrepancy between these predicted and actual values is the basis upon which model performance is assessed (i.e. sums of squares). Cheers, Dean From: [hidden email] To: [hidden email] Subject: RE: Stepwise vs Full model Discriminant Function Analysis and classification results Date: Wed, 4 Jul 2012 13:46:01 -0400
- I did reply to the group+private address; so you may yet receive another copy of mine. Your reply that I see right now is addressed only to me. So this reply is not going to the group From: [hidden email] To: [hidden email] Subject: RE: Stepwise vs Full model Discriminant Function Analysis and classification results Date: Thu, 5 Jul 2012 02:55:50 +1000
>Hi Rich, > Thanks for the prompt response! I have tried to digest what you have said, but I think I'm a little lost (which is quite normal). As to your "main question" -- DFA is mathematically a version of regression on a 0/1 criterion, where R^2 is the criterion. This criterion is always "improved" by including more variables, so the assumption in your question is wrong. What is improved is the sum of the squared deviations from predicting 0 or 1. Look at the Predicted Values: DF is predicting to 0/1 scores which are *not* probabilities. It upsets non-statisticians to see predicted values outside the range, because they were comfortable think of them as p. Logistic Regression predicts to log(p/(1-p)) -- which does include "probability" on an infinite scale. Yes, R^2 is used for "least squares" statistics like DFA. "Maximum Likelihood" Logistic Regression does not have an immediate counterpart. There are at least three versions of pseudo-R^2 used sometimes, but none of them can convert "best" prediction at plus or minus infinite to a Sum-of-squares for deviations, to be translated to R^2. >I understand that any increase in explanatory variables/parameters in a model would increase the coefficient of determination, R^2 (i.e. greater proportion of variance 'explained' collectively by the variables), but do I understand you correctly in that R^2 isn't predictably related to classification accuracy? Therefore, the assumption that increased variables will increase classification accuracy is incorrect. Right. >On a related matter, I have reported r (the canonical correlation coefficient) for each of the discriminant analyses, but I'm not sure how to interpret this value. Is it analogous to r from a simple OLS regression? Should I report R^2 instead? Well, yeah, I think so, but the statistic reported with the test is Wilks's lambda. I think that it equals (1-R^2) for the two group case. If I recall correctly. The rate of "correct classifications" is an ancillary statistic, which can be varied (for instance) by changing the cut-point used for classes. It is not a criterion for the equation. >I'm afraid I don't understand this at all, but it seems important to understanding how the classification accuracy is influenced. This is a version of regression. There is a prediction equation. There is a default cut-off for dividing groups on a predicted score. You can change the cutoff used by the program by means of the "prior probability" option (I think that is what it is called). If you change the priors to 2:1, you will put 90%+ of the cases into one group if they started out 50-50. The easier way to look at the effect of cutoffs is to sort the cases in order of Predicted score, and look at the cumulative count of Correct and Incorrect Predictions. >I think I'll gradually have an adequate understanding to defend my analysis. <snip, previous...> |
Dean, and the List --
I gave a private reply to Dean which (a) he possibly did not see all of, and (b) he mis-interpreted mcug of what he summarizes. My response had included a few statements about Logistic Regression, since that is often a preferred alternative to DFA, especially when prediction is strong (as his is). I will correct his "salient points"; and then try to contrast DFA and LR. Dean's salient points - 1) R^2 in DFA is exactly the same as it would be for OLS regression with a 0/1 outcome, since the two are showing exactly the same model. DFA presents correlations and coefficients using a "within group" basis for standardization, so it is mainly the tests that show up as exactly the same. R^2 is the same for all least-square procedures, including DFA, ANOVA, MANCOVA. 2) Well, a better R^2 correlates well with better classification when group sizes are equal. But they are certainly not the same thing. 3) Predicted versus actual group membership (using default cutoff scores to classify; ignoring group Ns) is *not* a vrery good measure of "model performance". For DFA (where R^2 is available), the Wilks's Lambda (1-R^2) is the primary measure. And you can look at the p-value. For LR, varioius "pseudo-R^2"s have been suggested, but none work well. You are stuck with the overall chi squared, or its p-value. The "percent classified correctly" lets you compare some alternate models informally, but it is not a criterion being optimized. And it often could be "improved" as a number by merely adjusting the cut-off score slightly, since there is always a cut-off to separate two groups. Both DFA (or regression) and LR are methods to derive a linear prediction equation. They extract weights to apply to the predictors, and the result is a SCORE for each case. Then a cut-off line is applied, which places individuals into one of the two groups. This line is, in some arbitrary sense, "in the middle". It is not adjusted to pick up one or two more correct classifications, by edging higher or lower. - This could be the main reason that Dean saw changes in "percent" which were not consistent with the improvement in the overall R^2 when another variable was added. Also, as a standard, Percent Classified Correctly (PCC) tends to fail horribly when group sizes are drastically different. For instance, when 90% of cases are Group 1, you can achieve 90% PCC by calling every case "Group 1". But this is not effective prediction, whether you do it by hand or do it by a computer program. A less arbitrary, more general standard of performance is obtained when you require that the line be drawn to produce equal error rates for each group. (This can be thought of as Sensitivity and Specificity, if you know those terms.) Thus, for the "90%" example, if you achieve 80% accuracy for *each* group, you have done some effective prediction -- However, clearly, you now have reduced your PCC from 90% to 80%. Predicted Scores. For this topic, it is easier to refer to the 0/1 regression expression of DFA, than to some other DFA formulation. What you get as predicted scores for each case usually range between the extremes of 0 and 1. This leads to the complacent (or naive) interpretation of those scores as being a "probability of group membership" (though they are never, really, that). So long as adding another variable moves the "prediction" closer to 0 or 1, you get an improvement of R^2 -- since that is measured as the sum of the squared deviations. This will be the case so long as prediction is pretty mediocre. Unfortunately for our convenience, a set of several "good" predictors can result in predicted scores that are greater than 1 or less than 0. First, this makes it obvious that the scores are not really "probabilities of group membership." Second, the "well-predicted" cases, the ones that are most extreme, start *adding* to the squared deviations, as they get further from 0 or 1. That is not desirable, since it decreases the R^2. (Example of how this works: Consider two rare predictors, uncorrelated, each of which "practically insure" membership in Group 1; so they both should have large regression coefficients. Now, for the rare*rare combination case, we are sure that it is Group 1. But the added regression score is far *beyond* 1.0, and therefore is penalized... or, rather, results in biasing the coefficients towards smaller size.) Logistic Regression does not have the problem from "over-prediction." Thus, LR is a correct theoretical construal of the prediction problem, in this aspect where DFA fails (for high R^2). In place of "least squares", LR draws on another method of statistical estimation. It uses the "likelihood function" and "maximum likelihood estimates" (MLE) to obtain testing on the difference between log- likelihoods for fitted models. MLE provides great flexibility in models, as illustrated in LR by setting up the explicit scale of prediction as the "logit", which is log(p(1-p)). This has no problem for over- prediction since it is infinite at both extremes. Also, the predicted logit can be translated back to a predicted "probability of group membership", which is what a lot of people want. What LR does not have is a convenient way of comparing models on the absolute scale that R^2 seems to provide. There are several suggestions for pseudo-R^2 for LR, but none has won out. There are other shortfalls of the LR procedures of today as a total replacement for DFA. DFA has ancillary statistics that are nice, including diagnostics that are standard or available. LR can blow up or give unstable coefficient values without warning you, especially when its predicted separation nears 100%. (People are working on these problems.) Date: Thu, 5 Jul 2012 17:46:32 +1000 From: [hidden email] Subject: Re: Stepwise vs Full model Discriminant Function Analysis and classification results To: [hidden email] Hi Rich, Firstly, I'm not sure what has happened with the threads of this post - there are duplicate copies of my first email and your first reply in the thread on SPSSX Discussion archives. I only realised after I sent the recent email to you about posting that your reply was tacked on the bottom of your email. The reply to my most recent email, which I understand has not gone to the group, appears to have snipped off so may not be there in full? Thanks very much for the confirmation and clarification. I'm confident I understand the salient points to interpreting R^2: R^2 in DFA is a pseudo-R^2 that is calculated differently to least squares regression as predictions from DFA are either 1 or 0 R^2 and classification are not directly related Model 'performance' is assessed by comparing predicted group memberships with actual membership, rather than through interpretation of the R^2 (which is what I thought initially). In which case, does the R^2 value provide any additional pertinent information about the analysis? I have inserted my queries amongst your blue text "Look at the Predicted Values: DF is predicting to 0/1 scores which are *not* probabilities. I get this bit - i.e. the 'fitted value' from the linear model (DFA) is either a 0 or 1 (group membership). Is R^2 calculated the same in MANOVA? It upsets non-statisticians to see predicted values outside the range, - not sure what 'range' refers to - because they were comfortable think of them as p. Logistic Regression predicts to log(p/(1-p)) -- which does include "probability" on an infinite scale" - I'm lost here, I don't understand how a regression 'predicts to' probabilities. Going back to basics, my understanding is that a fitted model predicts a value of the dependent variable for each case based on the independent variables. The discrepancy between these predicted and actual values is the basis upon which model performance is assessed (i.e. sums of squares). Cheers, Dean From: [hidden email] To: [hidden email] Subject: RE: Stepwise vs Full model Discriminant Function Analysis and classification results Date: Wed, 4 Jul 2012 13:46:01 -0400
- I did reply to the group+private address; so you may yet receive another copy of mine. Your reply that I see right now is addressed only to me. So this reply is not going to the group From: [hidden email] To: [hidden email] Subject: RE: Stepwise vs Full model Discriminant Function Analysis and classification results Date: Thu, 5 Jul 2012 02:55:50 +1000
>Hi Rich, > Thanks for the prompt response! I have tried to digest what you have said, but I think I'm a little lost (which is quite normal). As to your "main question" -- DFA is mathematically a version of regression on a 0/1 criterion, where R^2 is the criterion. This criterion is always "improved" by including more variables, so the assumption in your question is wrong. What is improved is the sum of the squared deviations from predicting 0 or 1. Look at the Predicted Values: DF is predicting to 0/1 scores which are *not* probabilities. It upsets non-statisticians to see predicted values outside the range, because they were comfortable think of them as p. Logistic Regression predicts to log(p/(1-p)) -- which does include "probability" on an infinite scale. Yes, R^2 is used for "least squares" statistics like DFA. "Maximum Likelihood" Logistic Regression does not have an immediate counterpart. There are at least three versions of pseudo-R^2 used sometimes, but none of them can convert "best" prediction at plus or minus infinite to a Sum-of-squares for deviations, to be translated to R^2. >I understand that any increase in explanatory variables/parameters in a model would increase the coefficient of determination, R^2 (i.e. greater proportion of variance 'explained' collectively by the variables), but do I understand you correctly in that R^2 isn't predictably related to classification accuracy? Therefore, the assumption that increased variables will increase classification accuracy is incorrect. Right. >On a related matter, I have reported r (the canonical correlation coefficient) for each of the discriminant analyses, but I'm not sure how to interpret this value. Is it analogous to r from a simple OLS regression? Should I report R^2 instead? Well, yeah, I think so, but the statistic reported with the test is Wilks's lambda. I think that it equals (1-R^2) for the two group case. If I recall correctly. The rate of "correct classifications" is an ancillary statistic, which can be varied (for instance) by changing the cut-point used for classes. It is not a criterion for the equation. >I'm afraid I don't understand this at all, but it seems important to understanding how the classification accuracy is influenced. This is a version of regression. There is a prediction equation. There is a default cut-off for dividing groups on a predicted score. You can change the cutoff used by the program by means of the "prior probability" option (I think that is what it is called). If you change the priors to 2:1, you will put 90%+ of the cases into one group if they started out 50-50. The easier way to look at the effect of cutoffs is to sort the cases in order of Predicted score, and look at the cumulative count of Correct and Incorrect Predictions. >I think I'll gradually have an adequate understanding to defend my analysis. <snip, previous...> |
Free forum by Nabble | Edit this page |