Hello,
There is some agreement about the expected value for logistic regression in social sciences? My study case predict a dichotomous variable, according to personal characteristics and the environment(categorical, ordinal and numerical variables). Another question is, how SPSS calculates the percentage of success when I run a logistic regression? It says that 70% is correct, but when calculating manually using the coefficients and equation (1/1-e (- (coefficients))), my result gives only 35% accuracy. Thanks! ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Getting "35% accuracy" suggests that you should look at
the predictions being in the opposite direction: With just two outcomes, that must be 65% correct. Comparing 65% accuracy with 70% accuracy suggests that, maybe, you are drawing the line in the wrong place. 1) Can't you compare your hand calculations to what the program will print? 2) Isn't the procedure documented in the manual or on-line? -- Rich Ulrich > Date: Thu, 11 Jul 2013 08:14:52 -0400 > From: [hidden email] > Subject: Logistic regression in social sciences > To: [hidden email] > > Hello, > > There is some agreement about the expected value for logistic regression in > social sciences? > My study case predict a dichotomous variable, according to personal > characteristics and the environment(categorical, ordinal and numerical > variables). > > Another question is, how SPSS calculates the percentage of success when I > run a logistic regression? It says that 70% is correct, but when calculating > manually using the coefficients and equation (1/1-e (- (coefficients))), my > result gives only 35% accuracy. > > Thanks! > > ===================== |
In reply to this post by Lari Ono
My outputs are 0 (negative vote) or 1 (yes vote).
The 39% accuracy** of the manual simulation is the response obtained with the sum of the hit percentage of votes 1 and 0. But the 70% obtained with spss is also the sum of the hit percentage of votes 1 and 0, so I found it strange. Both used the 0.5 cutoff value (> 0.5 positive vote and <0.5 negative vote) My model has 21 variables and I used 10 records per variable to create the sample, ie, 210 records. I tested the model in a population of 12,000 records. I will take a look at the algorithms SPSS to confirm. Thanks. Below are my data. ---- Total Predicted Not predicted 12065 4807 7258 % 0,398425196850394 0,601574803149606 -- ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Lari Ono
My data:http://we.tl/s8U1WsxD3X
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Lari Ono
Hi!
Rich thanks for help! The rule "10 cases in the group * smaller * for every 10 degrees of freedom of predictors." Would be: Variable "Rank" A, B, C, D, E; Data: Vote; Rank 1;A 0;A 1;B 1;C 0;D 1;A 0;A 1;E 1;B 1;C 0;D 1;A 0;A 1;B 1;C 0;D ... Would collect random data until I reach 10 occurrences of Rank "E" (which has fewer occurrences)? A simple random sample, given these conditions, solves this problem? I Will calculate a new sample and post the result! ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
[re-post, using Nabble]
Sorry! How did I say this, "10 cases in the *smaller* group for every 10 degrees of freedom of predictors." TYPO! or editing error and lack of good proof-reading. - That should be, about "10 cases for every d.f. of predictors." Again, SORRY! You need a larger sample than what you have taken. So if you have 25 d.f. among predictors, you want to take a random sample that has at least 250 cases in the smaller of your outcome groups. If outcomes split about 50-50, that is 500 or so for the total. If outcomes split 20-80, that is 1250 or so. If you are estimating from the whole sample, you might try 600, or 1400, to be pretty sure of meeting the "rule." People doing cross-validations usually take multiple samples, using round numbers. One popular strategy for cross-validations, starting with large enough N, is to divide the original into 10 equal-sized samples. -- Rich Ulrich
|
Free forum by Nabble | Edit this page |