Logistic regression in social sciences

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Logistic regression in social sciences

Lari Ono
Hello,

There is some agreement about the expected value for logistic regression in
social sciences?
My study case predict a dichotomous variable, according to personal
characteristics and the environment(categorical, ordinal and numerical
variables).

Another question is, how SPSS calculates the percentage of success when I
run a logistic regression? It says that 70% is correct, but when calculating
manually using the coefficients and equation (1/1-e (- (coefficients))), my
result gives only 35% accuracy.

Thanks!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Logistic regression in social sciences

Rich Ulrich
Getting "35% accuracy" suggests that you should look at
the predictions being in the opposite direction:  With just
two outcomes, that must be 65% correct.

Comparing 65% accuracy with 70% accuracy suggests that,
maybe, you are drawing the line in the wrong place.

1) Can't you compare your hand calculations to what the
program will print?
2) Isn't the procedure documented in the manual or on-line?

--
Rich Ulrich

> Date: Thu, 11 Jul 2013 08:14:52 -0400

> From: [hidden email]
> Subject: Logistic regression in social sciences
> To: [hidden email]
>
> Hello,
>
> There is some agreement about the expected value for logistic regression in
> social sciences?
> My study case predict a dichotomous variable, according to personal
> characteristics and the environment(categorical, ordinal and numerical
> variables).
>
> Another question is, how SPSS calculates the percentage of success when I
> run a logistic regression? It says that 70% is correct, but when calculating
> manually using the coefficients and equation (1/1-e (- (coefficients))), my
> result gives only 35% accuracy.
>
> Thanks!
>
> =====================

Reply | Threaded
Open this post in threaded view
|

Re: Logistic regression in social sciences

Lari Ono
In reply to this post by Lari Ono
My outputs are 0 (negative vote) or 1 (yes vote).

The 39% accuracy** of the manual simulation is the response obtained with
the sum of the hit percentage of votes 1 and 0.

But the 70% obtained with spss is also the sum of the hit percentage of
votes 1 and 0, so I found it strange.

Both used the 0.5 cutoff value (> 0.5 positive vote and <0.5 negative vote)

My model has 21 variables and I used 10 records per variable to create the
sample, ie, 210 records.

I tested the model in a population of 12,000 records.

I will take a look at the algorithms SPSS to confirm.


Thanks.
Below are my data.

----
Total Predicted       Not predicted
12065 4807               7258
% 0,398425196850394     0,601574803149606

--

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Logistic regression in social sciences

Lari Ono
In reply to this post by Lari Ono
My data:http://we.tl/s8U1WsxD3X

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Logistic regression in social sciences

Lari Ono
In reply to this post by Lari Ono
Hi!

Rich thanks for help!
The rule "10 cases in the group * smaller * for every 10 degrees of freedom
of predictors." Would be:

Variable "Rank" A, B, C, D, E;

Data:

Vote; Rank
1;A
0;A
1;B
1;C
0;D
1;A
0;A
1;E
1;B
1;C
0;D
1;A
0;A
1;B
1;C
0;D
...

Would collect random data until I reach 10 occurrences of Rank "E" (which
has fewer occurrences)?

A simple random sample, given these conditions, solves this problem?

I Will calculate a new sample and post the result!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Logistic regression in social sciences

Rich Ulrich
[re-post, using Nabble]

Sorry!  How did I say this, "10 cases in the *smaller*
group for every 10 degrees of freedom of predictors."

TYPO!  or editing error and lack of good proof-reading.

 - That should be, about "10 cases for every d.f.  of predictors."
Again, SORRY!  You need a larger sample than what you have
taken.

So if you have 25 d.f.  among predictors, you want to take
a random sample that has at least 250 cases in the smaller
of your outcome groups.  If outcomes split about 50-50, that
is 500 or so for the total.  If outcomes split 20-80, that is 1250
or so.

If you are estimating from the whole sample, you might try
600, or 1400, to be pretty sure of meeting the "rule."   People
doing cross-validations usually take multiple samples, using
round numbers.   One popular strategy for cross-validations,
starting with large enough N, is to divide the original into 10
equal-sized samples.  

--
Rich Ulrich


Lari Ono wrote
Hi!

Rich thanks for help!
The rule "10 cases in the group * smaller * for every 10 degrees of freedom
of predictors." Would be:

Variable "Rank" A, B, C, D, E;

Data:

Vote; Rank
1;A
0;A
1;B
1;C
0;D
1;A
0;A
1;E
1;B
1;C
0;D
1;A
0;A
1;B
1;C
0;D
...

Would collect random data until I reach 10 occurrences of Rank "E" (which
has fewer occurrences)?

A simple random sample, given these conditions, solves this problem?

I Will calculate a new sample and post the result!

=====================