why R-square so low?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

why R-square so low?

vando
I really do need help from all you.
Now, I am a bit stressed out because of the ouput SPSS for my final task.

Here is the case: It's multiple regression. n = 35
With Variable Y (dependent) and Variable X.
Variable X consist of 5 independent variables: age, gender, marital status, work time (years), and education
dummy variables are gender (m/f), education (bachelor/master) and marital status (married/not)

After I computed it to SPSS - Analyze - Regression - Linear. I got a disappointing output.
In theory, variable age, gender, marital status, work time (years), and education has significant influence to variable Y.

BUT, my R - square for all independent var is only 0.141.
I tested for independent var one by one, and the result is the biggest R square only 0.11, the other independent var is around 0.06, 0.08 ect.

How could this happened? is there i am doing something wrong. Please help me.

:')

Reply | Threaded
Open this post in threaded view
|

Automatic reply: why R-square so low?

Buhi, Eric
Banned User
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: why R-square so low?

Vik Rubenfeld
In reply to this post by vando
You might try doing a quick crosstab of variable Y by each variable x to see if in fact a strong correlation is present. It will be easily visible in crosstabs if present.

Best,


-Vik

On Jan 2, 2013, at 9:27 AM, vando wrote:

> I really do need help from all you.
> Now, I am a bit stressed out because of the ouput SPSS for my final task.
>
> Here is the case: It's multiple regression. n = 35
> With Variable Y (dependent) and Variable X.
> Variable X consist of 5 independent variables: age, gender, marital status,
> work time (years), and education
> dummy variables are gender (m/f), education (bachelor/master) and marital
> status (married/not)
>
> After I computed it to SPSS - Analyze - Regression - Linear. I got a
> disappointing output.
> In theory, variable age, gender, marital status, work time (years), and
> education has significant influence to variable Y.
>
> BUT, my R - square for all independent var is only 0.141.
> I tested for independent var one by one, and the result is the biggest R
> square only 0.11, the other independent var is around 0.06, 0.08 ect.
>
> How could this happened? is there i am doing something wrong. Please help
> me.
>
> :')
>
>
>
>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/why-R-square-so-low-tp5717198.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: why R-square so low?

Rich Ulrich
In reply to this post by vando
Your achieved R^2  is about what you would expect by chance, for 5
predictors and N of 35:  5/35 = 1/7 = 0.14 .

Two possibilities - (1) This sample does not show the relationship at all, or (2) you
have unexpected/unusual scores  that out-weigh the rest.

For (2), you should be able to look at Frequencies to see that your codes are what
you expected, and there are no Missing codes being analyzed as if they were scores.

For (1), you might consider whether there is a wide enough *range* of the outcome
in your particular sample.  (Check similarly for predictors, though that is less likely
to be the problem here.)

--
Rich Ulrich

> Date: Wed, 2 Jan 2013 09:27:40 -0800

> From: [hidden email]
> Subject: why R-square so low?
> To: [hidden email]
>
> I really do need help from all you.
> Now, I am a bit stressed out because of the ouput SPSS for my final task.
>
> Here is the case: It's multiple regression. n = 35
> With Variable Y (dependent) and Variable X.
> Variable X consist of 5 independent variables: age, gender, marital status,
> work time (years), and education
> dummy variables are gender (m/f), education (bachelor/master) and marital
> status (married/not)
>
> After I computed it to SPSS - Analyze - Regression - Linear. I got a
> disappointing output.
> In theory, variable age, gender, marital status, work time (years), and
> education has significant influence to variable Y.
>
> BUT, my R - square for all independent var is only 0.141.
> I tested for independent var one by one, and the result is the biggest R
> square only 0.11, the other independent var is around 0.06, 0.08 ect.
>
> How could this happened? is there i am doing something wrong. Please help
> me.
 ...
Reply | Threaded
Open this post in threaded view
|

Re: why R-square so low?

Rich Ulrich
Okay, What do you see when you look at Frequencies on the variables?
 - There is not much to say about dichotomies, unless some count is only a couple.

"Age" shows something a bit interesting, and more interesting when you think of
what "age" typically means.  With an age range of 20 to 60, I would expect (for
most outcomes or covariates" that the strong differences would be between the
folks over 45 and the ones under 30.  Your sample has exactly one case over 45,
who is 54.  Your sample has exactly 4 cases under 30, at 23, 25, 27, and 29.

For some other variable, I would wonder how much to worry about the "outlier" of
54.  For age ... I wonder whether the sample has been selected so that there is
very little variation on whatever-it-is that matters.  The three youngest cases do
have the smallest values (the values of 1 and 2) on the variable that goes to 10.

By the way -- I once had a professor who asked, "What is 'wrong' with this set of data?"
... and he showed us 30 numbers, all written as integers except for 3 that ended in
".5".  The answer was, "The  .5 numbers make us think that the numbers are measured
to the nearest 1/2; but since there are only 3 of them, we most likely are looking at
numbers to the nearest integer, where there were three that seemed too hard to judge."

Somewhat in the same vein --
I did a box-and-leaf plot to see the distribution of your Outcome variable... and I happen
to notice (assuming I wrote these right) that of the 22 values over 160, 21 of them end
in  (0,1,2,3,4,5).  The one exception is the highest value in the sample, 198. 

Is something strange going on?  In my previous experience, from years ago when we
read data in from fixed columns that were on "computer cards," the presence of "odd"
patterns in the final digits usually signified that we were reading the wrong columns
for a variable.

--
Rich Ulrich




Date: Thu, 3 Jan 2013 17:22:05 -0800
From: [hidden email]
Subject: Re: why R-square so low?
To: [hidden email]

Dear Mr. Ulrick,,

Thank you for your reply. I am so glad to read it.

I really wish that case (1) is not happening.

However, I still don't understand what I have to do to test case (2). I attached my data for your review. I wish you could help me once more. Your help means a lot for me.

Thank you Sir.

Best Regards,


Vanny Efendi

On Fri, Jan 4, 2013 at 1:39 AM, Rich Ulrich-2 [via SPSSX Discussion] <[hidden email]> wrote:
Your achieved R^2  is about what you would expect by chance, for 5
predictors and N of 35:  5/35 = 1/7 = 0.14 .

Two possibilities - (1) This sample does not show the relationship at all, or (2) you
have unexpected/unusual scores  that out-weigh the rest.

For (2), you should be able to look at Frequencies to see that your codes are what
you expected, and there are no Missing codes being analyzed as if they were scores.

For (1), you might consider whether there is a wide enough *range* of the outcome
in your particular sample.  (Check similarly for predictors, though that is less likely
to be the problem here.)

--
Rich Ulrich