SPSS vs Statistca

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

SPSS vs Statistca

Gonzales, Dana L
I wonder if anyone on the list can provide an opinion on the differences of SPSS and Statistca. I have a learner who wants to use Statistca stating that is it a much better option that SPSS because it is more accurate. I can't seem to locate any information to support this view. Any assistance would be greatly appreciated.

Dana


Dana Barber Gonzales, Ph.D
Assistant Professor
Department of Dietetics and Nutrition
University of Arkansas for Medical Sciences
4301 West Markham St.
Slot # 627
Little Rock, AR 72205
Phone # 501-686-6166
Fax # 501-686-5716

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS vs Statistca

Steve Simon, P.Mean Consulting
Gonzales, Dana L wrote:
> I wonder if anyone on the list can provide an opinion on the
> differences of SPSS and Statistca. I have a learner who wants to use
> Statistca stating that is it a much better option that SPSS because
> it is more accurate. I can't seem to locate any information to
> support this view. Any assistance would be greatly appreciated.

I'm not sure you'll get unbiased opinions from people who subscribe to
this list. But perhaps some of the opinions offered might still be
helpful. Here are my thoughts:

More accurate is a surprising choice of words. If this person means
numerical accuracy, that's pretty much a non-issue for most problems as
all professional computer programs have ditched single precision and
avoided the lousy algorithms. It's best not to mention Microsoft Excel
at this point.

That doesn't mean that you can't trip up one of these programs with a
tricky data set, but in general, accuracy is not a serious concern.

Several years ago, Statistica had a very aggressive advertising campaign
that cast aspersions on many of its competitors. I think that there was
also some pushback from people like Leland Wilkinson that held the
claims to be unfair and unsupported. I wonder if this person has some
residual memory of this campaign.

If "accuracy" means something other than numerical accuracy, then you
need to define what this really means. I could make some guesses, but
that's dangerous.

I would discourage a comparison of two statistical packages based on
accuracy. The criteria that make more sense are: (1) ease of use, and
(2) availability of advanced procedures. I have no experience with
Statistica, but I'd be surprised based on what I have read about the
program, that it would be considered vastly superior in either category.
--
Steve Simon, Standard disclaimer.
Looking for new friends/connections on Facebook and LinkedIn
www.facebook.com/pmean, www.linkedin.com/in/pmean
or become a fan of my newsletter, The Monthly Mean
http://www.facebook.com/group.php?gid=302778306676

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Effects of the Dichotomous variables on a Dichotomous Outcome

E. Bernardo
(Sorry for cross-posting)
 
I want to know which of the variables X1, X2, X3, X4 and X5 are significant predictors of the outcome variable Y where both Y and the X's are dichotomous (coded 0 for NO and 1 for YES).  The sample size is n=40.  The target variable Y has 32 for YES and 8 for the NO response.
 
Can I use logistic regression? Any comments are welcome.


Cleaner, Better, Faster - Experience the new Faster Yahoo! Mail today!
Reply | Threaded
Open this post in threaded view
|

Re: Effects of the Dichotomous variables on a Dichotomous Outcome

Marta Garcia-Granero
Eins Bernardo wrote:
> (Sorry for cross-posting)
>
> I want to know which of the variables X1, X2, X3, X4 and X5 are
> significant predictors of the outcome variable Y where both Y and the
> X's are dichotomous (coded 0 for NO and 1 for YES).  The sample size
> is n=40.  The target variable Y has 32 for YES and 8 for the NO response.
>
> Can I use logistic regression? Any comments are welcome.
>

Hi Eins:

If you are talking about 5 univariate logistic regression modeles (Y
with X1, Y with X2...), go ahead, or use contingency tables (exactly the
same). If you are planning to run a multivariate logistic regression
model (Y with X1 and X2 and ... X5), the answer is NO. Not with that
sample size (32 yes + 8 no).

Marta

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Effects of the Dichotomous variables on a Dichotomous Outcome

E. Bernardo
In reply to this post by E. Bernardo
What is your suggestion in terms of sample size and proportion of the yes
and no response?  Is the sample size requirement follows 10 to 20 cases
per independent variable? There was a comment in this list that the
proportion of a yes response to the no response is not an issue.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Effects of the Dichotomous variables on a Dichotomous Outcome

E. Bernardo
In reply to this post by Marta Garcia-Granero
Marta, et al.
 
I am using contingency table to analyze the data then try with Simple Logistic Regression.  Like Marta, prior to analysis, I have in mind that the two analyses are the same.  However, I noticed that they produced different results.  The logistic regression revelaed that the X is not significantly associated with the Y.  The Fisher exact test and the contingency coefficient revealed that X is significantly associated with Y.
 
The data are as follows:
 
  X            Y     Count
yes         yes     13
yes         No        0
No          No        8
No         Yes      19
 
Eins

--- On Sun, 3/7/10, Marta García-Granero <[hidden email]> wrote:

From: Marta García-Granero <[hidden email]>
Subject: Re: Effects of the Dichotomous variables on a Dichotomous Outcome
To: [hidden email]
Date: Sunday, 7 March, 2010, 10:10 AM

Eins Bernardo wrote:
> (Sorry for cross-posting)
>
> I want to know which of the variables X1, X2, X3, X4 and X5 are
> significant predictors of the outcome variable Y where both Y and the
> X's are dichotomous (coded 0 for NO and 1 for YES).  The sample size
> is n=40.  The target variable Y has 32 for YES and 8 for the NO response.
>
> Can I use logistic regression? Any comments are welcome.
>

Hi Eins:

If you are talking about 5 univariate logistic regression modeles (Y
with X1, Y with X2...), go ahead, or use contingency tables (exactly the
same). If you are planning to run a multivariate logistic regression
model (Y with X1 and X2 and ... X5), the answer is NO. Not with that
sample size (32 yes + 8 no).

Marta

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Yahoo! Mail Now Faster and Cleaner. Experience it today!
Reply | Threaded
Open this post in threaded view
|

Re: Effects of the Dichotomous variables on a Dichotomous Outcome

Jarrod Teo-2
Hi,
 
Eins Bernardo wrote:
> (Sorry for cross-posting)
>
> I want to know which of the variables X1, X2, X3, X4 and X5 are
> significant predictors of the outcome variable Y where both Y and the
> X's are dichotomous (coded 0 for NO and 1 for YES).  The sample size
> is n=40.  The target variable Y has 32 for YES and 8 for the NO response.
>
> Can I use logistic regression? Any comments are welcome.

You can use Logistic regression since the outcome is a yes-no response.
 
However, do note something
 
  1. You have only 40 sample in total which is quite small.
  2. The yes response is drastically too much compared to the no response.
 
Is it possible for you to collect more information for the No response? Do take note of the usual checks in the Logistic Regression output, Omnibus Tests, Nagelkerke R Square, Hosmer and Lemeshow Test, etc.
 
If in need and even possible, please do collect more data on the No-Responses.
 
Regards
Dorraj Oet
 

Date: Sun, 7 Mar 2010 21:59:33 +0800
From: [hidden email]
Subject: Re: Effects of the Dichotomous variables on a Dichotomous Outcome
To: [hidden email]

Marta, et al.
 
I am using contingency table to analyze the data then try with Simple Logistic Regression.  Like Marta, prior to analysis, I have in mind that the two analyses are the same.  However, I noticed that they produced different results.  The logistic regression revelaed that the X is not significantly associated with the Y.  The Fisher exact test and the contingency coefficient revealed that X is significantly associated with Y.
 
The data are as follows:
 
  X            Y     Count
yes         yes     13
yes         No        0
No          No        8
No         Yes      19
 
Eins

--- On Sun, 3/7/10, Marta García-Granero <[hidden email]> wrote:

From: Marta García-Granero <[hidden email]>
Subject: Re: Effects of the Dichotomous variables on a Dichotomous Outcome
To: [hidden email]
Date: Sunday, 7 March, 2010, 10:10 AM

Eins Bernardo wrote:
> (Sorry for cross-posting)
>
> I want to know which of the variables X1, X2, X3, X4 and X5 are
> significant predictors of the outcome variable Y where both Y and the
> X's are dichotomous (coded 0 for NO and 1 for YES).  The sample size
> is n=40.  The target variable Y has 32 for YES and 8 for the NO response.
>
> Can I use logistic regression? Any comments are welcome.
>

Hi Eins:

If you are talking about 5 univariate logistic regression modeles (Y
with X1, Y with X2...), go ahead, or use contingency tables (exactly the
same). If you are planning to run a multivariate logistic regression
model (Y with X1 and X2 and ... X5), the answer is NO. Not with that
sample size (32 yes + 8 no).

Marta

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/
=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Yahoo! Mail Now Faster and Cleaner. Experience it today!

Hotmail: Trusted email with Microsoft’s powerful SPAM protection. Sign up now.
Reply | Threaded
Open this post in threaded view
|

Re: Effects of the Dichotomous variables on a Dichotomous Outcome

Marta Garcia-Granero
In reply to this post by E. Bernardo
Eins Bernardo escribió:
> What is your suggestion in terms of sample size and proportion of the yes
> and no response?  Is the sample size requirement follows 10 to 20 cases
> per independent variable? There was a comment in this list that the
> proportion of a yes response to the no response is not an issue.
>
>
>
At least 10 to 20 "Yes", or "10 to 20" No, whatever is less frequent,
per independent variable, therefore, you need 50 to 100 "yes" and 50 to
100 "no", since you are studying 5 independent predictors.

Rules of thumb are, of course, only approximate, not absolute dogmas.

HTH,
Marta GG


--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Effects of the Dichotomous variables on a Dichotomous Outcome

Marta Garcia-Granero
In reply to this post by E. Bernardo
Eins Bernardo wrote:

> Marta, et al.
>
> I am using contingency table to analyze the data then try with Simple
> Logistic Regression.  Like Marta, prior to analysis, I have in mind
> that the two analyses are the same.  However, I noticed that they
> produced different results.  The logistic regression revelaed that
> the X is not significantly associated with the Y.  The Fisher exact
> test and the contingency coefficient revealed that X is significantly
> associated with Y.
>
> The data are as follows:
>
>   X            Y     Count
> yes         yes     13
> yes         No        0
> No          No        8
> No         Yes      19
>

Did you use Wald test? (in logistic regression). It has low power. You
should use LR test.

Marta
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Effects of the Dichotomous variables on a Dichotomous Outcome

Johnny Amora
In reply to this post by E. Bernardo
I support the comments of others in this thread that your sample size is too small for the 5 IVs.  Peduzzi et al. (1996)  recommends that the minimum number of cases for logistic regression is n= 10k/p where k is number of IVs and p is the proportion of event. In your case with 5 IVs and p= 32/40=.8 you need a minimum n=10*5/.80=62.
 
Long(1997) suggested that if the resulting n is less than 100, we should increase it to 100.
 
Here are the references:
 
Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR (1996) A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology 49:1373-1379.
 
Long JS (1997) Regression Models for categorical and limited dependent variables. Thousand Oaks, CA: Sage Publications

Cheers,
 
Johnny


--- On Sun, 3/7/10, Eins Bernardo <[hidden email]> wrote:

From: Eins Bernardo <[hidden email]>
Subject: Effects of the Dichotomous variables on a Dichotomous Outcome
To: [hidden email]
Date: Sunday, 7 March, 2010, 4:52 PM

(Sorry for cross-posting)
 
I want to know which of the variables X1, X2, X3, X4 and X5 are significant predictors of the outcome variable Y where both Y and the X's are dichotomous (coded 0 for NO and 1 for YES).  The sample size is n=40.  The target variable Y has 32 for YES and 8 for the NO response.
 
Can I use logistic regression? Any comments are welcome.


Cleaner, Better, Faster - Experience the new Faster Yahoo! Mail today!


Try the new FASTER Yahoo! Mail.. Experience it today!
Reply | Threaded
Open this post in threaded view
|

R in SPSS 16

Johnny Amora
In reply to this post by Marta Garcia-Granero
I am still using SPSS 16 for Windows (not yet PASW :)).  I want to know what can be done within SPSS 16 with the R plug-ins.  I tried to explore it in the spss website but I was not able to locate the right page.  Your help are welcome.
 
Johnny


Adding more friends is quick and easy.
Import them over to Yahoo! Mail today!
Reply | Threaded
Open this post in threaded view
|

Re: Effects of the Dichotomous variables on a Dichotomous Outcome

Marta Garcia-Granero
In reply to this post by E. Bernardo
Eins Bernardo wrote:

>
> I am using contingency table to analyze the data then try with Simple
> Logistic Regression.  Like Marta, prior to analysis, I have in mind
> that the two analyses are the same.  However, I noticed that they
> produced different results.  The logistic regression revelaed that
> the X is not significantly associated with the Y.  The Fisher exact
> test and the contingency coefficient revealed that X is significantly
> associated with Y.
>
> The data are as follows:
>
>   X            Y     Count
> yes         yes     13
> yes         No        0
> No          No        8
> No         Yes      19
>

You CAN'T use logistic regression when you have a zero in one cell. You
get an indeterminate OR (13*8/(19*0), and a SE(log(OR))=SQRT
(1/13+1/8+1/19+1/0),  that tends to infinity. That's why you get a
non-significant Wald test, although you do get a very significant LR
test (and you also get significant results when using contingency
analysis tables).

This example clearly shows that your sample size is definitely too low.

You can get an approximate estimate of the OR by adding 0.5 to each cell
frequency:

DATA LIST LIST/ X Y (2 A3) Count (F8).
BEGIN DATA
yes         yes     13
yes         no       0
no          no       8
no          yes     19
END DATA.
COMPUTE Count=Count+0.5.
WEIGHT BY Count.

LOGISTIC REGRESSION VARIABLES  Y
  /METHOD = ENTER X
  /CONTRAST (X)=Indicator(1).

Anyway, Wald's test power is still too low, and the result is non
significant.

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: R in SPSS 16

Jon K Peck
In reply to this post by Johnny Amora

If you install the R plugin for V16, the documentation installed with it and can be accessed from the Help menu.  Although V17 and V18 have many api enhancements, V16 provides access to the active SPSS data and the ability to create SPSS output pivot tables from R.  The R-based extension commands on Developer Central, however, require a later SPSS version.

HTH,
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Johnny Amora <[hidden email]>
To: [hidden email]
Date: 03/07/2010 06:39 PM
Subject: [SPSSX-L] R in SPSS 16
Sent by: "SPSSX(r) Discussion" <[hidden email]>





I am still using SPSS 16 for Windows (not yet PASW :)).  I want to know what can be done within SPSS 16 with the R plug-ins.  I tried to explore it in the spss website but I was not able to locate the right page.  Your help are welcome.
 
Johnny



Adding more friends is quick and easy.
Import them over to Yahoo! Mail today!


Reply | Threaded
Open this post in threaded view
|

Re: SPSS vs Statistca

J P-6
In reply to this post by Steve Simon, P.Mean Consulting

I work in a SPSS / Statistica office, IMHO Statistica is not the software of choice for any serious analyst. As the previous poster pointed out, precision is a moot point. I've used SPSS for over 15 years and while there is the occasional bug there has been no major issue regarding precision of results with any major software package, SPSS included, that I am aware of.

 

While Statistica may be easy to use, if you define easy to use as point and click, the seemingly endless array of point-and-click menus becomes a liability after a point because the sheer number of options etc that need to be clicked becomes very confusing and makes it difficult to replicate a procedure or analysis. Which leads to the primary shortcoming with Statistica: there is no syntax option. Thus there is no way to build audit trails to document complex analyses. I've never worked in a shop that relied on point and click software because it is imperative to document / replicate work. Not to mention syntax greatly simplifies repetitive tasks and often enables a degree of customization not available through menus. Also, I have found that Statistica does not play nicely with other software, i.e., moving data into or out of Statistica often results in scrambled variable names / labels / value labels / variable types etc.. Finally, I have found data management in Statistica (merging files, sub setting cases, etc.) to be unusually difficult relative to SPSS (or SAS or SQL).

 

Statistica is probably OK for teaching statistics because you can eliminate the layer of confusion that can result from the simultaneous teaching syntax and statistics. But for professional work I would choose almost anything but Statistica. Obviously this is all based on my personal experiences, perhaps others have had better.   

 


From: "Steve Simon, P.Mean Consulting" <[hidden email]>
To: [hidden email]
Sent: Sat, March 6, 2010 10:49:24 PM
Subject: Re: SPSS vs Statistca

Gonzales, Dana L wrote:
> I wonder if anyone on the list can provide an opinion on the
> differences of SPSS and Statistca. I have a learner who wants to use
> Statistca stating that is it a much better option that SPSS because
> it is more accurate. I can't seem to locate any information to
> support this view. Any assistance would be greatly appreciated.

I'm not sure you'll get unbiased opinions from people who subscribe to
this list. But perhaps some of the opinions offered might still be
helpful. Here are my thoughts:

More accurate is a surprising choice of words. If this person means
numerical accuracy, that's pretty much a non-issue for most problems as
all professional computer programs have ditched single precision and
avoided the lousy algorithms. It's best not to mention Microsoft Excel
at this point.

That doesn't mean that you can't trip up one of these programs with a
tricky data set, but in general, accuracy is not a serious concern.

Several years ago, Statistica had a very aggressive advertising campaign
that cast aspersions on many of its competitors. I think that there was
also some pushback from people like Leland Wilkinson that held the
claims to be unfair and unsupported. I wonder if this person has some
residual memory of this campaign.

If "accuracy" means something other than numerical accuracy, then you
need to define what this really means. I could make some guesses, but
that's dangerous.

I would discourage a comparison of two statistical packages based on
accuracy. The criteria that make more sense are: (1) ease of use, and
(2) availability of advanced procedures. I have no experience with
Statistica, but I'd be surprised based on what I have read about the
program, that it would be considered vastly superior in either category.
--
Steve Simon, Standard disclaimer.
Looking for new friends/connections on Facebook and LinkedIn
www.facebook.com/pmean, www.linkedin.com/in/pmean
or become a fan of my newsletter, The Monthly Mean
http://www.facebook.com/group.php?gid=302778306676

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD