SPSSX Discussion - Re: Inflated N's in Spearman's Rho

Re: Inflated N's in Spearman's Rho

Posted by Rick Oliver-3 on Apr 04, 2014; 9:15pm
URL: http://spssx-discussion.165.s1.nabble.com/Inflated-N-s-in-Spearman-s-Rho-tp5725316p5725317.html

If you are using the WEIGHT command to weight cases, the issue is fractional weights. From the documentation on the WEIGHT command:

Weight values do not need to be integer, and some procedures, such as FREQUENCIES, CROSSTABS, and CTABLES, will use fractional values on the WEIGHT variable. However, most procedures treat the WEIGHT variable as a replication weight and will simply round fractional weights to the nearest integer. Some procedures ignore the WEIGHTvariable completely, and this limitation is noted in the procedure-specific documentation. Procedures in the Complex Samples add-on module can use inverse sampling weights specified on the CSPLAN command.

If you are using NONPAR CORR to compute Spearman's Rho, it is rounding the weights. The WEIGHT command is not the ideal solution for fractional weights. I think the weight options in the Complex Sample add-on module are more robust.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]

From: Adam Troy <[hidden email]>
To: [hidden email],
Date: 04/04/2014 03:51 PM
Subject: Inflated N's in Spearman's Rho
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Hi all,

Quick question (I did look for an answer and could not find one). I have a dataset from a nationally representative sample with a column for the weighting of each case which weights the sample to 1,000 total (original sample was 1,003). Weights range from .25 to 5.48 for each case.

When I was running Spearman's correlations (Spearman's rho) with weight cases on I noticed that the N's were being inflated above 1,000 in all cases. For example, in pairs where the pairwise N was 975, the N for the Spearman's rho was 1041. This was not the case when I ran these chi squares with weights applied, which should yield the exact same p value for a comparison of two binary variables. The results also differed from identical weighted Spearman correlation results in SAS.

Any ideas why SPSS was inflate these N's unnecessarily for weighted Spearman correlations? I will be encountering a lot of weighted samples and would like to be able to run all these analyses in SPSS. I'm currently using SPSS 21.

Thanks again,

Adam