|
Hi all,
I'm going to preface all of this with an apology. I'm sure this question has been asked a million times before and everyone knows the answer; however, it's driving me crazy. I've created a weighted composite from 11 variables (we'll call the composite comp1, which ranges from 1-10). I then create a binary variable (binvar) that assigns each case in the dataset a 0 or a 1, depending on whether or not the case reaches a certain minimum on comp1 (such as below). DO IF (comp1>=4.5) . COMPUTE binvar=1 . ELSE . COMPUTE binvar=0 . END IF . EXECUTE . Nothing complicated. Binvar will be my filter variable for subsequent analyses. Now for the problem. Let's say that there are 5 people in the dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for some of these people, and a 1 for others. In other words, even though they all are equal to 4.5, SPSS views them differently. Further, when I run frequencies on comp1, 4.5 appears twice, with different counts next to it. Why is it doing this? Thanks, - Matt --------------------------------- Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search. |
|
It is likely due to the 2nd digit post-decimal. Those that are exactly
4.50 and greater get a 1. Those that are 4.45-4.49 get a 0. You may want to try an expression like DO IF (rnd(comp1,1)>=4.5). (totally untested!). Alternatively, you could expand the current format of comp1 to 2.2 instead of 2.1.) Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Matthew Reeder Sent: Wednesday, May 30, 2007 8:22 AM To: [hidden email] Subject: [SPSSX-L] Rounding Issues (v.14) Hi all, I'm going to preface all of this with an apology. I'm sure this question has been asked a million times before and everyone knows the answer; however, it's driving me crazy. I've created a weighted composite from 11 variables (we'll call the composite comp1, which ranges from 1-10). I then create a binary variable (binvar) that assigns each case in the dataset a 0 or a 1, depending on whether or not the case reaches a certain minimum on comp1 (such as below). DO IF (comp1>=4.5) . COMPUTE binvar=1 . ELSE . COMPUTE binvar=0 . END IF . EXECUTE . Nothing complicated. Binvar will be my filter variable for subsequent analyses. Now for the problem. Let's say that there are 5 people in the dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for some of these people, and a 1 for others. In other words, even though they all are equal to 4.5, SPSS views them differently. Further, when I run frequencies on comp1, 4.5 appears twice, with different counts next to it. Why is it doing this? Thanks, - Matt --------------------------------- Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search. PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. |
|
Hi Melissa,
I think I considered this. I went into the dataview and increased the number of decimals shown on comp1 to 7. (x.xxxxxxx). In all but one instance, the value shows as 4.5000000 (one case snuck in at 4.4900000). SPSS still discriminates amongst those equaling 4.5 when computing the binary variable, however. Though it seems unlikely, perhaps it is occurring at a decimal place beyond the seventh (maybe rounding in the DO IF statement, as you suggest, will work here). As a check, I calculated the composite manually for these individuals. In all cases (sans the 4.49 case), the scores come out to 4.5 even. - Matt Melissa Ives <[hidden email]> wrote: It is likely due to the 2nd digit post-decimal. Those that are exactly 4.50 and greater get a 1. Those that are 4.45-4.49 get a 0. You may want to try an expression like DO IF (rnd(comp1,1)>=4.5). (totally untested!). Alternatively, you could expand the current format of comp1 to 2.2 instead of 2.1.) Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Matthew Reeder Sent: Wednesday, May 30, 2007 8:22 AM To: [hidden email] Subject: [SPSSX-L] Rounding Issues (v.14) Hi all, I'm going to preface all of this with an apology. I'm sure this question has been asked a million times before and everyone knows the answer; however, it's driving me crazy. I've created a weighted composite from 11 variables (we'll call the composite comp1, which ranges from 1-10). I then create a binary variable (binvar) that assigns each case in the dataset a 0 or a 1, depending on whether or not the case reaches a certain minimum on comp1 (such as below). DO IF (comp1>=4.5) . COMPUTE binvar=1 . ELSE . COMPUTE binvar=0 . END IF . EXECUTE . Nothing complicated. Binvar will be my filter variable for subsequent analyses. Now for the problem. Let's say that there are 5 people in the dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for some of these people, and a 1 for others. In other words, even though they all are equal to 4.5, SPSS views them differently. Further, when I run frequencies on comp1, 4.5 appears twice, with different counts next to it. Why is it doing this? Thanks, - Matt --------------------------------- Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search. PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. --------------------------------- Pinpoint customers who are looking for what you sell. |
|
Your treading on uneven ground whenever you try to exactly compare decimal numbers stored in floating point binary formats. It's probably best to change the magnitude of your scale to avoid decimal scores before doing comparisons like this. Just how much precision do you need--perhaps multiply it by 10 and round that result.
-----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Matthew Reeder Sent: Wednesday, May 30, 2007 9:23 AM To: [hidden email] Subject: Re: Rounding Issues (v.14) Hi Melissa, I think I considered this. I went into the dataview and increased the number of decimals shown on comp1 to 7. (x.xxxxxxx). In all but one instance, the value shows as 4.5000000 (one case snuck in at 4.4900000). SPSS still discriminates amongst those equaling 4.5 when computing the binary variable, however. Though it seems unlikely, perhaps it is occurring at a decimal place beyond the seventh (maybe rounding in the DO IF statement, as you suggest, will work here). As a check, I calculated the composite manually for these individuals. In all cases (sans the 4.49 case), the scores come out to 4.5 even. - Matt Melissa Ives <[hidden email]> wrote: It is likely due to the 2nd digit post-decimal. Those that are exactly 4.50 and greater get a 1. Those that are 4.45-4.49 get a 0. You may want to try an expression like DO IF (rnd(comp1,1)>=4.5). (totally untested!). Alternatively, you could expand the current format of comp1 to 2.2 instead of 2.1.) Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Matthew Reeder Sent: Wednesday, May 30, 2007 8:22 AM To: [hidden email] Subject: [SPSSX-L] Rounding Issues (v.14) Hi all, I'm going to preface all of this with an apology. I'm sure this question has been asked a million times before and everyone knows the answer; however, it's driving me crazy. I've created a weighted composite from 11 variables (we'll call the composite comp1, which ranges from 1-10). I then create a binary variable (binvar) that assigns each case in the dataset a 0 or a 1, depending on whether or not the case reaches a certain minimum on comp1 (such as below). DO IF (comp1>=4.5) . COMPUTE binvar=1 . ELSE . COMPUTE binvar=0 . END IF . EXECUTE . Nothing complicated. Binvar will be my filter variable for subsequent analyses. Now for the problem. Let's say that there are 5 people in the dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for some of these people, and a 1 for others. In other words, even though they all are equal to 4.5, SPSS views them differently. Further, when I run frequencies on comp1, 4.5 appears twice, with different counts next to it. Why is it doing this? Thanks, - Matt --------------------------------- Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search. PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. --------------------------------- Pinpoint customers who are looking for what you sell. |
|
In reply to this post by Matthew Reeder
Hi Matthew
Having recently needed to round several variables in a similar manner, I found it easiest to do as ViAnn suggested, and then rescale as needed. This can be done in one step: (RND(Var001*1000))/1000 Hope this helps, Lin |
|
In reply to this post by Matthew Reeder
If you really want to see what values you have, change the format of the variable to display the hexadecimal representation.
format comp1(rbhex16). Then you can see exactly what values you have computed. You won't recognize the values, but you can see whether two cases really have the same value or not. Floating point comparisons are fine for an inequality test, but by its nature that will be sensitive to the very last bit of the value. Exactly equality tests are iffy if there are fractional values. That is the nature of floating point arithmetic hardware. HTH, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Matthew Reeder Sent: Wednesday, May 30, 2007 10:23 AM To: [hidden email] Subject: Re: [SPSSX-L] Rounding Issues (v.14) Hi Melissa, I think I considered this. I went into the dataview and increased the number of decimals shown on comp1 to 7. (x.xxxxxxx). In all but one instance, the value shows as 4.5000000 (one case snuck in at 4.4900000). SPSS still discriminates amongst those equaling 4.5 when computing the binary variable, however. Though it seems unlikely, perhaps it is occurring at a decimal place beyond the seventh (maybe rounding in the DO IF statement, as you suggest, will work here). As a check, I calculated the composite manually for these individuals. In all cases (sans the 4.49 case), the scores come out to 4.5 even. - Matt Melissa Ives <[hidden email]> wrote: It is likely due to the 2nd digit post-decimal. Those that are exactly 4.50 and greater get a 1. Those that are 4.45-4.49 get a 0. You may want to try an expression like DO IF (rnd(comp1,1)>=4.5). (totally untested!). Alternatively, you could expand the current format of comp1 to 2.2 instead of 2.1.) Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Matthew Reeder Sent: Wednesday, May 30, 2007 8:22 AM To: [hidden email] Subject: [SPSSX-L] Rounding Issues (v.14) Hi all, I'm going to preface all of this with an apology. I'm sure this question has been asked a million times before and everyone knows the answer; however, it's driving me crazy. I've created a weighted composite from 11 variables (we'll call the composite comp1, which ranges from 1-10). I then create a binary variable (binvar) that assigns each case in the dataset a 0 or a 1, depending on whether or not the case reaches a certain minimum on comp1 (such as below). DO IF (comp1>=4.5) . COMPUTE binvar=1 . ELSE . COMPUTE binvar=0 . END IF . EXECUTE . Nothing complicated. Binvar will be my filter variable for subsequent analyses. Now for the problem. Let's say that there are 5 people in the dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for some of these people, and a 1 for others. In other words, even though they all are equal to 4.5, SPSS views them differently. Further, when I run frequencies on comp1, 4.5 appears twice, with different counts next to it. Why is it doing this? Thanks, - Matt --------------------------------- Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search. PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. --------------------------------- Pinpoint customers who are looking for what you sell. |
|
In reply to this post by Matthew Reeder
At 09:22 AM 5/30/2007, Matthew Reeder wrote:
>I've created a weighted composite from 11 variables (we'll call the >composite comp1, which ranges from 1-10). I then create a binary >variable (binvar) that assigns each case in the dataset a 0 or a 1, >depending on whether or not the case reaches a certain minimum on >comp1 (such as below). > > DO IF (comp1>=4.5) . > COMPUTE binvar=1 . > ELSE . > COMPUTE binvar=0 . > END IF . > EXECUTE . > >Nothing complicated. Binvar will be my filter variable for subsequent >analyses. As you've found, so far, so good. By the way, a good replacement for the above syntax is RECODE comp1 (4.5 THRU HI = 1) (OTHER = 0) INTO binvar. >Let's say that there are 5 people in the dataset with a value of 4.5 >on comp1. Binvar is being assigned a 0 for some of these people, and a >1 for others. In other words, even though they all are equal to 4.5, >SPSS views them differently. You've had the answer: because the values *display as* 4.5 does not mean they are *equal to* 4.5. At 11:03 AM 5/30/2007, Melissa Ives wrote: >It is likely due to the 2nd digit post-decimal. Actually, the minimum difference cannot be guaranteed to appear in a modest fixed number of decimal places. SPSS numbers are represented with 53 bits of precision, which is about 16 decimal digits. But even that doesn't characterize the representation: the representable numbers are spaced about as closely as 16-digit decimal numbers, it they aren't the same set of numbers. >Further, when I run frequencies on comp1, 4.5 appears twice, with >different counts next to it. Why is it doing this? For, of course, the same reason: the numbers* are different, though their *display forms* are the same, in the format (F<something>.1) that you are using. You've had a couple of suggestions (ViAnn Beadle, Jon Peck) for producing display forms that will show all differences. To write a little differently, but close to ViAnn's: if, as I assume, your "weighted composite" is a weighted average, then guarantee the result is integral by (a) using only integer weights, and (b) taking the *sum* rather than *average*, using those weights. But you probably won't like the result. Depending on your weights, you may have to multiply them by a large number to convert them to integers while maintaining their relative magnitudes; and while you will see the exact values of the weighted sums, those values may be integers with a lot of digits. Normally, when you've taken a weighted average like that, it's best to treat it as a continuous quantity, whose the magnitude is important to the appropriate precision, but whose exact values are not relevant. It's rarely illuminating to take FREQUENCIES for such a quantity. It can be useful to use RECODE to classify the values into ranges, and take FREQUENCIES of the result; you'll have to decide about that. If you're particularly interested in what's happening near the cutpoint value of 4.5, I'd try something like this (not tested): a.) Use the code you already have, to calculate 'comp1'. b.) Assuming you have an ID variable called CaseNum, and your 11 data variables are Datum1 to Datum 11, inspect the data by TEMPORARY /* If desired */. NUMERIC Delta (E10.3). VAR LABEL Delta 'Difference from 4.5'. COMPUTE Delta = comp1 - 4.5. SELECT IF ABS(Delta) LE 0.2 /* Or other threshold */. LIST VARIABLES= CaseID comp1 Delta Datum1 TO Datum11. |
| Free forum by Nabble | Edit this page |
