Rounding Issues (v.14)

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Rounding Issues (v.14)

Matthew Reeder
Hi all,

  I'm going to preface all of this with an apology. I'm sure this question has been asked a million times before and everyone knows the answer; however, it's driving me crazy.

  I've created a weighted composite from 11 variables (we'll call the composite comp1, which ranges from 1-10). I then create a binary variable (binvar) that assigns each case in the dataset a 0 or a 1, depending on whether or not the case reaches a certain minimum on comp1 (such as below).

  DO IF (comp1>=4.5) .
  COMPUTE binvar=1 .
  ELSE .
  COMPUTE binvar=0 .
  END IF .
  EXECUTE .

  Nothing complicated. Binvar will be my filter variable for subsequent analyses. Now for the problem. Let's say that there are 5 people in the dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for some of these people, and a 1 for others. In other words, even though they all are equal to 4.5, SPSS views them differently.

  Further, when I run frequencies on comp1, 4.5 appears twice, with different counts next to it. Why is it doing this?



  Thanks,

  - Matt




---------------------------------
Luggage? GPS? Comic books?
Check out fitting  gifts for grads at Yahoo! Search.
Reply | Threaded
Open this post in threaded view
|

Re: Rounding Issues (v.14)

Melissa Ives
It is likely due to the 2nd digit post-decimal. Those that are exactly
4.50 and greater get a 1.  Those that are 4.45-4.49 get a 0.  You may
want to try an expression like DO IF (rnd(comp1,1)>=4.5).  (totally
untested!).  Alternatively, you could expand the current format of comp1
to 2.2 instead of 2.1.)

Melissa

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Matthew Reeder
Sent: Wednesday, May 30, 2007 8:22 AM
To: [hidden email]
Subject: [SPSSX-L] Rounding Issues (v.14)

Hi all,

  I'm going to preface all of this with an apology. I'm sure this
question has been asked a million times before and everyone knows the
answer; however, it's driving me crazy.

  I've created a weighted composite from 11 variables (we'll call the
composite comp1, which ranges from 1-10). I then create a binary
variable (binvar) that assigns each case in the dataset a 0 or a 1,
depending on whether or not the case reaches a certain minimum on comp1
(such as below).

  DO IF (comp1>=4.5) .
  COMPUTE binvar=1 .
  ELSE .
  COMPUTE binvar=0 .
  END IF .
  EXECUTE .

  Nothing complicated. Binvar will be my filter variable for subsequent
analyses. Now for the problem. Let's say that there are 5 people in the
dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for
some of these people, and a 1 for others. In other words, even though
they all are equal to 4.5, SPSS views them differently.

  Further, when I run frequencies on comp1, 4.5 appears twice, with
different counts next to it. Why is it doing this?



  Thanks,

  - Matt




---------------------------------
Luggage? GPS? Comic books?
Check out fitting  gifts for grads at Yahoo! Search.


PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.
Reply | Threaded
Open this post in threaded view
|

Re: Rounding Issues (v.14)

Matthew Reeder
Hi Melissa,

  I think I considered this. I went into the dataview and increased the number of decimals shown on comp1 to 7. (x.xxxxxxx). In all but one instance, the value shows as 4.5000000 (one case snuck in at 4.4900000). SPSS still discriminates amongst those equaling 4.5 when computing the binary variable, however. Though it seems unlikely, perhaps it is occurring at a decimal place beyond the seventh (maybe rounding in the DO IF statement, as you suggest, will work here).

  As a check, I calculated the composite manually for these individuals. In all cases (sans the 4.49 case), the scores come out to 4.5 even.


  - Matt

Melissa Ives <[hidden email]> wrote:
  It is likely due to the 2nd digit post-decimal. Those that are exactly
4.50 and greater get a 1. Those that are 4.45-4.49 get a 0. You may
want to try an expression like DO IF (rnd(comp1,1)>=4.5). (totally
untested!). Alternatively, you could expand the current format of comp1
to 2.2 instead of 2.1.)

Melissa

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Matthew Reeder
Sent: Wednesday, May 30, 2007 8:22 AM
To: [hidden email]
Subject: [SPSSX-L] Rounding Issues (v.14)

Hi all,

I'm going to preface all of this with an apology. I'm sure this
question has been asked a million times before and everyone knows the
answer; however, it's driving me crazy.

I've created a weighted composite from 11 variables (we'll call the
composite comp1, which ranges from 1-10). I then create a binary
variable (binvar) that assigns each case in the dataset a 0 or a 1,
depending on whether or not the case reaches a certain minimum on comp1
(such as below).

DO IF (comp1>=4.5) .
COMPUTE binvar=1 .
ELSE .
COMPUTE binvar=0 .
END IF .
EXECUTE .

Nothing complicated. Binvar will be my filter variable for subsequent
analyses. Now for the problem. Let's say that there are 5 people in the
dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for
some of these people, and a 1 for others. In other words, even though
they all are equal to 4.5, SPSS views them differently.

Further, when I run frequencies on comp1, 4.5 appears twice, with
different counts next to it. Why is it doing this?



Thanks,

- Matt




---------------------------------
Luggage? GPS? Comic books?
Check out fitting gifts for grads at Yahoo! Search.


PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.





---------------------------------
Pinpoint customers who are looking for what you sell.
Reply | Threaded
Open this post in threaded view
|

Re: Rounding Issues (v.14)

Beadle, ViAnn
Your treading on uneven ground whenever you try to exactly compare decimal numbers stored in floating point binary formats. It's probably best to change the magnitude of your scale to avoid decimal scores before doing comparisons like this. Just how much precision do you need--perhaps multiply it by 10 and round that result.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Matthew Reeder
Sent: Wednesday, May 30, 2007 9:23 AM
To: [hidden email]
Subject: Re: Rounding Issues (v.14)

Hi Melissa,

  I think I considered this. I went into the dataview and increased the number of decimals shown on comp1 to 7. (x.xxxxxxx). In all but one instance, the value shows as 4.5000000 (one case snuck in at 4.4900000). SPSS still discriminates amongst those equaling 4.5 when computing the binary variable, however. Though it seems unlikely, perhaps it is occurring at a decimal place beyond the seventh (maybe rounding in the DO IF statement, as you suggest, will work here).

  As a check, I calculated the composite manually for these individuals. In all cases (sans the 4.49 case), the scores come out to 4.5 even.


  - Matt

Melissa Ives <[hidden email]> wrote:
  It is likely due to the 2nd digit post-decimal. Those that are exactly
4.50 and greater get a 1. Those that are 4.45-4.49 get a 0. You may
want to try an expression like DO IF (rnd(comp1,1)>=4.5). (totally
untested!). Alternatively, you could expand the current format of comp1
to 2.2 instead of 2.1.)

Melissa

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Matthew Reeder
Sent: Wednesday, May 30, 2007 8:22 AM
To: [hidden email]
Subject: [SPSSX-L] Rounding Issues (v.14)

Hi all,

I'm going to preface all of this with an apology. I'm sure this
question has been asked a million times before and everyone knows the
answer; however, it's driving me crazy.

I've created a weighted composite from 11 variables (we'll call the
composite comp1, which ranges from 1-10). I then create a binary
variable (binvar) that assigns each case in the dataset a 0 or a 1,
depending on whether or not the case reaches a certain minimum on comp1
(such as below).

DO IF (comp1>=4.5) .
COMPUTE binvar=1 .
ELSE .
COMPUTE binvar=0 .
END IF .
EXECUTE .

Nothing complicated. Binvar will be my filter variable for subsequent
analyses. Now for the problem. Let's say that there are 5 people in the
dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for
some of these people, and a 1 for others. In other words, even though
they all are equal to 4.5, SPSS views them differently.

Further, when I run frequencies on comp1, 4.5 appears twice, with
different counts next to it. Why is it doing this?



Thanks,

- Matt




---------------------------------
Luggage? GPS? Comic books?
Check out fitting gifts for grads at Yahoo! Search.


PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.





---------------------------------
Pinpoint customers who are looking for what you sell.
Reply | Threaded
Open this post in threaded view
|

Re: Rounding Issues (v.14)

Lin Cassidy
In reply to this post by Matthew Reeder
Hi Matthew

Having recently needed to round several variables in a similar manner, I
found it easiest to do as ViAnn suggested, and then rescale as needed.

This can be done in one step:   (RND(Var001*1000))/1000

Hope this helps,

Lin
Reply | Threaded
Open this post in threaded view
|

Re: Rounding Issues (v.14)

Peck, Jon
In reply to this post by Matthew Reeder
If you really want to see what values you have, change the format of the variable to display the hexadecimal representation.

format comp1(rbhex16).

Then you can see exactly what values you have computed.  You won't recognize the values, but you can see whether two cases really have the same value or not.

Floating point comparisons are fine for an inequality test, but by its nature that will be sensitive to the very last bit of the value.  Exactly equality tests are iffy if there are fractional values.  That is the nature of floating point arithmetic hardware.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Matthew Reeder
Sent: Wednesday, May 30, 2007 10:23 AM
To: [hidden email]
Subject: Re: [SPSSX-L] Rounding Issues (v.14)

Hi Melissa,

  I think I considered this. I went into the dataview and increased the number of decimals shown on comp1 to 7. (x.xxxxxxx). In all but one instance, the value shows as 4.5000000 (one case snuck in at 4.4900000). SPSS still discriminates amongst those equaling 4.5 when computing the binary variable, however. Though it seems unlikely, perhaps it is occurring at a decimal place beyond the seventh (maybe rounding in the DO IF statement, as you suggest, will work here).

  As a check, I calculated the composite manually for these individuals. In all cases (sans the 4.49 case), the scores come out to 4.5 even.


  - Matt

Melissa Ives <[hidden email]> wrote:
  It is likely due to the 2nd digit post-decimal. Those that are exactly
4.50 and greater get a 1. Those that are 4.45-4.49 get a 0. You may
want to try an expression like DO IF (rnd(comp1,1)>=4.5). (totally
untested!). Alternatively, you could expand the current format of comp1
to 2.2 instead of 2.1.)

Melissa

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Matthew Reeder
Sent: Wednesday, May 30, 2007 8:22 AM
To: [hidden email]
Subject: [SPSSX-L] Rounding Issues (v.14)

Hi all,

I'm going to preface all of this with an apology. I'm sure this
question has been asked a million times before and everyone knows the
answer; however, it's driving me crazy.

I've created a weighted composite from 11 variables (we'll call the
composite comp1, which ranges from 1-10). I then create a binary
variable (binvar) that assigns each case in the dataset a 0 or a 1,
depending on whether or not the case reaches a certain minimum on comp1
(such as below).

DO IF (comp1>=4.5) .
COMPUTE binvar=1 .
ELSE .
COMPUTE binvar=0 .
END IF .
EXECUTE .

Nothing complicated. Binvar will be my filter variable for subsequent
analyses. Now for the problem. Let's say that there are 5 people in the
dataset with a value of 4.5 on comp1. Binvar is being assigned a 0 for
some of these people, and a 1 for others. In other words, even though
they all are equal to 4.5, SPSS views them differently.

Further, when I run frequencies on comp1, 4.5 appears twice, with
different counts next to it. Why is it doing this?



Thanks,

- Matt




---------------------------------
Luggage? GPS? Comic books?
Check out fitting gifts for grads at Yahoo! Search.


PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.





---------------------------------
Pinpoint customers who are looking for what you sell.
Reply | Threaded
Open this post in threaded view
|

Re: Rounding Issues (v.14)

Richard Ristow
In reply to this post by Matthew Reeder
At 09:22 AM 5/30/2007, Matthew Reeder wrote:

>I've created a weighted composite from 11 variables (we'll call the
>composite comp1, which ranges from 1-10). I then create a binary
>variable (binvar) that assigns each case in the dataset a 0 or a 1,
>depending on whether or not the case reaches a certain minimum on
>comp1 (such as below).
>
>   DO IF (comp1>=4.5) .
>   COMPUTE binvar=1 .
>   ELSE .
>   COMPUTE binvar=0 .
>   END IF .
>   EXECUTE .
>
>Nothing complicated. Binvar will be my filter variable for subsequent
>analyses.

As you've found, so far, so good. By the way, a good replacement for
the above syntax is

RECODE comp1
   (4.5 THRU HI = 1)
   (OTHER       = 0) INTO binvar.

>Let's say that there are 5 people in the dataset with a value of 4.5
>on comp1. Binvar is being assigned a 0 for some of these people, and a
>1 for others. In other words, even though they all are equal to 4.5,
>SPSS views them differently.

You've had the answer: because the values *display as* 4.5 does not
mean they are *equal to* 4.5.

At 11:03 AM 5/30/2007, Melissa Ives wrote:

>It is likely due to the 2nd digit post-decimal.

Actually, the minimum difference cannot be guaranteed to appear in a
modest fixed number of decimal places. SPSS numbers are represented
with 53 bits of precision, which is about 16 decimal digits. But even
that doesn't characterize the representation: the representable numbers
are spaced about as closely as 16-digit decimal numbers, it they aren't
the same set of numbers.

>Further, when I run frequencies on comp1, 4.5 appears twice, with
>different counts next to it. Why is it doing this?

For, of course, the same reason: the numbers* are different, though
their *display forms* are the same, in the format (F<something>.1) that
you are using.

You've had a couple of suggestions (ViAnn Beadle, Jon Peck) for
producing display forms that will show all differences. To write a
little differently, but close to ViAnn's: if, as I assume, your
"weighted composite" is a weighted average, then guarantee the result
is integral by (a) using only integer weights, and (b) taking the *sum*
rather than *average*, using those weights.

But you probably won't like the result. Depending on your weights, you
may have to multiply them by a large number to convert them to integers
while maintaining their relative magnitudes; and while you will see the
exact values of the weighted sums, those values may be integers with a
lot of digits.

Normally, when you've taken a weighted average like that, it's best to
treat it as a continuous quantity, whose the magnitude is important to
the appropriate precision, but whose exact values are not relevant.

It's rarely illuminating to take FREQUENCIES for such a quantity. It
can be useful to use RECODE to classify the values into ranges, and
take FREQUENCIES of the result; you'll have to decide about that.

If you're particularly interested in what's happening near the cutpoint
value of 4.5, I'd try something like this (not tested):

a.) Use the code you already have, to calculate 'comp1'.

b.) Assuming you have an ID variable called CaseNum, and your 11 data
variables are Datum1 to Datum 11, inspect the data by

TEMPORARY /* If desired */.
NUMERIC   Delta (E10.3).
VAR LABEL Delta 'Difference from 4.5'.

COMPUTE   Delta = comp1 - 4.5.

SELECT IF ABS(Delta) LE 0.2 /* Or other threshold */.

LIST VARIABLES= CaseID comp1 Delta Datum1 TO Datum11.