Dear list!
I.ve got a SPSS file where some variables have comma as a decimal separator. This results in duplicate values or more in Frequencies. For example I get: 0.02 17 0.02 34 0.02 23 0.03 12 0.03 15 When I click on these cell values in the file, I see a number with 6 decimals, that obviously has been rounded to 2 decimals. How do I transform these values, so I get period as the decimal separator and the values rounded/truncated to just 2 decimal without any hidden extra decimals. In the above example I would like to get: 0.02 74 0.03 27 I.m not quite sure if these problems (decimal separator and duplicate frequency values) are interrelated or separate problems. Can anyone help? best Staffan Lindberg Stockholm Sweden ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
The decimal symbol has nothing to do with
the Frequencies result (unless these are actually formatted strings). Frequencies
are computed on the exact values. So values of .021 and .022, for
example, would produce different rows in the table. The value column
uses the variables format (or value label) for display.
If you want the data binned in two-decimal groups, use the RND or TRUNC functions to change the values first or use the Visual Binner to define other groups. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "[hidden email]" <[hidden email]> To: [hidden email], Date: 05/13/2014 06:09 AM Subject: [SPSSX-L] Duplicate frequencies with comma as decimal separator Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear list! I.ve got a SPSS file where some variables have comma as a decimal separator. This results in duplicate values or more in Frequencies. For example I get: 0.02 17 0.02 34 0.02 23 0.03 12 0.03 15 When I click on these cell values in the file, I see a number with 6 decimals, that obviously has been rounded to 2 decimals. How do I transform these values, so I get period as the decimal separator and the values rounded/truncated to just 2 decimal without any hidden extra decimals. In the above example I would like to get: 0.02 74 0.03 27 I.m not quite sure if these problems (decimal separator and duplicate frequency values) are interrelated or separate problems. Can anyone help? best Staffan Lindberg Stockholm Sweden ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Staffan Lindberg
You have to make the frequency variable fixed (not just displayed) to 2 decimals. See RND(x, 0.01) or TRUNC(x, 0.01). OTOH use binning. Decimal point och dot is not the culprit. (BTW, I still wait for the preceding zero on decimals between -1 and 1).
/PR NEW FILE. PRESERVE. SET DECIMAL=COMMA. DATA LIST list/ x n (F6.3 F6.0). BEGIN DATA. 0,019 17 0,020 34 0,021 23 0,030 12 0,031 15 END DATA. RESTORE. WEIGHT BY n. LIST. FREQUENCIES x. COMPUTE x1=RND(x, 0.01). EXECUTE. FORMATS X X1 (F6.2). LIST. FREQUENCIES x x1. |
In reply to this post by Jon K Peck
Thanks Jon and everybody for your input. However the problem was more cimplicated than I thought and I did not adequately describe the problem. 1. The dataset is over 100.000 cases. 2. I cannot for the main variable (morphine concentration) change the comma decimal separator to period with the help RND or TRUNC. I do not also succeed to lump together all 0.01. I must be missing something here. 3. Only the value 0.01 has been duplicated (for some 200 cases). 4. Tried Visual binning but the distribution is extreme skewed. Minimum value is as above 0.01. Maximum value is a few values over 5.000 (not registration errors). otherwise it is about 100. This makes it practically impossible to set the boundaries in Visual Binning. Any comments? best Staffan Lindberg Stockholm Sweden Från: SPSSX(r) Discussion [mailto:[hidden email]] För Jon K Peck The decimal symbol has nothing to do with the Frequencies result (unless these are actually formatted strings). Frequencies are computed on the exact values. So values of .021 and .022, for example, would produce different rows in the table. The value column uses the variables format (or value label) for display.
|
Administrator
|
If you have successfully adequately described the problem this time around then the solution is as simple as
COMPUTE new=TRUNC(old*100)/100.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
or RND?
---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Staffan Lindberg
I repeat: this has nothing to do with comma
vs dot decimal, assuming that these variables are numeric.
I am not sure whether there is really a problem here, anyway. You can increase the number of decimals in the display by changing the variable formats to, say, F10.4, but since these variables are apparently continuous, FREQUENCIES is probably not the best procedure to use here. If you do want the distribution at the two-decimal level, recompute the variables using RND to eliminate the value differences beyond that point. You might be better off generating histograms if you want to see the distributions. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "Staffan Lindberg" <[hidden email]> To: Jon K Peck/Chicago/IBM@IBMUS, <[hidden email]>, Date: 05/15/2014 05:45 AM Subject: SV: Duplicate frequencies with comma as decimal separator Thanks Jon and everybody for your input. However the problem was more cimplicated than I thought and I did not adequately describe the problem. 1. The dataset is over 100.000 cases. 2. I cannot for the main variable (morphine concentration) change the comma decimal separator to period with the help RND or TRUNC. I do not also succeed to lump together all 0.01. I must be missing something here. 3. Only the value 0.01 has been duplicated (for some 200 cases). 4. Tried Visual binning but the distribution is extreme skewed. Minimum value is as above 0.01. Maximum value is a few values over 5.000 (not registration errors). otherwise it is about 100. This makes it practically impossible to set the boundaries in Visual Binning. Any comments? best Staffan Lindberg Stockholm Sweden Från: SPSSX(r) Discussion [[hidden email]] För Jon K Peck Skickat: den 13 maj 2014 14:18 Till: [hidden email] Ämne: Re: Duplicate frequencies with comma as decimal separator The decimal symbol has nothing to do with the Frequencies result (unless these are actually formatted strings). Frequencies are computed on the exact values. So values of .021 and .022, for example, would produce different rows in the table. The value column uses the variables format (or value label) for display. If you want the data binned in two-decimal groups, use the RND or TRUNC functions to change the values first or use the Visual Binner to define other groups. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM peck@... phone: 720-342-5621 From: "staffan.lindberg@..." <staffan.lindberg@...> To: [hidden email], Date: 05/13/2014 06:09 AM Subject: [SPSSX-L] Duplicate frequencies with comma as decimal separator Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear list! I.ve got a SPSS file where some variables have comma as a decimal separator. This results in duplicate values or more in Frequencies. For example I get: 0.02 17 0.02 34 0.02 23 0.03 12 0.03 15 When I click on these cell values in the file, I see a number with 6 decimals, that obviously has been rounded to 2 decimals. How do I transform these values, so I get period as the decimal separator and the values rounded/truncated to just 2 decimal without any hidden extra decimals. In the above example I would like to get: 0.02 74 0.03 27 I.m not quite sure if these problems (decimal separator and duplicate frequency values) are interrelated or separate problems. Can anyone help? best Staffan Lindberg Stockholm Sweden ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |