|
Hello everybody,
I'm trying to concatenate one string variable with let's say with 100 other numerical variables. My data looks like this: STRING N1 N2 N3 ... N100 AAAA 100 345 456 502 AAAB 056 101 159 312 ... The resulting data should have this format STRING C1 C2 C3 ... C100 AAAA AAAA100 AAAA345 AAAA456 AAAA502 AAAB AAAB056 AAAB101 AAAB159 AAAB312 ... I know that I can use concat(var1,var2,.) but I don't know how to use it in combination with a loop. Thanks in advance. Bye, Lucas ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Lucas
Lucas Bremer wrote: > I'm trying to concatenate one string variable with let's say with 100 other > numerical variables. > > My data looks like this: > > STRING N1 N2 > N3 ... N100 > > AAAA 100 345 > 456 502 > > AAAB 056 101 > 159 312 > > ... > > The resulting data should have this format > > STRING C1 C2 > C3 ... C100 > > AAAA AAAA100 AAAA345 AAAA456 > AAAA502 > > AAAB AAAB056 AAAB101 AAAB159 > AAAB312 > > ... > > > I know that I can use concat(var1,var2,.) but I don't know how to use it in > combination with a loop. > 4+3 -, modify it if needed): STRING C1 TO C100 (A7). DO REPEAT A=C1 TO C100 /B=N1 TO N100. COMPUTE A=CONCAT(STRING,STRING(B,'N3')). END REPEAT. EXE. LIST STRING C1 TO C100. Regards, Marta GarcĂa-Granero -- For miscellaneous statistical stuff, visit: http://gjyp.nl/marta/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Lucas Bremer
Hello everybody, I'm trying to detected outliers in my data. I have data series from 6 month till 24 month. Maybe, can you recomend some method which helps better detected outliers. Later this data is used for regression estimation. Thanks in advance. Bye, Juris |
|
Juris,
I start by stating that I am only in my second semester of my doctoral program and in my second statistics course, so if you take my comments, you may want to confirm them (although I am sure someone on the list will correct me, too). My information is based on simple regression.
First, if you have outliers with a small n, you have a serious problem because the effects of the outliers are magnified. If you have a large n with few outliers, you may not have a problem at all. The important thing is that you don't want to just remove the outliers without careful analysis - this can be "ethically" wrong and academically dishonest, besides giving an inaccurate representation of your data. If, however, you realize the outliers are due to errors in data entry, misunderstanding of a survey question, or other recording errors, you may be grounds for removing them. You don't want to "take lightly" the removal of outliers from your model.
You will have to run a regression to get the residuals for these checks and semistudentized or deleted semistudentized residuals will be easier to understand because they give you a zero baseline and your deviations are measured in standard deviations. There are differences of opinion, but a rough rule of thumb is that if you have residuals that have an absolute value of 4 or more (I've seen 3), they are considered outliers.
Graphical methods (these work primarily for simple regression - if you multiple predictor variables, this gets harder to interpret):
If you have a large n, you can use a box plot of the residuals, a histogram of the residuals, you can compare the actual frequencies of the residuals with expected frequencies, or you can do a normal probability plot of the residuals. The normal probability plot is the only one that will give you reliable results for a small n.
The EXPLORE command in SPSS will give you boxplots that indicate your outliers with an O and if you have any extreme outliers, they'll be shown with an asterisk *.
Keep in mind that if you are doing regression, you have other things to check, too, such as linearity, homoscedasticity, independency of the error terms, have some of the important predictor variables been omitted from the regression model, and normality. Do the normality check last because the corrections for other deviations can have large effects on normality. Non-normality can also have serious effects on your results, so you don't want to overlook this.
There is more you can do, but that will get you started and I have to go to class.
I hope this helps,
Fred
All the best,
Fred Weigel
Doctoral Student
[hidden email] College of Business
427 Lowder Business Building
415 West Magnolia Avenue Auburn University Auburn, AL 36849 Phone: 334-844-6538 Fax: 334-844-5159 >>> Juris Breidaks <[hidden email]> 2/19/2009 03:14 >>> Hello everybody, I'm trying to detected outliers in my data. I have data series from 6 month till 24 month. Maybe, can you recomend some method which helps better detected outliers. Later this data is used for regression estimation. Thanks in advance. Bye, Juris |
| Free forum by Nabble | Edit this page |
