automatically concatenate one string with many variables

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

automatically concatenate one string with many variables

Lucas Bremer
Hello everybody,



I'm trying to concatenate one string variable with let's say with 100 other
numerical variables.



My data looks like this:



STRING                N1                          N2
N3          ...                       N100

AAAA                   100                        345
456                                        502

AAAB                    056                        101
159                                        312

...



The resulting data should have this format



STRING                C1                          C2
C3           ...                         C100

AAAA                   AAAA100            AAAA345            AAAA456
AAAA502

AAAB                    AAAB056             AAAB101             AAAB159
AAAB312

...



I know that I can use concat(var1,var2,.) but I don't know how to use it in
combination with a loop.



Thanks in advance.



Bye, Lucas

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: automatically concatenate one string with many variables

Marta Garcia-Granero
Hi Lucas

Lucas Bremer wrote:

> I'm trying to concatenate one string variable with let's say with 100 other
> numerical variables.
>
> My data looks like this:
>
> STRING                N1                          N2
> N3          ...                       N100
>
> AAAA                   100                        345
> 456                                        502
>
> AAAB                    056                        101
> 159                                        312
>
> ...
>
> The resulting data should have this format
>
> STRING                C1                          C2
> C3           ...                         C100
>
> AAAA                   AAAA100            AAAA345            AAAA456
> AAAA502
>
> AAAB                    AAAB056             AAAB101             AAAB159
> AAAB312
>
> ...
>
>
> I know that I can use concat(var1,var2,.) but I don't know how to use it in
> combination with a loop.
>
Ty this code (it assumes that the final string length is 7 characters -
4+3 -, modify it if needed):

STRING C1 TO C100 (A7).
DO REPEAT A=C1 TO C100
         /B=N1 TO N100.
COMPUTE A=CONCAT(STRING,STRING(B,'N3')).
END REPEAT.
EXE.

LIST STRING C1 TO C100.

Regards,
Marta GarcĂ­a-Granero

--
For miscellaneous statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

How better can detect outliers...

Juris Breidaks
In reply to this post by Lucas Bremer

Hello everybody,


I'm trying to detected outliers in my data. I have data series from 6 month till 24 month. Maybe, can you recomend some method which helps better detected outliers. Later this data is used for regression estimation.


Thanks in advance.



Bye,
Juris


Reply | Threaded
Open this post in threaded view
|

Re: How better can detect outliers...

Fred Weigel-2
Juris,
 
I start by stating that I am only in my second semester of my doctoral program and in my second statistics course, so if you take my comments, you may want to confirm them (although I am sure someone on the list will correct me, too).  My information is based on simple regression.
 
First, if you have outliers with a small n, you have a serious problem because the effects of the outliers are magnified.  If you have a large n with few outliers, you may not have a problem at all.  The important thing is that you don't want to just remove the outliers without careful analysis - this can be "ethically" wrong and academically dishonest, besides giving an inaccurate representation of your data.  If, however, you realize the outliers are due to errors in data entry, misunderstanding of a survey question, or other recording errors, you may be grounds for removing them. You don't want to "take lightly" the removal of outliers from your model. 
 
You will have to run a regression to get the residuals for these checks and semistudentized or deleted semistudentized residuals will be easier to understand because they give you a zero baseline and your deviations are measured in standard deviations.  There are differences of opinion, but a rough rule of thumb is that if you have residuals that have an absolute value of 4 or more (I've seen 3), they are considered outliers.
 
Graphical methods (these work primarily for simple regression - if you multiple predictor variables, this gets harder to interpret):
If you have a large n, you can use a box plot of the residuals, a histogram of the residuals, you can compare the actual frequencies of the residuals with expected frequencies, or you can do a normal probability plot of the residuals.  The normal probability plot is the only one that will give you reliable results for a small n.
The EXPLORE command in SPSS will give you boxplots that indicate your outliers with an O and if you have any extreme outliers, they'll be shown with an asterisk *.
 
Keep in mind that if you are doing regression, you have other things to check, too, such as linearity, homoscedasticity, independency of the error terms, have some of the important predictor variables been omitted from the regression model, and normality.  Do the normality check last because the corrections for other deviations can have large effects on normality.  Non-normality can also have serious effects on your results, so you don't want to overlook this.
 
There is more you can do, but that will get you started and I have to go to class.
 
I hope this helps,
Fred
 
 
 
 
 
All the best,
Fred Weigel
Doctoral Student
[hidden email]
College of Business
427 Lowder Business Building
415 West Magnolia Avenue
Auburn University
Auburn, AL  36849
Phone:  334-844-6538
Fax:  334-844-5159


>>> Juris Breidaks <[hidden email]> 2/19/2009 03:14 >>>

Hello everybody,


I'm trying to detected outliers in my data. I have data series from 6 month till 24 month. Maybe, can you recomend some method which helps better detected outliers. Later this data is used for regression estimation.


Thanks in advance.



Bye,
Juris