How to recode missing values ?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How to recode missing values ?

M'hammed Abdous
Dear all,

After merging four different datasets, I ended up with missing values for
some key variables (AA, BB & CC). I need to recode the missing values to
either the first or the last value of the variables. As shown in the
example below, I need to recode the values of AA to have 1 for all 001
cases and 2 for all 002 cases.


ID AA BB CC   DD

001 1 1 1   1
001 . . .   2
001 . . .   3
001 . . .   3
001 . . .   4
002 . . .   4
002 . . .   4
002 . . .   4
002 . . .   4
002 . . .   4
002 . . .   4
002 2 2 2   2
003 . . .   5
003 . . .   4
003 . . .   5
003 . . .   5


 Thank you in advance for your help.

 M'hammed.
Reply | Threaded
Open this post in threaded view
|

Re: How to recode missing values ?

Maguin, Eugene
M'hammed.

The trick is to get your file sorted so that the case with valid value for
AA is the first in the list and use the lag function to pull that value of
AA across all subsequent cases with the same value of ID.

First, sort your file by ID and AA. The important thing is that you want the
result to look like this. Notice the change in the order for the ID=2 cases.
The default ascending sort order may do this. If not, switch AA to a
descending sort order while leaving ID in ascending. If you don't know how
to do this look in the syntax refererence or the help file.

ID AA BB CC   DD
001 1 1 1   1
001 . . .   2
001 . . .   3
001 . . .   3
001 . . .   4
002 2 2 2   2
002 . . .   4
002 . . .   4
002 . . .   4
002 . . .   4
002 . . .   4
002 . . .   4
003 . . .   5
003 . . .   4
003 . . .   5
003 . . .   5

Then do this.
If (Id eq lag(id)) AA=lag(AA).

What are you going to do about the ID=3 cases?


WARNING:
Can this situation occur? What do you want to do then?

003 2 2 2   2
003 . . .   4
003 . . .   4
003 . . .   4
003 5 5 6   2


Gene Maguin
Reply | Threaded
Open this post in threaded view
|

Re: How to recode missing values ?

M'hammed Abdous
In reply to this post by M'hammed Abdous
Gene,

Thank you very much for your time and help. Your suggestion worked very
well. I was able to add/map and sort the files by AA ascendingly. After
that, I used the lag function to recode identical cases.

Actually, the original dataset cases are as follow:

School:   47 cases
Principal:  47 cases
Teacher:  256 cases
Student: 1096 for each school

After merging the four different parts, I ended up with missing values for
school, principal, and teacher. Which created some problems with TTest,
Anova, and Regression analysis.

However the new dataset creates another problem for descriptive analysis
(Frequency, Means, etc.). I guess I will have to maintain two datasets
(with and without the new values). Unless there's a more elegant way to
exclude the new values from the descriptive analysis.

I really appreciate your time and help.

Thanks again, M'hammed.