SPSSX Discussion

Encounter data !

Classic

List

Threaded

2 messages Options

Khaleel Hussaini

Encounter data !

Hi! I have encounter data for individuals from a hospital database that
looks like this.

ID
Registration Date
Mothers Last Name
Mother's First Name
Dob
Race
Risk Factor
001
8-Aug-06
A
A
11-Dec-85
WHITE
TOBACCO USE
002
11-Aug-06
A
B
28-Aug-81
HISPANIC
ALCOHOL USE
002
11-Aug-06
A
B
28-Aug-81
HISPANIC
ALCOHOL USE
002
11-Aug-06
A
B
28-Aug-81
HISPANIC
DIABETES (GESTATIONAL OR REGULAR)
002
11-Aug-06
A
B
28-Aug-81
HISPANIC
DIABETES (GESTATIONAL OR REGULAR)

I want to not only identify duplicate cases using syntax with (SPSS 10/12)
but also calculate the unique unduplicated cases in addition to estimating
the number of encounters. For e.g. ID 002 has four encounters, and I want to
compute a new variable that tells me number of encounters. Thanks,
Khaleel.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Maguin, Eugene

Re: Encounter data !

Khaleel,

I'll assume that you have defined a Data list statement that correctly reads
your data in from the ascii file in which it currently resides such that a
spss data record consists of the seven variables listed in your posting (ID,
Registration Date, Mothers Last Name, Mother's First Name, Dob, Race, Risk
Factor). If this is not true, repost describing questions/issues about the
data read-in step. I'll assume that the names of those variables are the
names listed with spaces replaced by an underscore character ('_') to make
correct spss variable names. Since you are using 10/12 you will need to
truncate variable names to eight characters.

>>I want to not only identify duplicate cases using syntax with (SPSS 10/12)
but also calculate the unique unduplicated cases in addition to estimating
the number of encounters. For e.g. ID 002 has four encounters, and I want to
compute a new variable that tells me number of encounters.

What defines an 'unduplicated case'? A) ID number, b) something else?

If an 'unduplicated case' is defined by ID number only, use this syntax.

Compute dups=0.
If (id eq lag(id)) dups=lag(dups)+1.

* To count unduplicated cases.
Temporary.
Select if (dups eq 0).
Frequencies id.

I don't understand the relationship between identifying duplicate cases,
calculating unique duplicate cases and estimating the number of encounters
with respect to the output data file. Therefore, I'll assume that you want
to create a new dataset of unduplicated cases, duplication defined by ID,
consisting of the input variables plus the number of encounters. I use
Aggregate.

* please note that this syntax retains the first nonmissing value of a
variable
* within a break group. Specifically, the value of risk_factor is the first

* encountered value of risk_factor in the break group.
Aggregate oufile=*/break=id/encounters=nu/Registration_Date
Mothers_Last_Name
Mothers_First_Name Dob Race Risk_Factor=
first(Registration_Date Mothers_Last_Name
Mothers_First_Name Dob Race Risk_Factor).

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD