SPSSX Discussion

Getting potential Keyed variables to line up

Classic

List

Threaded

2 messages Options

Ben Lintschinger

Getting potential Keyed variables to line up

I am trying to merge files to add variables, but I don't have
identical Keyed variables. I was wondering if anyone knew a fairly
easy way to "sync-up" variables. The data set has about 10,000 cases,
so it would have to be a process that was time effective with that
volume. Preferably something that matches corresponding cases while
simultatiously informing me what variables are not corresponding
(especially nice would be then filling those in with a corresponding
variable so the merge will be successful).

Here's a very simple example to give you a visual idea:

Dataset 1 DataSet 2 Merged Dataset (Hopefully)
ID Variable 1 Variable 2 Variable 1 Variable 2
9 A 12 Q 9 A -
12 B 100 R 12 B Q
100 C 102 S 100 C R
2210 D 2210 T 102 - S
2210 D T

Thanks
Ben

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Marks, Jim

Re: Getting potential Keyed variables to line up

Ben:

Does this do what you want?

*sample datasets-- added a numeric variable for illustration.
DATA LIST FREE /id (f8.0) variable_A (A1) variable_1 (f8.0).
BEGIN DATA
9 A 50 12 B 51 100 C 52 2210 D 53
END DATA.

DATASET NAME dataset1 WINDOW = FRONT.

DATA LIST FREE /id (f8.0) variable_B (A1) variable_2 (f8.0).
BEGIN DATA
12 Q 99 100 R 98 102 S 97 2210 T 96
END DATA.

DATASET NAME dataset2 WINDOW = FRONT.

* merge datasets.
MATCH FILES /FILE=* /FILE='dataset1'
/BY id.

*flag cases with all variables populated.
COMPUTE full_match =
LEN(RTRIM(variable_A)) > 0 AND LEN(RTRIM(variable_B)) > 0
AND
NOT(SYSMIS(variable_1)) AND NOT(SYSMIS(variable_2)).

*flag cases with variable_A missing.
COMPUTE find_miss_A = LENGTH(RTRIM(variable_A)) = 0.

*find cases with variable_2 missing.
COMPUTE find_miss_2 = SYSMIS(variable_2).

*run transformations.
FREQUENCIES full_match find_miss_A.

Note that the code for full_match has examples for finding missing
values for string variables and numeric variables. It can be extended
for more variable pairs, but it will become cumbersome with too many
variables.

Jim Marks
Director, Market Research
x1616

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Ben Lintschinger
Sent: Wednesday, February 10, 2010 6:58 PM
To: [hidden email]
Subject: Getting potential Keyed variables to line up

I am trying to merge files to add variables, but I don't have
identical Keyed variables. I was wondering if anyone knew a fairly
easy way to "sync-up" variables. The data set has about 10,000 cases,
so it would have to be a process that was time effective with that
volume. Preferably something that matches corresponding cases while
simultatiously informing me what variables are not corresponding
(especially nice would be then filling those in with a corresponding
variable so the merge will be successful).

Here's a very simple example to give you a visual idea:

Dataset 1 DataSet 2
Merged Dataset (Hopefully)
ID Variable 1 Variable 2 Variable 1 Variable 2
9 A 12 Q 9 A
-
12 B 100 R 12 B
Q
100 C 102 S 100 C
R
2210 D 2210 T 102 -
S
2210 D T

Thanks
Ben

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD