keyed table merge not working

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

keyed table merge not working

Christianne Lane
Hi all, I am having a problem I've not encountered before. I am merging two files. The first (table_1)
included IDs and randomization information and is my keyed table. The second (table_2) includes pre
and post-test data. There is some participants who dropped out before any testing, so they have
data in table_1 but not table_2. When I merge, these participants are removed from the working file.


This is the code I am using:

MATCH FILES /TABLE=*
  /FILE='/Users/cjl/Desktop/table_2'
  /BY  ID.
EXECUTE.

I am using SPSS version 16 for Mac, which has been more problematic than any version before it. (I
started with version 7 back in the old days.) Has anyone else encountered this problem? Is there a
fix? A work-around? Thanks!!!!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: keyed table merge not working

ViAnn Beadle
The keyed table never contributes cases to the result--it provides data to
cases in the second table. What is your goal here. Do you want to analyze
cases with incomplete information? Also, do you have duplicate ids in your
second table?

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Christianne Lane
Sent: Monday, September 14, 2009 5:40 PM
To: [hidden email]
Subject: keyed table merge not working

Hi all, I am having a problem I've not encountered before. I am merging two
files. The first (table_1)
included IDs and randomization information and is my keyed table. The second
(table_2) includes pre
and post-test data. There is some participants who dropped out before any
testing, so they have
data in table_1 but not table_2. When I merge, these participants are
removed from the working file.


This is the code I am using:

MATCH FILES /TABLE=*
  /FILE='/Users/cjl/Desktop/table_2'
  /BY  ID.
EXECUTE.

I am using SPSS version 16 for Mac, which has been more problematic than any
version before it. (I
started with version 7 back in the old days.) Has anyone else encountered
this problem? Is there a
fix? A work-around? Thanks!!!!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: keyed table merge not working

Christianne Lane
I am working on creating a "conplete" dataset for others in the group
to analyze pieces of.

Guess I will work around it at the query stage. Thanks so much!

Christi



On Sep 14, 2009, at 6:15 PM, ViAnn Beadle <[hidden email]> wrote:

> The keyed table never contributes cases to the result--it provides
> data to
> cases in the second table. What is your goal here. Do you want to
> analyze
> cases with incomplete information? Also, do you have duplicate ids
> in your
> second table?
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On
> Behalf Of
> Christianne Lane
> Sent: Monday, September 14, 2009 5:40 PM
> To: [hidden email]
> Subject: keyed table merge not working
>
> Hi all, I am having a problem I've not encountered before. I am
> merging two
> files. The first (table_1)
> included IDs and randomization information and is my keyed table.
> The second
> (table_2) includes pre
> and post-test data. There is some participants who dropped out
> before any
> testing, so they have
> data in table_1 but not table_2. When I merge, these participants are
> removed from the working file.
>
>
> This is the code I am using:
>
> MATCH FILES /TABLE=*
>  /FILE='/Users/cjl/Desktop/table_2'
>  /BY  ID.
> EXECUTE.
>
> I am using SPSS version 16 for Mac, which has been more problematic
> than any
> version before it. (I
> started with version 7 back in the old days.) Has anyone else
> encountered
> this problem? Is there a
> fix? A work-around? Thanks!!!!
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except
> the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: keyed table merge not working

Maguin, Eugene
In reply to this post by Christianne Lane
Christianne,

I think you have the 'table' and 'file' files reversed. Do this

MATCH FILES /file=*
  /table='/Users/cjl/Desktop/table_2'
  /BY  ID.
EXECUTE.

Why? Table_1 contains everybody and you want to add pre and post data to
that file for every body that has that information.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: keyed table merge not working

Ismail Noor

Hi All,

You need to make sure that none of the table/files has any duplicate cases in them. Then sort both files, same way, by the key variable (here is the ID) before merging.


Ismail K. Noor, Ph.D.,
[hidden email]
http://www.dennonoor.com
(313) 690-0755 (cell)

--- On Tue, 9/15/09, Gene Maguin <[hidden email]> wrote:

From: Gene Maguin <[hidden email]>
Subject: Re: keyed table merge not working
To: [hidden email]
Date: Tuesday, September 15, 2009, 8:46 AM

Christianne,

I think you have the 'table' and 'file' files reversed. Do this

MATCH FILES /file=*
  /table='/Users/cjl/Desktop/table_2'
  /BY  ID.
EXECUTE.

Why? Table_1 contains everybody and you want to add pre and post data to
that file for every body that has that information.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: keyed table merge not working

Richard Ristow
In reply to this post by Christianne Lane
At 07:39 PM 9/14/2009, Christianne Lane wrote:

I am merging two files. The first (table_1) included IDs and randomization information and is my keyed table. The second (table_2) includes pre and post-test data. Some participants dropped out before any testing, so they have data in table_1 but not table_2. When I merge, these participants are removed from the working file.

MATCH FILES /TABLE=*
  /FILE='/Users/cjl/Desktop/table_2'
  /BY  ID.

As ViAnn said, that's how MATCH FILES is meant to work. (An option to include one copy of each unmatched record in the TABLE files would be nice.)

.  If table_2, with pre- and post-test data, has both pre- and post- in the same record, so there's only one record per subject, replace TABLE by FILE and you're all set.

.  Or, here's a way to find and add those table_1 records that didn't match in table_2. It requires that table_1 be available after the MATCH FILES. It uses DATASET logic, but could be modified to use scratch files instead. Tested code:

If this is your table_1:
[table_1]
ID Group

01 Subject
02 Control
03 Control
04 Subject
05 Control
06 Control

Number of cases read:  6    Number of cases listed:  6

and this is the merge, losing some table_1 records:
[merged]
ID Group   Condition Measure

01 Subject Pre         15.5
01 Subject Post        13.1
02 Control Pre          4.7
02 Control Post         5.1
04 Subject Pre          7.9
04 Subject Post        11.2
05 Control Pre         16.6
05 Control Post        17.7

Number of cases read:  8    Number of cases listed:  8

Then:
*  ...  ................................................        ... .
*  ...  Add the unmerged table_1 records:                       ... .
*  ...  ................................................        ... .

*  ...  Create a file of all IDS that *are* matched             ... .

DATASET   DECLARE Matched_IDs.
AGGREGATE OUTFILE=Matched_IDS
   /BREAK   = ID
   /Records = NU.

*  ...  Select those table_1 records that aren't matched:       ... .

DATASET ACTIVATE table_1.
DATASET COPY     Unmatched_IDs.
DATASET ACTIVATE Unmatched_IDs.
MATCH FILES
   /FILE=*
   /FILE=Matched_IDS  /IN=Matched
   /BY  ID.

SELECT IF NOT Matched.
LIST.
|-----------------------------|---------------------------|
|Output Created               |16-SEP-2009 17:50:20       |
|-----------------------------|---------------------------|
[Unmatched_IDs]
ID Group   Records Matched

03 Control       .    0
06 Control       .    0

Number of cases read:  2    Number of cases listed:  2

*  ...  Finally, combine the matched and the unmatched:         ... .

DATASET ACTIVATE merged.
ADD FILES
   /FILE=*
   /FILE=Unmatched_IDs
   /BY   ID
   /DROP=Records Matched.

LIST.
|-----------------------------|---------------------------|
|Output Created               |16-SEP-2009 17:50:21       |
|-----------------------------|---------------------------|
[merged]
ID Group   Condition Measure

01 Subject Pre         15.5
01 Subject Post        13.1
02 Control Pre          4.7
02 Control Post         5.1
03 Control               .
04 Subject Pre          7.9
04 Subject Post        11.2
05 Control Pre         16.6
05 Control Post        17.7
06 Control               .

Number of cases read:  10    Number of cases listed:  10
================================================
APPENDIX: Test file, with test data and all code
================================================
*  C:\Documents and Settings\Richard\My Documents                   .
*    \Technical\spssx-l\Z-2009c\                                    .
*    2009-09-14 Lane - keyed table merge not working.SPS            .

*  In response to posting                                           .
*  Date:    Mon, 14 Sep 2009 19:39:45 -0400                         .
*  From:    Christianne Lane <[hidden email]>                        .
*  Subject: keyed table merge not working                           .
*  To:      [hidden email]                                .

*  "I am merging two files. The first (table_1) included IDs and    .
*  randomization information and is my keyed table. The second      .
*  (table_2) includes pre and post-test data. Some participants     .
*  dropped out before any testing, so they have data in table_1 but .
*  not table_2. When I merge, these participants are removed from   .
*  the working file."                                               .

DATA LIST FREE / ID (F) Group (A7).
BEGIN DATA
1 'Subject' 2 'Control' 3 'Control' 4 'Subject' 5 'Control'
6 'Control'
END DATA.
DATASET NAME table_1.
FORMATS ID (N2).
LIST.

DATA LIST FREE /ID (F) Condition (A4) Measure (F).
BEGIN DATA
1 'Pre' 15.5 1 'Post' 13.1 2 'Pre' 4.7  2 'Post' 5.1
4 'Pre'  7.9 4 'Post' 11.2 5 'Pre' 16.6 5 'Post' 17.7
END DATA.
DATASET NAME table_2.
FORMATS ID(N2)
        Measure (F5.1).
       
LIST.

*  ...  Copy to a separate dataset, for merging                 ... .

DATASET ACTIVATE table_1.
DATASET COPY     merged.
DATASET ACTIVATE merged.

*  ...  Merging code, as posted                                 ... .

MATCH FILES /TABLE=*
  /FILE=table_2
  /BY  ID.
 
LIST.

*  ...  ................................................        ... .
*  ...  Add the unmerged table_1 records:                       ... .
*  ...  ................................................        ... .

*  ...  Create a file of all IDS that *are* matched             ... .

DATASET   DECLARE Matched_IDs.
AGGREGATE OUTFILE=Matched_IDS
   /BREAK   = ID
   /Records = NU.

*  ...  Select those table_1 records that aren't matched:       ... .

DATASET ACTIVATE table_1.
DATASET COPY     Unmatched_IDs.
DATASET ACTIVATE Unmatched_IDs.

MATCH FILES
   /FILE=*
   /FILE=Matched_IDS  /IN=Matched
   /BY  ID.

SELECT IF NOT Matched.
LIST.

*  ...  Finally, combine the matched and the unmatched:         ... .

DATASET ACTIVATE merged.
ADD FILES
   /FILE=*
   /FILE=Unmatched_IDs
   /BY   ID
   /DROP=Records Matched.

LIST.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: keyed table merge not working

Dennis Deck
In reply to this post by Christianne Lane

A simpler approach would be to first create a master file of all the IDs

and then merge the ID, pre, and post files :
(following assumes PREfile and POSTfile were already defined)

 

*** Create IDfile *** .

 

ADD

  /File=PREfile

  /File=POSTfile

    /Keep= ID .

DATASET NAME IDfile .

 

SORT CASES BY ID .

 

AGGREGATE  Outfile=*  . . . /Break= ID .

 

*alternatively you could find first record with an ID and drop any others .

*COMPUTE Keep = 1 .

*IF ($casenum>1 and ID = LAG(ID)) Keep = 0 .

*SELECT IF Keep = 1 .       

 

EXECUTE .

 

*** Create Combined *** .

 

MATCH FILES

   File= IDfile

  /Table= PREfile  /In=HasPre

  /Table=POSTfile /In=HasPost

     /BY ID .

DATASET NAME Combined .

 

COMPUTE PrePost = HasPre + HasPost*2 .

VALUE LABELS  PrePost   1 ‘Pre  2 ‘Post  3 ‘Pre and Post’ .

 

This eliminates steps, reduces chances for error, and is easier to adapt.

Simple variations are possible (eg, maybe you already have an IDfile and a few cases have neither pre nor post)

 

Dennis Deck

RMC Research


Reply | Threaded
Open this post in threaded view
|

Re: keyed table merge not working

Richard Ristow
At 07:41 PM 9/17/2009, Dennis Deck wrote:

A simpler approach would be to first create a master file of all the IDs and then merge the ID, pre, and post files :

It's a good idea; but it won't work in cases like this one, where one or both of the two files can have multiple records for the same ID:

MATCH FILES
   File= IDfile
  /Table= table_1  /In=In_1
  /Table= table_2  /In=In_2
     /BY ID .
DATASET NAME Combined .
File #3
     KEY:        1

>Error # 5131
>Duplicate key on a TABLE file.  Each case on a TABLE file in
>MATCH FILES must be uniquely identified by the BY variables.
>This command not executed.

=====================
APPENDIX: Code tested
=====================
This uses datasets 'table_1' and 'table_2' that were the test data in my previous posting.

ADD FILES
  /File=table_1
  /File=table_2
    /Keep= ID .
 
AGGREGATE  Outfile=*
     /Break= ID
     /NOccur=NU.

DATASET NAME IDfile .

MATCH FILES
   File= IDfile
  /Table= table_1  /In=In_1
  /Table= table_2  /In=In_2
     /BY ID .
DATASET NAME Combined .

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

FW: keyed table merge not working

Dennis Deck
In reply to this post by Christianne Lane

I was under the impression there were two files, pre and post, with one record per case each.
But a simple tweak handles it:.


I use something similar in an existing application where there are multiple updates collected over time.

All are stored in a single table.  I only want to analyze results for the first and last record so use an approach like:

 

  SORT CASES ID (A) Date (A) .   /* ascending */

  COMPUTE First = 1 .

  IF ($Casenum>1 and ID = LAG(ID)) First = 0 .

 

  SORT CASES ID (A) Date (D) .  /* descending */

  COMPUTE Last = 1 .  

   IF ($Casenum>1 and ID = LAG(ID)) Last = 0 .

 

  SELECT IF (First=1 or Last=1) .

 

Now we have a dataset with just the desired pre and post records.  
Next I create a merged data set joining First and Last records and renaming the variables (eg, XXX0 for pre and XXX1 for post).


However, if pre and post are in separate files already where a few cases have multiple records, then simply compute First/Last in the respective files, keep only the unique records, and then join them.

 

 

Dennis