effective record join

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

effective record join

wsu_wright
I have two files I wish to join/merge in which one file is an effective
term record table for a dimension.  File 1 is a repeating ID file for
every term a faculty member teaches; file 2 is a repeating ID file for
when they received rank.  While I commonly use match files for
join/merges, I'm having trouble with the logic for the effective term
records in file 2.  I haven't tried star join which may work but thought
I would exhaust other options before diving into that. No python please.

Examples of the data would be:

FILE 1 FILE 2
ID term ID term rank
123 200730     123 200510 1
123 201010     123 201130 2
123 201020     123 201710 3
123 201230     456 201010 1
123 201310     456 201230 2
123 201320
456 200910
456 200920
456 201120
456 201130
789 201030
789 201110


the merged file would be:

ID term rank
123 200730 1
123 201010 1
123 201020 1
123 201230 2
123 201310 2
123 201320 2
456 200910
456 200920
456 201120 1
789 201030
789 201110
456 201130 1

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: effective record join

Bruce Weaver
Administrator
A Google search on <SPSS many to many merge> turned up this old thread:

 
http://spssx-discussion.1045642.n5.nabble.com/Many-to-many-merge-in-SPSS-td1081978.html

Perhaps you'll find something useful there--e.g., David's post on
31-Jan-2014.  

HTH.


coxspss wrote

> I have two files I wish to join/merge in which one file is an effective
> term record table for a dimension.  File 1 is a repeating ID file for
> every term a faculty member teaches; file 2 is a repeating ID file for
> when they received rank.  While I commonly use match files for
> join/merges, I'm having trouble with the logic for the effective term
> records in file 2.  I haven't tried star join which may work but thought
> I would exhaust other options before diving into that. No python please.
>
> Examples of the data would be:
>
> FILE 1 FILE 2
> ID term ID term rank
> 123 200730     123 200510 1
> 123 201010     123 201130 2
> 123 201020     123 201710 3
> 123 201230     456 201010 1
> 123 201310     456 201230 2
> 123 201320
> 456 200910
> 456 200920
> 456 201120
> 456 201130
> 789 201030
> 789 201110
>
>
> the merged file would be:
>
> ID term rank
> 123 200730 1
> 123 201010 1
> 123 201020 1
> 123 201230 2
> 123 201310 2
> 123 201320 2
> 456 200910
> 456 200920
> 456 201120 1
> 789 201030
> 789 201110
> 456 201130 1
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: effective record join

David Marso-2
In reply to this post by wsu_wright
I suspect ADD FILES following up with conditional logic involving LAG would sort this.

ADD FILES /FILE = file1 / FILE = file2 /BY id term.
IF MISSING (rank) AND id = LAG(id) rank= LAG(rank).

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: effective record join

Rich Ulrich
In reply to this post by wsu_wright
When you match on both ID and Term, you don't necessarily get matches.

But if you create a file where you sort together the ID and Term, including Rank
as another variable, the /appropriate/ rank also belongs to the following lines of the same
ID with only the Term variable. To avoid ambiguity about $SYSMIS, I would
probably assign "Rank = 100" to a dummy variable in the Term file; combine files as
sorted by ID, Term and Rank; then, whenever Rank=100, use LAG( ) to pick up the
previous value from the preceding record, after assuring with LAG( ) that the IDs match.

Here is what the combined file looks like, before replacing 100 with the LAGged values
(assuming I didn't screw up while cobbling this by hand). "Rank= 100" obviously needs to
be fixed to whatever is appropriate for "no Rank assigned, yet".

FILE 1 +  FILE 2         
ID      term     rank
123 200510 1
123 200730 100
123 201010 100
123 201020 100
123 201130 2
123 201230 100
123 201310 100

123 201320 100                                 
123 201710 3   
456 200910 100                     
456 200920 100          

456 201010 1
456 201120 100
456 201130 100

456 201230 2
789 201030 100
 .
 .
--
Rich Ulrich

From: SPSSX(r) Discussion <[hidden email]> on behalf of [hidden email] <[hidden email]>
Sent: Wednesday, September 26, 2018 7:18 AM
To: [hidden email]
Subject: effective record join
 
I have two files I wish to join/merge in which one file is an effective
term record table for a dimension.  File 1 is a repeating ID file for
every term a faculty member teaches; file 2 is a repeating ID file for
when they received rank.  While I commonly use match files for
join/merges, I'm having trouble with the logic for the effective term
records in file 2.  I haven't tried star join which may work but thought
I would exhaust other options before diving into that. No python please.

Examples of the data would be:

FILE 1                  FILE 2         
ID      term            ID      term rank
123 200730     123 200510 1
123 201010     123 201130 2
123 201020     123 201710 3
123 201230     456 201010 1
123 201310     456 201230 2
123 201320                             
456 200910                             
456 200920                             
456 201120                             
456 201130                             
789 201030
789 201110


the merged file would be:

ID      term    rank
123     200730  1
123     201010  1
123     201020  1
123     201230  2
123     201310  2
123     201320  2
456     200910 
456     200920 
456 201120 1
789 201030
789 201110
456 201130 1

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: effective record join

David Marso-2
In reply to this post by wsu_wright
Actually a minor addition.

ADD FILES /FILE = file1 /IN=infile1 / FILE = file2 /BY id term.
IF MISSING (rank) AND id = LAG(id) rank= LAG(rank).
FREQUENCIES rank.
DELETE VARIABLES infile1.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: effective record join

wsu_wright
In reply to this post by wsu_wright
Thanks David, worked like a charm.  I've inserted your syntax into mine
below to make a few tweaks for my job.


DATASET CLOSE ALL.
GET FILE='e:\file2.sav'. /* rank award year.
DATASET NAME file2.
GET FILE='e:\file1.sav'. /* teaching terms.
ADD FILES FILE=* /IN=file1 /FILE=file2 /IN=file2 /BY id term.
DATASET CLOSE file2.
IF MISSING (rank) AND id = LAG(id) rank= LAG(rank).
EXE.
SELECT IF (file1=1). /* retain teaching history records only.
FREQUENCIES rank.
DELETE VARIABLES file1 file2.



On Thu, Sep 27, 2018 at 1:54 AM, David Marso wrote:

> Actually a minor addition.
>
> ADD FILES /FILE = file1 /IN=infile1 / FILE = file2 /BY id term.
> IF MISSING (rank) AND id = LAG(id) rank= LAG(rank).
> FREQUENCIES rank.
> DELETE VARIABLES infile1.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except
> the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: effective record join

David Marso
Administrator
In reply to this post by David Marso-2
Needs

SELECT IF infile1 .

Prior to the FREQ command.
I shouldn't be posting at 3 AM ;-)
----------


David Marso-2 wrote
> Actually a minor addition.
>
> ADD FILES /FILE = file1 /IN=infile1 / FILE = file2 /BY id term.
> IF MISSING (rank) AND id = LAG(id) rank= LAG(rank).
> FREQUENCIES rank.
> DELETE VARIABLES infile1.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"