Want to reduce multiple SSN_SCR to one SSN_SCR

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Want to reduce multiple SSN_SCR to one SSN_SCR

Daniel Gerstenfield
I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank
decreases dichotomized into yes=1 and no=0 (demotions).  My data looks
like the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner
and delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Robert Walker
Dan,

DATA LIST FREE /SSN_SCR RANK.
BEGIN DATA
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05
END DATA.

COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Daniel Gerstenfield
I attempted to run the syntax but I have over 100,000 cases like this with
more fields than this.  I am a neophyte with syntax and require any other
commands such as exe.

Thank you for your help.

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob
Walker
Sent: Monday, October 11, 2010 5:42 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Dan,

DATA LIST FREE /SSN_SCR RANK.
BEGIN DATA
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05
END DATA.

COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /
DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank
decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like
the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and
delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Maguin, Eugene
Dan,

>>I attempted to run the syntax but I have over 100,000 cases like this with
more fields than this.

Ok. You ran Bob's syntax. What didn't work? Were there errors? Variables
computed incorrectly? Details are needed. The number of cases is almost
certainly irrelevant.

What do you mean by ' ... More fields that this'? How does that fact enter
into the computation that you requested help with?

Gene Maguin


COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /
DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank
decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like
the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and
delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Daniel Gerstenfield
When I ran the syntax placing the command exe. at the end to run it, I
received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and
"Demotions."  I lost all the other variables.  What is the syntax that would
allow me to keep the other variables?

This is what I got

SSN_SCR      RANK1 RANK2 DEMOTIONS
505381.0         E07     E07    0.0
525462.0                        0.0
528122.0         E01     E04    0.0
530788.0         E01     E02    0.0
546009.0         E03     E05    0.0
549763.0                        0.0
560083.0         E01     E04    0.0
560763.0                        0.0
598286.0                        0.0
605081.0                        0.0
605365.0         E01     E04    0.0
607508.0         E04     E04    1.0
610245.0                        0.0
612805.0                        0.0
613129.0         E02     E02    0.0
614626.0                        0.0
617349.0         E03     E04    0.0
617803.0         E02     E04    0.0

This is what I wanted

SSN_SCR RANK RANK_EFF_DATE DOD_OCC        MOS        AGE  GENDER RACE ETHNIC
AFQT  DEP  MRTL  EDUC HOR_C HOR_S  G_MO   FILE_DATE  D_YR  D_MO  EDUC  SVS
RANK1 RANK2  DEMOTIONS
528122.0         E01     19970221        154200  73C1O           031     F
003      BG      53      01      M       21      US      FL      001
19970331        96.0    2.0     27.0    1.0
530788.0         E01     19970213        110200  31R1O           022     M
999      AL      52      00      N       99      US      WI      001
19970331        96.0    10.0    31.0    1.0

I hope that this is clearer

Thank you,

Dan

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: Tuesday, October 12, 2010 11:29 AM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Dan,

>>I attempted to run the syntax but I have over 100,000 cases like this with
more fields than this.

Ok. You ran Bob's syntax. What didn't work? Were there errors? Variables
computed incorrectly? Details are needed. The number of cases is almost
certainly irrelevant.

What do you mean by ' ... More fields that this'? How does that fact enter
into the computation that you requested help with?

Gene Maguin


COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /
DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank
decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like
the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and
delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Robert Walker
In reply to this post by Daniel Gerstenfield
Hi Dan,

Perhaps you could restate the problem? If you have more variables, simply add them to AGGREGATE. Start by looking up on that command.

Regards,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 11:13 AM
To: Bob Walker; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

I attempted to run the syntax but I have over 100,000 cases like this with more fields than this.  I am a neophyte with syntax and require any other commands such as exe.

Thank you for your help.

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker
Sent: Monday, October 11, 2010 5:42 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Dan,

DATA LIST FREE /SSN_SCR RANK.
BEGIN DATA
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05
END DATA.

COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Daniel Gerstenfield
Bob:  Thank you very much for sharing your knowledge.  I just posted this
but will do so again.

When I ran the syntax without the begin data and end data and placing the
command exe. at the end to run, I received only 4 variables in the output
"SSN_SCR," "Rank1," "Rank2," and "Demotions."  I lost all the other
variables.  What is the syntax that would allow me to keep the other
variables?

This is what I got

SSN_SCR      RANK1 RANK2 DEMOTIONS
505381.0         E07     E07    0.0
525462.0                        0.0
528122.0         E01     E04    0.0
530788.0         E01     E02    0.0
546009.0         E03     E05    0.0
549763.0                        0.0
560083.0         E01     E04    0.0
560763.0                        0.0
598286.0                        0.0
605081.0                        0.0
605365.0         E01     E04    0.0
607508.0         E04     E04    1.0
610245.0                        0.0
612805.0                        0.0
613129.0         E02     E02    0.0
614626.0                        0.0
617349.0         E03     E04    0.0
617803.0         E02     E04    0.0

This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at
the end have no data attached.

SSN_SCR RANK RANK_EFF_DATE DOD_OCC        MOS        AGE  GENDER RACE ETHNIC
AFQT  DEP  MRTL  EDUC HOR_C HOR_S  G_MO   FILE_DATE  D_YR  D_MO  EDUC  SVS
RANK1 RANK2  DEMOTIONS
528122.0         E01     19970221        154200  73C1O           031     F
003      BG      53      01      M       21      US      FL      001
19970331        96.0    2.0     27.0    1.0  E01    E04    0.0
530788.0         E01     19970213        110200  31R1O           022     M
999      AL      52      00      N       99      US      WI      001
19970331        96.0    10.0    31.0    1.0  E01    E02    0.0

I hope that my question is much clearer.

Thank you,

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob
Walker
Sent: Tuesday, October 12, 2010 12:26 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Hi Dan,

Perhaps you could restate the problem? If you have more variables, simply
add them to AGGREGATE. Start by looking up on that command.

Regards,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 11:13 AM
To: Bob Walker; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

I attempted to run the syntax but I have over 100,000 cases like this with
more fields than this.  I am a neophyte with syntax and require any other
commands such as exe.

Thank you for your help.

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob
Walker
Sent: Monday, October 11, 2010 5:42 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Dan,

DATA LIST FREE /SSN_SCR RANK.
BEGIN DATA
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05
END DATA.

COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /
DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank
decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like
the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and
delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Maguin, Eugene
In reply to this post by Daniel Gerstenfield
Yes, that was supposed to happen. You also wanted to keep other variables in
the dataset. But since that information and that requirement was omitted,
....

What your other variables are specifically doesn't matter. All of them will
either be constant, e.g., sex or race/ethnicity, for sets of records with
the same id variable value or changing, e.g., checking account balance. If
they are all constant, the aggregate statement can be modified to keep the
first or last value of each of the variables within each id set. If any are
changing, then that is a problem because you have two choices. 1) you can
use one of the aggregate functions (see the syntax reference for the
function list) to 'summarize' values of the variable. 2) you can change the
problem around by adding your new variables (rank1 rank2 demotions) to every
record in the original dataset.

Gene Maguin






-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 12:06 PM
To: 'Gene Maguin'; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

When I ran the syntax placing the command exe. at the end to run it, I
received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and
"Demotions."  I lost all the other variables.  What is the syntax that would
allow me to keep the other variables?

This is what I got

SSN_SCR      RANK1 RANK2 DEMOTIONS
505381.0         E07     E07    0.0
525462.0                        0.0
528122.0         E01     E04    0.0
530788.0         E01     E02    0.0
546009.0         E03     E05    0.0
549763.0                        0.0
560083.0         E01     E04    0.0
560763.0                        0.0
598286.0                        0.0
605081.0                        0.0
605365.0         E01     E04    0.0
607508.0         E04     E04    1.0
610245.0                        0.0
612805.0                        0.0
613129.0         E02     E02    0.0
614626.0                        0.0
617349.0         E03     E04    0.0
617803.0         E02     E04    0.0

This is what I wanted

SSN_SCR RANK RANK_EFF_DATE DOD_OCC        MOS        AGE  GENDER RACE ETHNIC
AFQT  DEP  MRTL  EDUC HOR_C HOR_S  G_MO   FILE_DATE  D_YR  D_MO  EDUC  SVS
RANK1 RANK2  DEMOTIONS
528122.0         E01     19970221        154200  73C1O           031     F
003      BG      53      01      M       21      US      FL      001
19970331        96.0    2.0     27.0    1.0
530788.0         E01     19970213        110200  31R1O           022     M
999      AL      52      00      N       99      US      WI      001
19970331        96.0    10.0    31.0    1.0

I hope that this is clearer

Thank you,

Dan

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: Tuesday, October 12, 2010 11:29 AM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Dan,

>>I attempted to run the syntax but I have over 100,000 cases like this with
more fields than this.

Ok. You ran Bob's syntax. What didn't work? Were there errors? Variables
computed incorrectly? Details are needed. The number of cases is almost
certainly irrelevant.

What do you mean by ' ... More fields that this'? How does that fact enter
into the computation that you requested help with?

Gene Maguin


COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /
DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank
decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like
the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and
delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Robert Walker
In reply to this post by Daniel Gerstenfield
Hi Dan,

Gene is right... what exactly are we dealing with here.

- I'm assuming that you have just one master file, correct?
- Do you want to maintain the original file structure (multiple rows per SSN), or do you want to end up with one record (row) per SSN?
- Which variables change values from row to row, and which ones do not (i.e., are constant)?

SSN_SCR
RANK [We know that these values change, but what about the rest?]
RANK_EFF_DATE
DOD_OCC
MOS
AGE
GENDER
RACE
ETHNIC
AFQT
DEP
MRTL
EDUC
HOR_C
HOR_S
G_MO
FILE_DATE
D_YR
D_MO
EDUC
SVS

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com


-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 12:47 PM
To: Bob Walker; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

Bob:  Thank you very much for sharing your knowledge.  I just posted this but will do so again.

When I ran the syntax without the begin data and end data and placing the command exe. at the end to run, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions."  I lost all the other variables.  What is the syntax that would allow me to keep the other variables?

This is what I got

SSN_SCR      RANK1 RANK2 DEMOTIONS
505381.0         E07     E07    0.0
525462.0                        0.0
528122.0         E01     E04    0.0
530788.0         E01     E02    0.0
546009.0         E03     E05    0.0
549763.0                        0.0
560083.0         E01     E04    0.0
560763.0                        0.0
598286.0                        0.0
605081.0                        0.0
605365.0         E01     E04    0.0
607508.0         E04     E04    1.0
610245.0                        0.0
612805.0                        0.0
613129.0         E02     E02    0.0
614626.0                        0.0
617349.0         E03     E04    0.0
617803.0         E02     E04    0.0

This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at the end have no data attached.

SSN_SCR RANK RANK_EFF_DATE DOD_OCC        MOS        AGE  GENDER RACE ETHNIC
AFQT  DEP  MRTL  EDUC HOR_C HOR_S  G_MO   FILE_DATE  D_YR  D_MO  EDUC  SVS
RANK1 RANK2  DEMOTIONS
528122.0         E01     19970221        154200  73C1O           031     F
003      BG      53      01      M       21      US      FL      001
19970331        96.0    2.0     27.0    1.0  E01    E04    0.0
530788.0         E01     19970213        110200  31R1O           022     M
999      AL      52      00      N       99      US      WI      001
19970331        96.0    10.0    31.0    1.0  E01    E02    0.0

I hope that my question is much clearer.

Thank you,

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker
Sent: Tuesday, October 12, 2010 12:26 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Hi Dan,

Perhaps you could restate the problem? If you have more variables, simply add them to AGGREGATE. Start by looking up on that command.

Regards,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 11:13 AM
To: Bob Walker; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

I attempted to run the syntax but I have over 100,000 cases like this with more fields than this.  I am a neophyte with syntax and require any other commands such as exe.

Thank you for your help.

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker
Sent: Monday, October 11, 2010 5:42 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Dan,

DATA LIST FREE /SSN_SCR RANK.
BEGIN DATA
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05
END DATA.

COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Daniel Gerstenfield
Bob & Gene:

I am very glad that ya'll are here.

I have 1 master file. What I want to keep constant are the following
variables from the 1st record per SSN.

The variables to remain constant are:

SSN_SCR
RANK [We know that these values change, but what about the rest?]
RANK_EFF_DATE
DOD_OCC
MOS
AGE (Does change over time but am interested in age for 1st record per SSN)
GENDER
RACE
ETHNIC
AFQT
EDUC
HOR_CNTRY
HOR_STATE
GRADE_MONTHS
FILE_DATE
DOT_YR
DOT_MO
ACC_EDUC
SVS

The variables that would be added to those that could possibly change are:

DEPENDENTS (Would want to know if there is a change in the first record to
the last record by having a Dependents1=1st record and Dependents2=last
record)

MRTL_STAT (Would want to know if the person had a change in marital status
with the value showing a difference from the 1st SSN_SCR value compared to
the last SSN_SCR value per SSN_SCR.  MRTL_STAT1=1st MRTLSTAT.
MRTL_STAT2=Last MRTLSTAT.

Added to the data set would be:
Rank1=rank,
Rank2=last rank in set of ranks among SSN_SCR (the records per SSN show
changes in rank from 1st case to last case per SSN_SCR), and
Demotions is dichotomized as occurrence=1 and non-occurrence=0 where there
is a reduction in rank among the records per SSN_SCR.

I hope that I am more succinct.

Thank you,

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob
Walker
Sent: Tuesday, October 12, 2010 1:51 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Hi Dan,

Gene is right... what exactly are we dealing with here.

- I'm assuming that you have just one master file, correct?
- Do you want to maintain the original file structure (multiple rows per
SSN), or do you want to end up with one record (row) per SSN?
- Which variables change values from row to row, and which ones do not
(i.e., are constant)?

SSN_SCR
RANK [We know that these values change, but what about the rest?]
RANK_EFF_DATE
DOD_OCC
MOS
AGE
GENDER
RACE
ETHNIC
AFQT
DEP
MRTL
EDUC
HOR_C
HOR_S
G_MO
FILE_DATE
D_YR
D_MO
EDUC
SVS

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com


-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 12:47 PM
To: Bob Walker; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

Bob:  Thank you very much for sharing your knowledge.  I just posted this
but will do so again.

When I ran the syntax without the begin data and end data and placing the
command exe. at the end to run, I received only 4 variables in the output
"SSN_SCR," "Rank1," "Rank2," and "Demotions."  I lost all the other
variables.  What is the syntax that would allow me to keep the other
variables?

This is what I got

SSN_SCR      RANK1 RANK2 DEMOTIONS
505381.0         E07     E07    0.0
525462.0                        0.0
528122.0         E01     E04    0.0
530788.0         E01     E02    0.0
546009.0         E03     E05    0.0
549763.0                        0.0
560083.0         E01     E04    0.0
560763.0                        0.0
598286.0                        0.0
605081.0                        0.0
605365.0         E01     E04    0.0
607508.0         E04     E04    1.0
610245.0                        0.0
612805.0                        0.0
613129.0         E02     E02    0.0
614626.0                        0.0
617349.0         E03     E04    0.0
617803.0         E02     E04    0.0

This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at
the end have no data attached.

SSN_SCR RANK RANK_EFF_DATE DOD_OCC        MOS        AGE  GENDER RACE ETHNIC
AFQT  DEP  MRTL  EDUC HOR_C HOR_S  G_MO   FILE_DATE  D_YR  D_MO  EDUC  SVS
RANK1 RANK2  DEMOTIONS
528122.0         E01     19970221        154200  73C1O           031     F
003      BG      53      01      M       21      US      FL      001
19970331        96.0    2.0     27.0    1.0  E01    E04    0.0
530788.0         E01     19970213        110200  31R1O           022     M
999      AL      52      00      N       99      US      WI      001
19970331        96.0    10.0    31.0    1.0  E01    E02    0.0

I hope that my question is much clearer.

Thank you,

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob
Walker
Sent: Tuesday, October 12, 2010 12:26 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Hi Dan,

Perhaps you could restate the problem? If you have more variables, simply
add them to AGGREGATE. Start by looking up on that command.

Regards,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 11:13 AM
To: Bob Walker; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

I attempted to run the syntax but I have over 100,000 cases like this with
more fields than this.  I am a neophyte with syntax and require any other
commands such as exe.

Thank you for your help.

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob
Walker
Sent: Monday, October 11, 2010 5:42 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Dan,

DATA LIST FREE /SSN_SCR RANK.
BEGIN DATA
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05
END DATA.

COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /
DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank
decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like
the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and
delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Robert Walker
Hi Dan,

You would need to do something like this (not tested)...

COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR
 /RANK_EFF_DATE=FIRST(RANK_EFF_DATE)
 /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE) /GENDER=FIRST(GENDER)
 /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT) /EDUC=FIRST(EDUC)
 /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO)
 /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO) /EDUC=FIRST(EDUC)
 /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP)
 /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL)
 /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE).

The last three lines pick up the first and last record values for DEP, MRTL, AND RANK. And DEMOTIONS works the same way as before.

FYI, you're not really doing any "aggregating" in any of this -- you are simply taking advantage of functions available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too.

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 6:10 PM
To: Bob Walker; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

Bob & Gene:

I am very glad that ya'll are here.

I have 1 master file. What I want to keep constant are the following variables from the 1st record per SSN.

The variables to remain constant are:

SSN_SCR
RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE (Does change over time but am interested in age for 1st record per SSN) GENDER RACE ETHNIC AFQT EDUC HOR_CNTRY HOR_STATE GRADE_MONTHS FILE_DATE DOT_YR DOT_MO ACC_EDUC SVS

The variables that would be added to those that could possibly change are:

DEPENDENTS (Would want to know if there is a change in the first record to the last record by having a Dependents1=1st record and Dependents2=last
record)

MRTL_STAT (Would want to know if the person had a change in marital status with the value showing a difference from the 1st SSN_SCR value compared to the last SSN_SCR value per SSN_SCR.  MRTL_STAT1=1st MRTLSTAT.
MRTL_STAT2=Last MRTLSTAT.

Added to the data set would be:
Rank1=rank,
Rank2=last rank in set of ranks among SSN_SCR (the records per SSN show changes in rank from 1st case to last case per SSN_SCR), and Demotions is dichotomized as occurrence=1 and non-occurrence=0 where there is a reduction in rank among the records per SSN_SCR.

I hope that I am more succinct.

Thank you,

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker
Sent: Tuesday, October 12, 2010 1:51 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Hi Dan,

Gene is right... what exactly are we dealing with here.

- I'm assuming that you have just one master file, correct?
- Do you want to maintain the original file structure (multiple rows per SSN), or do you want to end up with one record (row) per SSN?
- Which variables change values from row to row, and which ones do not (i.e., are constant)?

SSN_SCR
RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com


-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 12:47 PM
To: Bob Walker; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

Bob:  Thank you very much for sharing your knowledge.  I just posted this but will do so again.

When I ran the syntax without the begin data and end data and placing the command exe. at the end to run, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions."  I lost all the other variables.  What is the syntax that would allow me to keep the other variables?

This is what I got

SSN_SCR      RANK1 RANK2 DEMOTIONS
505381.0         E07     E07    0.0
525462.0                        0.0
528122.0         E01     E04    0.0
530788.0         E01     E02    0.0
546009.0         E03     E05    0.0
549763.0                        0.0
560083.0         E01     E04    0.0
560763.0                        0.0
598286.0                        0.0
605081.0                        0.0
605365.0         E01     E04    0.0
607508.0         E04     E04    1.0
610245.0                        0.0
612805.0                        0.0
613129.0         E02     E02    0.0
614626.0                        0.0
617349.0         E03     E04    0.0
617803.0         E02     E04    0.0

This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at the end have no data attached.

SSN_SCR RANK RANK_EFF_DATE DOD_OCC        MOS        AGE  GENDER RACE ETHNIC
AFQT  DEP  MRTL  EDUC HOR_C HOR_S  G_MO   FILE_DATE  D_YR  D_MO  EDUC  SVS
RANK1 RANK2  DEMOTIONS
528122.0         E01     19970221        154200  73C1O           031     F
003      BG      53      01      M       21      US      FL      001
19970331        96.0    2.0     27.0    1.0  E01    E04    0.0
530788.0         E01     19970213        110200  31R1O           022     M
999      AL      52      00      N       99      US      WI      001
19970331        96.0    10.0    31.0    1.0  E01    E02    0.0

I hope that my question is much clearer.

Thank you,

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker
Sent: Tuesday, October 12, 2010 12:26 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Hi Dan,

Perhaps you could restate the problem? If you have more variables, simply add them to AGGREGATE. Start by looking up on that command.

Regards,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: Daniel W. Gerstenfield [mailto:[hidden email]]
Sent: Tuesday, October 12, 2010 11:13 AM
To: Bob Walker; [hidden email]
Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR

I attempted to run the syntax but I have over 100,000 cases like this with more fields than this.  I am a neophyte with syntax and require any other commands such as exe.

Thank you for your help.

Dan Gerstenfield

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker
Sent: Monday, October 11, 2010 5:42 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Dan,

DATA LIST FREE /SSN_SCR RANK.
BEGIN DATA
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05
END DATA.

COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE).

HTH,

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield
Sent: Monday, October 11, 2010 4:53 PM
To: [hidden email]
Subject: Want to reduce multiple SSN_SCR to one SSN_SCR

I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables.
I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like the following:

SSN_SCR  RANK
112110    01
112110    02
112110    03
112110    03
112110    04
112110    04
112111    03
112111    01
112111    02
112111    03
112111    04
112112    03
112112    03
112112    04
112112    05

I want the table to look like this
SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
112110    01      04          0
112111    03      04          1
112112    03      05          0

I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR.  Please help.

Thank you,

Daniel Gerstenfield

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

David Marso
Administrator
In reply to this post by Daniel Gerstenfield
Bouncing off of Bob's astute observation
">FYI, you're not really doing any "aggregating" in any of this -- you are
simply taking advantage of functions available in AGGREGATE. The FIRST and
LAST functions retain the variable type and formatting, too."

SO!!!...Might as well just do a simple MATCH FILES with FIRST and LAST
operators.  Blast everything else then lag the desired variables from the
first record.

Something like: (Untested - It's late here!- and OP should have the FM
available...OTOH, there are SCADS of examples of MATCH and LAG in the
archives (Google is your friend!!!).

* From Bob's code *.
COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
*----*.

MATCH FILES
    / FILE *
    / BY SSN_SCR
    / FIRST=TOP
    / LAST=BOT.
SELECT IF TOP OR BOT.
You now have only 2 records per case (perhaps 1 if you have a noob).
USE LAG function to drag the things you want from the first record to the
second.
DO IF BOT and NOT(TOP).
+  DO REPEAT V = List of desired variables from first record...If you want
them from both first and second record then those you need to LAG separately
into new unique variables.
+    COMPUTE V=LAG(V).
+  END REPEAT.
END IF.
FREQ something... Why waste a data pass on an EXECUTE?...
SELECT IF BOT.
Now time to drag out the Syntax Manual and review what people have posted,
LAG, AGGREGATE, MATCH (FIRST and LAST operators), DO REPEAT, DO IF.
HTH, David
----------------------
On Tue, 12 Oct 2010 20:45:15 -0400, Bob Walker <[hidden email]> wrote:

>Hi Dan,
>
>You would need to do something like this (not tested)...
>
>COMPUTE DEMOTE=0.
>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
>AGGREGATE OUTFILE=* /BREAK=SSN_SCR
> /RANK_EFF_DATE=FIRST(RANK_EFF_DATE)
> /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE) /GENDER=FIRST(GENDER)
> /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT) /EDUC=FIRST(EDUC)
> /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO)
> /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO)
/EDUC=FIRST(EDUC)
> /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP)
> /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL)
> /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE).
>
>The last three lines pick up the first and last record values for DEP,
MRTL, AND RANK. And DEMOTIONS works the same way as before.
>
>FYI, you're not really doing any "aggregating" in any of this -- you are
simply taking advantage of functions available in AGGREGATE. The FIRST and
LAST functions retain the variable type and formatting, too.

>
>HTH,
>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>-----Original Message-----
>From: Daniel W. Gerstenfield [mailto:[hidden email]]
>Sent: Tuesday, October 12, 2010 6:10 PM
>To: Bob Walker; [hidden email]
>Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Bob & Gene:
>
>I am very glad that ya'll are here.
>
>I have 1 master file. What I want to keep constant are the following
variables from the 1st record per SSN.
>
>The variables to remain constant are:
>
>SSN_SCR
>RANK [We know that these values change, but what about the rest?]
RANK_EFF_DATE DOD_OCC MOS AGE (Does change over time but am interested in
age for 1st record per SSN) GENDER RACE ETHNIC AFQT EDUC HOR_CNTRY HOR_STATE
GRADE_MONTHS FILE_DATE DOT_YR DOT_MO ACC_EDUC SVS
>
>The variables that would be added to those that could possibly change are:
>
>DEPENDENTS (Would want to know if there is a change in the first record to
the last record by having a Dependents1=1st record and Dependents2=last
>record)
>
>MRTL_STAT (Would want to know if the person had a change in marital status
with the value showing a difference from the 1st SSN_SCR value compared to
the last SSN_SCR value per SSN_SCR.  MRTL_STAT1=1st MRTLSTAT.
>MRTL_STAT2=Last MRTLSTAT.
>
>Added to the data set would be:
>Rank1=rank,
>Rank2=last rank in set of ranks among SSN_SCR (the records per SSN show
changes in rank from 1st case to last case per SSN_SCR), and Demotions is
dichotomized as occurrence=1 and non-occurrence=0 where there is a reduction
in rank among the records per SSN_SCR.
>
>I hope that I am more succinct.
>
>Thank you,
>
>Dan Gerstenfield
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bob Walker

>Sent: Tuesday, October 12, 2010 1:51 PM
>To: [hidden email]
>Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Hi Dan,
>
>Gene is right... what exactly are we dealing with here.
>
>- I'm assuming that you have just one master file, correct?
>- Do you want to maintain the original file structure (multiple rows per
SSN), or do you want to end up with one record (row) per SSN?
>- Which variables change values from row to row, and which ones do not
(i.e., are constant)?
>
>SSN_SCR
>RANK [We know that these values change, but what about the rest?]
RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C
HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS

>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>
>-----Original Message-----
>From: Daniel W. Gerstenfield [mailto:[hidden email]]
>Sent: Tuesday, October 12, 2010 12:47 PM
>To: Bob Walker; [hidden email]
>Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Bob:  Thank you very much for sharing your knowledge.  I just posted this
but will do so again.
>
>When I ran the syntax without the begin data and end data and placing the
command exe. at the end to run, I received only 4 variables in the output
"SSN_SCR," "Rank1," "Rank2," and "Demotions."  I lost all the other
variables.  What is the syntax that would allow me to keep the other variables?

>
>This is what I got
>
>SSN_SCR      RANK1 RANK2 DEMOTIONS
>505381.0         E07     E07    0.0
>525462.0                        0.0
>528122.0         E01     E04    0.0
>530788.0         E01     E02    0.0
>546009.0         E03     E05    0.0
>549763.0                        0.0
>560083.0         E01     E04    0.0
>560763.0                        0.0
>598286.0                        0.0
>605081.0                        0.0
>605365.0         E01     E04    0.0
>607508.0         E04     E04    1.0
>610245.0                        0.0
>612805.0                        0.0
>613129.0         E02     E02    0.0
>614626.0                        0.0
>617349.0         E03     E04    0.0
>617803.0         E02     E04    0.0
>
>This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at
the end have no data attached.

>
>SSN_SCR RANK RANK_EFF_DATE DOD_OCC        MOS        AGE  GENDER RACE ETHNIC
>AFQT  DEP  MRTL  EDUC HOR_C HOR_S  G_MO   FILE_DATE  D_YR  D_MO  EDUC  SVS
>RANK1 RANK2  DEMOTIONS
>528122.0         E01     19970221        154200  73C1O           031     F
>003      BG      53      01      M       21      US      FL      001
>19970331        96.0    2.0     27.0    1.0  E01    E04    0.0
>530788.0         E01     19970213        110200  31R1O           022     M
>999      AL      52      00      N       99      US      WI      001
>19970331        96.0    10.0    31.0    1.0  E01    E02    0.0
>
>I hope that my question is much clearer.
>
>Thank you,
>
>Dan Gerstenfield
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bob Walker
>Sent: Tuesday, October 12, 2010 12:26 PM
>To: [hidden email]
>Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Hi Dan,
>
>Perhaps you could restate the problem? If you have more variables, simply
add them to AGGREGATE. Start by looking up on that command.

>
>Regards,
>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>-----Original Message-----
>From: Daniel W. Gerstenfield [mailto:[hidden email]]
>Sent: Tuesday, October 12, 2010 11:13 AM
>To: Bob Walker; [hidden email]
>Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>I attempted to run the syntax but I have over 100,000 cases like this with
more fields than this.  I am a neophyte with syntax and require any other
commands such as exe.
>
>Thank you for your help.
>
>Dan Gerstenfield
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bob Walker

>Sent: Monday, October 11, 2010 5:42 PM
>To: [hidden email]
>Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Dan,
>
>DATA LIST FREE /SSN_SCR RANK.
>BEGIN DATA
>112110    01
>112110    02
>112110    03
>112110    03
>112110    04
>112110    04
>112111    03
>112111    01
>112111    02
>112111    03
>112111    04
>112112    03
>112112    03
>112112    04
>112112    05
>END DATA.
>
>COMPUTE DEMOTE=0.
>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
>AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /
DEMOTIONS=MAX(DEMOTE).
>
>HTH,
>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Daniel Gerstenfield
>Sent: Monday, October 11, 2010 4:53 PM
>To: [hidden email]
>Subject: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables.
>I want a column for beginning rank (RANK), ending rank (RANK_Change), rank
decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like
the following:

>
>SSN_SCR  RANK
>112110    01
>112110    02
>112110    03
>112110    03
>112110    04
>112110    04
>112111    03
>112111    01
>112111    02
>112111    03
>112111    04
>112112    03
>112112    03
>112112    04
>112112    05
>
>I want the table to look like this
>SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
>112110    01      04          0
>112111    03      04          1
>112112    03      05          0
>
>I need help in coming up with a syntax to format my data in this manner and
delete the repeated SSN_SCR.  Please help.
>
>Thank you,
>
>Daniel Gerstenfield
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Robert Walker
Hey David,

Good idea... using AGGREGATE is verbose, but Dan needs that one variable (DEMOTIONS) to capture reversals on *any* record within each SSN-SCR, so keeping just first/last records using MATCH FILES would miss this.

BTW, I see an error in my earlier example -- I duplicated /EDUC=FIRST(EDUC), so I think that the second occurrence should have been ACC_EDUC=FIRST(ACC_EDUC) or whatever that variable should have been.

Thanks!

Bob Walker
Surveys & Forecasts, LLC
www.safllc.com

-----Original Message-----
From: David Marso [mailto:[hidden email]]
Sent: Wednesday, October 13, 2010 1:23 AM
To: [hidden email]; Bob Walker
Cc: David Marso
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Bouncing off of Bob's astute observation ">FYI, you're not really doing any "aggregating" in any of this -- you are simply taking advantage of functions available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too."

SO!!!...Might as well just do a simple MATCH FILES with FIRST and LAST operators.  Blast everything else then lag the desired variables from the first record.

Something like: (Untested - It's late here!- and OP should have the FM available...OTOH, there are SCADS of examples of MATCH and LAG in the archives (Google is your friend!!!).

* From Bob's code *.
COMPUTE DEMOTE=0.
IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
*----*.

MATCH FILES
    / FILE *
    / BY SSN_SCR
    / FIRST=TOP
    / LAST=BOT.
SELECT IF TOP OR BOT.
You now have only 2 records per case (perhaps 1 if you have a noob).
USE LAG function to drag the things you want from the first record to the second.
DO IF BOT and NOT(TOP).
+  DO REPEAT V = List of desired variables from first record...If you
+ want
them from both first and second record then those you need to LAG separately into new unique variables.
+    COMPUTE V=LAG(V).
+  END REPEAT.
END IF.
FREQ something... Why waste a data pass on an EXECUTE?...
SELECT IF BOT.
Now time to drag out the Syntax Manual and review what people have posted, LAG, AGGREGATE, MATCH (FIRST and LAST operators), DO REPEAT, DO IF.
HTH, David
----------------------
On Tue, 12 Oct 2010 20:45:15 -0400, Bob Walker <[hidden email]> wrote:

>Hi Dan,
>
>You would need to do something like this (not tested)...
>
>COMPUTE DEMOTE=0.
>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
>AGGREGATE OUTFILE=* /BREAK=SSN_SCR
> /RANK_EFF_DATE=FIRST(RANK_EFF_DATE)
> /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE)
>/GENDER=FIRST(GENDER)
> /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT)
>/EDUC=FIRST(EDUC)
> /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO)
> /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO)
/EDUC=FIRST(EDUC)
> /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP)
> /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL)
> /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE).
>
>The last three lines pick up the first and last record values for DEP,
MRTL, AND RANK. And DEMOTIONS works the same way as before.
>
>FYI, you're not really doing any "aggregating" in any of this -- you
>are
simply taking advantage of functions available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too.

>
>HTH,
>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>-----Original Message-----
>From: Daniel W. Gerstenfield [mailto:[hidden email]]
>Sent: Tuesday, October 12, 2010 6:10 PM
>To: Bob Walker; [hidden email]
>Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Bob & Gene:
>
>I am very glad that ya'll are here.
>
>I have 1 master file. What I want to keep constant are the following
variables from the 1st record per SSN.
>
>The variables to remain constant are:
>
>SSN_SCR
>RANK [We know that these values change, but what about the rest?]
RANK_EFF_DATE DOD_OCC MOS AGE (Does change over time but am interested in age for 1st record per SSN) GENDER RACE ETHNIC AFQT EDUC HOR_CNTRY HOR_STATE GRADE_MONTHS FILE_DATE DOT_YR DOT_MO ACC_EDUC SVS
>
>The variables that would be added to those that could possibly change are:
>
>DEPENDENTS (Would want to know if there is a change in the first record
>to
the last record by having a Dependents1=1st record and Dependents2=last
>record)
>
>MRTL_STAT (Would want to know if the person had a change in marital
>status
with the value showing a difference from the 1st SSN_SCR value compared to the last SSN_SCR value per SSN_SCR.  MRTL_STAT1=1st MRTLSTAT.
>MRTL_STAT2=Last MRTLSTAT.
>
>Added to the data set would be:
>Rank1=rank,
>Rank2=last rank in set of ranks among SSN_SCR (the records per SSN show
changes in rank from 1st case to last case per SSN_SCR), and Demotions is dichotomized as occurrence=1 and non-occurrence=0 where there is a reduction in rank among the records per SSN_SCR.

>
>I hope that I am more succinct.
>
>Thank you,
>
>Dan Gerstenfield
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf
>Of
Bob Walker

>Sent: Tuesday, October 12, 2010 1:51 PM
>To: [hidden email]
>Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Hi Dan,
>
>Gene is right... what exactly are we dealing with here.
>
>- I'm assuming that you have just one master file, correct?
>- Do you want to maintain the original file structure (multiple rows
>per
SSN), or do you want to end up with one record (row) per SSN?
>- Which variables change values from row to row, and which ones do not
(i.e., are constant)?
>
>SSN_SCR
>RANK [We know that these values change, but what about the rest?]
RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS

>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>
>-----Original Message-----
>From: Daniel W. Gerstenfield [mailto:[hidden email]]
>Sent: Tuesday, October 12, 2010 12:47 PM
>To: Bob Walker; [hidden email]
>Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Bob:  Thank you very much for sharing your knowledge.  I just posted
>this
but will do so again.
>
>When I ran the syntax without the begin data and end data and placing
>the
command exe. at the end to run, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions."  I lost all the other variables.  What is the syntax that would allow me to keep the other variables?

>
>This is what I got
>
>SSN_SCR      RANK1 RANK2 DEMOTIONS
>505381.0         E07     E07    0.0
>525462.0                        0.0
>528122.0         E01     E04    0.0
>530788.0         E01     E02    0.0
>546009.0         E03     E05    0.0
>549763.0                        0.0
>560083.0         E01     E04    0.0
>560763.0                        0.0
>598286.0                        0.0
>605081.0                        0.0
>605365.0         E01     E04    0.0
>607508.0         E04     E04    1.0
>610245.0                        0.0
>612805.0                        0.0
>613129.0         E02     E02    0.0
>614626.0                        0.0
>617349.0         E03     E04    0.0
>617803.0         E02     E04    0.0
>
>This is what I wanted (the variables "Rank1," "Rank2," and "Demotions"
>at
the end have no data attached.

>
>SSN_SCR RANK RANK_EFF_DATE DOD_OCC        MOS        AGE  GENDER RACE ETHNIC
>AFQT  DEP  MRTL  EDUC HOR_C HOR_S  G_MO   FILE_DATE  D_YR  D_MO  EDUC  SVS
>RANK1 RANK2  DEMOTIONS
>528122.0         E01     19970221        154200  73C1O           031     F
>003      BG      53      01      M       21      US      FL      001
>19970331        96.0    2.0     27.0    1.0  E01    E04    0.0
>530788.0         E01     19970213        110200  31R1O           022     M
>999      AL      52      00      N       99      US      WI      001
>19970331        96.0    10.0    31.0    1.0  E01    E02    0.0
>
>I hope that my question is much clearer.
>
>Thank you,
>
>Dan Gerstenfield
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf
>Of
Bob Walker
>Sent: Tuesday, October 12, 2010 12:26 PM
>To: [hidden email]
>Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Hi Dan,
>
>Perhaps you could restate the problem? If you have more variables,
>simply
add them to AGGREGATE. Start by looking up on that command.

>
>Regards,
>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>-----Original Message-----
>From: Daniel W. Gerstenfield [mailto:[hidden email]]
>Sent: Tuesday, October 12, 2010 11:13 AM
>To: Bob Walker; [hidden email]
>Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>I attempted to run the syntax but I have over 100,000 cases like this
>with
more fields than this.  I am a neophyte with syntax and require any other commands such as exe.
>
>Thank you for your help.
>
>Dan Gerstenfield
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf
>Of
Bob Walker

>Sent: Monday, October 11, 2010 5:42 PM
>To: [hidden email]
>Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Dan,
>
>DATA LIST FREE /SSN_SCR RANK.
>BEGIN DATA
>112110    01
>112110    02
>112110    03
>112110    03
>112110    04
>112110    04
>112111    03
>112111    01
>112111    02
>112111    03
>112111    04
>112112    03
>112112    03
>112112    04
>112112    05
>END DATA.
>
>COMPUTE DEMOTE=0.
>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
>AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK)
>/
DEMOTIONS=MAX(DEMOTE).

>
>HTH,
>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf
>Of
Daniel Gerstenfield
>Sent: Monday, October 11, 2010 4:53 PM
>To: [hidden email]
>Subject: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables.
>I want a column for beginning rank (RANK), ending rank (RANK_Change),
>rank
decreases dichotomized into yes=1 and no=0 (demotions).  My data looks like the following:

>
>SSN_SCR  RANK
>112110    01
>112110    02
>112110    03
>112110    03
>112110    04
>112110    04
>112111    03
>112111    01
>112111    02
>112111    03
>112111    04
>112112    03
>112112    03
>112112    04
>112112    05
>
>I want the table to look like this
>SSN_SCR  RANK  RANK_CHANGE DEMOTIONS
>112110    01      04          0
>112111    03      04          1
>112112    03      05          0
>
>I need help in coming up with a syntax to format my data in this manner
>and
delete the repeated SSN_SCR.  Please help.
>
>Thank you,
>
>Daniel Gerstenfield
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except
>the command. To leave the list, send the command SIGNOFF SPSSX-L For a
>list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

David Marso
Administrator
In reply to this post by Daniel Gerstenfield
Good point Bob,
I need to stop posting after midnight ;-)
We can do it as follows: LEAVE is the silver bullet here.
IF SSN_SCR NE LAG(SSN_SCR)  DEMOTIONS=0.
DO IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK).
COMPUTE DEMOTIONS=1.
LEAVE DEMOTIONS.
END IF .
MATCH FILES / FILE * / BY SSN_SCR  / FIRST=TOP / LAST=BOT.
HTH, David
--
On Wed, 13 Oct 2010 11:14:54 -0400, Bob Walker <[hidden email]> wrote:

>Hey David,
>
>Good idea... using AGGREGATE is verbose, but Dan needs that one variable
(DEMOTIONS) to capture reversals on *any* record within each SSN-SCR, so
keeping just first/last records using MATCH FILES would miss this.
>
>BTW, I see an error in my earlier example -- I duplicated
/EDUC=FIRST(EDUC), so I think that the second occurrence should have been
ACC_EDUC=FIRST(ACC_EDUC) or whatever that variable should have been.

>
>Thanks!
>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>-----Original Message-----
>From: David Marso [mailto:[hidden email]]
>Sent: Wednesday, October 13, 2010 1:23 AM
>To: [hidden email]; Bob Walker
>Cc: David Marso
>Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Bouncing off of Bob's astute observation ">FYI, you're not really doing any
"aggregating" in any of this -- you are simply taking advantage of functions
available in AGGREGATE. The FIRST and LAST functions retain the variable
type and formatting, too."
>
>SO!!!...Might as well just do a simple MATCH FILES with FIRST and LAST
operators.  Blast everything else then lag the desired variables from the
first record.
>
>Something like: (Untested - It's late here!- and OP should have the FM
available...OTOH, there are SCADS of examples of MATCH and LAG in the
archives (Google is your friend!!!).

>
>* From Bob's code *.
>COMPUTE DEMOTE=0.
>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
>*----*.
>
>MATCH FILES
>    / FILE *
>    / BY SSN_SCR
>    / FIRST=TOP
>    / LAST=BOT.
>SELECT IF TOP OR BOT.
>You now have only 2 records per case (perhaps 1 if you have a noob).
>USE LAG function to drag the things you want from the first record to the
second.
>DO IF BOT and NOT(TOP).
>+  DO REPEAT V = List of desired variables from first record...If you
>+ want
>them from both first and second record then those you need to LAG
separately into new unique variables.
>+    COMPUTE V=LAG(V).
>+  END REPEAT.
>END IF.
>FREQ something... Why waste a data pass on an EXECUTE?...
>SELECT IF BOT.
>Now time to drag out the Syntax Manual and review what people have posted,
LAG, AGGREGATE, MATCH (FIRST and LAST operators), DO REPEAT, DO IF.

>HTH, David
>----------------------
>On Tue, 12 Oct 2010 20:45:15 -0400, Bob Walker <[hidden email]> wrote:
>>Hi Dan,
>>
>>You would need to do something like this (not tested)...
>>
>>COMPUTE DEMOTE=0.
>>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
>>AGGREGATE OUTFILE=* /BREAK=SSN_SCR
>> /RANK_EFF_DATE=FIRST(RANK_EFF_DATE)
>> /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE)
>>/GENDER=FIRST(GENDER)
>> /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT)
>>/EDUC=FIRST(EDUC)
>> /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO)
>> /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO)
>/EDUC=FIRST(EDUC)
>> /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP)
>> /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL)
>> /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE).
>>
>>The last three lines pick up the first and last record values for DEP,
>MRTL, AND RANK. And DEMOTIONS works the same way as before.
>>
>>FYI, you're not really doing any "aggregating" in any of this -- you
>>are
>simply taking advantage of functions available in AGGREGATE. The FIRST and
LAST functions retain the variable type and formatting, too.
>>
>>HTH,
>>
>>Bob Walker
>>Surveys & Forecasts, LLC
>>www.safllc.com

>><SNIP>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Daniel Gerstenfield
David & Bob:

I am matching what I did by hand with the syntax that ya'll have been
extremely helpful in providing.  I think it does what I want it to do.  I am
amazed at the power of this program and those that have the extensive
experience in working with it.  Ya'll make it look easy.  I am just on the
front end of this learning curve but look forward to becoming one that can
share the knowledge and help others like me.

Thank you,

Dan

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: Wednesday, October 13, 2010 1:16 PM
To: [hidden email]
Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR

Good point Bob,
I need to stop posting after midnight ;-)
We can do it as follows: LEAVE is the silver bullet here.
IF SSN_SCR NE LAG(SSN_SCR)  DEMOTIONS=0.
DO IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK).
COMPUTE DEMOTIONS=1.
LEAVE DEMOTIONS.
END IF .
MATCH FILES / FILE * / BY SSN_SCR  / FIRST=TOP / LAST=BOT.
HTH, David
--
On Wed, 13 Oct 2010 11:14:54 -0400, Bob Walker <[hidden email]> wrote:

>Hey David,
>
>Good idea... using AGGREGATE is verbose, but Dan needs that one variable
(DEMOTIONS) to capture reversals on *any* record within each SSN-SCR, so
keeping just first/last records using MATCH FILES would miss this.
>
>BTW, I see an error in my earlier example -- I duplicated
/EDUC=FIRST(EDUC), so I think that the second occurrence should have been
ACC_EDUC=FIRST(ACC_EDUC) or whatever that variable should have been.

>
>Thanks!
>
>Bob Walker
>Surveys & Forecasts, LLC
>www.safllc.com
>
>-----Original Message-----
>From: David Marso [mailto:[hidden email]]
>Sent: Wednesday, October 13, 2010 1:23 AM
>To: [hidden email]; Bob Walker
>Cc: David Marso
>Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR
>
>Bouncing off of Bob's astute observation ">FYI, you're not really doing any
"aggregating" in any of this -- you are simply taking advantage of functions
available in AGGREGATE. The FIRST and LAST functions retain the variable
type and formatting, too."
>
>SO!!!...Might as well just do a simple MATCH FILES with FIRST and LAST
operators.  Blast everything else then lag the desired variables from the
first record.
>
>Something like: (Untested - It's late here!- and OP should have the FM
available...OTOH, there are SCADS of examples of MATCH and LAG in the
archives (Google is your friend!!!).

>
>* From Bob's code *.
>COMPUTE DEMOTE=0.
>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
>*----*.
>
>MATCH FILES
>    / FILE *
>    / BY SSN_SCR
>    / FIRST=TOP
>    / LAST=BOT.
>SELECT IF TOP OR BOT.
>You now have only 2 records per case (perhaps 1 if you have a noob).
>USE LAG function to drag the things you want from the first record to the
second.
>DO IF BOT and NOT(TOP).
>+  DO REPEAT V = List of desired variables from first record...If you
>+ want
>them from both first and second record then those you need to LAG
separately into new unique variables.
>+    COMPUTE V=LAG(V).
>+  END REPEAT.
>END IF.
>FREQ something... Why waste a data pass on an EXECUTE?...
>SELECT IF BOT.
>Now time to drag out the Syntax Manual and review what people have posted,
LAG, AGGREGATE, MATCH (FIRST and LAST operators), DO REPEAT, DO IF.

>HTH, David
>----------------------
>On Tue, 12 Oct 2010 20:45:15 -0400, Bob Walker <[hidden email]> wrote:
>>Hi Dan,
>>
>>You would need to do something like this (not tested)...
>>
>>COMPUTE DEMOTE=0.
>>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1.
>>AGGREGATE OUTFILE=* /BREAK=SSN_SCR
>> /RANK_EFF_DATE=FIRST(RANK_EFF_DATE)
>> /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE)
>>/GENDER=FIRST(GENDER)
>> /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT)
>>/EDUC=FIRST(EDUC)
>> /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO)
>> /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO)
>/EDUC=FIRST(EDUC)
>> /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP)
>> /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL)
>> /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE).
>>
>>The last three lines pick up the first and last record values for DEP,
>MRTL, AND RANK. And DEMOTIONS works the same way as before.
>>
>>FYI, you're not really doing any "aggregating" in any of this -- you
>>are
>simply taking advantage of functions available in AGGREGATE. The FIRST and
LAST functions retain the variable type and formatting, too.
>>
>>HTH,
>>
>>Bob Walker
>>Surveys & Forecasts, LLC
>>www.safllc.com

>><SNIP>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD