|
I have more than 1 case for each SSN_SCR and I need to reduce it to one
SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Dan,
DATA LIST FREE /SSN_SCR RANK. BEGIN DATA 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 END DATA. COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
I attempted to run the syntax but I have over 100,000 cases like this with
more fields than this. I am a neophyte with syntax and require any other commands such as exe. Thank you for your help. Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Monday, October 11, 2010 5:42 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Dan, DATA LIST FREE /SSN_SCR RANK. BEGIN DATA 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 END DATA. COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Dan,
>>I attempted to run the syntax but I have over 100,000 cases like this with more fields than this. Ok. You ran Bob's syntax. What didn't work? Were there errors? Variables computed incorrectly? Details are needed. The number of cases is almost certainly irrelevant. What do you mean by ' ... More fields that this'? How does that fact enter into the computation that you requested help with? Gene Maguin COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
When I ran the syntax placing the command exe. at the end to run it, I
received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions." I lost all the other variables. What is the syntax that would allow me to keep the other variables? This is what I got SSN_SCR RANK1 RANK2 DEMOTIONS 505381.0 E07 E07 0.0 525462.0 0.0 528122.0 E01 E04 0.0 530788.0 E01 E02 0.0 546009.0 E03 E05 0.0 549763.0 0.0 560083.0 E01 E04 0.0 560763.0 0.0 598286.0 0.0 605081.0 0.0 605365.0 E01 E04 0.0 607508.0 E04 E04 1.0 610245.0 0.0 612805.0 0.0 613129.0 E02 E02 0.0 614626.0 0.0 617349.0 E03 E04 0.0 617803.0 E02 E04 0.0 This is what I wanted SSN_SCR RANK RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS RANK1 RANK2 DEMOTIONS 528122.0 E01 19970221 154200 73C1O 031 F 003 BG 53 01 M 21 US FL 001 19970331 96.0 2.0 27.0 1.0 530788.0 E01 19970213 110200 31R1O 022 M 999 AL 52 00 N 99 US WI 001 19970331 96.0 10.0 31.0 1.0 I hope that this is clearer Thank you, Dan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin Sent: Tuesday, October 12, 2010 11:29 AM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Dan, >>I attempted to run the syntax but I have over 100,000 cases like this with more fields than this. Ok. You ran Bob's syntax. What didn't work? Were there errors? Variables computed incorrectly? Details are needed. The number of cases is almost certainly irrelevant. What do you mean by ' ... More fields that this'? How does that fact enter into the computation that you requested help with? Gene Maguin COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Daniel Gerstenfield
Hi Dan,
Perhaps you could restate the problem? If you have more variables, simply add them to AGGREGATE. Start by looking up on that command. Regards, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 11:13 AM To: Bob Walker; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR I attempted to run the syntax but I have over 100,000 cases like this with more fields than this. I am a neophyte with syntax and require any other commands such as exe. Thank you for your help. Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Monday, October 11, 2010 5:42 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Dan, DATA LIST FREE /SSN_SCR RANK. BEGIN DATA 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 END DATA. COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Bob: Thank you very much for sharing your knowledge. I just posted this
but will do so again. When I ran the syntax without the begin data and end data and placing the command exe. at the end to run, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions." I lost all the other variables. What is the syntax that would allow me to keep the other variables? This is what I got SSN_SCR RANK1 RANK2 DEMOTIONS 505381.0 E07 E07 0.0 525462.0 0.0 528122.0 E01 E04 0.0 530788.0 E01 E02 0.0 546009.0 E03 E05 0.0 549763.0 0.0 560083.0 E01 E04 0.0 560763.0 0.0 598286.0 0.0 605081.0 0.0 605365.0 E01 E04 0.0 607508.0 E04 E04 1.0 610245.0 0.0 612805.0 0.0 613129.0 E02 E02 0.0 614626.0 0.0 617349.0 E03 E04 0.0 617803.0 E02 E04 0.0 This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at the end have no data attached. SSN_SCR RANK RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS RANK1 RANK2 DEMOTIONS 528122.0 E01 19970221 154200 73C1O 031 F 003 BG 53 01 M 21 US FL 001 19970331 96.0 2.0 27.0 1.0 E01 E04 0.0 530788.0 E01 19970213 110200 31R1O 022 M 999 AL 52 00 N 99 US WI 001 19970331 96.0 10.0 31.0 1.0 E01 E02 0.0 I hope that my question is much clearer. Thank you, Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Tuesday, October 12, 2010 12:26 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Hi Dan, Perhaps you could restate the problem? If you have more variables, simply add them to AGGREGATE. Start by looking up on that command. Regards, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 11:13 AM To: Bob Walker; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR I attempted to run the syntax but I have over 100,000 cases like this with more fields than this. I am a neophyte with syntax and require any other commands such as exe. Thank you for your help. Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Monday, October 11, 2010 5:42 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Dan, DATA LIST FREE /SSN_SCR RANK. BEGIN DATA 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 END DATA. COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Daniel Gerstenfield
Yes, that was supposed to happen. You also wanted to keep other variables in
the dataset. But since that information and that requirement was omitted, .... What your other variables are specifically doesn't matter. All of them will either be constant, e.g., sex or race/ethnicity, for sets of records with the same id variable value or changing, e.g., checking account balance. If they are all constant, the aggregate statement can be modified to keep the first or last value of each of the variables within each id set. If any are changing, then that is a problem because you have two choices. 1) you can use one of the aggregate functions (see the syntax reference for the function list) to 'summarize' values of the variable. 2) you can change the problem around by adding your new variables (rank1 rank2 demotions) to every record in the original dataset. Gene Maguin -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 12:06 PM To: 'Gene Maguin'; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR When I ran the syntax placing the command exe. at the end to run it, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions." I lost all the other variables. What is the syntax that would allow me to keep the other variables? This is what I got SSN_SCR RANK1 RANK2 DEMOTIONS 505381.0 E07 E07 0.0 525462.0 0.0 528122.0 E01 E04 0.0 530788.0 E01 E02 0.0 546009.0 E03 E05 0.0 549763.0 0.0 560083.0 E01 E04 0.0 560763.0 0.0 598286.0 0.0 605081.0 0.0 605365.0 E01 E04 0.0 607508.0 E04 E04 1.0 610245.0 0.0 612805.0 0.0 613129.0 E02 E02 0.0 614626.0 0.0 617349.0 E03 E04 0.0 617803.0 E02 E04 0.0 This is what I wanted SSN_SCR RANK RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS RANK1 RANK2 DEMOTIONS 528122.0 E01 19970221 154200 73C1O 031 F 003 BG 53 01 M 21 US FL 001 19970331 96.0 2.0 27.0 1.0 530788.0 E01 19970213 110200 31R1O 022 M 999 AL 52 00 N 99 US WI 001 19970331 96.0 10.0 31.0 1.0 I hope that this is clearer Thank you, Dan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin Sent: Tuesday, October 12, 2010 11:29 AM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Dan, >>I attempted to run the syntax but I have over 100,000 cases like this with more fields than this. Ok. You ran Bob's syntax. What didn't work? Were there errors? Variables computed incorrectly? Details are needed. The number of cases is almost certainly irrelevant. What do you mean by ' ... More fields that this'? How does that fact enter into the computation that you requested help with? Gene Maguin COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Daniel Gerstenfield
Hi Dan,
Gene is right... what exactly are we dealing with here. - I'm assuming that you have just one master file, correct? - Do you want to maintain the original file structure (multiple rows per SSN), or do you want to end up with one record (row) per SSN? - Which variables change values from row to row, and which ones do not (i.e., are constant)? SSN_SCR RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 12:47 PM To: Bob Walker; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR Bob: Thank you very much for sharing your knowledge. I just posted this but will do so again. When I ran the syntax without the begin data and end data and placing the command exe. at the end to run, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions." I lost all the other variables. What is the syntax that would allow me to keep the other variables? This is what I got SSN_SCR RANK1 RANK2 DEMOTIONS 505381.0 E07 E07 0.0 525462.0 0.0 528122.0 E01 E04 0.0 530788.0 E01 E02 0.0 546009.0 E03 E05 0.0 549763.0 0.0 560083.0 E01 E04 0.0 560763.0 0.0 598286.0 0.0 605081.0 0.0 605365.0 E01 E04 0.0 607508.0 E04 E04 1.0 610245.0 0.0 612805.0 0.0 613129.0 E02 E02 0.0 614626.0 0.0 617349.0 E03 E04 0.0 617803.0 E02 E04 0.0 This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at the end have no data attached. SSN_SCR RANK RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS RANK1 RANK2 DEMOTIONS 528122.0 E01 19970221 154200 73C1O 031 F 003 BG 53 01 M 21 US FL 001 19970331 96.0 2.0 27.0 1.0 E01 E04 0.0 530788.0 E01 19970213 110200 31R1O 022 M 999 AL 52 00 N 99 US WI 001 19970331 96.0 10.0 31.0 1.0 E01 E02 0.0 I hope that my question is much clearer. Thank you, Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Tuesday, October 12, 2010 12:26 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Hi Dan, Perhaps you could restate the problem? If you have more variables, simply add them to AGGREGATE. Start by looking up on that command. Regards, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 11:13 AM To: Bob Walker; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR I attempted to run the syntax but I have over 100,000 cases like this with more fields than this. I am a neophyte with syntax and require any other commands such as exe. Thank you for your help. Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Monday, October 11, 2010 5:42 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Dan, DATA LIST FREE /SSN_SCR RANK. BEGIN DATA 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 END DATA. COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Bob & Gene:
I am very glad that ya'll are here. I have 1 master file. What I want to keep constant are the following variables from the 1st record per SSN. The variables to remain constant are: SSN_SCR RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE (Does change over time but am interested in age for 1st record per SSN) GENDER RACE ETHNIC AFQT EDUC HOR_CNTRY HOR_STATE GRADE_MONTHS FILE_DATE DOT_YR DOT_MO ACC_EDUC SVS The variables that would be added to those that could possibly change are: DEPENDENTS (Would want to know if there is a change in the first record to the last record by having a Dependents1=1st record and Dependents2=last record) MRTL_STAT (Would want to know if the person had a change in marital status with the value showing a difference from the 1st SSN_SCR value compared to the last SSN_SCR value per SSN_SCR. MRTL_STAT1=1st MRTLSTAT. MRTL_STAT2=Last MRTLSTAT. Added to the data set would be: Rank1=rank, Rank2=last rank in set of ranks among SSN_SCR (the records per SSN show changes in rank from 1st case to last case per SSN_SCR), and Demotions is dichotomized as occurrence=1 and non-occurrence=0 where there is a reduction in rank among the records per SSN_SCR. I hope that I am more succinct. Thank you, Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Tuesday, October 12, 2010 1:51 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Hi Dan, Gene is right... what exactly are we dealing with here. - I'm assuming that you have just one master file, correct? - Do you want to maintain the original file structure (multiple rows per SSN), or do you want to end up with one record (row) per SSN? - Which variables change values from row to row, and which ones do not (i.e., are constant)? SSN_SCR RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 12:47 PM To: Bob Walker; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR Bob: Thank you very much for sharing your knowledge. I just posted this but will do so again. When I ran the syntax without the begin data and end data and placing the command exe. at the end to run, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions." I lost all the other variables. What is the syntax that would allow me to keep the other variables? This is what I got SSN_SCR RANK1 RANK2 DEMOTIONS 505381.0 E07 E07 0.0 525462.0 0.0 528122.0 E01 E04 0.0 530788.0 E01 E02 0.0 546009.0 E03 E05 0.0 549763.0 0.0 560083.0 E01 E04 0.0 560763.0 0.0 598286.0 0.0 605081.0 0.0 605365.0 E01 E04 0.0 607508.0 E04 E04 1.0 610245.0 0.0 612805.0 0.0 613129.0 E02 E02 0.0 614626.0 0.0 617349.0 E03 E04 0.0 617803.0 E02 E04 0.0 This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at the end have no data attached. SSN_SCR RANK RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS RANK1 RANK2 DEMOTIONS 528122.0 E01 19970221 154200 73C1O 031 F 003 BG 53 01 M 21 US FL 001 19970331 96.0 2.0 27.0 1.0 E01 E04 0.0 530788.0 E01 19970213 110200 31R1O 022 M 999 AL 52 00 N 99 US WI 001 19970331 96.0 10.0 31.0 1.0 E01 E02 0.0 I hope that my question is much clearer. Thank you, Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Tuesday, October 12, 2010 12:26 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Hi Dan, Perhaps you could restate the problem? If you have more variables, simply add them to AGGREGATE. Start by looking up on that command. Regards, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 11:13 AM To: Bob Walker; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR I attempted to run the syntax but I have over 100,000 cases like this with more fields than this. I am a neophyte with syntax and require any other commands such as exe. Thank you for your help. Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Monday, October 11, 2010 5:42 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Dan, DATA LIST FREE /SSN_SCR RANK. BEGIN DATA 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 END DATA. COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Dan,
You would need to do something like this (not tested)... COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK_EFF_DATE=FIRST(RANK_EFF_DATE) /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE) /GENDER=FIRST(GENDER) /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT) /EDUC=FIRST(EDUC) /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO) /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO) /EDUC=FIRST(EDUC) /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP) /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL) /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE). The last three lines pick up the first and last record values for DEP, MRTL, AND RANK. And DEMOTIONS works the same way as before. FYI, you're not really doing any "aggregating" in any of this -- you are simply taking advantage of functions available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too. HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 6:10 PM To: Bob Walker; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR Bob & Gene: I am very glad that ya'll are here. I have 1 master file. What I want to keep constant are the following variables from the 1st record per SSN. The variables to remain constant are: SSN_SCR RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE (Does change over time but am interested in age for 1st record per SSN) GENDER RACE ETHNIC AFQT EDUC HOR_CNTRY HOR_STATE GRADE_MONTHS FILE_DATE DOT_YR DOT_MO ACC_EDUC SVS The variables that would be added to those that could possibly change are: DEPENDENTS (Would want to know if there is a change in the first record to the last record by having a Dependents1=1st record and Dependents2=last record) MRTL_STAT (Would want to know if the person had a change in marital status with the value showing a difference from the 1st SSN_SCR value compared to the last SSN_SCR value per SSN_SCR. MRTL_STAT1=1st MRTLSTAT. MRTL_STAT2=Last MRTLSTAT. Added to the data set would be: Rank1=rank, Rank2=last rank in set of ranks among SSN_SCR (the records per SSN show changes in rank from 1st case to last case per SSN_SCR), and Demotions is dichotomized as occurrence=1 and non-occurrence=0 where there is a reduction in rank among the records per SSN_SCR. I hope that I am more succinct. Thank you, Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Tuesday, October 12, 2010 1:51 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Hi Dan, Gene is right... what exactly are we dealing with here. - I'm assuming that you have just one master file, correct? - Do you want to maintain the original file structure (multiple rows per SSN), or do you want to end up with one record (row) per SSN? - Which variables change values from row to row, and which ones do not (i.e., are constant)? SSN_SCR RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 12:47 PM To: Bob Walker; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR Bob: Thank you very much for sharing your knowledge. I just posted this but will do so again. When I ran the syntax without the begin data and end data and placing the command exe. at the end to run, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions." I lost all the other variables. What is the syntax that would allow me to keep the other variables? This is what I got SSN_SCR RANK1 RANK2 DEMOTIONS 505381.0 E07 E07 0.0 525462.0 0.0 528122.0 E01 E04 0.0 530788.0 E01 E02 0.0 546009.0 E03 E05 0.0 549763.0 0.0 560083.0 E01 E04 0.0 560763.0 0.0 598286.0 0.0 605081.0 0.0 605365.0 E01 E04 0.0 607508.0 E04 E04 1.0 610245.0 0.0 612805.0 0.0 613129.0 E02 E02 0.0 614626.0 0.0 617349.0 E03 E04 0.0 617803.0 E02 E04 0.0 This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at the end have no data attached. SSN_SCR RANK RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS RANK1 RANK2 DEMOTIONS 528122.0 E01 19970221 154200 73C1O 031 F 003 BG 53 01 M 21 US FL 001 19970331 96.0 2.0 27.0 1.0 E01 E04 0.0 530788.0 E01 19970213 110200 31R1O 022 M 999 AL 52 00 N 99 US WI 001 19970331 96.0 10.0 31.0 1.0 E01 E02 0.0 I hope that my question is much clearer. Thank you, Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Tuesday, October 12, 2010 12:26 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Hi Dan, Perhaps you could restate the problem? If you have more variables, simply add them to AGGREGATE. Start by looking up on that command. Regards, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: Daniel W. Gerstenfield [mailto:[hidden email]] Sent: Tuesday, October 12, 2010 11:13 AM To: Bob Walker; [hidden email] Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR I attempted to run the syntax but I have over 100,000 cases like this with more fields than this. I am a neophyte with syntax and require any other commands such as exe. Thank you for your help. Dan Gerstenfield -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker Sent: Monday, October 11, 2010 5:42 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Dan, DATA LIST FREE /SSN_SCR RANK. BEGIN DATA 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 END DATA. COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / DEMOTIONS=MAX(DEMOTE). HTH, Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield Sent: Monday, October 11, 2010 4:53 PM To: [hidden email] Subject: Want to reduce multiple SSN_SCR to one SSN_SCR I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: SSN_SCR RANK 112110 01 112110 02 112110 03 112110 03 112110 04 112110 04 112111 03 112111 01 112111 02 112111 03 112111 04 112112 03 112112 03 112112 04 112112 05 I want the table to look like this SSN_SCR RANK RANK_CHANGE DEMOTIONS 112110 01 04 0 112111 03 04 1 112112 03 05 0 I need help in coming up with a syntax to format my data in this manner and delete the repeated SSN_SCR. Please help. Thank you, Daniel Gerstenfield ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
In reply to this post by Daniel Gerstenfield
Bouncing off of Bob's astute observation
">FYI, you're not really doing any "aggregating" in any of this -- you are simply taking advantage of functions available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too." SO!!!...Might as well just do a simple MATCH FILES with FIRST and LAST operators. Blast everything else then lag the desired variables from the first record. Something like: (Untested - It's late here!- and OP should have the FM available...OTOH, there are SCADS of examples of MATCH and LAG in the archives (Google is your friend!!!). * From Bob's code *. COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. *----*. MATCH FILES / FILE * / BY SSN_SCR / FIRST=TOP / LAST=BOT. SELECT IF TOP OR BOT. You now have only 2 records per case (perhaps 1 if you have a noob). USE LAG function to drag the things you want from the first record to the second. DO IF BOT and NOT(TOP). + DO REPEAT V = List of desired variables from first record...If you want them from both first and second record then those you need to LAG separately into new unique variables. + COMPUTE V=LAG(V). + END REPEAT. END IF. FREQ something... Why waste a data pass on an EXECUTE?... SELECT IF BOT. Now time to drag out the Syntax Manual and review what people have posted, LAG, AGGREGATE, MATCH (FIRST and LAST operators), DO REPEAT, DO IF. HTH, David ---------------------- On Tue, 12 Oct 2010 20:45:15 -0400, Bob Walker <[hidden email]> wrote: >Hi Dan, > >You would need to do something like this (not tested)... > >COMPUTE DEMOTE=0. >IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. >AGGREGATE OUTFILE=* /BREAK=SSN_SCR > /RANK_EFF_DATE=FIRST(RANK_EFF_DATE) > /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE) /GENDER=FIRST(GENDER) > /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT) /EDUC=FIRST(EDUC) > /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO) > /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO) > /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP) > /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL) > /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE). > >The last three lines pick up the first and last record values for DEP, MRTL, AND RANK. And DEMOTIONS works the same way as before. > >FYI, you're not really doing any "aggregating" in any of this -- you are simply taking advantage of functions available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too. > >HTH, > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > >-----Original Message----- >From: Daniel W. Gerstenfield [mailto:[hidden email]] >Sent: Tuesday, October 12, 2010 6:10 PM >To: Bob Walker; [hidden email] >Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR > >Bob & Gene: > >I am very glad that ya'll are here. > >I have 1 master file. What I want to keep constant are the following > >The variables to remain constant are: > >SSN_SCR >RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE (Does change over time but am interested in age for 1st record per SSN) GENDER RACE ETHNIC AFQT EDUC HOR_CNTRY HOR_STATE GRADE_MONTHS FILE_DATE DOT_YR DOT_MO ACC_EDUC SVS > >The variables that would be added to those that could possibly change are: > >DEPENDENTS (Would want to know if there is a change in the first record to the last record by having a Dependents1=1st record and Dependents2=last >record) > >MRTL_STAT (Would want to know if the person had a change in marital status with the value showing a difference from the 1st SSN_SCR value compared to the last SSN_SCR value per SSN_SCR. MRTL_STAT1=1st MRTLSTAT. >MRTL_STAT2=Last MRTLSTAT. > >Added to the data set would be: >Rank1=rank, >Rank2=last rank in set of ranks among SSN_SCR (the records per SSN show changes in rank from 1st case to last case per SSN_SCR), and Demotions is dichotomized as occurrence=1 and non-occurrence=0 where there is a reduction in rank among the records per SSN_SCR. > >I hope that I am more succinct. > >Thank you, > >Dan Gerstenfield > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker >Sent: Tuesday, October 12, 2010 1:51 PM >To: [hidden email] >Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR > >Hi Dan, > >Gene is right... what exactly are we dealing with here. > >- I'm assuming that you have just one master file, correct? >- Do you want to maintain the original file structure (multiple rows per >- Which variables change values from row to row, and which ones do not (i.e., are constant)? > >SSN_SCR >RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > > >-----Original Message----- >From: Daniel W. Gerstenfield [mailto:[hidden email]] >Sent: Tuesday, October 12, 2010 12:47 PM >To: Bob Walker; [hidden email] >Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR > >Bob: Thank you very much for sharing your knowledge. I just posted this > >When I ran the syntax without the begin data and end data and placing the command exe. at the end to run, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions." I lost all the other variables. What is the syntax that would allow me to keep the other variables? > >This is what I got > >SSN_SCR RANK1 RANK2 DEMOTIONS >505381.0 E07 E07 0.0 >525462.0 0.0 >528122.0 E01 E04 0.0 >530788.0 E01 E02 0.0 >546009.0 E03 E05 0.0 >549763.0 0.0 >560083.0 E01 E04 0.0 >560763.0 0.0 >598286.0 0.0 >605081.0 0.0 >605365.0 E01 E04 0.0 >607508.0 E04 E04 1.0 >610245.0 0.0 >612805.0 0.0 >613129.0 E02 E02 0.0 >614626.0 0.0 >617349.0 E03 E04 0.0 >617803.0 E02 E04 0.0 > >This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" at > >SSN_SCR RANK RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC >AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS >RANK1 RANK2 DEMOTIONS >528122.0 E01 19970221 154200 73C1O 031 F >003 BG 53 01 M 21 US FL 001 >19970331 96.0 2.0 27.0 1.0 E01 E04 0.0 >530788.0 E01 19970213 110200 31R1O 022 M >999 AL 52 00 N 99 US WI 001 >19970331 96.0 10.0 31.0 1.0 E01 E02 0.0 > >I hope that my question is much clearer. > >Thank you, > >Dan Gerstenfield > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of >Sent: Tuesday, October 12, 2010 12:26 PM >To: [hidden email] >Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR > >Hi Dan, > >Perhaps you could restate the problem? If you have more variables, simply add them to AGGREGATE. Start by looking up on that command. > >Regards, > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > >-----Original Message----- >From: Daniel W. Gerstenfield [mailto:[hidden email]] >Sent: Tuesday, October 12, 2010 11:13 AM >To: Bob Walker; [hidden email] >Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR > >I attempted to run the syntax but I have over 100,000 cases like this with commands such as exe. > >Thank you for your help. > >Dan Gerstenfield > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Walker >Sent: Monday, October 11, 2010 5:42 PM >To: [hidden email] >Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR > >Dan, > >DATA LIST FREE /SSN_SCR RANK. >BEGIN DATA >112110 01 >112110 02 >112110 03 >112110 03 >112110 04 >112110 04 >112111 03 >112111 01 >112111 02 >112111 03 >112111 04 >112112 03 >112112 03 >112112 04 >112112 05 >END DATA. > >COMPUTE DEMOTE=0. >IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. >AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) / > >HTH, > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Daniel Gerstenfield >Sent: Monday, October 11, 2010 4:53 PM >To: [hidden email] >Subject: Want to reduce multiple SSN_SCR to one SSN_SCR > >I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. >I want a column for beginning rank (RANK), ending rank (RANK_Change), rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: > >SSN_SCR RANK >112110 01 >112110 02 >112110 03 >112110 03 >112110 04 >112110 04 >112111 03 >112111 01 >112111 02 >112111 03 >112111 04 >112112 03 >112112 03 >112112 04 >112112 05 > >I want the table to look like this >SSN_SCR RANK RANK_CHANGE DEMOTIONS >112110 01 04 0 >112111 03 04 1 >112112 03 05 0 > >I need help in coming up with a syntax to format my data in this manner and > >Thank you, > >Daniel Gerstenfield > >===================== >To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > >===================== >To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > >===================== >To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > >===================== >To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except the >command. To leave the list, send the command >SIGNOFF SPSSX-L >For a list of commands to manage subscriptions, send the command >INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
Hey David,
Good idea... using AGGREGATE is verbose, but Dan needs that one variable (DEMOTIONS) to capture reversals on *any* record within each SSN-SCR, so keeping just first/last records using MATCH FILES would miss this. BTW, I see an error in my earlier example -- I duplicated /EDUC=FIRST(EDUC), so I think that the second occurrence should have been ACC_EDUC=FIRST(ACC_EDUC) or whatever that variable should have been. Thanks! Bob Walker Surveys & Forecasts, LLC www.safllc.com -----Original Message----- From: David Marso [mailto:[hidden email]] Sent: Wednesday, October 13, 2010 1:23 AM To: [hidden email]; Bob Walker Cc: David Marso Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Bouncing off of Bob's astute observation ">FYI, you're not really doing any "aggregating" in any of this -- you are simply taking advantage of functions available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too." SO!!!...Might as well just do a simple MATCH FILES with FIRST and LAST operators. Blast everything else then lag the desired variables from the first record. Something like: (Untested - It's late here!- and OP should have the FM available...OTOH, there are SCADS of examples of MATCH and LAG in the archives (Google is your friend!!!). * From Bob's code *. COMPUTE DEMOTE=0. IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. *----*. MATCH FILES / FILE * / BY SSN_SCR / FIRST=TOP / LAST=BOT. SELECT IF TOP OR BOT. You now have only 2 records per case (perhaps 1 if you have a noob). USE LAG function to drag the things you want from the first record to the second. DO IF BOT and NOT(TOP). + DO REPEAT V = List of desired variables from first record...If you + want them from both first and second record then those you need to LAG separately into new unique variables. + COMPUTE V=LAG(V). + END REPEAT. END IF. FREQ something... Why waste a data pass on an EXECUTE?... SELECT IF BOT. Now time to drag out the Syntax Manual and review what people have posted, LAG, AGGREGATE, MATCH (FIRST and LAST operators), DO REPEAT, DO IF. HTH, David ---------------------- On Tue, 12 Oct 2010 20:45:15 -0400, Bob Walker <[hidden email]> wrote: >Hi Dan, > >You would need to do something like this (not tested)... > >COMPUTE DEMOTE=0. >IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. >AGGREGATE OUTFILE=* /BREAK=SSN_SCR > /RANK_EFF_DATE=FIRST(RANK_EFF_DATE) > /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE) >/GENDER=FIRST(GENDER) > /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT) >/EDUC=FIRST(EDUC) > /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO) > /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO) > /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP) > /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL) > /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE). > >The last three lines pick up the first and last record values for DEP, MRTL, AND RANK. And DEMOTIONS works the same way as before. > >FYI, you're not really doing any "aggregating" in any of this -- you >are simply taking advantage of functions available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too. > >HTH, > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > >-----Original Message----- >From: Daniel W. Gerstenfield [mailto:[hidden email]] >Sent: Tuesday, October 12, 2010 6:10 PM >To: Bob Walker; [hidden email] >Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR > >Bob & Gene: > >I am very glad that ya'll are here. > >I have 1 master file. What I want to keep constant are the following > >The variables to remain constant are: > >SSN_SCR >RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE (Does change over time but am interested in age for 1st record per SSN) GENDER RACE ETHNIC AFQT EDUC HOR_CNTRY HOR_STATE GRADE_MONTHS FILE_DATE DOT_YR DOT_MO ACC_EDUC SVS > >The variables that would be added to those that could possibly change are: > >DEPENDENTS (Would want to know if there is a change in the first record >to the last record by having a Dependents1=1st record and Dependents2=last >record) > >MRTL_STAT (Would want to know if the person had a change in marital >status with the value showing a difference from the 1st SSN_SCR value compared to the last SSN_SCR value per SSN_SCR. MRTL_STAT1=1st MRTLSTAT. >MRTL_STAT2=Last MRTLSTAT. > >Added to the data set would be: >Rank1=rank, >Rank2=last rank in set of ranks among SSN_SCR (the records per SSN show changes in rank from 1st case to last case per SSN_SCR), and Demotions is dichotomized as occurrence=1 and non-occurrence=0 where there is a reduction in rank among the records per SSN_SCR. > >I hope that I am more succinct. > >Thank you, > >Dan Gerstenfield > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf >Of >Sent: Tuesday, October 12, 2010 1:51 PM >To: [hidden email] >Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR > >Hi Dan, > >Gene is right... what exactly are we dealing with here. > >- I'm assuming that you have just one master file, correct? >- Do you want to maintain the original file structure (multiple rows >per >- Which variables change values from row to row, and which ones do not (i.e., are constant)? > >SSN_SCR >RANK [We know that these values change, but what about the rest?] RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > > >-----Original Message----- >From: Daniel W. Gerstenfield [mailto:[hidden email]] >Sent: Tuesday, October 12, 2010 12:47 PM >To: Bob Walker; [hidden email] >Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR > >Bob: Thank you very much for sharing your knowledge. I just posted >this > >When I ran the syntax without the begin data and end data and placing >the command exe. at the end to run, I received only 4 variables in the output "SSN_SCR," "Rank1," "Rank2," and "Demotions." I lost all the other variables. What is the syntax that would allow me to keep the other variables? > >This is what I got > >SSN_SCR RANK1 RANK2 DEMOTIONS >505381.0 E07 E07 0.0 >525462.0 0.0 >528122.0 E01 E04 0.0 >530788.0 E01 E02 0.0 >546009.0 E03 E05 0.0 >549763.0 0.0 >560083.0 E01 E04 0.0 >560763.0 0.0 >598286.0 0.0 >605081.0 0.0 >605365.0 E01 E04 0.0 >607508.0 E04 E04 1.0 >610245.0 0.0 >612805.0 0.0 >613129.0 E02 E02 0.0 >614626.0 0.0 >617349.0 E03 E04 0.0 >617803.0 E02 E04 0.0 > >This is what I wanted (the variables "Rank1," "Rank2," and "Demotions" >at > >SSN_SCR RANK RANK_EFF_DATE DOD_OCC MOS AGE GENDER RACE ETHNIC >AFQT DEP MRTL EDUC HOR_C HOR_S G_MO FILE_DATE D_YR D_MO EDUC SVS >RANK1 RANK2 DEMOTIONS >528122.0 E01 19970221 154200 73C1O 031 F >003 BG 53 01 M 21 US FL 001 >19970331 96.0 2.0 27.0 1.0 E01 E04 0.0 >530788.0 E01 19970213 110200 31R1O 022 M >999 AL 52 00 N 99 US WI 001 >19970331 96.0 10.0 31.0 1.0 E01 E02 0.0 > >I hope that my question is much clearer. > >Thank you, > >Dan Gerstenfield > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf >Of >Sent: Tuesday, October 12, 2010 12:26 PM >To: [hidden email] >Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR > >Hi Dan, > >Perhaps you could restate the problem? If you have more variables, >simply add them to AGGREGATE. Start by looking up on that command. > >Regards, > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > >-----Original Message----- >From: Daniel W. Gerstenfield [mailto:[hidden email]] >Sent: Tuesday, October 12, 2010 11:13 AM >To: Bob Walker; [hidden email] >Subject: RE: Want to reduce multiple SSN_SCR to one SSN_SCR > >I attempted to run the syntax but I have over 100,000 cases like this >with > >Thank you for your help. > >Dan Gerstenfield > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf >Of Bob Walker >Sent: Monday, October 11, 2010 5:42 PM >To: [hidden email] >Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR > >Dan, > >DATA LIST FREE /SSN_SCR RANK. >BEGIN DATA >112110 01 >112110 02 >112110 03 >112110 03 >112110 04 >112110 04 >112111 03 >112111 01 >112111 02 >112111 03 >112111 04 >112112 03 >112112 03 >112112 04 >112112 05 >END DATA. > >COMPUTE DEMOTE=0. >IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. >AGGREGATE OUTFILE=* /BREAK=SSN_SCR /RANK1=FIRST(RANK) /RANK2=LAST(RANK) >/ > >HTH, > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf >Of >Sent: Monday, October 11, 2010 4:53 PM >To: [hidden email] >Subject: Want to reduce multiple SSN_SCR to one SSN_SCR > >I have more than 1 case for each SSN_SCR and I need to reduce it to one SSN_SCR but retain information in the variables and create new variables. >I want a column for beginning rank (RANK), ending rank (RANK_Change), >rank decreases dichotomized into yes=1 and no=0 (demotions). My data looks like the following: > >SSN_SCR RANK >112110 01 >112110 02 >112110 03 >112110 03 >112110 04 >112110 04 >112111 03 >112111 01 >112111 02 >112111 03 >112111 04 >112112 03 >112112 03 >112112 04 >112112 05 > >I want the table to look like this >SSN_SCR RANK RANK_CHANGE DEMOTIONS >112110 01 04 0 >112111 03 04 1 >112112 03 05 0 > >I need help in coming up with a syntax to format my data in this manner >and > >Thank you, > >Daniel Gerstenfield > >===================== >To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > >===================== >To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > >===================== >To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > >===================== >To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except >the command. To leave the list, send the command SIGNOFF SPSSX-L For a >list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
In reply to this post by Daniel Gerstenfield
Good point Bob,
I need to stop posting after midnight ;-) We can do it as follows: LEAVE is the silver bullet here. IF SSN_SCR NE LAG(SSN_SCR) DEMOTIONS=0. DO IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK). COMPUTE DEMOTIONS=1. LEAVE DEMOTIONS. END IF . MATCH FILES / FILE * / BY SSN_SCR / FIRST=TOP / LAST=BOT. HTH, David -- On Wed, 13 Oct 2010 11:14:54 -0400, Bob Walker <[hidden email]> wrote: >Hey David, > >Good idea... using AGGREGATE is verbose, but Dan needs that one variable (DEMOTIONS) to capture reversals on *any* record within each SSN-SCR, so keeping just first/last records using MATCH FILES would miss this. > >BTW, I see an error in my earlier example -- I duplicated /EDUC=FIRST(EDUC), so I think that the second occurrence should have been ACC_EDUC=FIRST(ACC_EDUC) or whatever that variable should have been. > >Thanks! > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > >-----Original Message----- >From: David Marso [mailto:[hidden email]] >Sent: Wednesday, October 13, 2010 1:23 AM >To: [hidden email]; Bob Walker >Cc: David Marso >Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR > >Bouncing off of Bob's astute observation ">FYI, you're not really doing any available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too." > >SO!!!...Might as well just do a simple MATCH FILES with FIRST and LAST operators. Blast everything else then lag the desired variables from the first record. > >Something like: (Untested - It's late here!- and OP should have the FM available...OTOH, there are SCADS of examples of MATCH and LAG in the archives (Google is your friend!!!). > >* From Bob's code *. >COMPUTE DEMOTE=0. >IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. >*----*. > >MATCH FILES > / FILE * > / BY SSN_SCR > / FIRST=TOP > / LAST=BOT. >SELECT IF TOP OR BOT. >You now have only 2 records per case (perhaps 1 if you have a noob). >USE LAG function to drag the things you want from the first record to the >DO IF BOT and NOT(TOP). >+ DO REPEAT V = List of desired variables from first record...If you >+ want >them from both first and second record then those you need to LAG separately into new unique variables. >+ COMPUTE V=LAG(V). >+ END REPEAT. >END IF. >FREQ something... Why waste a data pass on an EXECUTE?... >SELECT IF BOT. >Now time to drag out the Syntax Manual and review what people have posted, LAG, AGGREGATE, MATCH (FIRST and LAST operators), DO REPEAT, DO IF. >HTH, David >---------------------- >On Tue, 12 Oct 2010 20:45:15 -0400, Bob Walker <[hidden email]> wrote: >>Hi Dan, >> >>You would need to do something like this (not tested)... >> >>COMPUTE DEMOTE=0. >>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. >>AGGREGATE OUTFILE=* /BREAK=SSN_SCR >> /RANK_EFF_DATE=FIRST(RANK_EFF_DATE) >> /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE) >>/GENDER=FIRST(GENDER) >> /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT) >>/EDUC=FIRST(EDUC) >> /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO) >> /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO) >/EDUC=FIRST(EDUC) >> /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP) >> /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL) >> /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE). >> >>The last three lines pick up the first and last record values for DEP, >MRTL, AND RANK. And DEMOTIONS works the same way as before. >> >>FYI, you're not really doing any "aggregating" in any of this -- you >>are >simply taking advantage of functions available in AGGREGATE. The FIRST and >> >>HTH, >> >>Bob Walker >>Surveys & Forecasts, LLC >>www.safllc.com >><SNIP> ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
David & Bob:
I am matching what I did by hand with the syntax that ya'll have been extremely helpful in providing. I think it does what I want it to do. I am amazed at the power of this program and those that have the extensive experience in working with it. Ya'll make it look easy. I am just on the front end of this learning curve but look forward to becoming one that can share the knowledge and help others like me. Thank you, Dan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Wednesday, October 13, 2010 1:16 PM To: [hidden email] Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR Good point Bob, I need to stop posting after midnight ;-) We can do it as follows: LEAVE is the silver bullet here. IF SSN_SCR NE LAG(SSN_SCR) DEMOTIONS=0. DO IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK). COMPUTE DEMOTIONS=1. LEAVE DEMOTIONS. END IF . MATCH FILES / FILE * / BY SSN_SCR / FIRST=TOP / LAST=BOT. HTH, David -- On Wed, 13 Oct 2010 11:14:54 -0400, Bob Walker <[hidden email]> wrote: >Hey David, > >Good idea... using AGGREGATE is verbose, but Dan needs that one variable (DEMOTIONS) to capture reversals on *any* record within each SSN-SCR, so keeping just first/last records using MATCH FILES would miss this. > >BTW, I see an error in my earlier example -- I duplicated /EDUC=FIRST(EDUC), so I think that the second occurrence should have been ACC_EDUC=FIRST(ACC_EDUC) or whatever that variable should have been. > >Thanks! > >Bob Walker >Surveys & Forecasts, LLC >www.safllc.com > >-----Original Message----- >From: David Marso [mailto:[hidden email]] >Sent: Wednesday, October 13, 2010 1:23 AM >To: [hidden email]; Bob Walker >Cc: David Marso >Subject: Re: Want to reduce multiple SSN_SCR to one SSN_SCR > >Bouncing off of Bob's astute observation ">FYI, you're not really doing any available in AGGREGATE. The FIRST and LAST functions retain the variable type and formatting, too." > >SO!!!...Might as well just do a simple MATCH FILES with FIRST and LAST operators. Blast everything else then lag the desired variables from the first record. > >Something like: (Untested - It's late here!- and OP should have the FM available...OTOH, there are SCADS of examples of MATCH and LAG in the archives (Google is your friend!!!). > >* From Bob's code *. >COMPUTE DEMOTE=0. >IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. >*----*. > >MATCH FILES > / FILE * > / BY SSN_SCR > / FIRST=TOP > / LAST=BOT. >SELECT IF TOP OR BOT. >You now have only 2 records per case (perhaps 1 if you have a noob). >USE LAG function to drag the things you want from the first record to the >DO IF BOT and NOT(TOP). >+ DO REPEAT V = List of desired variables from first record...If you >+ want >them from both first and second record then those you need to LAG separately into new unique variables. >+ COMPUTE V=LAG(V). >+ END REPEAT. >END IF. >FREQ something... Why waste a data pass on an EXECUTE?... >SELECT IF BOT. >Now time to drag out the Syntax Manual and review what people have posted, LAG, AGGREGATE, MATCH (FIRST and LAST operators), DO REPEAT, DO IF. >HTH, David >---------------------- >On Tue, 12 Oct 2010 20:45:15 -0400, Bob Walker <[hidden email]> wrote: >>Hi Dan, >> >>You would need to do something like this (not tested)... >> >>COMPUTE DEMOTE=0. >>IF SSN_SCR = LAG(SSN_SCR) AND RANK < LAG(RANK) DEMOTE=1. >>AGGREGATE OUTFILE=* /BREAK=SSN_SCR >> /RANK_EFF_DATE=FIRST(RANK_EFF_DATE) >> /DOD_OCC=FIRST(DOD_OCC) /MOS=FIRST(MOS) /AGE=FIRST(AGE) >>/GENDER=FIRST(GENDER) >> /RACE=FIRST(RACE) /ETHNIC=FIRST(ETHNIC) /AFQT=FIRST(AFQT) >>/EDUC=FIRST(EDUC) >> /HOR_C=FIRST(HOR_C) /HOR_S=FIRST(HOR_S) /G_MO=FIRST(G_MO) >> /FILE_DATE=FIRST(FILE_DATE) /D_YR=FIRST(D_YR) /D_MO=FIRST(D_MO) >/EDUC=FIRST(EDUC) >> /DEPENDENTS1=FIRST(DEP) /DEPENDENTS2=LAST(DEP) >> /MARITAL1=FIRST(MRTL) /MARITAL2=LAST(MRTL) >> /RANK1=FIRST(RANK) /RANK2=LAST(RANK) /DEMOTIONS=MAX(DEMOTE). >> >>The last three lines pick up the first and last record values for DEP, >MRTL, AND RANK. And DEMOTIONS works the same way as before. >> >>FYI, you're not really doing any "aggregating" in any of this -- you >>are >simply taking advantage of functions available in AGGREGATE. The FIRST and >> >>HTH, >> >>Bob Walker >>Surveys & Forecasts, LLC >>www.safllc.com >><SNIP> ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
