I am using a longitudinal data. My IV is HC taken in 2009 and DV is PERF taken in 2011. The problem is that I could not analyse the correlation and regression since the PERF variable in other year is coded as sysmis. Thus I have to used the max value in PERF and replace it to the sysmis, but I could not do that. This is the syntax to compute HC and PERF.
MISSING VALUE hampgllb hampglla hampgkno hampollb hampolla (999999999). EXECUTE. MISSING VALUE warrflhr warrjobs (999999999, 55555). EXECUTE. IF (fin_year = 200809) HC= hampgllb + hampglla + hampgkno + hampollb + hampolla + warrflhr + warrjobs. EXECUTE. MISSING VALUE compsogs compprts compprof compprod comptra compcont compit (999999999, 7777777). EXECUTE. MISSING VALUE compjobs compsoc(999999999, 7777777, 55555). EXECUTE. IF (fin_year = 201011) PERF = compsogs + compprts + compprof + compprod + compjobs + comptra + compsoc + compcont + compit. EXECUTE. I have tried different ways to replace the sysmis with the max value in PERF using : IF SYSMIS(PERF) NPERF = MAX(PERF) do if missing(PERF). COMPUTE NPERF = MAX(PERF). end if. EXECUTE . USE ALL. COMPUTE filter_$=(MISSING(PERF)). VARIABLE LABELS filter_$ 'MISSING(PERF) (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. COMPUTE NPERF = MAX(PERF). EXECUTE . SORT CASES BY FILE. SPLIT FILE LAYERED BY FILE. if (PERF=$SYSMIS) NPERF=MAX(PERF). EXECUTE. But it does not work. I still could not analyse the correlation and regression. What is the right code? PLS HELP. |
COMPUTE and the conditional compute IF (Condition) works on each case (row) without reference to other cases. If any value on the right side of the equal sign is missing the result will be system missing. Your HC and PERF are dependent on 6 and 9 terms respectively. You could try the statistical function SUM (and other functions, see the FM), which even can specify the minimum of terms needed to do the calculation.
Your use of NPERF=MAX(PERF) specifies one variable in the case (row) and is equal to NPERF=PERF. If you want the maximum of a variable (column), see AGGREGATE. I don't understand why you would replace missing values with the maximum. HTH, PR
|
In reply to this post by min
In principle, I think I understand what you want to do. As written in syntax, I don't understand.
The data structure is confusing. Usually longitudinal data is wide format, like this: ID HC2009 PERF2009 HC2011 PERF2011 But it could be long format, like this ID year HC PERF So, which is it: wide or long? The way you write your code suggests it is long. Your code also suggests you have two data points: 2009 and 2011. True? If long format and two data points, a sample case would be (I understand that both HC and PERF are the sum of variables. Skip over that for now because that is not where the problem is.) ID year HC PERF 11 2009 3.4 . 11 2011 5.2 6.7 Your problem is that in the example case you want to replace the 2009 PERF value (now, sysmis) with the 2011 PERF value (6.7). One way is this. Do you see the logic? Sort cases id (a) year(d). If (id eq lag(id)) perf=lag(perf). Lastly, I urge you to read the documentation for the Max function in the help system or syntax reference because you do not understand it. Gently, an Execute command is not needed after the Missing values statement (read the documentation) because the missing values command takes effect as soon as it is read. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of min Sent: Tuesday, June 23, 2015 1:08 AM To: [hidden email] Subject: Recoding system missing with max value I am using a longitudinal data. My IV is HC taken in 2009 and DV is PERF taken in 2011. The problem is that I could not analyse the correlation and regression since the PERF variable in other year is coded as sysmis. Thus I have to used the max value in PERF and replace it to the sysmis, but I could not do that. This is the syntax to compute HC and PERF. MISSING VALUE hampgllb hampglla hampgkno hampollb hampolla (999999999). EXECUTE. MISSING VALUE warrflhr warrjobs (999999999, 55555). EXECUTE. IF (fin_year = 200809) HC= hampgllb + hampglla + hampgkno + hampollb + hampolla + warrflhr + warrjobs. EXECUTE. MISSING VALUE compsogs compprts compprof compprod comptra compcont compit (999999999, 7777777). EXECUTE. MISSING VALUE compjobs compsoc(999999999, 7777777, 55555). EXECUTE. IF (fin_year = 201011) PERF = compsogs + compprts + compprof + compprod + compjobs + comptra + compsoc + compcont + compit. EXECUTE. I have tried different ways to replace the sysmis with the max value in PERF using : IF SYSMIS(PERF) NPERF = MAX(PERF) do if missing(PERF). COMPUTE NPERF = MAX(PERF). end if. EXECUTE . USE ALL. COMPUTE filter_$=(MISSING(PERF)). VARIABLE LABELS filter_$ 'MISSING(PERF) (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. COMPUTE NPERF = MAX(PERF). EXECUTE . SORT CASES BY FILE. SPLIT FILE LAYERED BY FILE. if (PERF=$SYSMIS) NPERF=MAX(PERF). EXECUTE. But it does not work. I still could not analyse the correlation and regression. What is the right code? PLS HELP. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Recoding-system-missing-with-max-value-tp5729895.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Hi Gene,
Yes you are right, it is a long format. Thank you for you suggestion. I managed to get what I really want using your code. I really appreciate it so much. Thank you But can you pls explain it to me why must I sort the case year(d) and not (a). And one more thing, is it possible if I would like to recode PERF into NPERF using the code that you give. I have try but it does not work. Am I missing anything? You are a real life saving. Thanks a lot. MIN |
In reply to this post by Maguin, Eugene
Hi Gene,
I have one more problem. I have one more variable INOV that is on 2010 data. SO the data look like this: id year HC INOV PERF 7 2009 5 . . 8 2009 11 . . 7 2010 . 9 . 8 2010 . 5 . 7 2011 . . 13 8 2011 . . 7 I am trying to do the same like PERF to INOV where I am going to replace id 7-PERF -2009 & 2010 with 13 and id 8-PERF -2009 & 2010 with 7. Similar to PERF, I am trying to do the same to INOV too, unfortunately the code that you give works on PERF but for INOV it turn out to be all sysmis(.). What should I do. I used the same code that you give and replace it : Sort cases id (a) year(d). If (id eq lag(id)) INOV=lag(INOV). But it doesn't works. Pls help me. |
This post was updated on .
In reply to this post by Maguin, Eugene
Hi Gene,
It's me again. Yes I have figure out the reasons behind the code. I have figure out how to deal with INOV using this code: Sort cases ABSBID (a) fin_year(d). create ORGINNO = lead(INNO, 1). EXECUTE. If (ABSBID eq lag(ABSBID)) ORGINNO =lag(ORGINNO). EXECUTE. It works on my trial(not real but similar data) data but it does not work on my real database( that I have to generate using remote system). The database that I bought is quite complicated. Is there any different way to do this. Pls help me. Thanks. |
In reply to this post by PRogman
Thanks PRogman for your reply.
I hope that you understand the reasons why I need to replace system missing with value. If you have any idea to solve my latest problem I would really appreciate. |
In reply to this post by min
I'll reply separately to each message rather than aggregating the messages.
Embedded replies. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of min Sent: Tuesday, June 23, 2015 10:32 PM To: [hidden email] Subject: Re: Recoding system missing with max value Hi Gene, Yes you are right, it is a long format. Thank you for you suggestion. I managed to get what I really want using your code. I really appreciate it so much. Thank you But can you pls explain it to me why must I sort the case year(d) and not (a). >>write out (on paper) a few example cases sorted in id(a) year(a) order. The 2009 value is missing so you look forward to the next record, the 2011 record for that case, note the value and carry it back to the 2009 record. That's easy for people to do and while spss can do that also using the Creat command, Lead function. The alternative, that I like better, is to sort the file in id (a) year(d) order and look backward to the 2009 record, note the value and carry that value forward to the 2011 record. I am doing the id (a) year(d) sort so that I can use the Lag function. Some people might roll their eyes at that and that's ok with me. And one more thing, is it possible if I would like to recode PERF into NPERF using the code that you give. I have try but it does not work. Am I missing anything? >>I don't understand this based on your original posting. But ... Sure (I think). In my logic you have to look backwards at the 2011 record to get the PERF value to fill in the 2009 value. Once you done that, you can write Compute nperf=perf. You are a real life saving. Thanks a lot. MIN -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Recoding-system-missing-with-max-value-tp5729895p5729910.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by min
Does the following accomplish what you want?
RECODE HC to PERF (sysmis = -999). SORT CASES by id year. CASESTOVARS /ID=id /INDEX=year /GROUPBY=VARIABLE. MISSING VALUES HC.2009 TO PERF.2011 (-999). LIST. OUTPUT: id HC.2009 HC.2010 HC.2011 INOV.2009 INOV.2010 INOV.2011 PERF.2009 PERF.2010 PERF.2011 7 5 -999 -999 -999 9 -999 -999 -999 13 8 11 -999 -999 -999 5 -999 -999 -999 7
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by min
Resorting your file.
id year HC INOV PERF 7 2009 5 . . 7 2010 . 9 . 7 2011 . . 13 8 2009 11 . . 8 2010 . 5 . 8 2011 . . 7 Let's back up and step back and look at the whole problem as thus far presented. HC, PERF, and, probably, INOV as well are the sums of other variables. Once you done the sum computations you're done with those other variables. Bruce posted a good suggestion. Here is another. Aggregate outfile=*/break=id/hc inov perf=first(hc inov perf). The result of this will be 7 5 9 13 8 11 5 7 Which might not be what you need. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of min Sent: Tuesday, June 23, 2015 11:24 PM To: [hidden email] Subject: Re: Recoding system missing with max value Hi Gene, I have one more problem. I have one more variable INOV that is on 2010 data. SO the data look like this: id year HC INOV PERF 7 2009 5 . . 8 2009 11 . . 7 2010 . 9 . 8 2010 . 5 . 7 2011 . . 13 8 2011 . . 7 I am trying to do the same like PERF to INOV where I am going to replace id 7-PERF -2009 & 2010 with 13 and id 8-PERF -2009 & 2010 with 7. Similar to PERF, I am trying to do the same to INOV too, unfortunately the code that you give works on PERF but for INOV it turn out to be all sysmis(.). What should I do. I used the same code that you give and replace it : Sort cases id (a) year(d). If (id eq lag(id)) INOV=lag(INOV). But it doesn't works. Pls help me. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Recoding-system-missing-with-max-value-tp5729895p5729911.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by min
To replace missing values of PERF and INOV with the maximum of each id, maybe this would solve it?
/PRogman DATASET CLOSE ALL. DATA LIST list / id year HC INOV PERF (5F4). BEGIN DATA. 7 2009 5 . . 8 2009 11 . . 7 2010 . 9 . 8 2010 . 5 . 7 2011 . . 13 8 2011 . . 7 END DATA. DATASET NAME Min. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES OVERWRITEVARS=YES /BREAK=id /INOV_max=MAX(INOV) /PERF_max=MAX(PERF). IF MISSING(PERF) PERF=PERF_max. IF MISSING(INOV) INOV=INOV_max. EXECUTE. DELETE VARIABLES PERF_max INOV_max. LIST. id year HC INOV PERF 7 2009 5 9 13 8 2009 11 5 7 7 2010 . 9 13 8 2010 . 5 7 7 2011 . 9 13 8 2011 . 5 7 Number of cases read: 6 Number of cases listed: 6
|
Administrator
|
In reply to this post by min
* I have misgivings as to this 'manufacturing' of data, here is an approach with a simple AGGREGATE .
Good luck explaining your imputation strategy to 'hostile critics/reviewers/peers etc'. DATA LIST LIST /id year HC INOV PERF . BEGIN DATA 7 2009 5 . . 8 2009 11 . . 7 2010 . 9 . 8 2010 . 5 . 7 2011 . . 13 8 2011 . . 7 END DATA. AGGREGATE OUTFILE * MODE ADDVARIABLES /BREAK ID / new_hc new_inov new_perf =MAX( hc inov perf). DO REPEAT v=hc inov perf /n_v=new_hc new_inov new_perf . IF MISSING(v) v=n_v. END REPEAT. MATCH FILES / FILE * / DROP new_hc new_inov new_perf . LIST. id year HC INOV PERF 7.00 2009.00 5.00 9.00 13.00 8.00 2009.00 11.00 5.00 7.00 7.00 2010.00 5.00 9.00 13.00 8.00 2010.00 11.00 5.00 7.00 7.00 2011.00 5.00 9.00 13.00 8.00 2011.00 11.00 5.00 7.00 Number of cases read: 6 Number of cases listed: 6
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by PRogman
Thank you for all your suggestions.
The code from PRogman works perfectly on my trial data but on my real database this error come out. How do I fix this. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES OVERWRITEVARS=YES /BREAK=ABSBID /ORGINNO_max=MAX(ORGINNO) /PERF_max=MAX(PERF). >Error # 10912 in column 14. Text: MODE >The OUTFILE subcommand on the AGGREGATE command must be terminated by a >slash. >This command not executed. Thanks. |
In reply to this post by PRogman
Hi
I have tried this code but still fail. BTW the database that I'm using only support SPSS 11.5. AGGREGATE OUTFILE=* /MODE=ADDVARIABLES OVERWRITEVARS=YES /BREAK=ABSBID /ORGINNO_max=MAX(ORGINNO) /PERF_max=MAX(PERF). >Error # 10907 in column 4. Text: MODE >The OUTFILE subcommand on the AGGREGATE command must be followed by the >optional MISSING or PRESORTED or DOCUMENT subcommands or the required BREAK >subcommand. None of these was found. >This command not executed. >Error # 10920 in column 10. Text: ABSBID >The definition of a new variable on the AGGREGATE command specifies an >unknown or misspelled function name. The following are valid function >names: FGT, FIN, FIRST, FLT, FOUT, LAST, MAX, MEAN, MEDIAN, MIN, N, NMISS, >NU, NUMISS, PGT, PIN, PLT, POUT, SD and SUM. Does anyone knows why? |
Administrator
|
You need to then aggregate to an external file and then match that as a TABLE. The MODE modifier does not exist in 11.5 . If this is mysterious then search the archives and/or RTFM.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by min
The Mode keyword was introduced in version 13. In one of your messages, the other one you sent this morning, you mentioned something about 11.5. What version of spss are you using?
Also, OVERWRITEVARS=YES should be OVERWRITE=YES I see that you said that the syntax listed works in your trial data but not in your real database. That's frustrating. Are you saying that this snippet works Get file='trialdata.sav'. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES OVERWRITEVARS=YES /BREAK=ABSBID /ORGINNO_max=MAX(ORGINNO) /PERF_max=MAX(PERF). And this does not and gives the error you cited?. Get file='myrealdatabase.sav'. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES OVERWRITEVARS=YES /BREAK=ABSBID /ORGINNO_max=MAX(ORGINNO) /PERF_max=MAX(PERF). Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of min Sent: Wednesday, June 24, 2015 9:03 PM To: [hidden email] Subject: Re: Recoding system missing with max value Thank you for all your suggestions. The code from PRogman works perfectly on my trial data but on my real database this error come out. How do I fix this. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES OVERWRITEVARS=YES /BREAK=ABSBID /ORGINNO_max=MAX(ORGINNO) /PERF_max=MAX(PERF). >Error # 10912 in column 14. Text: MODE The OUTFILE subcommand on the >AGGREGATE command must be terminated by a slash. >This command not executed. Thanks. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Recoding-system-missing-with-max-value-tp5729895p5729930.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by min
Do what the error message suggests and put a slash before MODE.
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
> Date: Wed, 24 Jun 2015 18:02:49 -0700 > From: [hidden email] > Subject: Re: Recoding system missing with max value > To: [hidden email] > > Thank you for all your suggestions. > > The code from PRogman works perfectly on my trial data but on my real > database this error come out. How do I fix this. > > AGGREGATE > /OUTFILE=* MODE=ADDVARIABLES OVERWRITEVARS=YES > /BREAK=ABSBID > /ORGINNO_max=MAX(ORGINNO) > /PERF_max=MAX(PERF). > > >Error # 10912 in column 14. Text: MODE > >The OUTFILE subcommand on the AGGREGATE command must be terminated by a > >slash. > >This command not executed. > |
Administrator
|
I suspect OP is using a non supported version (11.5) and MODE is NOT one of the options in OUTFILE.
Solution is to AGGREGATE to external file and then MATCH back as a TABLE. File needs to be SORTED for the MATCH to succeed. ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Hi David,
Since I am new in using syntax, I am not sure what is the code to your suggestion. If you don't mind, can you write the code for me. Thanks a lot. |
In reply to this post by Maguin, Eugene
Hi Gene,
The real data is using spss 11.5. and yes, it does not work on my real data. Do you have any other suggestion how to do this? Thanks. |
Free forum by Nabble | Edit this page |