Dear List members
I have two variables PersonCNI (numeric ) and a date variable Personbirthdate (dd.mm.yyyy) Many of the personid’s have multiple records. The task I have to check if the personbirthdate has been correctly recorded for each record. Dummy data Personcni Personbirthdate 1234 21.04.1959 1234 21.04.1959 1234 21.04.1959 I would be thankful if anyone could help me with this. Many thanks Regards Thara vardhan
|
Given how the dob is written, I’m going to assume that it is an a10 variable and not a date variable. And, no missing data. I think this will do it. if (($casenum ge 2 or personcni eq lag(personcni)) and
Personbirthdate ne lag(Personbirthdate)) err=1. Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Thara Vardhan Dear List members
I have two variables PersonCNI (numeric ) and a date variable Personbirthdate (dd.mm.yyyy)
Many of the personid’s have multiple records.
The task I have to check if the personbirthdate has been correctly recorded for each record.
Dummy data
Personcni Personbirthdate
1234 21.04.1959
1234 21.04.1959
1234 21.04.1959
I would be thankful if anyone could help me with this.
Many thanks
Regards
Thara vardhan
|
Administrator
|
If Personbirthdate is a true date variable, then use AGGREGATE with Personci as the BREAK variable, and add the SD of the birthdates to the file as a new variable. If that SD is greater than 0, not all birth dates are the same.
HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Hi Bruce and Gene
Thank you so much for your help with this. Bruce your solution worked like magic. I have to confess I did try Aggregate in the first instance - i did break by personcni and for persondateofbirth i specified it as unweighted - i did not know that i had to specify SD. What a valuable tool you have taught me. Many thanks once again regards thara From: Bruce Weaver <[hidden email]> To: [hidden email] Date: 10/09/2013 11:09 Subject: Re: Checking if date of birth has been correctly recorded in each record for personid?s with multiple records Sent by: "SPSSX(r) Discussion" <[hidden email]> If Personbirthdate is a true date variable, then use AGGREGATE with Personci as the BREAK variable, and add the SD of the birthdates to the file as a new variable. If that SD is greater than 0, not all birth dates are the same. HTH. Maguin, Eugene wrote > Given how the dob is written, I'm going to assume that it is an a10 > variable and not a date variable. And, no missing data. > I think this will do it. > > if (($casenum ge 2 or personcni eq lag(personcni)) and Personbirthdate ne > lag(Personbirthdate)) err=1. > > Gene Maguin > > From: SPSSX(r) Discussion [mailto: > SPSSX-L@.UGA > ] On Behalf Of Thara Vardhan > Sent: Monday, September 09, 2013 8:33 PM > To: > SPSSX-L@.UGA > Subject: Checking if date of birth has been correctly recorded in each > record for personid?s with multiple records > > Dear List members > > I have two variables PersonCNI (numeric ) and a date variable > Personbirthdate (dd.mm.yyyy) > > Many of the personid's have multiple records. > > The task I have to check if the personbirthdate has been correctly > recorded for each record. > > Dummy data > > Personcni Personbirthdate > > 1234 21.04.1959 > > 1234 21.04.1959 > > 1234 21.04.1959 > > I would be thankful if anyone could help me with this. > > Many thanks > > Regards > > Thara vardhan > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ > > The information contained in this email is intended for the named > recipient(s) > only. It may contain private, confidential, copyright or legally > privileged > information. If you are not the intended recipient or you have received > this > email by mistake, please reply to the author and delete this email > immediately. > You must not copy, print, forward or distribute this email, nor place > reliance > on its contents. This email and any attachment have been virus scanned. > However, > you are requested to conduct a virus scan as well. No liability is > accepted > for any loss or damage resulting from a computer virus, or resulting from > a delay > or defect in transmission of this email or any attached file. This email > does not > constitute a representation by the NSW Police Force unless the author is > legally > entitled to do so. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Checking-if-date-of-birth-has-been-correctly-recorded-in-each-record-for-personid-s-with-multiple-res-tp5721875p5721877.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ All mail is subject to content scanning for possible violation of NSW Police Force policy, including the Email and Internet Policy and Guidelines. All NSW Police Force employees are required to familiarise themselves with these policies, available on the NSW Police Force Intranet.
|
Administrator
|
In reply to this post by Bruce Weaver
For either numeric or strings use MIN and MAX functions in AGGREGATE.
If they differ then the BDs are different. --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Bruce Weaver
For *either* numeric *or* strings use MIN and MAX functions in AGGREGATE.
If they differ then the BDs are different. -- Bruce Weaver wrote > If Personbirthdate is a true date variable, then use AGGREGATE with > Personci as the BREAK variable, and add the SD of the birthdates to the > file as a new variable. If that SD is greater than 0, not all birth dates > are the same. > > HTH. > > Maguin, Eugene wrote >> Given how the dob is written, I'm going to assume that it is an a10 >> variable and not a date variable. And, no missing data. >> I think this will do it. >> >> if (($casenum ge 2 or personcni eq lag(personcni)) and Personbirthdate ne >> lag(Personbirthdate)) err=1. >> >> Gene Maguin >> >> From: SPSSX(r) Discussion [mailto: >> SPSSX-L@.UGA >> ] On Behalf Of Thara Vardhan >> Sent: Monday, September 09, 2013 8:33 PM >> To: >> SPSSX-L@.UGA >> Subject: Checking if date of birth has been correctly recorded in each >> record for personid?s with multiple records >> >> Dear List members >> >> I have two variables PersonCNI (numeric ) and a date variable >> Personbirthdate (dd.mm.yyyy) >> >> Many of the personid's have multiple records. >> >> The task I have to check if the personbirthdate has been correctly >> recorded for each record. >> >> Dummy data >> >> Personcni Personbirthdate >> >> 1234 21.04.1959 >> >> 1234 21.04.1959 >> >> 1234 21.04.1959 >> >> I would be thankful if anyone could help me with this. >> >> Many thanks >> >> Regards >> >> Thara vardhan >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> _ _ _ _ >> >> The information contained in this email is intended for the named >> recipient(s) >> only. It may contain private, confidential, copyright or legally >> privileged >> information. If you are not the intended recipient or you have received >> this >> email by mistake, please reply to the author and delete this email >> immediately. >> You must not copy, print, forward or distribute this email, nor place >> reliance >> on its contents. This email and any attachment have been virus scanned. >> However, >> you are requested to conduct a virus scan as well. No liability is >> accepted >> for any loss or damage resulting from a computer virus, or resulting from >> a delay >> or defect in transmission of this email or any attached file. This email >> does not >> constitute a representation by the NSW Police Force unless the author is >> legally >> entitled to do so. ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Checking-if-date-of-birth-has-been-correctly-recorded-in-each-record-for-personid-s-with-multiple-res-tp5721875p5721881.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
For *either* numeric *or* strings use MIN and MAX functions
in AGGREGATE.
If they differ then the BDs are different. -- Hi David Thank you so much. But if i have understood these two functions then min stands for the first value and max the last one isn't it? So what happens if the middle value or record is the incorrect one? For example for a numeric variable let say Personid location 1234 1 1234 2 1234 1 where 1 is a supermarket and 2 is a chemist shop but record should also be 1 rather than 2. Looking forward to your reply. many thanks regards thara
|
Administrator
|
Actually Thara,
MIN returns the smallest value, MAX returns the largest value. In strings: A<B,..< Z, Numbers: 1 < 2 < ..99999999 Perhaps you are thinking of FIRST and LAST functions? -- <quote author="Thara Vardhan"> For *either* numeric *or* strings use MIN and MAX functions in AGGREGATE. If they differ then the BDs are different. -- Hi David Thank you so much. But if i have understood these two functions then min stands for the first value and max the last one isn't it? So what happens if the middle value or record is the incorrect one? For example for a numeric variable let say Personid location 1234 1 1234 2 1234 1 where 1 is a supermarket and 2 is a chemist shop but record should also be 1 rather than 2. Looking forward to your reply. many thanks regards thara _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The information contained in this email is intended for the named recipient(s) only. It may contain private, confidential, copyright or legally privileged information. If you are not the intended recipient or you have received this email by mistake, please reply to the author and delete this email immediately. You must not copy, print, forward or distribute this email, nor place reliance on its contents. This email and any attachment have been virus scanned. However, you are requested to conduct a virus scan as well. No liability is accepted for any loss or damage resulting from a computer virus, or resulting from a delay or defect in transmission of this email or any attached file. This email does not constitute a representation by the NSW Police Force unless the author is legally entitled to do so.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Hi David
So sorry you are right i was actually thinking about first and last. I feel so foolish now. My sincere apologies. I will definitely use the method suggested by you for numeric and string values because i have to practically check these kind of values every day. Many thanks regards thara From: David Marso <[hidden email]> To: [hidden email] Date: 11/09/2013 15:36 Subject: Re: Checking if date of birth has been correctly recorded in each - reply to David's suggestion Sent by: "SPSSX(r) Discussion" <[hidden email]> Actually Thara, MIN returns the smallest value, MAX returns the largest value. In strings: A<B,..< Z, Numbers: 1 < 2 < ..99999999 Perhaps you are thinking of FIRST and LAST functions? -- <quote author="Thara Vardhan"> For *either* numeric *or* strings use MIN and MAX functions in AGGREGATE. If they differ then the BDs are different. -- Hi David Thank you so much. But if i have understood these two functions then min stands for the first value and max the last one isn't it? So what happens if the middle value or record is the incorrect one? For example for a numeric variable let say Personid location 1234 1 1234 2 1234 1 where 1 is a supermarket and 2 is a chemist shop but record should also be 1 rather than 2. Looking forward to your reply. many thanks regards thara _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The information contained in this email is intended for the named recipient(s) only. It may contain private, confidential, copyright or legally privileged information. If you are not the intended recipient or you have received this email by mistake, please reply to the author and delete this email immediately. You must not copy, print, forward or distribute this email, nor place reliance on its contents. This email and any attachment have been virus scanned. However, you are requested to conduct a virus scan as well. No liability is accepted for any loss or damage resulting from a computer virus, or resulting from a delay or defect in transmission of this email or any attached file. This email does not constitute a representation by the NSW Police Force unless the author is legally entitled to do so. For *either* numeric *or* strings use MIN and MAX functions in AGGREGATE. If they differ then the BDs are different. -- Hi David Thank you so much. But if i have understood these two functions then min stands for the first value and max the last one isn't it? So what happens if the middle value or record is the incorrect one? For example for a numeric variable let say Personid location 1234 1 1234 2 1234 1 where 1 is a supermarket and 2 is a chemist shop but record should also be 1 rather than 2. Looking forward to your reply. many thanks regards thara _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The information contained in this email is intended for the named recipient(s) only. It may contain private, confidential, copyright or legally privileged information. If you are not the intended recipient or you have received this email by mistake, please reply to the author and delete this email immediately. You must not copy, print, forward or distribute this email, nor place reliance on its contents. This email and any attachment have been virus scanned. However, you are requested to conduct a virus scan as well. No liability is accepted for any loss or damage resulting from a computer virus, or resulting from a delay or defect in transmission of this email or any attached file. This email does not constitute a representation by the NSW Police Force unless the author is legally entitled to do so. ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Checking-if-date-of-birth-has-been-correctly-recorded-in-each-record-for-personid-s-with-multiple-res-tp5721875p5721905.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ All mail is subject to content scanning for possible violation of NSW Police Force policy, including the Email and Internet Policy and Guidelines. All NSW Police Force employees are required to familiarise themselves with these policies, available on the NSW Police Force Intranet.
|
In reply to this post by thara vardhan-2
I would go with min and
max. the wrong data could be in some
other place in the set of records.
data list list/ID (f1) visit (f1) strdate (a10). begin data 1 1 9/11/2001 1 2 9/11/2001 1 3 9/11/2001 1 4 9/11/2001 1 5 9/11/2001 2 1 8/10/2000 2 2 8/10/2000 2 3 8/10/2000 2 4 10/8/2000 2 5 8/10/2000 end data. dataset name input. dataset declare aggfile. aggregate outfile=aggfile /break= ID / minstring = min(strdate) / maxstring = max(strdate) / firststring = first(strdate) / laststring = last(strdate). dataset activate aggfile. list. Art Kendall Social Research ConsultantsOn 9/10/2013 11:27 PM, Thara Vardhan [via SPSSX Discussion] wrote: For *either* numeric *or* strings use MIN and MAX functions in AGGREGATE.
Art Kendall
Social Research Consultants |
Just a note of caution, if you had a numeric variable with some null cases, any function in a mathematical mode would ignore it, consequently max, min and sd would not be appropriate, though the STRING function could be used to convert numerics to string for the checking process.
|
Administrator
|
NOT the case here at all!!! AGGREGATE does just fine when there are missing 'null' values.
You should test your 'theory' before propagating false information!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
I take what Robert is saying is that if you assume missing is a "different" date, then using the numeric aggregate commands doesn't return information including that value, so the SD = 0 test will not quite work.
data list free / ID X. begin data 1 1 1 . 1 1 1 1 2 2 2 2 2 2 end data. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=ID /X_sd = SD(X). Of course simple workarounds are to recode missings to some illogical value (will return 0 if all dates are missing), or just use the string functions like others have suggested. |
Administrator
|
This just fleshes out Andy's example a little.
new file. dataset close all. data list free / Note(A20) ID(f1) DateVar(adate10) StringDate (a10). begin data "All dates the same" 1 9/11/2013 "09/11/2013" "All dates the same" 1 9/11/2013 "09/11/2013" "All dates the same" 1 9/11/2013 "09/11/2013" "One date different" 2 9/11/2013 "09/11/2013" "One date different" 2 11/9/2013 "11/09/2013" "One date different" 2 9/11/2013 "09/11/2013" "One date missing " 3 9/11/2013 "09/11/2013" "One date missing " 3 9/11/2013 "09/11/2013" "One date missing " 3 . "" end data. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=ID /SD_DateVar = SD(DateVar) /MIN_DateVar = MIN(DateVar) /MAX_DateVar = MAX(DateVar) /MIN_StDate = MIN(StringDate) /MAX_StDate = MAX(StringDate) . * Flag IDs where dates are not all the same. * 1) Using SD of DateVar method. COMPUTE FlagSD = SD_DateVar GT 0. * 2) Using Min NE Max method with the true date variable. COMPUTE FlagMinMax1 = Min_DateVar NE Max_DateVar. * 3) Using Min NE Max method with the string date variable. COMPUTE FlagMinMax2 = Min_StDate NE Max_StDate. FORMATS FlagSD to FlagMinMax2 (f1). LIST Note TO StringDate FlagSD to FlagMinMax2. OUTPUT: Note ID DateVar StringDate FlagSD FlagMinMax1 FlagMinMax2 All dates the same 1 09/11/2013 09/11/2013 0 0 0 All dates the same 1 09/11/2013 09/11/2013 0 0 0 All dates the same 1 09/11/2013 09/11/2013 0 0 0 One date different 2 09/11/2013 09/11/2013 1 1 1 One date different 2 11/09/2013 11/09/2013 1 1 1 One date different 2 09/11/2013 09/11/2013 1 1 1 One date missing 3 09/11/2013 09/11/2013 0 0 1 One date missing 3 09/11/2013 09/11/2013 0 0 1 One date missing 3 . 0 0 1 Number of cases read: 9 Number of cases listed: 9 As suggested by others, the flag that is based on the SD of the true date variable being greater than 0 fails in the case of a missing date; the min NE max flag using the true date variable also fails; but the min NE max flag based on the string date works. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Thank you Art, Bruce, Andy, David, Gene
and Robert for your support and tremendous help.
This time fortunately my data file did not have any missing values for personbirthdate. However since this one of the variables that I have to check regularly and quite often i do have missing values I created a test copy of the current data file and included a) some missing values and b) modified dates for 1 or 2 records for some personcni's. For all personcni with multiple records I ran the following syntax in 2 steps FINAL SYNTAX USED to CHECK oldpersonbirthdate records 1) Keep oldpersonbirthdate as a date variable - note oldpersonbirthdate is in the format yyyy/mm/dd SORT CASES BY PersonCNI. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /PRESORTED /BREAK=PersonCNI /OLDPersonBirthdate_sd=SD(OLDPersonBirthdate). COMPUTE FlagSD = oldpersonbirthDate_sd GT 0. 2) make both personcni and oldpersonbirthdate a string variable SORT CASES BY PersonCNI. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=PersonCNI /OLDPersonBirthdate_min=MIN(OLDPersonBirthdate) /OLDPersonBirthdate_max=MAX(OLDPersonBirthdate) /OLDPersonBirthdate_first_1=FIRST(OLDPersonBirthdate) /OLDPersonBirthdate_last=LAST(OLDPersonBirthdate). COMPUTE FlagMinMax2 = oldPersonbirthDate_min NE oldPersonbirthDate_max. Flagminmax2 gives a value of 0 where they match and a value of 1 where they do not match BRUCE THIS WORKS WELL FOR PERSONS WITH MULTIPLE RECORDS HOWEVER THE PROBLEM IS WHERE I HAVE ONLY 1 RECORD PER PERSONCNI - I REMOVED the birthdate for A FEW OF THEM AND MADE THEM MISSING - SINCE BOTH MIN AND MAX HAVE NO VALUES IN THESE RECORDS VARIABLE FLAGMAX2 GIVES ME 0 AS A VALUE. HOW DO I OVERCOME THIS PROBLEM? For example Dummy data Persioncni oldpersonbirthdate OLDPersonBirthdate_sd OLDPersonBirthdate_min OLDPersonBirthdate_max OLDPersonBirthdate_first OLDPersonBirthdate_last FlagMinMax2 78910 missing . . 0 121314 missing . . 0 CAN YOU HELP ME PLEASE? Many thanks regards thara From: Bruce Weaver <[hidden email]> To: [hidden email] Date: 12/09/2013 04:11 Subject: Re: Checking if date of birth has been correctly recorded in each - reply to David's suggestion Sent by: "SPSSX(r) Discussion" <[hidden email]> This just fleshes out Andy's example a little. new file. dataset close all. data list free / Note(A20) ID(f1) DateVar(adate10) StringDate (a10). begin data "All dates the same" 1 9/11/2013 "09/11/2013" "All dates the same" 1 9/11/2013 "09/11/2013" "All dates the same" 1 9/11/2013 "09/11/2013" "One date different" 2 9/11/2013 "09/11/2013" "One date different" 2 11/9/2013 "11/09/2013" "One date different" 2 9/11/2013 "09/11/2013" "One date missing " 3 9/11/2013 "09/11/2013" "One date missing " 3 9/11/2013 "09/11/2013" "One date missing " 3 . "" end data. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=ID /SD_DateVar = SD(DateVar) /MIN_DateVar = MIN(DateVar) /MAX_DateVar = MAX(DateVar) /MIN_StDate = MIN(StringDate) /MAX_StDate = MAX(StringDate) . * Flag IDs where dates are not all the same. * 1) Using SD of DateVar method. COMPUTE FlagSD = SD_DateVar GT 0. * 2) Using Min NE Max method with the true date variable. COMPUTE FlagMinMax1 = Min_DateVar NE Max_DateVar. * 3) Using Min NE Max method with the string date variable. COMPUTE FlagMinMax2 = Min_StDate NE Max_StDate. FORMATS FlagSD to FlagMinMax2 (f1). LIST Note TO StringDate FlagSD to FlagMinMax2. OUTPUT: Note ID DateVar StringDate FlagSD FlagMinMax1 FlagMinMax2 All dates the same 1 09/11/2013 09/11/2013 0 0 0 All dates the same 1 09/11/2013 09/11/2013 0 0 0 All dates the same 1 09/11/2013 09/11/2013 0 0 0 One date different 2 09/11/2013 09/11/2013 1 1 1 One date different 2 11/09/2013 11/09/2013 1 1 1 One date different 2 09/11/2013 09/11/2013 1 1 1 One date missing 3 09/11/2013 09/11/2013 0 0 1 One date missing 3 09/11/2013 09/11/2013 0 0 1 One date missing 3 . 0 0 1 Number of cases read: 9 Number of cases listed: 9 As suggested by others, the flag that is based on the SD of the true date variable being greater than 0 fails in the case of a missing date; the min NE max flag using the true date variable also fails; but the min NE max flag based on the string date works. HTH. Andy W wrote > I take what Robert is saying is that if you assume missing is a > "different" date, then using the numeric aggregate commands doesn't return > information including that value, so the SD = 0 test will not quite work. > > data list free / ID X. > begin data > 1 1 > 1 . > 1 1 > 1 1 > 2 2 > 2 2 > 2 2 > end data. > AGGREGATE > /OUTFILE=* > MODE=ADDVARIABLES > /BREAK=ID > /X_sd = SD(X). > > Of course simple workarounds are to recode missings to some illogical > value (will return 0 if all dates are missing), or just use the string > functions like others have suggested. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Checking-if-date-of-birth-has-been-correctly-recorded-in-each-record-for-personid-s-with-multiple-res-tp5721875p5721918.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ All mail is subject to content scanning for possible violation of NSW Police Force policy, including the Email and Internet Policy and Guidelines. All NSW Police Force employees are required to familiarise themselves with these policies, available on the NSW Police Force Intranet.
|
Administrator
|
I think you're saying that you also want the flag variable set to 1 when there is only 1 case for an ID and the date variable is missing (or blank for a string date). If so, you you'll need to have the AGGREGATE also generate a variable recording the number of records per ID (using NU function). Suppose that variable is called NumRecs:
COMPUTE #Cond1 = (NumRecs GT) 1 AND (oldPersonbirthDate_min NE oldPersonbirthDate_max). COMPUTE #Cond2 = (NumRecs EQ 1) AND (oldPersonbirthDate EQ ""). COMPUTE FlagMinMax2 = #Cond1 OR #Cond2. Does this flag the cases you want correctly?
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Hi Bruce
Thank you so much for your prompt reply and help. This is exactly what i needed and it works - fabulous!! But i got an error message initially regarding (numrecs GT) 1- so I just changed it to as follows: COMPUTE #Cond1 = (NumRecs GT 1) AND (oldPersonbirthDate_min NE oldPersonbirthDate_max). COMPUTE #Cond2 = (NumRecs EQ 1) AND (oldPersonbirthDate EQ ""). COMPUTE FlagMinMax2 = #Cond1 OR #Cond2. Once again thank you all. regards thara From: Bruce Weaver <[hidden email]> To: [hidden email] Date: 12/09/2013 21:09 Subject: Re: Checking if date of birth has been correctly recorded in each - test check - problem with missing birthdate for single personcni records in the same file Sent by: "SPSSX(r) Discussion" <[hidden email]> I think you're saying that you also want the flag variable set to 1 when there is only 1 case for an ID and the date variable is missing (or blank for a string date). If so, you you'll need to have the AGGREGATE also generate a variable recording the number of records per ID (using NU function). Suppose that variable is called NumRecs: COMPUTE #Cond1 = (NumRecs GT) 1 AND (oldPersonbirthDate_min NE oldPersonbirthDate_max). COMPUTE #Cond2 = (NumRecs EQ 1) AND (oldPersonbirthDate EQ ""). COMPUTE FlagMinMax2 = #Cond1 OR #Cond2. Does this flag the cases you want correctly? Thara Vardhan wrote > Thank you Art, Bruce, Andy, David, Gene and Robert for your support and > tremendous help. > > This time fortunately my data file did not have any missing values for > personbirthdate. > > However since this one of the variables that I have to check regularly and > quite often i do have missing values I created a test copy of the current > data file and included a) some missing values and b) modified dates for > 1 or 2 records for some personcni's. > > For all personcni with multiple records I ran the following syntax in 2 > steps > > FINAL SYNTAX USED to CHECK oldpersonbirthdate records > > 1) Keep oldpersonbirthdate as a date variable - note oldpersonbirthdate is > in the format yyyy/mm/dd > > SORT CASES BY PersonCNI. > AGGREGATE > /OUTFILE=* MODE=ADDVARIABLES > /PRESORTED > /BREAK=PersonCNI > /OLDPersonBirthdate_sd=SD(OLDPersonBirthdate). > > COMPUTE FlagSD = oldpersonbirthDate_sd GT 0. > > 2) make both personcni and oldpersonbirthdate a string variable > > SORT CASES BY PersonCNI. > AGGREGATE > /OUTFILE=* MODE=ADDVARIABLES > /BREAK=PersonCNI > /OLDPersonBirthdate_min=MIN(OLDPersonBirthdate) > /OLDPersonBirthdate_max=MAX(OLDPersonBirthdate) > /OLDPersonBirthdate_first_1=FIRST(OLDPersonBirthdate) > /OLDPersonBirthdate_last=LAST(OLDPersonBirthdate). > > COMPUTE FlagMinMax2 = oldPersonbirthDate_min NE oldPersonbirthDate_max. > > Flagminmax2 gives a value of 0 where they match and a value of 1 where > they do not match > > BRUCE THIS WORKS WELL FOR PERSONS WITH MULTIPLE RECORDS HOWEVER THE > PROBLEM IS WHERE I HAVE ONLY 1 RECORD PER PERSONCNI - I REMOVED the > birthdate for A FEW OF THEM AND MADE THEM MISSING - SINCE BOTH MIN AND MAX > HAVE NO VALUES IN THESE RECORDS VARIABLE FLAGMAX2 GIVES ME 0 AS A VALUE. > HOW DO I OVERCOME THIS PROBLEM? > > For example Dummy data > > Persioncni oldpersonbirthdate OLDPersonBirthdate_sd > OLDPersonBirthdate_min OLDPersonBirthdate_max OLDPersonBirthdate_first > OLDPersonBirthdate_last FlagMinMax2 > > 78910 missing . . 0 > 121314 missing . . 0 > > > > CAN YOU HELP ME PLEASE? > > Many thanks > regards > thara > > > > > > > From: Bruce Weaver < > bruce.weaver@ > > > To: > SPSSX-L@.UGA > Date: 12/09/2013 04:11 > Subject: Re: Checking if date of birth has been correctly recorded > in each - reply to David's suggestion > Sent by: "SPSSX(r) Discussion" < > SPSSX-L@.UGA > > > > > > This just fleshes out Andy's example a little. > > new file. > dataset close all. > > data list free / Note(A20) ID(f1) DateVar(adate10) StringDate (a10). > begin data > "All dates the same" 1 9/11/2013 "09/11/2013" > "All dates the same" 1 9/11/2013 "09/11/2013" > "All dates the same" 1 9/11/2013 "09/11/2013" > "One date different" 2 9/11/2013 "09/11/2013" > "One date different" 2 11/9/2013 "11/09/2013" > "One date different" 2 9/11/2013 "09/11/2013" > "One date missing " 3 9/11/2013 "09/11/2013" > "One date missing " 3 9/11/2013 "09/11/2013" > "One date missing " 3 . "" > end data. > > AGGREGATE > /OUTFILE=* MODE=ADDVARIABLES > /BREAK=ID > /SD_DateVar = SD(DateVar) > /MIN_DateVar = MIN(DateVar) > /MAX_DateVar = MAX(DateVar) > /MIN_StDate = MIN(StringDate) > /MAX_StDate = MAX(StringDate) > . > > * Flag IDs where dates are not all the same. > > * 1) Using SD of DateVar method. > COMPUTE FlagSD = SD_DateVar GT 0. > > * 2) Using Min NE Max method with the true date variable. > COMPUTE FlagMinMax1 = Min_DateVar NE Max_DateVar. > > * 3) Using Min NE Max method with the string date variable. > COMPUTE FlagMinMax2 = Min_StDate NE Max_StDate. > > FORMATS FlagSD to FlagMinMax2 (f1). > LIST Note TO StringDate FlagSD to FlagMinMax2. > > > OUTPUT: > > Note ID DateVar StringDate FlagSD FlagMinMax1 > FlagMinMax2 > > All dates the same 1 09/11/2013 09/11/2013 0 0 0 > All dates the same 1 09/11/2013 09/11/2013 0 0 0 > All dates the same 1 09/11/2013 09/11/2013 0 0 0 > > One date different 2 09/11/2013 09/11/2013 1 1 1 > One date different 2 11/09/2013 11/09/2013 1 1 1 > One date different 2 09/11/2013 09/11/2013 1 1 1 > > One date missing 3 09/11/2013 09/11/2013 0 0 1 > One date missing 3 09/11/2013 09/11/2013 0 0 1 > One date missing 3 . 0 0 1 > > Number of cases read: 9 Number of cases listed: 9 > > As suggested by others, the flag that is based on the SD of the true date > variable being greater than 0 fails in the case of a missing date; the min > NE max flag using the true date variable also fails; but the min NE max > flag > based on the string date works. > > HTH. > > > > Andy W wrote >> I take what Robert is saying is that if you assume missing is a >> "different" date, then using the numeric aggregate commands doesn't > return >> information including that value, so the SD = 0 test will not quite > work. >> >> data list free / ID X. >> begin data >> 1 1 >> 1 . >> 1 1 >> 1 1 >> 2 2 >> 2 2 >> 2 2 >> end data. >> AGGREGATE >> /OUTFILE=* >> MODE=ADDVARIABLES >> /BREAK=ID >> /X_sd = SD(X). >> >> Of course simple workarounds are to recode missings to some illogical >> value (will return 0 if all dates are missing), or just use the string >> functions like others have suggested. > > > > > > ----- > -- > Bruce Weaver > bweaver@ > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > View this message in context: > http://spssx-discussion.1045642.n5.nabble.com/Checking-if-date-of-birth-has-been-correctly-recorded-in-each-record-for-personid-s-with-multiple-res-tp5721875p5721918.html > > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ > > All mail is subject to content scanning for possible violation of NSW > Police > Force policy, including the Email and Internet Policy and Guidelines. All > NSW > Police Force employees are required to familiarise themselves with these > policies, available on the NSW Police Force Intranet. > > > > > > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ > > The information contained in this email is intended for the named > recipient(s) > only. It may contain private, confidential, copyright or legally > privileged > information. If you are not the intended recipient or you have received > this > email by mistake, please reply to the author and delete this email > immediately. > You must not copy, print, forward or distribute this email, nor place > reliance > on its contents. This email and any attachment have been virus scanned. > However, > you are requested to conduct a virus scan as well. No liability is > accepted > for any loss or damage resulting from a computer virus, or resulting from > a delay > or defect in transmission of this email or any attached file. This email > does not > constitute a representation by the NSW Police Force unless the author is > legally > entitled to do so. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Checking-if-date-of-birth-has-been-correctly-recorded-in-each-record-for-personid-s-with-multiple-res-tp5721875p5721933.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ All mail is subject to content scanning for possible violation of NSW Police Force policy, including the Email and Internet Policy and Guidelines. All NSW Police Force employees are required to familiarise themselves with these policies, available on the NSW Police Force Intranet.
|
Administrator
|
Good catch on the misplaced right bracket, Thara. (My initial code didn't have the brackets, but I decided to add them for clarity, and didn't test again afterwards!)
Cheers, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Free forum by Nabble | Edit this page |