Administrator
|
In that case, here is a quick stab at including the AGGREGATE command in a macro that loops through the variables.
* Create a sample dataset. NEW FILE. DATASET CLOSE all. DATA LIST list / ID (F5.0) v1 to v5 (5A12). BEGIN DATA 1 test house tree nothing none 2 test garden car nothing key 3 sky --- people key nothing END DATA. LIST. * Insert the AGGREGATE command in a looping macro. DEFINE !Flags ( Root = !CHAREND('/') / First = !CHAREND('/') / Last = !CMDEND ) !DO !i = !First !TO !LAST !LET !V = !CONCAT(!Root,!i) !LET !Flag = !CONCAT("Flag",!V) AGGREGATE /BREAK = !V /!Flag = NU. RECODE !Flag (1=0) (ELSE=1). FORMATS !Flag(F1). VARIABLE LABELS !Flag !CONCAT(!V," value appears 2 or more times"). !DOEND EXECUTE. !ENDDEFINE. * Call the macro. *SET MPRINT ON. !Flags Root = V / First = 1 / Last = 5. *SET MPRINT OFF. LIST FlagV1 to FlagV5. Output from LIST: FlagV1 FlagV2 FlagV3 FlagV4 FlagV5 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 Number of cases read: 3 Number of cases listed: 3
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
Bruce Weaver posted:
"In that case, here is a quick stab at including the AGGREGATE command in a macro that loops through the variables. " <SNIP> I don't know if I like the one data pass per variable here ;-( Why not simply wrap FREQUENCIES in OMS with SPLIT FILE on ID ? Not sure why someone would want to do this in the first place.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Good point, FREQUENCIES with OMS would eliminate all those data passes. But if I understood correctly, Emma does not want the SPLIT FILE by ID.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Maguin, Eugene
Yes thats right, if i find some with the Same value i will have a closer Look in the Data and Probably delete those cases
|
In reply to this post by Bruce Weaver
Hi Bruce,
your syntax works very well :-) Is there a chance to got it more general? For example if the variables are not named v_1 up to v_100 , but a little bit confused like q3_1, q3_1_1, q_3_2, q4 I tried to adapt it but I didn`t suceed... SPLIT FILE by ID. What do you mean by this? Really apreciate your help! |
An alternative way to work this problem is through Varstocases followed by Aggregate in addvariables mode. A non-duplicated value dataset will show an NU function count of 1. What you do next depends on what the resulting dataset is to be. If the intent is to retain the first instance and eliminate (i.e., blank) instances i = 2 to n and then restructure the dataset back to wide format, that's just one data pass, I believe but have not actually tested that part, and a casestovars.
Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of emma78 Sent: Tuesday, November 24, 2015 8:37 AM To: [hidden email] Subject: Re: Identify Duplicate Cases Hi Bruce, your syntax works very well :-) Is there a chance to got it more general? For example if the variables are not named v_1 up to v_100 , but a little bit confused like q3_1, q3_1_1, q_3_2, q4 I tried to adapt it but I didn`t suceed... SPLIT FILE by ID. What do you mean by this? Really apreciate your help! -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Identify-Duplicate-Cases-tp5730968p5731028.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by emma78
This would be a good place to use the SPSSINC SELECT VARIABLES extension command. |
Sorry for my stupid questions But how can I use it for the makro, obviously I haven't got a clue....This doesn't work DEFINE !Flags ( First = !CHAREND('/') / Last = !CMDEND ) !DO !i = !First !TO !LAST !LET !V = !qish !LET !Flag = !CONCAT("Flag",!i) AGGREGATE /BREAK = !V /!Flag = NU. RECODE !Flag (1=0) (ELSE=1). FORMATS !Flag(F1). VARIABLE LABELS !Flag !CONCAT(!V,"mehrfach"). !DOEND EXECUTE. !ENDDEFINE. * Call the macro. *SET MPRINT ON. !Flags First = 1 / Last = 3. |
Administrator
|
In reply to this post by Maguin, Eugene
I for one am having extreme difficulty grasping what OP actually wants to achieve here and whether there is even any utility in such a thing. What is the point of this exercise in the first place. Do you REALLY want to remove all cases for which there is a duplicated value in any variable? What is the point of this? Sounds truly suspect and a very weird thing to do in my opinion.
--- --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Hi David,
Yes i want to delete them because i do Not Know if one Person filled out the Survey Twice. If the Data in the String var is the Same its a hint for Duplicates for me. It Sounds weird but unfortunately the Data Looks like that ... |
Administrator
|
Sounds like overkill as stated. Wouldn't it be more reasonable to consider cases with some substantial number of same answers as duplicates rather than basing it on a single match?
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
The probablilty that someone filled out a survey - WITH MULTIPLE OPENS - and
wrote the exact same thing in all them (case and all) - VIRTUALLY 0 - that would not find duplicate cases at all I argue If that's what you are looking for - waste of time -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Tuesday, November 24, 2015 03:55 PM To: [hidden email] Subject: Re: Identify Duplicate Cases Sounds like overkill as stated. Wouldn't it be more reasonable to consider cases with some substantial number of same answers as duplicates rather than basing it on a single match? emma78 wrote > Hi David, > Yes i want to delete them because i do Not Know if one Person filled > out the Survey Twice. If the Data in the String var is the Same its a > hint for Duplicates for me. > It Sounds weird but unfortunately the Data Looks like that ... ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Identify-Duplicate-Cases-tp573 0968p5731036.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by emma78
Here is a modified version of the macro that takes a single argument, which is a list of variables.
* Emma had problems with macro because variable names * are odd (e.g., q3_1, q3_1_1, q_3_2, q4). * Create a sample dataset using those variable names. NEW FILE. DATASET CLOSE all. DATA LIST list / ID (F5.0) q3_1 q3_1_1 q_3_2 q4 q4_1 (5A12). BEGIN DATA 1 test house tree nothing none 2 test garden car nothing key 3 sky --- people key nothing END DATA. LIST. * Here is a modified version of the macro that takes * a LIST of variable names. DEFINE !Flags (Vlist = !CMDEND ) !DO !V !IN (!Vlist) !LET !Flag = !CONCAT("Flag_",!V) AGGREGATE /BREAK = !V /!Flag = NU. RECODE !Flag (1=0) (ELSE=1). FORMATS !Flag(F1). VARIABLE LABELS !Flag !CONCAT(!V," value appears 2 or more times"). !DOEND EXECUTE. !ENDDEFINE. * Call the macro. *SET MPRINT ON. !Flags Vlist = q3_1 q3_1_1 q_3_2 q4 q4_1. *SET MPRINT OFF. LIST Flag_q3_1 to Flag_q4_1. Output from LIST: Flag_q3_1 Flag_q3_1_1 Flag_q_3_2 Flag_q4 Flag_q4_1 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 Number of cases read: 3 Number of cases listed: 3 The SPLIT FILE by ID stuff was referring to a completely different way to approach this that David was suggesting. But that would assume you are only looking for duplicate values WITHIN the same ID (in a file with multiple rows per ID). As I said before, that does not appear to be what you want to do. p.s. - Like David and some others, I too am not entirely clear on WHY you want to do what you're doing. But never mind! ;-)
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Maguin, Eugene
It seems that you gathered the data via some online survey system. It makes many differences whether the responses are "choose one" or "type in an answer".
Are the questions presented in a fixed order, a branched order, or a randomized order? If there are answers that are logically inconsistent, you might subset your cases, e.g., (males vs females) by (retired vs still working) etc. Would it be a clue if some cases were incomplete? How many cases do you have?
Art Kendall
Social Research Consultants |
Free forum by Nabble | Edit this page |