All,
Thanks in advance for any assistance you can offer. I'm not a terrific syntax writer, but I am trying very hard here so please be patient with me. Am trying to parse out some survey data where there are multiple responses in a single cell. Respondents to the survey have up to 3 choices for this question. There are 15 options. I need to parse the three choices out into separate columns/variables (I know this was a silly way to put the data but I have no choice here - not involved in the design of the instrument). Data comes back to me looking like this: Q66 is a string variable (A10): 2,10,11 4,5,6 3 2,7 11,12,15 4,15 8 Here is my code: STRING #TEMP(A50). COMPUTE #TEMP=Q66. VECTOR VAR(3, A2). LOOP #I=1 TO 3. COMPUTE #INDEX=INDEX(#TEMP, ","). COMPUTE VAR(#I)=SUBSTR(#TEMP,1, #INDEX-1). COMPUTE #TEMP=SUBSTR(#TEMP, #INDEX+1). END LOOP IF INDEX(#TEMP, ",")=0. EXECUTE. This is working pretty well to put the first and second variables into separate columns if there is a comma present. It is not working for the rows where there is only 1 choice selected (no comma). Also, none of the third responses are being parsed out. SPSS does not seem to care for my #INDEX-1 and +1 statements. Does anyone see what I am doing wrong here or have a suggestion for another way to accomplish this task? I feel like I am close to getting this accomplished... Many thanks again! Nancy ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Nancy (looks like you are a quick study and avid to learn),
Please see my thread killer post/Coup de Grace in the end of the previous thread you posted. Simple macro solution. It deals with the last component as well as singlets correctly. IIRC: In your original, you were trying to establish dummy codes. Looks like your current attempt just extracts the actual values. See my second macro in the CDG post for how to extract the actual elements and apply whatever format desired. Issue in your code is you are not picking up the last piece after the END LOOP IF. This macro does so (ELSE clause) if the delimiter is not present in the remainder (#ind=0). I call this technique "Reverse Oroborus" -Think of snake swallowing tail, except this is snake swallowing head- Yeah I'm weird ;-)) -- DATA LIST /strX (A80). BEGIN DATA 4 1,2,4,7 2,7,9,10,12 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 END DATA. DEFINE strSPLIT ( Origstr !TOKENS(1) / New !TOKENS(1) / NumElem !TOKENS(1) / Delim !TOKENS(1) !DEFAULT (",") ). STRING #cpyx(A80). COMPUTE #cpyx=!OrigStr . VECTOR !New (!Numelem). RECODE !CONCAT(!New,1," TO ", !New,!NumElem, '(SYSMIS=0)'). FORMATS !CONCAT(!New,1," TO ", !New,!NumElem, '(F1.0)'). LOOP. + COMPUTE #ind=INDEX(#cpyx,!QUOTE(!Delim)). + DO IF (#ind GT 0). + COMPUTE !New(NUMBER(RTRIM(SUBSTR(#cpyx,1,#ind-1)),F2))=1. + COMPUTE #cpyx=SUBSTR(#cpyx,#ind+1). + ELSE . + COMPUTE !New(NUMBER(RTRIM(#cpyx),F2))=1. + END IF. END LOOP IF #ind=0. !ENDDEFINE . SET PRINTBACK ON MPRINT ON. strSPlit Origstr=strX New=X NumElem=15 . LIST. The variables are listed in the following order: LINE 1: STRX LINE 2: X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 STRX: 4 X1: 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 STRX: 1,2,4,7 X1: 1 1 0 1 0 0 1 0 0 0 0 0 0 0 0 STRX: 2,7,9,10,12 X1: 0 1 0 0 0 0 1 0 1 1 0 1 0 0 0 STRX: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 X1: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Number of cases read: 4 Number of cases listed: 4
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Nancy Rusinak-2
very old fashioned way
to do it, posted because I want to test my flame-proofing.
If I recall correctly this problem has come up on this list in the last few months. There may even be a macro or python extension to handle it. data list fixed/q66(a10). begin data 2,10,11 4,5,6 3 2,7 11,12,15 4,15 8 end data. write outfile= 'c:\project\temporary.txt' /q66 *. *execute needed because WRITE is not a procedure. execute. *drafted using <file> <read text data>. GET DATA /TYPE=TXT /FILE="C:\project\temporary.txt" /ENCODING='Locale' /DELCASE=LINE /DELIMITERS="," /ARRANGEMENT=DELIMITED /FIRSTCASE=1 /IMPORTCASE=ALL /VARIABLES= V1 F2 V2 F2 V3 F2. CACHE. list. Art Kendall Social Research ConsultantsOn 4/23/2013 7:59 AM, Nancy Rusinak-2 [via SPSSX Discussion] wrote: All,
Art Kendall
Social Research Consultants |
In reply to this post by Nancy Rusinak-2
try this
data list fixed/q66(a10). begin data 2,10,11 4,5,6 3 2,7 11,12,15 4,15 8 end data. *adapted from <help> for parsing phone numbers. STRING #workstr(A11). COMPUTE #workstr = q66. VECTOR var(3,f3). LOOP #i = 1 to 2. - COMPUTE #comma = CHAR.INDEX(#telstr,","). - COMPUTE var(#i) = NUMBER(CHAR.SUBSTR(#workstr,1,#comma-1),f10). - COMPUTE #workstr = CHAR.SUBSTR(#workstr,#comma+1). END LOOP. COMPUTE var(3) = NUMBER(#telstr,f10). formats var1 to var3 (n2). list /variables = q66 var1 to var3. Art Kendall Social Research ConsultantsOn 4/23/2013 7:59 AM, Nancy Rusinak-2 [via SPSSX Discussion] wrote: All,
Art Kendall
Social Research Consultants |
In reply to this post by Nancy Rusinak-2
please ignore my second
post that was mistakenly taken from an earlier
draft of the syntax.
Art Kendall Social Research ConsultantsOn 4/23/2013 7:59 AM, Nancy Rusinak-2 [via SPSSX Discussion] wrote: All,
Art Kendall
Social Research Consultants |
In reply to this post by Nancy Rusinak-2
Nancy,
I also added a line to your code. The reason you need that line is because the code uses the comma to know where the break between values is and there has to be a break between the true last value and the nonexistent next value. I tested it and it works STRING #TEMP(A50). COMPUTE #TEMP=Q66. VECTOR VAR(3,A2). If (substr(#temp,char.len(#temp),1) ne ',') #temp=concat(rtrim(q66),','). /*<<have to have this line. LOOP #I=1 TO 3. COMPUTE #INDEX=INDEX(#TEMP,","). COMPUTE VAR(#I)=SUBSTR(#TEMP,1, #INDEX-1). COMPUTE #TEMP=SUBSTR(#TEMP, #INDEX+1). END LOOP IF INDEX(#TEMP,",")=0. EXECUTE. q66 VAR1 VAR2 VAR3 2,10,11 2 10 11 4,5,6 4 5 6 3 3 2,7 2 7 11,12,15 11 12 15 4,15 4 15 8 8 Number of cases read: 7 Number of cases listed: 7 Bruce, Thank you for taking the time to look at my original. I'll test out what you suggested. I wanted to make the code independent of the max number of elements. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Nancy Rusinak Sent: Tuesday, April 23, 2013 7:58 AM To: [hidden email] Subject: What's wrong with my code? All, Thanks in advance for any assistance you can offer. I'm not a terrific syntax writer, but I am trying very hard here so please be patient with me. Am trying to parse out some survey data where there are multiple responses in a single cell. Respondents to the survey have up to 3 choices for this question. There are 15 options. I need to parse the three choices out into separate columns/variables (I know this was a silly way to put the data but I have no choice here - not involved in the design of the instrument). Data comes back to me looking like this: Q66 is a string variable (A10): 2,10,11 4,5,6 3 2,7 11,12,15 4,15 8 Here is my code: STRING #TEMP(A50). COMPUTE #TEMP=Q66. VECTOR VAR(3, A2). If (substr(#s,char.len(#s),1) ne ',') #s=concat(rtrim(q66),','). <<have to have this line LOOP #I=1 TO 3. COMPUTE #INDEX=INDEX(#TEMP, ","). COMPUTE VAR(#I)=SUBSTR(#TEMP,1, #INDEX-1). COMPUTE #TEMP=SUBSTR(#TEMP, #INDEX+1). END LOOP IF INDEX(#TEMP, ",")=0. EXECUTE. This is working pretty well to put the first and second variables into separate columns if there is a comma present. It is not working for the rows where there is only 1 choice selected (no comma). Also, none of the third responses are being parsed out. SPSS does not seem to care for my #INDEX-1 and +1 statements. Does anyone see what I am doing wrong here or have a suggestion for another way to accomplish this task? I feel like I am close to getting this accomplished... Many thanks again! Nancy ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Nancy Rusinak-2
This thread has gone out under two different subject headings: What's wrong with my code? And parsing out one variable to many. Can we just stick to one? John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com Start page: www.surveyresearch.weebly.com/spss-without-tears.html PS Gene’s code is nice From: John F Hall [mailto:[hidden email]] David In all the years I've been designing and analysing surveys I've never needed to parse a script. I offered a solution to do the analysis in response to an off-list note from Nancy about MULT RESPONSE and copied my reply and her note to the list. In any case there were plenty of suggestions offered on parsing the script. The solution I offered took me about 20 minutes in which I modified Nancy's data, checked the FM, wrote and tested the syntax, then produced a specimen analysis. Nancy has never used MULT RESPONSE, but I'm sure she'll need it in her analysis. Your solution creates a string of 1s and 0s over 15 variables which can be analysed using MULT RESPONSE in dichotomous mode. My SPSS syntax deals with codes 1 - 15 in up to three variables, leaving the original values intact. A further solution would be for your code to be modified to yield codes 1 - 15 each with its own variable. It depends whether one wants to use MULT RESPONSE in integer or dichotomous mode. I've frequently used both (15 variable) conventions in the same survey on the same variables by recoding 1-15 to 1 or the series of 1s to 1 -15. I've dealt with coding conventions like these over many years, and run many an SPSS job to convert between one convention and another, but in the days of 80-column cards the fewer columns used the cheaper the data-prep. I'm busy writing tutorials, but will have a look at scripts etc when I have time. However, if they're not covered in Gradpack, I probably won't bother as I'm concerned mainly with entry level materials to get people started on, and getting an appetite for, the world of survey research and survey analysis. John John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com Start page: www.surveyresearch.weebly.com/spss-without-tears.html -----Original Message----- @John, "time = money and I don’t have time to learn about strings". Over the long haul you will save a lot of time=money by spending 20 minutes to learn about parsing strings. I am surprised/shocked/appalled to read this from someone who claims to have been using SPSS since 1972? -- John F Hall wrote > Nancy > > I can see your problem, but other listers have suggested solutions to > spread out the strings. I'm copying this back to the list so others > can see it. > > How many cases do you have? Are there only up to three responses for > each respondent? If so you need to spread the responses out on to > three separate columns, which is what you've tried to do with your > syntax. String manipulation is not my forte, I'm afraid. You can > apply the same logic if there are more than three responses per case. > > I'm always aware that time = money and I don't have time to learn > about strings. Provided the data set is not too large, I would > probably attempt something by hand such as copying the strings into > two more columns and then editing them. I would also try copying the > column for Q.66 into Word to edit into something that data list can > read as three variables (even if the 2nd and 3rd are blank). > > Here's my rough and ready solution: > > Copy your data into a txt file > > 0,1,14 > 2,6,7 > 10,11,15 > 3 > 0,4 > 8,10,12 > > > Replace all single digits with leading 0 and commas by blanks (I did > this by > hand) > > 00 01 14 > 02 06 07 > 10 11 15 > 03 > 00 04 > 08 10 12 > > . . and save to nancy.txt on f: > > SPSS syntax: > > data list file 'f:\nancy.txt' rec 1 > / x1 1-2 x2 4-5 x3 7-8. > > list /cases 6. > > freq var x1 x2 x3. > > mult resp groups q66 (x1 to x3 (0,15)) /freq q66. > > From: Nancy Rusinak [mailto: > nancy@ > ] > Sent: 23 April 2013 05:15 > To: John F Hall > Subject: Re: parsing out one variable to many > > Hello Mr. Hall, > > Thank you for writing & attempting to assist me. > > I'm afraid I have never worked with MULT RESPONSE syntax in the past so > it's > all Greek to me at this point. > > I have worked a good bit with SUBSTR and am trying to create substrings > using the "," as a delimiter. I'm almost there but, alas, not quite. Can > you > possibly help? > > Here's a sample of my data -variable is called Q66. It's string, 10 wide: > > 0,1,14 > 2,6,7 > 10,11,15 > 3 > 0,4 > 8,10,12 > > I only want 3 variables when I finish. For example, for the first row of > data, variable 1 would be "0," variable 2 would be "1," and variable 3 > would > be "14." > > Here's my syntax, which I found online and have been trying to adapt for > my > purposes: > > STRING #TEMP(A50). > COMPUTE #TEMP=Q66. > VECTOR VAR(3, A2). > LOOP #I=1 TO 3. > COMPUTE #INDEX=INDEX(#TEMP, ","). > COMPUTE VAR(#I)=SUBSTR(#TEMP,1, #INDEX-1). > COMPUTE #TEMP=SUBSTR(#TEMP, #INDEX+1). > END LOOP IF INDEX(#TEMP, ",")=0. > EXECUTE. > > When I run this, the first two variables are correctly populated for the > data strings that contain a "," but not for data strings that do not. > Those > are null. Also, the third variable is not populated for any of the data - > just null down the column. SPSS output logs the following: "the third > argument to SUBSTR (the length) is missing or otherwise invalid. The > argument must be a non-negative integer. The result has been set to the > null > string." This indicates to me that the #INDEX-1 and #INDEX+1 are an issue > but I do not know a workaround... > > Any help you could offer would be greatly appreciated! > > Best, > > Nancy Rusinak, Ed.S. > > Sent from my iPad > > On Apr 22, 2013, at 10:38 AM, John F Hall < > johnfhall@ > > wrote: > Nancy > > IN SPSS, easiest would be to have 15 variables, one for each response > category. If you label each value (1st variable only) you can then use > MULT > RESPONSE to tabulate the responses. > > val lab v1 1 'Strategy' 2 ' ~~~~' etc etc > mult resp category (v1 to v15 (1,15)) > /freq category. > > Another way is to label each variable and use MULT RESPONSE in dichotomous > mode, but to do this you first have to recode the values for all variables > to 1. I've played safe and kept your originals, but recoded to a new > variable > > recode v1 to v15 (1 thru 15 = 1)(else =0) into r1 to r15. > var lab r1 '~~~' > /r2 '~~~' > ~~~ > /r15 '~~~'. > mult resp rcategory (r1 to r15 (1)) > /freq rcategory. > > There are examples of this (illustrated by multiple response questions > from > the British Social Attitudes survey 1987 and the European Social Survey > 2002) in sections 3.2 and 3.3 (pp 27-35) and the 3rd slide show of my 2006 > Old Dog, Old Tricks presentation SPSS usage in major surveys. > (See: > http://surveyresearch.weebly.com/old-dog-old-tricks-using-spss-syntax-to-bea > t-the-mouse-trap.html) > > Get back to me if you don't understand that or if you need further help. > > > John F Hall (Mr) > [Retired academic survey researcher] > > Email: > johnfhall@ > Website: www.surveyresearch.weebly.com > Start page: www.surveyresearch.weebly.com/spss-without-tears.html > > > > > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto: > ] On Behalf Of > Nancy Rusinak > Sent: 22 April 2013 15:54 > To: > Subject: parsing out one variable to many > > I am collecting survey data. Respondents are asked to categorize their > comments into different buckets. They can choose as many as they like or > none at all. If they choose the first category, the data puts a "1" in > the > field. If they choose the first and fifth category, the data reads "1,5." > There are 15 possible choices. > > I'd like to create a variable called "strategy" for category 1 and have a > "1" put in that variable if the respondent chose that bucket; a variable > called "communication" for category 2 and a "1" put in that variable if > the > respondent chose that bucket and so on. > > sample data, variable named qID_9109: > 1,2 > 3,6,7 > 2 > 5,10,12,14 > > Can anyone help here? > > ===================== > To manage your subscription to SPSSX-L, send a message to > <mailto: > > > (not to > SPSSX-L), with no body text except the command. To leave the list, send > the > command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, > send > the command INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/parsing-out-one-variable-to-many-tp5719621p5719656.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
I suspect Nancy has sorted out the remaining issue given she was one tiny step away from closure earlier this AM. I do agree that for the sake of continuity that follow up threads be pursued in the initial thread. OTOH: Gotta give newbies the benefit of the doubt ;-)
I would personally prefer sometimes that a thread gets a new post with residual issues rather than the 20-30 post many branched 'what is where' and 'who posted what/when' craziness that often transpires ;-) . For example the INPUT DATA thread has branched into a 'dialog' re the +/- of 'newer *insert tongue into cheek* post 1961' but unfortunately unfamiliar to many Rasch type models over 'classical' SUM(Pi*Xi) approach which is somewhat OT at this point despite the very intriguing and enlightening discussion. Parse that ;-) --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Banned User
|
I will be out of the office until Wednesday, April 24, 2013, with no access to email. However, please know that your message is very important to me and I will respond when I return. Thank you.
If this is an emergency, please call our office at 734.459.1030.
Thank you.
Sincerely,
Cheryl
_____________________________________________________
Cheryl A. Boglarsky, Ph.D.
Human Synergistics, Inc.
39819 Plymouth Road
Plymouth, MI 48170
734.459.1030
This message includes legally privileged and confidential information that is intended only for the use of the recipient named above. All readers
of this message, other than the intended recipient, are hereby notified that any dissemination, modification, distribution or reproduction of this e-mail is strictly forbidden.
|
Free forum by Nabble | Edit this page |