|
I know this is possible in Excel, but I was wondering if SPSS had a similar
feature. The problem: Multiple data entries (String) in a single field separated by semi-colons. Is there syntax code to separate out the multiple data entries in a single field into separate fields for each datum? Thanks in advance for any help. Teddy ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Well, my first recommendation would be to specify a semi-colon as a value delimiter when reading the data, but that only works for delimited text files (using either GET DATA or DATA LIST).
But assuming your stuck with a string value that contains multiple values delimited by a semi-colon: *sample data. data list free /oldvar (a22). begin data ab;abc;abcd;abcde;f end data. *real code starts here. string #temp(a22). compute #temp=oldvar. vector newvar(5, a5). loop if index(#temp, ";")>0. compute #index=index(#temp,";"). compute #i=#i+1. compute newvar(#i)=substr(#temp,1, #index-1). compute #temp=substr(#temp, #index+1). end loop. compute newvar(#i+1)=#temp. execute. The one catch here is that you either need to know the maximum number of values and the maximum width of a single value (for the Vector command that creates the new variables) or you need to overestimate to make sure you get all of them, in which case you may end up with a bunch of superfluous empty new variables. For example, "vector newvar(100,a100)" would also work in my example, but the defined string widths would be much wider than necessary and you'd have 95 unnecessary extra (and empty) variables. There may be more elegant solutions. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Theodora B. Consolacion Sent: Thursday, November 29, 2007 1:09 PM To: [hidden email] Subject: Syntax to reverse concatenate in SPSS? I know this is possible in Excel, but I was wondering if SPSS had a similar feature. The problem: Multiple data entries (String) in a single field separated by semi-colons. Is there syntax code to separate out the multiple data entries in a single field into separate fields for each datum? Thanks in advance for any help. Teddy ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Theodora B. Consolacion
Oops. My bad. I forgot that scratch variables don't automatically get reinitialized...
*sample data. data list list /oldvar (a22). begin data ab;abc;abcd abcde;f end data. *real code starts here. string #temp(a22). compute #temp=oldvar. vector newvar(5, a5). compute #i=0. /*reinitialize the counter for each case. loop if index(#temp, ";")>0. compute #index=index(#temp,";"). compute #i=#i+1. compute newvar(#i)=substr(#temp,1, #index-1). compute #temp=substr(#temp, #index+1). end loop. compute newvar(#i+1)=#temp. execute. -----Original Message----- From: Oliver, Richard Sent: Thursday, November 29, 2007 3:33 PM To: 'Theodora B. Consolacion'; [hidden email] Subject: RE: Syntax to reverse concatenate in SPSS? Well, my first recommendation would be to specify a semi-colon as a value delimiter when reading the data, but that only works for delimited text files (using either GET DATA or DATA LIST). But assuming your stuck with a string value that contains multiple values delimited by a semi-colon: *sample data. data list free /oldvar (a22). begin data ab;abc;abcd;abcde;f end data. *real code starts here. string #temp(a22). compute #temp=oldvar. vector newvar(5, a5). loop if index(#temp, ";")>0. compute #index=index(#temp,";"). compute #i=#i+1. compute newvar(#i)=substr(#temp,1, #index-1). compute #temp=substr(#temp, #index+1). end loop. compute newvar(#i+1)=#temp. execute. The one catch here is that you either need to know the maximum number of values and the maximum width of a single value (for the Vector command that creates the new variables) or you need to overestimate to make sure you get all of them, in which case you may end up with a bunch of superfluous empty new variables. For example, "vector newvar(100,a100)" would also work in my example, but the defined string widths would be much wider than necessary and you'd have 95 unnecessary extra (and empty) variables. There may be more elegant solutions. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Theodora B. Consolacion Sent: Thursday, November 29, 2007 1:09 PM To: [hidden email] Subject: Syntax to reverse concatenate in SPSS? I know this is possible in Excel, but I was wondering if SPSS had a similar feature. The problem: Multiple data entries (String) in a single field separated by semi-colons. Is there syntax code to separate out the multiple data entries in a single field into separate fields for each datum? Thanks in advance for any help. Teddy ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
At 08:59 PM 11/29/2007, Oliver, Richard wrote:
>Oops. My bad. I forgot that scratch variables don't automatically >get reinitialized... Sure. But, why not make #i the loop counter instead, and save initializing and incrementing it? The loop has a determinate upper limit: you're allowing for parsing out up to 5 tokens. (Your code would blow on an indexing error, if there were more than 5 in the string.) The LTRIM strips leading blanks from tokens; omit, if that's not desired. |-----------------------------|---------------------------| |Output Created |29-NOV-2007 22:20:49 | |-----------------------------|---------------------------| oldvar ab;abc;abcd abcde;f abcde;hijk;lmn;op;qr;s zzzz;yyy;xwvu;aeiou; Number of cases read: 4 Number of cases listed: 4 *real code starts here. string #temp(a22). compute #temp=oldvar. vector newvar(5, a5). loop #i = 1 TO 5. . compute #index=index(#temp,";"). . DO IF #index GT 0. . compute newvar(#i)=LTRIM(substr(#temp,1, #index-1)). . compute #temp=substr(#temp, #index+1). . ELSE. . compute newvar(#i)=LTRIM(#temp). . compute #temp=''. . END IF. end loop IF #index EQ 0. LIST. List |-----------------------------|---------------------------| |Output Created |29-NOV-2007 22:20:50 | |-----------------------------|---------------------------| oldvar newvar1 newvar2 newvar3 newvar4 newvar5 ab;abc;abcd ab abc abcd abcde;f abcde f abcde;hijk;lmn;op;qr;s abcde hijk lmn op qr zzzz;yyy;xwvu;aeiou; zzzz yyy xwvu aeiou Number of cases read: 4 Number of cases listed: 4 =================== APPENDIX: Test data =================== *sample data. data list list /oldvar (a22). begin data ab;abc;abcd abcde;f abcde;hijk;lmn;op;qr;s zzzz;yyy;xwvu;aeiou; end data. LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Oliver, Richard
At 04:32 PM 11/29/2007, Oliver, Richard wrote:
>Assuming you're stuck with a string value that contains multiple >values delimited by a semi-colon: [...parsing logic omitted...] >The one catch here is that you either need to know the maximum >number of values and the maximum width of a single value (for the >Vector command that creates the new variables) or you need to >overestimate to make sure you get all of them, in which case you may >end up with a bunch of superfluous empty new variables. For >example, "vector newvar(100,a100)" would also work in my example, >but the defined string widths would be much wider than necessary and >you'd have 95 unnecessary extra (and empty) variables. > >There may be more elegant solutions. Yours is pretty much what I'd do myself. As for the limitations, I don't know any way around setting a maximum token length. But if you don't know the maximum number of tokens, you can write the tokens to separate records, using XSAVE; read back the resulting file, and use CASESTOVARS put the tokens into a set of variables instead of separate cases. (Is it obvious how to write this code?) This works because CASESTOVARS scans the data to determine the largest number of tokens in any case, and builds the variable list accordingly. I believe that it is the only command in native SPSS that can generate a data dictionary whose content depends on the data in the file (can anybody think of another?); that can also be done with Python, or by generating SPSS code from an SPSS program, writing it out, and INCLUDE/INSERTing it. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Richard Ristow
I was trying to provide a solution that would work (albeit not in an optimal fashion) even if you didn't know the maximum number of tokens.
-----Original Message----- From: Richard Ristow [mailto:[hidden email]] Sent: Thursday, November 29, 2007 9:24 PM To: Oliver, Richard; [hidden email] Cc: Theodora B. Consolacion Subject: Re: Syntax to reverse concatenate in SPSS? At 08:59 PM 11/29/2007, Oliver, Richard wrote: >Oops. My bad. I forgot that scratch variables don't automatically >get reinitialized... Sure. But, why not make #i the loop counter instead, and save initializing and incrementing it? The loop has a determinate upper limit: you're allowing for parsing out up to 5 tokens. (Your code would blow on an indexing error, if there were more than 5 in the string.) The LTRIM strips leading blanks from tokens; omit, if that's not desired. |-----------------------------|---------------------------| |Output Created |29-NOV-2007 22:20:49 | |-----------------------------|---------------------------| oldvar ab;abc;abcd abcde;f abcde;hijk;lmn;op;qr;s zzzz;yyy;xwvu;aeiou; Number of cases read: 4 Number of cases listed: 4 *real code starts here. string #temp(a22). compute #temp=oldvar. vector newvar(5, a5). loop #i = 1 TO 5. . compute #index=index(#temp,";"). . DO IF #index GT 0. . compute newvar(#i)=LTRIM(substr(#temp,1, #index-1)). . compute #temp=substr(#temp, #index+1). . ELSE. . compute newvar(#i)=LTRIM(#temp). . compute #temp=''. . END IF. end loop IF #index EQ 0. LIST. List |-----------------------------|---------------------------| |Output Created |29-NOV-2007 22:20:50 | |-----------------------------|---------------------------| oldvar newvar1 newvar2 newvar3 newvar4 newvar5 ab;abc;abcd ab abc abcd abcde;f abcde f abcde;hijk;lmn;op;qr;s abcde hijk lmn op qr zzzz;yyy;xwvu;aeiou; zzzz yyy xwvu aeiou Number of cases read: 4 Number of cases listed: 4 =================== APPENDIX: Test data =================== *sample data. data list list /oldvar (a22). begin data ab;abc;abcd abcde;f abcde;hijk;lmn;op;qr;s zzzz;yyy;xwvu;aeiou; end data. LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Oliver, Richard
Thank you! This works fantastically for the data that I have.
cheers, teddy On Nov 29, 2007 5:59 PM, Oliver, Richard <[hidden email]> wrote: > Oops. My bad. I forgot that scratch variables don't automatically get > reinitialized... > > *sample data. > data list list /oldvar (a22). > begin data > ab;abc;abcd > abcde;f > end data. > *real code starts here. > string #temp(a22). > compute #temp=oldvar. > vector newvar(5, a5). > compute #i=0. /*reinitialize the counter for each case. > loop if index(#temp, ";")>0. > compute #index=index(#temp,";"). > compute #i=#i+1. > compute newvar(#i)=substr(#temp,1, #index-1). > compute #temp=substr(#temp, #index+1). > end loop. > compute newvar(#i+1)=#temp. > execute. > > -----Original Message----- > From: Oliver, Richard > Sent: Thursday, November 29, 2007 3:33 PM > To: 'Theodora B. Consolacion'; [hidden email] > Subject: RE: Syntax to reverse concatenate in SPSS? > > Well, my first recommendation would be to specify a semi-colon as a value > delimiter when reading the data, but that only works for delimited text > files (using either GET DATA or DATA LIST). > > But assuming your stuck with a string value that contains multiple values > delimited by a semi-colon: > > *sample data. > data list free /oldvar (a22). > begin data > ab;abc;abcd;abcde;f > end data. > *real code starts here. > string #temp(a22). > compute #temp=oldvar. > vector newvar(5, a5). > loop if index(#temp, ";")>0. > compute #index=index(#temp,";"). > compute #i=#i+1. > compute newvar(#i)=substr(#temp,1, #index-1). > compute #temp=substr(#temp, #index+1). > end loop. > compute newvar(#i+1)=#temp. > execute. > > The one catch here is that you either need to know the maximum number of > values and the maximum width of a single value (for the Vector command that > creates the new variables) or you need to overestimate to make sure you get > all of them, in which case you may end up with a bunch of superfluous empty > new variables. For example, "vector newvar(100,a100)" would also work in my > example, but the defined string widths would be much wider than necessary > and you'd have 95 unnecessary extra (and empty) variables. > > There may be more elegant solutions. > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Theodora B. Consolacion > Sent: Thursday, November 29, 2007 1:09 PM > To: [hidden email] > Subject: Syntax to reverse concatenate in SPSS? > > I know this is possible in Excel, but I was wondering if SPSS had a > similar > feature. > > The problem: > Multiple data entries (String) in a single field separated by semi-colons. > > Is there syntax code to separate out the multiple data entries in a single > field into separate fields for each datum? > > Thanks in advance for any help. > Teddy > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
