|
Dear list members,
I am struggeling with finding a efficient and foolproof way to restructure a large datafile (10Mb). My data looks like this. For some cases there are extra information, that is spread over several lines. I want to restructure the datafile so that every case is on one line. The number of pieces of extra information varies (class), and I do not know what will be the maximum number. Common will from 0 to 10, but I suspect 20 pieces, or more, of extra info could take place, too. DATA LIST LIST (",")/ casenum(A8) class (A12) . begin data 09/1 ,1-SF-007 08/2010 ,1-SF- 08/2010 ,1-SF- 08/2009 ,1-SF-513.2; ,2-SF-711; ,3-SF-822; ,4-SF-902 08/2008 ,1-SF-471; ,2-SF-551; ,3-SF-711; ,4-SF-821; ,5-SF-902 END DATA. LIST. I would like to have the data in a format like this, where all extra information is joined on one long text string (Class_long). And I would like to move the extra pieces to the line with case number (because there are actually 50 more variables on this line, while the pieces of extra information are each alone on their line). CASENUM CLASS_LONG 09/1 1-SF-007 08/2010 1-SF- 08/2010 1-SF- 08/2009 1-SF-513.2;2-SF-711;3-SF-822;4-SF-902 08/2008 1-SF-471;2-SF-551;3-SF-711;4-SF-821;5-SF-902 What do you think is the best way to approach this? I can see three ways to restructure A) some LOOP with LAG and CONCAT B) first CASESTOVARS and then CONCAT C) python I am sure that there is a simple and elegant solution, that I am missing. Sincerely, Eero ________________________________________ Eero Olli Advisor the Equality and Anti-discrimination Ombud [hidden email] +47 2405 5951 POB 8048 Dep, N-0031 Oslo, Norway ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Try this:
if (casenum ne ' ') x=$casenum. exe. if (casenum = ' ') x = lag(x). exe. SORT CASES BY x . CASESTOVARS /ID=x /GROUPBY=VARIABLE. Tim Daciuk Director, Worldwide Demo Resources SPSS Inc. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Eero Olli Sent: Tuesday, January 06, 2009 6:54 AM To: [hidden email] Subject: Restructure datafile Dear list members, I am struggeling with finding a efficient and foolproof way to restructure a large datafile (10Mb). My data looks like this. For some cases there are extra information, that is spread over several lines. I want to restructure the datafile so that every case is on one line. The number of pieces of extra information varies (class), and I do not know what will be the maximum number. Common will from 0 to 10, but I suspect 20 pieces, or more, of extra info could take place, too. DATA LIST LIST (",")/ casenum(A8) class (A12) . begin data 09/1 ,1-SF-007 08/2010 ,1-SF- 08/2010 ,1-SF- 08/2009 ,1-SF-513.2; ,2-SF-711; ,3-SF-822; ,4-SF-902 08/2008 ,1-SF-471; ,2-SF-551; ,3-SF-711; ,4-SF-821; ,5-SF-902 END DATA. LIST. I would like to have the data in a format like this, where all extra information is joined on one long text string (Class_long). And I would like to move the extra pieces to the line with case number (because there are actually 50 more variables on this line, while the pieces of extra information are each alone on their line). CASENUM CLASS_LONG 09/1 1-SF-007 08/2010 1-SF- 08/2010 1-SF- 08/2009 1-SF-513.2;2-SF-711;3-SF-822;4-SF-902 08/2008 1-SF-471;2-SF-551;3-SF-711;4-SF-821;5-SF-902 What do you think is the best way to approach this? I can see three ways to restructure A) some LOOP with LAG and CONCAT B) first CASESTOVARS and then CONCAT C) python I am sure that there is a simple and elegant solution, that I am missing. Sincerely, Eero ________________________________________ Eero Olli Advisor the Equality and Anti-discrimination Ombud [hidden email] +47 2405 5951 POB 8048 Dep, N-0031 Oslo, Norway ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
