Hi All
I have to append (eg. add files) two datasets and they have different variables, what would be the most efficient way. Below are examples of the two datasets and the final dataset (how I need it to look). Dataset 1 Name Age Sex Score1 Smith, John 23 m 14 Jones, Terry 34 m 12 Johns, Tracey 25 f 25 Dataset 2 Firstname Surname Age Sex Score_1 Score_2 Samantha Jones 23 . 12 13 David Smith 34 . 12 24 Timothy Johnson 23 . 12 23 Final dataset Firstname Surname Age Sex Score1 Score2 John Smith 23 m 14 . Terry Jones 34 m 12 . Tracey Johns 25 f 25 . Samantha Jones 23 . 12 13 David Smith 34 . 12 24 Timothy Johnson 23 . 12 23 I need to pull apart the names in dataset1 to create 'Firstname' and 'Surname'. I then need to rename 'Score_1' and 'Score_2' so they are 'Score1' and 'Score2'. The 'Sex' variable in dataset1 is a string and 'Sex' in dataset2 is numeric (even though they are all missing) so I need to change Sex in dataset2 to be a string. The scores are formated differently in each dataset, for example, Score_1 (f6), Score1 (f8). Thanks in advacne for any help Cheers Craig |
If you can safely assume that the format of names in the first file is always "last name, first name" with a comma separating the two values, the job is fairly simple. Note that numeric variable width doesn't matter, but string variable width does. Numeric variables of the same name can vary in width across files; string variables of the same name must be the same defined width, and sometimes that requires a little extra work. Also, there's probably a more elegant solution to the the string/numeric sex variable problem.
****sample data***. data list list (";") /Name (a25) Age (f2) Sex (a2) Score1 (f8). begin data Smith, John; 23; m; 14 Jones, Terry; 34; m; 12 Johns, Tracey; 25; f; 25 end data. string firstname surname (a25). compute #comma=index(name, ","). compute surname=substr(name, 1, #comma-1). compute firstname=ltrim(substr(name, #comma+1)). dataset name dataset1. ***sample data***. data list list (";")/Firstname (a25) Surname (a25) Age (f3) Sex (f2) Score_1 (f6) Score_2 (f6). begin data Samantha;Jones; 23;; 12; 13 David;Smith; 34;; 12;24 Timothy; Johnson; 23;; 12; 23 end data. string gender (a2). *next two commands not technically necessary if sex is always sysmis. compute gender=string(sex, f2). execute. delete variables sex. dataset name dataset2. add files file=dataset1 /file=dataset2 /rename (Gender=Sex) (Score_1=Score1) (Score_2=Score2) /keep Firstname Surname Age Sex Score1 Score2. ________________________________ From: SPSSX(r) Discussion on behalf of Craig Hansen Sent: Wed 9/27/2006 6:59 PM To: [hidden email] Subject: Add filles and pull apart a string Hi All I have to append (eg. add files) two datasets and they have different variables, what would be the most efficient way. Below are examples of the two datasets and the final dataset (how I need it to look). Dataset 1 Name Age Sex Score1 Smith, John 23 m 14 Jones, Terry 34 m 12 Johns, Tracey 25 f 25 Dataset 2 Firstname Surname Age Sex Score_1 Score_2 Samantha Jones 23 . 12 13 David Smith 34 . 12 24 Timothy Johnson 23 . 12 23 Final dataset Firstname Surname Age Sex Score1 Score2 John Smith 23 m 14 . Terry Jones 34 m 12 . Tracey Johns 25 f 25 . Samantha Jones 23 . 12 13 David Smith 34 . 12 24 Timothy Johnson 23 . 12 23 I need to pull apart the names in dataset1 to create 'Firstname' and 'Surname'. I then need to rename 'Score_1' and 'Score_2' so they are 'Score1' and 'Score2'. The 'Sex' variable in dataset1 is a string and 'Sex' in dataset2 is numeric (even though they are all missing) so I need to change Sex in dataset2 to be a string. The scores are formated differently in each dataset, for example, Score_1 (f6), Score1 (f8). Thanks in advacne for any help Cheers Craig |
Free forum by Nabble | Edit this page |