|
Dear list: I checked out the syntax examples at spsstools.net in order to take many rows of data for each participant and convert them into one row. The sample syntax files work fine but when I use my data they do not work. I believe the problem may have to do with missing values and I am not sure how to handle them.
I have the following variables. ID, Sequencenumber, DV1 DV2. Highest sequence for any one subject is 5 The data looks something like this. ID Sequencenumber DV1 DV2 1 1 10 8 1 2 . . 1 3 5 5 1 4 8 3 2 1 6 5 3 1 8 2 3 2 7 . 3 3 . 7 3 4 8 . 3 5 7 4 4 1 2 6 5 1 2 8 5 2 7 3 and I want to make it look like. ID DV1_1 DV1_2 DV1_3 DV1_4 DV1_5 DV2_1 DV2_2 DV2_3 DV2_4 DV2_5 1 10 . 5 8 . 8 . 5 3 . 2 6 . . . . 5 . . . . 3 8 7 . 8 7 2 . 7 . 4 etc. I welcome any advice. thanks, martin sherman |
|
Try the command casestovars (see below) or use the Restructure option
under the Data menu data list fixed/id 1-2 Sequencenumber 4 DV1 6-7 DV2 9. begin data 1 1 10 8 1 2 1 3 5 5 1 4 8 3 2 1 6 5 3 1 8 2 3 2 7 3 3 7 3 4 8 3 5 7 4 4 1 2 6 5 1 2 8 5 2 7 3 end data. sort cases id Sequencenumber . casestovars/ id = id/ index = Sequencenumber/ groupby = variable. |
|
The restructure command works perfectly. thank you. martin
>>> "Judith Saebel" <[hidden email]> 10/8/2007 9:41 PM >>> Try the command casestovars (see below) or use the Restructure option under the Data menu data list fixed/id 1-2 Sequencenumber 4 DV1 6-7 DV2 9. begin data 1 1 10 8 1 2 1 3 5 5 1 4 8 3 2 1 6 5 3 1 8 2 3 2 7 3 3 7 3 4 8 3 5 7 4 4 1 2 6 5 1 2 8 5 2 7 3 end data. sort cases id Sequencenumber . casestovars/ id = id/ index = Sequencenumber/ groupby = variable. |
|
In reply to this post by msherman
Sort of one of my soap-box speeches -
At 09:18 PM 10/8/2007, Martin Sherman wrote: >I have the following variables. >ID, Sequencenumber, DV1 DV2. Highest sequence for any one subject is >5 > >The data looks something like this. >ID Sequencenumber DV1 DV2 >1 1 10 8 >1 2 . . >1 3 5 5 >1 4 8 3 >2 1 6 5 >3 1 8 2 >3 2 7 . >3 3 . 7 >3 4 8 . >3 5 7 4 >4 1 2 6 >5 1 2 8 >5 2 7 3 > >and I want to make it look like. > >ID DV1_1 DV1_2 DV1_3 DV1_4 DV1_5 DV2_1 >DV2_2 DV2_3 DV2_4 DV2_5 > 1 10 . 5 8 > . 8 . 5 > 3 . > 2 6 . . . > . 5 . . > . . > 3 8 7 . 8 > 7 2 . 7 > . 4 As you saw, Judith Saebel's advice to use CASESTOVARS, either directly as syntax or via the Restructure wizard, is exactly what you want. My advice or question: Do you really want to do this? For many purposes, data in 'long' organization, as you have it, works better than data in 'wide' organization, as it is after restructuring. It depends on what you are going to do, so I won't go farther; but do give one hard look whether the restructuring is a good idea. -Best of regards, Richard |
|
Responding to an off-list question, as it's probably of general
interest. At 11:50 AM 10/10/2007, Elena Tuerk wrote: >I've been struggling with this data management issue myself and had a >question for you regarding your preference for "long" vs. "wide" >organization. Can you still run paired samples t-tests for the same >subject ID with a long organization? As it is, I've been restructuring >data in order to get the format needed to run these analyses. I'd >appreciate your perspective. No, you need wide organization for a paired-sample t-test. That's true for within-subjects factors in other procedures as well, including at least GLM, MANOVA and GENLIN. I preach long organization, but organization choice actually depends on taste and circumstances. In yours, I'd consider long organization, with CASESTOVARS for just the variables needed for analysis immediately before running a procedure that requires wide organization. (But be careful: CASESTOVARS irrevocably drops variables not included in the restructured data. Make sure you're keeping a separate master copy of the original data.) Long organization is preferable or required for a few procedures - notably MEANS, which I like for producing descriptive statistics. (But don't do an ANOVA in MEANS if your BY variable is actually a within-subjects factor.) When you do data transformations with wide data, you commonly have to do them separately for each variable that's part of a within-subjects factor; repeating the code, or using a DO REPEAT, LOOP, or macro loop. With long organization, you usually need only one set of transformation code; the required loop is implicit in SPSS's own looping through the cases. However, if I were doing a lot of within-subjects analyses, I might well go to wide organization and stay there. That's particularly true if there are many variables that don't vary within subjects, and there's not much need for looping over within-subjects variables in transformation programs. (For example, sometimes you won't have to transform those variables at all.) Now - any help? -Best wishes and best of luck, Richard |
|
Richard,
Thanks for your comments, they were very helpful. I'm finding that it's most efficient to maintain a long organization so as to only create one set of variables for the items and one set of code (rather than write separate syntax to score pre- and post- data) and then to convert the variables of interest (e.g., scale scores) to pre- and post- for within-subject analyses. I've been working with a relatively small sample, so I've been doing this manually (which is not a great idea, of course) because I've run into problems with the CASETOVARS point-and-click menu. To elaborate (and elicit help!), if I select the "participant id" as the identifier variable and then the outcome variable "BDI Total Score" as the index variable, with times of measurement identified by a third variable, I keep running into errors. I would greatly appreciate any clarification so as to make better use of this tool. Best, Elena |
|
In reply to this post by Richard Ristow
Dear List,
I have 400 new ID numbers like the following (12644168). To fit the structure of my data, I want to insert the letter "A" in front of each ID number like the following (A12644168). Can anyone, share syntax on how to do this. Thanks, Stace --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. |
|
This is not the most elegant way but it works.
The new string 'A12644168' will be called new_id. I first create a new var called var01 which is all 'A's STRING var01 (A1). COMPUTE var01="A". Then create the new id number. STRING new_id (A9). COMPUTE new_id=CONCAT(RTRIM(var01),var02). You can then drop var01. DELETE VAR var01. Put it all together and it looks like this. STRING var01 (A1). COMPUTE var01="A". STRING new_id (A9). COMPUTE new_id=CONCAT(RTRIM(var01),var02). DELETE VAR var01. *************************************************************************************************************************************************************** Mark A. Davenport Ph.D. Senior Research Analyst Office of Institutional Research The University of North Carolina at Greensboro 336.256.0395 [hidden email] 'An approximate answer to the right question is worth a good deal more than an exact answer to an approximate question.' --a paraphrase of J. W. Tukey (1962) stace swayne <[hidden email]> Sent by: "SPSSX(r) Discussion" <[hidden email]> 10/10/2007 03:57 PM Please respond to stace swayne <[hidden email]> To [hidden email] cc Subject Inserting data Dear List, I have 400 new ID numbers like the following (12644168). To fit the structure of my data, I want to insert the letter "A" in front of each ID number like the following (A12644168). Can anyone, share syntax on how to do this. Thanks, Stace --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. |
|
In reply to this post by Tuerk, Elena
At 03:19 PM 10/10/2007, Elena Tuerk wrote:
>I'm finding that it's most efficient to maintain a long organization >so as to only create one set of variables for the items and one set of >code (rather than write separate syntax to score pre- and post- data) >and then to convert the variables of interest (e.g., scale scores) to >pre- and post- for within-subject analyses. Good. To my mind, an excellent strategy. >[But] I've run into problems with the CASETOVARS point-and-click menu. >If I select the "participant id" as the identifier variable and then >the outcome variable "BDI Total Score" as the index variable, with >times of measurement identified by a third variable, I keep running >into errors. I would greatly appreciate any clarification. I'd feel surer of this if I tested it, but I think you've hit a maddening quirk of the CASESTOVARS menu (or wizard). The quirky panel is Step 2 of 5: "Cases to Variables: Select Variables". It has a window for "Identifier Variable(s)" and one for "Index Variable(s)"; but, where the dickens to you put the variables you're actually transposing, like "BDI Total Score"? Answer: you OMIT THEM ALTOGETHER. That's far from intuitive. It's not comfortable to do, even if you know it's the right answer. But, try it. Now, "Index Variable(s)"? You should have a variable in your data that marks whether a case is the pre-test or the post-test for that subject. Use that as the index variable. |
|
This worked beautifully. Thanks so much for your help.
Richard Ristow wrote: > At 03:19 PM 10/10/2007, Elena Tuerk wrote: > >> I'm finding that it's most efficient to maintain a long organization >> so as to only create one set of variables for the items and one set >> of code (rather than write separate syntax to score pre- and post- >> data) and then to convert the variables of interest (e.g., scale >> scores) to pre- and post- for within-subject analyses. > > Good. To my mind, an excellent strategy. > >> [But] I've run into problems with the CASETOVARS point-and-click >> menu. If I select the "participant id" as the identifier variable and >> then the outcome variable "BDI Total Score" as the index variable, >> with times of measurement identified by a third variable, I keep >> running into errors. I would greatly appreciate any clarification. > > I'd feel surer of this if I tested it, but I think you've hit a > maddening quirk of the CASESTOVARS menu (or wizard). > > The quirky panel is Step 2 of 5: "Cases to Variables: Select > Variables". It has a window for "Identifier Variable(s)" and one for > "Index Variable(s)"; but, where the dickens to you put the variables > you're actually transposing, like "BDI Total Score"? Answer: you OMIT > THEM ALTOGETHER. > > That's far from intuitive. It's not comfortable to do, even if you > know it's the right answer. But, try it. > > Now, "Index Variable(s)"? You should have a variable in your data that > marks whether a case is the pre-test or the post-test for that > subject. Use that as the index variable. |
| Free forum by Nabble | Edit this page |
