Have 2x datasets A & B.
Dataset A contains the subjects ID# & answers to questions phq1 to phq240. xxxx-xx-xxxx 1251212351.... Dataset B contains the item name, possible response categories, & weights to be assigned each response category. phq1 1 .3355 phq1 2 .4736 phq1 3 .2692 phq1 4 .2628 phq1 5 .3103 phq2 1 .2345 phq2 2 .3445 . . phq240 How do I use dataset B as a table lookup for dataset A so that I can score a subject's answers to questions phq1 to phq240 according to their assigned weights in dataset B? ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Harrison,
I think there are a couple of ways to do this. What follows is untested. I think the easier way is to restructure dataset A from wide to long. I assume dataset A looks like Pid phq1 phq2 ... Phw240 Restructure then sort. Varstocases /make phq from phq1 to phq240/null=keep/count=vnum. Sort cases by vnum phq. The result will look like Pid phq vnum 1234-56-1111 1 1 1234-56-1112 1 1 1234-56-1113 1 1 ... 9999-99-9999 5 240 The sort is needed to match up with dataset B I'm assuming dataset B looks like Varid varvalue weight phq1 1 .3355 In dataset A the variable vnum carries the variable number, i.e.. Vnum=1 for phq1, etc. A similar variable needs to be created in dataset B. So in dataset B: Sort cases by varid varvalue. Do if ($casenum eq 1). + compute vnum=1. Else if (varid eq lag(varid)). + compute vnum=lag(vnum). Else. + compute vnum=lag(vnum)+1. End if. So now dataset B looks like Varid varvalue weight vnum phq1 1 .3355 1 ... phq1 5 .3103 1 phq2 1 .2345 2 Now match files using the table subcommand. Match files file='dataset A'/rename=(phq=varvalue)/ table='dataset B'/by vnum varvalue. Execute. The rename is needed because I gave the same functional variable different names in the two datasets. A rename will have to be done some where. I did it here, it could be done elsewhere. The resulting dataset will look like Pid varvalue vnum weight 1234-56-1111 1 1 .3355 1234-56-1122 1 1 .3355 1234-56-1145 1 1 .3355 ... 9999-99-9934 5 240 .2345 Next, sort cases to get back to the correct order. Sort cases by pid vnum. I'm not sure what you need to do from this point on but it sounds like you might simply add the weights for each person. Thus: Aggregate outfile=*/break=pid/total=sum(weight). >>>> Warning <<<< No allowance has been made for missing data in phq1 to phq240, either user or sysmis. If usermissing data is present, I think you will have to turn the usermissing designation off so that the match will work and will give sysmis to the weight variable in the final dataset. If you have sysmis values, I think you should recode the sysmis value to a user missing value and proceed as for usermissing. Careful data checking should be done. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Harrison Sent: Wednesday, January 26, 2011 2:59 PM To: [hidden email] Subject: Merging data or table lookup problem Have 2x datasets A & B. Dataset A contains the subjects ID# & answers to questions phq1 to phq240. xxxx-xx-xxxx 1251212351.... Dataset B contains the item name, possible response categories, & weights to be assigned each response category. phq1 1 .3355 phq1 2 .4736 phq1 3 .2692 phq1 4 .2628 phq1 5 .3103 phq2 1 .2345 phq2 2 .3445 . . phq240 How do I use dataset B as a table lookup for dataset A so that I can score a subject's answers to questions phq1 to phq240 according to their assigned weights in dataset B? ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by Harrisonlv
How about something like the following?
new file. dataset close all. data list list / id (a12) A1 A2 (2f1.0). begin data aaaa-aa-aaaa 1 2 bbbb-bb-bbbb 2 3 cccc-cc-cccc 4 5 end data. dataset name A . data list list / question (a6) answer(f1.0) wght (f5.4). begin data phq1 1 .3355 phq1 2 .4736 phq1 3 .2692 phq1 4 .2628 phq1 5 .3103 phq2 1 .2345 phq2 2 .3445 phq2 3 .2692 phq2 4 .2628 phq2 5 .3103 end data. dataset name B. * First, restructure dataset A from WIDE to LONG . dataset activate A. VARSTOCASES /MAKE Answer FROM A1 A2 /INDEX=item(2) /KEEP=id /NULL=KEEP. * Now prepare dataset B for merging. * Specifically, recode QUESTION into a numeric variable called ITEM . * AUTORECODE will not work properly, because phq10 < phq2, etc; so * the ITEM numbers would be out of order. Therefore, we must use * another method. The approach below assumes the questions are * in order from phq1 to phq240, as shown in mailing list post. DATASET ACTIVATE B. do if $casenum EQ 1. - compute item = 1. else. - compute #increment = question NE lag(question). - compute item = lag(item) + #increment. end if. exe. DATASET ACTIVATE A. sort cases by item answer. MATCH FILES FILE=* / TABLE='B' / BY item answer. exe. * If you want the original case order, sort again. sort cases by id item.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Free forum by Nabble | Edit this page |