I'd like some help to find a more efficient way to work a cases to vars problem. I going to present only a simplified version of the actual dataset because the actual dataset has additional but irrelevant complexity. The data are from an EMA (ecological momentary assessment) measure. Briefly, persons have a piece of software installed on their phones that presents a set of survey items to them at multiple times during the day. This is now the simplified version. The survey consists of 7 items. Item 1 is always presented first and the remaining six items are presented in a ramdomized order for each presentation.
The data are provided to us in long format and a data record has the following structure: Person, instance, item_id, item_label, response, item_time, response_time.
The data for one presentation of the survey items consists of 7 records, one for each of the items presented. Call_time is the time of presentation initiation, item_time is the time at which the item was presented, response_time is the time at which the
response was entered. Time is UTC seconds.
Casestovars works exactly as it should; however, the resulting dataset is not correctly structured because column contents are in call_time order. They need to be in a consistent order across records (person-call_time). I have a plan for this, which is basically
a programmed casestovars operation using aggregate which is going to be a lot of work (and error prone) as the number of items is not 7 but rather 45. I'm interested in better ideas.
Here is a sample dataset. person,call_time,item_id,item_label,response,item_time,response_time,seq
1278,1520541611,1022,Qxtwii,8,1520541619,1520541623
1278,1520541611,1031,NrfghJ,6,1520541641,1520541643
1278,1520541611,1024,AgHmmw,5,1520541648,1520541657
1278,1520541611,1051,BpweBd,5,1520541665,1520541672
1278,1520541611,1059,LkxCwu,6,1520541689,1520541691
1278,1520541611,1029,BooDzz,4,1520541704,1520541712
1278,1520541611,1040,KeioUy,2,1520541719,1520541722
1389,1521539923,1022,Qxtwii,7,1521539951,1521539957
1389,1521539923,1024,AgHmmw,7,1521539972,1521539977
1389,1521539923,1029,BooDzz,4,1521539981,1521539984
1389,1521539923,1051,BpweBd,5,1521540003,1521540010
1389,1521539923,1059,LkxCwu,6,1521540018,1521540022
1389,1521539923,1040,KeioUy,6,1521540026,1521540034
1389,1521539923,1031,NrfghJ,5,1521540047,1521540055
1395,1521783301,1022,Qxtwii,5,1521783306,1521783312
1395,1521783301,1024,AgHmmw,3,1521783323,1521783332
1395,1521783301,1029,BooDzz,6,1521783347,1521783349
1395,1521783301,1040,KeioUy,1,1521783365,1521783374
1395,1521783301,1031,NrfghJ,5,1521783381,1521783388
1395,1521783301,1051,BpweBd,4,1521783398,1521783403
1395,1521783301,1059,LkxCwu,3,1521783408,1521783414
Thanks, Gene Maguin
|
I don't understand where the great complexity comes in.
There seem to be two choices for how to order the responses - by order
of presentation, which you easily obtain but don't want; or by item_label.
I think that you want to a "line number" for order of presentation
to a person, so you can save that information; then sort by Person
and item_id before doing CasesToVars on Person and item_id.
What am I missing?
--
Rich Ulrich
From: SPSSX(r) Discussion <[hidden email]> on behalf of Maguin, Eugene <[hidden email]>
Sent: Monday, September 30, 2019 11:21 AM To: [hidden email] <[hidden email]> Subject: (complex) cases to vars problem I'd like some help to find a more efficient way to work a cases to vars problem. I going to present only a simplified version of the actual dataset because the actual dataset has additional but irrelevant complexity. The data are from an EMA (ecological momentary assessment) measure. Briefly, persons have a piece of software installed on their phones that presents a set of survey items to them at multiple times during the day. This is now the simplified version. The survey consists of 7 items. Item 1 is always presented first and the remaining six items are presented in a ramdomized order for each presentation.
The data are provided to us in long format and a data record has the following structure: Person, instance, item_id, item_label, response, item_time, response_time.
The data for one presentation of the survey items consists of 7 records, one for each of the items presented. Call_time is the time of presentation initiation, item_time is the time at which the item was presented, response_time is the time at which the
response was entered. Time is UTC seconds.
Casestovars works exactly as it should; however, the resulting dataset is not correctly structured because column contents are in call_time order. They need to be in a consistent order across records (person-call_time). I have a plan for this, which is basically
a programmed casestovars operation using aggregate which is going to be a lot of work (and error prone) as the number of items is not 7 but rather 45. I'm interested in better ideas.
Here is a sample dataset. person,call_time,item_id,item_label,response,item_time,response_time,seq
1278,1520541611,1022,Qxtwii,8,1520541619,1520541623
1278,1520541611,1031,NrfghJ,6,1520541641,1520541643
1278,1520541611,1024,AgHmmw,5,1520541648,1520541657
1278,1520541611,1051,BpweBd,5,1520541665,1520541672
1278,1520541611,1059,LkxCwu,6,1520541689,1520541691
1278,1520541611,1029,BooDzz,4,1520541704,1520541712
1278,1520541611,1040,KeioUy,2,1520541719,1520541722
1389,1521539923,1022,Qxtwii,7,1521539951,1521539957
1389,1521539923,1024,AgHmmw,7,1521539972,1521539977
1389,1521539923,1029,BooDzz,4,1521539981,1521539984
1389,1521539923,1051,BpweBd,5,1521540003,1521540010
1389,1521539923,1059,LkxCwu,6,1521540018,1521540022
1389,1521539923,1040,KeioUy,6,1521540026,1521540034
1389,1521539923,1031,NrfghJ,5,1521540047,1521540055
1395,1521783301,1022,Qxtwii,5,1521783306,1521783312
1395,1521783301,1024,AgHmmw,3,1521783323,1521783332
1395,1521783301,1029,BooDzz,6,1521783347,1521783349
1395,1521783301,1040,KeioUy,1,1521783365,1521783374
1395,1521783301,1031,NrfghJ,5,1521783381,1521783388
1395,1521783301,1051,BpweBd,4,1521783398,1521783403
1395,1521783301,1059,LkxCwu,3,1521783408,1521783414
Thanks, Gene Maguin
|
Hi Rich, Perhaps i've made the problem more complex than it really is but that is why i'm asking. I hope there's other people on the list who work with EMA data and have encountered this problem.
I first want to add just two points. A "case" is defined by both person and call_time. Presentation order is recorded by item_time.
I appreciate your point about the choice between either order by item_id or call_time. It seems like we can have one or the other but not both.
Thank you, Gene Maguin
From: Rich Ulrich <[hidden email]>
Sent: Monday, September 30, 2019 12:28 PM To: [hidden email]; Maguin, Eugene Subject: Re: (complex) cases to vars problem
I don't understand where the great complexity comes in.
There seem to be two choices for how to order the responses - by order
of presentation, which you easily obtain but don't want; or by item_label.
I think that you want to a "line number" for order of presentation
to a person, so you can save that information; then sort by Person
and item_id before doing CasesToVars on Person and item_id.
What am I missing?
--
Rich Ulrich
From: SPSSX(r) Discussion <[hidden email]> on behalf of Maguin, Eugene <[hidden email]>
Sent: Monday, September 30, 2019 11:21 AM To: [hidden email] <[hidden email]> Subject: (complex) cases to vars problem I'd like some help to find a more efficient way to work a cases to vars problem. I going to present only a simplified version of the actual dataset because the actual dataset has additional but irrelevant complexity. The data are from an EMA (ecological momentary assessment) measure. Briefly, persons have a piece of software installed on their phones that presents a set of survey items to them at multiple times during the day. This is now the simplified version. The survey consists of 7 items. Item 1 is always presented first and the remaining six items are presented in a ramdomized order for each presentation.
The data are provided to us in long format and a data record has the following structure: Person, instance, item_id, item_label, response, item_time, response_time.
The data for one presentation of the survey items consists of 7 records, one for each of the items presented. Call_time is the time of presentation initiation, item_time is the time at which the item was presented, response_time is the time at which the
response was entered. Time is UTC seconds.
Casestovars works exactly as it should; however, the resulting dataset is not correctly structured because column contents are in call_time order. They need to be in a consistent order across records (person-call_time). I have a plan for this, which is basically
a programmed casestovars operation using aggregate which is going to be a lot of work (and error prone) as the number of items is not 7 but rather 45. I'm interested in better ideas.
Here is a sample dataset. person,call_time,item_id,item_label,response,item_time,response_time,seq
1278,1520541611,1022,Qxtwii,8,1520541619,1520541623
1278,1520541611,1031,NrfghJ,6,1520541641,1520541643
1278,1520541611,1024,AgHmmw,5,1520541648,1520541657
1278,1520541611,1051,BpweBd,5,1520541665,1520541672
1278,1520541611,1059,LkxCwu,6,1520541689,1520541691
1278,1520541611,1029,BooDzz,4,1520541704,1520541712
1278,1520541611,1040,KeioUy,2,1520541719,1520541722
1389,1521539923,1022,Qxtwii,7,1521539951,1521539957
1389,1521539923,1024,AgHmmw,7,1521539972,1521539977
1389,1521539923,1029,BooDzz,4,1521539981,1521539984
1389,1521539923,1051,BpweBd,5,1521540003,1521540010
1389,1521539923,1059,LkxCwu,6,1521540018,1521540022
1389,1521539923,1040,KeioUy,6,1521540026,1521540034
1389,1521539923,1031,NrfghJ,5,1521540047,1521540055
1395,1521783301,1022,Qxtwii,5,1521783306,1521783312
1395,1521783301,1024,AgHmmw,3,1521783323,1521783332
1395,1521783301,1029,BooDzz,6,1521783347,1521783349
1395,1521783301,1040,KeioUy,1,1521783365,1521783374
1395,1521783301,1031,NrfghJ,5,1521783381,1521783388
1395,1521783301,1051,BpweBd,4,1521783398,1521783403
1395,1521783301,1059,LkxCwu,3,1521783408,1521783414
Thanks, Gene Maguin
|
Free forum by Nabble | Edit this page |