SPSSX Discussion

Fwd: complex restructure

Classic

List

Threaded

5 messages Options

alex mcveigh

Fwd: complex restructure

Dear all, this is my first post to the forum. I searched for a similar query but couldn’t find. I am using SPSS v20 on windows 7 32-bit.

I'm a postgrad medical researcher with some raw blood pressure (SBP) data in long format which needs reshaping to allow me to generate new variables to mixed-effects model. Extract of the data is here. the full dataset has >30000 rows.

ID	Treat_date	BP	PrePost_T
365	09/04/2009	171	1
365	09/04/2009	162	2
365	09/06/2009	171	1
365	09/06/2009	174	2
365	09/08/2009	163	1
365	09/08/2009	196	2
365	09/11/2009	154	1
365	9/13/2009	158	1
365	9/13/2009	180	2
350	10/20/2009	163	1
350	10/20/2009	143	2
350	10/22/2009	136	1
350	10/22/2009	140	2
350	10/25/2009	189	1
350	10/25/2009	185	2
355	02/01/2011	152	1
355	02/01/2011	133	2
355	02/02/2011	144	1
355	02/02/2011	144	2
355	02/03/2011	137	1
355	02/03/2011	142	2
355	02/06/2011	145	1
355	02/06/2011	139	2
355	02/08/2011	133	1
355	02/08/2011	169	2

Subjects (ID) had SBP measured both before and after a treatment (PrePost_T, 1=Pre-treatment, 2=Post-treatment) on several dates (Treat_date) usually 2-3 days apart. Data are unbalanced with varying number of observations per ID. My main interest is in the change in SBP on the same Treat_date. Ideally I need the Pre-Treatment BP in a column adjacent to the Post-treatment BP column. I tried using a variety of setups in CASESTOVARS using a combination of ID, Treat_date and PrePost_T to make a unique INDEX.

SORT CASES BY ID Treat_date PrePost_T.

CASESTOVARS

/INDEX=Treat_date PrePost_T

/GROUPBY=VARIABLE.

This led to either an error message saying my observations were not unique or generating 1 row per ID with all observations as new variables in wide format. My questions are:

1. I suspect there is an alternative method/syntax to separate my pre and post-treatment BPs in column PrePost_T for each ID but seems beyond my lateral thinking or SPSS knowledge. Can anyone help?

2. Assuming CASESTOVARS is the wrong approach, I would still find it useful in future to create a unique index from long format data using a combination of ID and 2 variables. Is this possible?

My thanks in advance

Alex McVeigh

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Andy W

Re: Fwd: complex restructure

You may be confusing the INDEX subcommand with the ID subcommand for CASESTOVARS. Index just tells SPSS what to put as the suffix for the new variables. The ID subcommand (which can take more than one variable) tells SPSS which rows in the old dataset to collapse into one row in the new dataset.

Try this:

*************************************.
DATA LIST FREE / ID (F3.0) Treat_date (ADATE10) BP PrePost_T (2F3.0).
BEGIN DATA
365 09/04/2009 171 1
365 09/04/2009 162 2
365 09/06/2009 171 1
365 09/06/2009 174 2
365 09/08/2009 163 1
365 09/08/2009 196 2
365 09/11/2009 154 1
365 9/13/2009 158 1
365 9/13/2009 180 2
350 10/20/2009 163 1
350 10/20/2009 143 2
350 10/22/2009 136 1
350 10/22/2009 140 2
350 10/25/2009 189 1
350 10/25/2009 185 2
355 02/01/2011 152 1
355 02/01/2011 133 2
355 02/02/2011 144 1
355 02/02/2011 144 2
355 02/03/2011 137 1
355 02/03/2011 142 2
355 02/06/2011 145 1
355 02/06/2011 139 2
355 02/08/2011 133 1
355 02/08/2011 169 2
END DATA.

SORT CASES BY ID Treat_Date PrePost_T.
CASESTOVARS /ID = ID Treat_Date /INDEX = PrePost_T.
*************************************.

Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/

alex mcveigh

Re: Fwd: complex restructure

David Marso

Re: Fwd: complex restructure

Administrator

Look up the LAG function! or use MATCH with FIRST function. Your call!

alex mcveigh wrote

Thanks Andy for the explanation and syntax. I obviously didn't understand how CASESTOVARS worked.
Using your syntax in my full raw data the restructure is partly running but stopped by an error. i have reproduced the error below adding the variable casenum for illustrative purposes (which i don't have in the real data):

DATA LIST FREE / casenum (F1.0) ID (F3.0) Treat_date (ADATE10) BP PrePost_T (2F3.0).
BEGIN DATA
1 350 10/20/2009 163 1
2 350 10/20/2009 143 2
3 350 10/22/2009 136 1
4 350 10/22/2009 140 2
5 350 10/25/2009 189 1
6 350 10/25/2009 185 2
7 355 02/01/2011 152 1
8 355 02/01/2011 133 2
9 355 02/02/2011 144 1
10 355 02/02/2011 144 2
11 355 02/03/2011 137 1
12 355 02/03/2011 142 2
13 355 02/06/2011 145 1
14 355 02/06/2011 139 2
15 355 02/08/2011 133 1
16 355 02/08/2011 136 1
17 355 02/08/2011 169 2
18 365 09/04/2009 171 1
19 365 09/04/2009 162 2
20 365 09/06/2009 171 1
21 365 09/06/2009 174 2
22 365 09/06/2009 176 2
23 365 09/08/2009 163 1
24 365 09/08/2009 196 2
25 365 09/13/2009 158 1
26 365 09/13/2009 180 2
END DATA.

SORT CASES BY ID Treat_Date PrePost_T.
CASESTOVARS /ID = ID Treat_Date /INDEX = PrePost_T.

The INDEX values for case 16 have occurred before in the cases with the same ID values.
Execution of this command stops.

if i delete/filter case 16 it occurs for further cases (22 in this simulated example). This is obviously due to data entry errors in the raw data with 2 pre or post-treatment BPs when there should only be one. to save time manually filtering the large dataset and (improve reproducibility of syntax for similar data) i wonder how i can filter cases to accept only the first value of BP for PrePost_T 1 or 2? in other words how to filter out casenum 16 and 22 in the example data?

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Jon K Peck

Re: Fwd: complex restructure

In reply to this post by alex mcveigh

See Data > Identify Duplicate Cases

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621

From: alex mcveigh <[hidden email]>
To: [hidden email]
Date: 08/13/2014 06:44 AM
Subject: Re: [SPSSX-L] Fwd: complex restructure
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Thanks Andy for the explanation and syntax. I obviously didn't understand how CASESTOVARS worked. Using your syntax in my full raw data the restructure is partly running but stopped by an error. i have reproduced the error below adding the variable casenum for illustrative purposes (which i don't have in the real data): DATA LIST FREE / casenum (F1.0) ID (F3.0) Treat_date (ADATE10) BP PrePost_T (2F3.0). BEGIN DATA 1 350 10/20/2009 163 1 2 350 10/20/2009 143 2 3 350 10/22/2009 136 1 4 350 10/22/2009 140 2 5 350 10/25/2009 189 1 6 350 10/25/2009 185 2 7 355 02/01/2011 152 1 8 355 02/01/2011 133 2 9 355 02/02/2011 144 1 10 355 02/02/2011 144 2 11 355 02/03/2011 137 1 12 355 02/03/2011 142 2 13 355 02/06/2011 145 1 14 355 02/06/2011 139 2 15 355 02/08/2011 133 1 16 355 02/08/2011 136 1 17 355 02/08/2011 169 2 18 365 09/04/2009 171 1 19 365 09/04/2009 162 2 20 365 09/06/2009 171 1 21 365 09/06/2009 174 2 22 365 09/06/2009 176 2 23 365 09/08/2009 163 1 24 365 09/08/2009 196 2 25 365 09/13/2009 158 1 26 365 09/13/2009 180 2 END DATA. SORT CASES BY ID Treat_Date PrePost_T. CASESTOVARS /ID = ID Treat_Date /INDEX = PrePost_T. The INDEX values for case 16 have occurred before in the cases with the same ID values. Execution of this command stops. if i delete/filter case 16 it occurs for further cases (22 in this simulated example). This is obviously due to data entry errors in the raw data with 2 pre or post-treatment BPs when there should only be one. to save time manually filtering the large dataset and (improve reproducibility of syntax for similar data) i wonder how i can filter cases to accept only the first value of BP for PrePost_T 1 or 2? in other words how to filter out casenum 16 and 22 in the example data? -- View this message in context:http://spssx-discussion.1045642.n5.nabble.com/Fwd-complex-restructure-tp5726939p5726943.htmlSent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD