SPSSX Discussion

SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Classic

List

Threaded

9 messages Options

Marike

SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Hello!

I have a question how I can best merge 4 large national survey data files
(each about 18000 cases with about 500 variables) with 4 repeated measures
in SPSS with unique ID. However cases are not always present in each of the
years. I like to have both a wide and a long file for my analyses. By using
merge files --> add variables I am able to create a wide file, but when I
try restructure data the missing cases are also included in the long file.

To be more clear I show you what I am struggling with by showing 2 files

File T1:
ID var1_T1 var2_T1 var3_T1 var4_T1
1003 41122 4 2 24
1008 41007 -4 1 1
1009 41007 -4 2 1
1010 21155 1 2 24
1011 11002 1 2 4

File T2:
ID var1_T2 var2_T2 var3_T2 var4_T2
1003 41122 1 2 24
1005 41122 1 2 13
1006 41072 2 1 4
1007 41034 999 999 13
1010 21178 3 2 1

By using merge files --> add variables I come to this syntax and file, which
I think is correct.
DATASET ACTIVATE 'fileT1.sav'
MATCH FILES /FILE=*
/FILE='fileT2.sav'
/BY ID.
EXECUTE.

File T1 and T2 merged:
ID var1_T1 var2_T1 var3_T1 var4_T1 var1_T2 var2_T2
var3_T2 var4_T2
1003 41122 4 2 24 41122
1 2 24
1005 . . . .
41122 1 2 13
1006 . . . .
41072 2 1 4
1007 . . . .
41034 999 999 13
1008 41007 -4 1 1 .
. . .
1009 41007 -4 2 1 .
. . .
1010 21155 1 2 24 21178
3 2 1
1011 11002 1 2 4 .
. . .

Then I thought I use RESTRUCTURE to make from this wide file a long file.
But then I do something wrong. I may miss an option or there are other ways
to do this, but the cases that are missing in a certain year are still
included in the file. My current file looks like this:

Current restructured file:
ID var1 var2 var3 var4 time
1003 41122 4 2 24 1
1003 41122 1 2 24 2
1005 . . . . 1
<
1005 41122 1 2 13 2
1006 . . . . 1
<
1006 41072 2 1 4 2
1007 . . . . 1
<
1007 41034 999 999 13 2
1008 41007 -4 1 1 1
1008 . . . . 2
<
1009 41007 -4 2 1 1
1009 . . . . 2
<
1010 21155 1 2 24 1
1010 21178 3 2 1 2
1011 11002 1 2 4 1
1011 . . . . 2
<

I used this syntax:
VARSTOCASES
/MAKE var1 FROM var1_T1 var1_T2
/MAKE var2 FROM var2_T1 var2_T2
/MAKE var3 FROM var3_T1 var3_T2
/MAKE var4 FROM var4_T1 var4_T2
/INDEX=Index1(2)
/KEEP=ID
/NULL=KEEP.

I do not want these missing cases included in the long file. For this
example I left out the fixed variables, but also want to include them. So
what I am doing wrong. Are there some options I need to tick by
restructuring and missing right now, or should I not use the wide file to
create the long file, but use the 4 separate files for each measure to make
the long file straight way. What I am looking for is this:

Restructured file I want:
ID var1 var2 var3 var4 time
1003 41122 4 2 24 1
1003 41122 1 2 24 2
1005 41122 1 2 13 2
1006 41072 2 1 4 2
1007 41034 999 999 13 2
1008 41007 -4 1 1 1
1009 41007 -4 2 1 1
1010 21155 1 2 24 1
1010 21178 3 2 1 2
1011 11002 1 2 4 1

Of course I can use my syntax to delete these cases, but I have to do this
many times for different combinations of years and variables. So in short
what is the best way to merge my files with repeated measures into both a
wide and a long file, whereby the long file does not contain the missing
cases.

Thank you so much!!
Marike

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

John F Hall

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Bit complex this early in the morning, but I had a similar problem.
Check out the Nabble thread: Estimating actual earnings from earnings groups

http://spssx-discussion.1045642.n5.nabble.com/Estimating-actual-earnings-fro
m-earnings-groups-td5736859.html
You might find something helpful in the commentaries at the bottom of my
page
https://surveyresearch.weebly.com/british-social-attitudes-1983-onwards-cumu
lative-spss-file.html
I'll have a look at your sample data later and see if I can sort something
out for you.
Also check out: combinations using:
~~IF~~~ (not (missing(<varlist>))).

John F Hall MA (Cantab) Dip Ed (Dunelm)
[Retired academic survey researcher]

Email: [hidden email]
Website: Journeys in Survey Research
Course: Survey Analysis Workshop (SPSS)
Research: Subjective Social Indicators (Quality of Life)

-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Marike
Sent: 15 October 2018 01:38
To: [hidden email]
Subject: SPSS: Question re merging and restructuring large files with
repeated measures and missing cases

Hello!

I have a question how I can best merge 4 large national survey data files
(each about 18000 cases with about 500 variables) with 4 repeated measures
in SPSS with unique ID. However cases are not always present in each of the
years. I like to have both a wide and a long file for my analyses. By using
merge files --> add variables I am able to create a wide file, but when I
try restructure data the missing cases are also included in the long file.

To be more clear I show you what I am struggling with by showing 2 files

File T1:
ID var1_T1 var2_T1 var3_T1 var4_T1
1003 41122 4 2 24
1008 41007 -4 1 1
1009 41007 -4 2 1
1010 21155 1 2 24
1011 11002 1 2 4

File T2:
ID var1_T2 var2_T2 var3_T2 var4_T2
1003 41122 1 2 24
1005 41122 1 2 13
1006 41072 2 1 4
1007 41034 999 999 13
1010 21178 3 2 1

By using merge files --> add variables I come to this syntax and file, which
I think is correct.
DATASET ACTIVATE 'fileT1.sav'
MATCH FILES /FILE=*
/FILE='fileT2.sav'
/BY ID.
EXECUTE.

File T1 and T2 merged:
ID var1_T1 var2_T1 var3_T1 var4_T1 var1_T2 var2_T2

var3_T2 var4_T2
1003 41122 4 2 24 41122

1 2 24
1005 . . . .
41122 1 2 13
1006 . . . .
41072 2 1 4
1007 . . . .
41034 999 999 13
1008 41007 -4 1 1 .

. . .
1009 41007 -4 2 1 .

. . .
1010 21155 1 2 24 21178

3 2 1
1011 11002 1 2 4 .

. . .

Then I thought I use RESTRUCTURE to make from this wide file a long file.
But then I do something wrong. I may miss an option or there are other ways
to do this, but the cases that are missing in a certain year are still
included in the file. My current file looks like this:

Current restructured file:
ID var1 var2 var3 var4 time
1003 41122 4 2 24 1
1003 41122 1 2 24 2
1005 . . . . 1
<
1005 41122 1 2 13 2
1006 . . . . 1
<
1006 41072 2 1 4 2
1007 . . . . 1
<
1007 41034 999 999 13 2
1008 41007 -4 1 1 1
1008 . . . . 2
<
1009 41007 -4 2 1 1
1009 . . . . 2
<
1010 21155 1 2 24 1
1010 21178 3 2 1 2
1011 11002 1 2 4 1
1011 . . . . 2
<

I used this syntax:
VARSTOCASES
/MAKE var1 FROM var1_T1 var1_T2
/MAKE var2 FROM var2_T1 var2_T2
/MAKE var3 FROM var3_T1 var3_T2
/MAKE var4 FROM var4_T1 var4_T2
/INDEX=Index1(2)
/KEEP=ID
/NULL=KEEP.

I do not want these missing cases included in the long file. For this
example I left out the fixed variables, but also want to include them. So
what I am doing wrong. Are there some options I need to tick by
restructuring and missing right now, or should I not use the wide file to
create the long file, but use the 4 separate files for each measure to make
the long file straight way. What I am looking for is this:

Restructured file I want:
ID var1 var2 var3 var4 time
1003 41122 4 2 24 1
1003 41122 1 2 24 2
1005 41122 1 2 13 2
1006 41072 2 1 4 2
1007 41034 999 999 13 2
1008 41007 -4 1 1 1
1009 41007 -4 2 1 1
1010 21155 1 2 24 1
1010 21178 3 2 1 2
1011 11002 1 2 4 1

Of course I can use my syntax to delete these cases, but I have to do this
many times for different combinations of years and variables. So in short
what is the best way to merge my files with repeated measures into both a
wide and a long file, whereby the long file does not contain the missing
cases.

Thank you so much!!
Marike

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

John F Hall

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

In reply to this post by Marike

Both files previously sorted by ID using:

sort cases by id.

This seems to work (tested)

MATCH FILES /FILE=dataset4

/FILE= dataset5

/BY ID.

On merged file, all cases

list var1_T1, var2_T1, var3_T1, var4_T1,

var1_T2, var2_T2, var3_T2, var4_T2.

var1_T1 var2_T1 var3_T1 var4_T1 var1_T2 var2_T2 var3_T2 var4_T2

41122 4 2 24 41122 41122 2 24

. . . . 41122 41122 2 13

. . . . 41072 41072 1 4

. . . . 41034 41034 999 13

41007 -4 1 1 . . . .

41007 -4 2 1 . . . .

21155 1 2 24 21178 21178 2 1

11002 1 2 4 . . . .

Number of cases read: 8 Number of cases listed: 8

Checking cases not present in both files;

count t1missing = var1_T1, var2_T1, var3_T1, var4_T1, (sysmis)

/t2missing = var1_T2, var2_T2, var3_T2, var4_T2 (sysmis).

freq t1missing t2missing.

t1missing
		Frequency	Percent	Valid Percent	Cumulative Percent
Valid	0	5	62.5	62.5	62.5
	4	3	37.5	37.5	100.0
	Total	8	100.0	100.0

t1missing
		Frequency	Percent	Valid Percent	Cumulative Percent
Valid	0	5	62.5	62.5	62.5
	4	3	37.5	37.5	100.0
	Total	8	100.0	100.0

Selecting only cases present in both files

TEMPORARY.

select if (t1missing ne 4) and (t2missing ne 4).

list id var1_T1, var2_T1, var3_T1, var4_T1,

var1_T2, var2_T2, var3_T2, var4_T2.

ID var1_T1 var2_T1 var3_T1 var4_T1 var1_T2 var2_T2 var3_T2 var4_T2

1003 41122 4 2 24 41122 41122 2 24

1010 21155 1 2 24 21178 21178 2 1

Number of cases read: 2 Number of cases listed: 2

If you want to save the merged file with only cases present in both files you can do a permanent selection with:

select if (t1missing ne 4) and (t2missing ne 4).

Hope this helps

John F Hall MA (Cantab) Dip Ed (Dunelm)

[Retired academic survey researcher]

Email: [hidden email]

Website: Journeys in Survey Research

Course: Survey Analysis Workshop (SPSS)

Research: Subjective Social Indicators (Quality of Life)

-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Marike
Sent: 15 October 2018 01:38
To: [hidden email]
Subject: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Hello!

I have a question how I can best merge 4 large national survey data files (each about 18000 cases with about 500 variables) with 4 repeated measures in SPSS with unique ID. However cases are not always present in each of the years. I like to have both a wide and a long file for my analyses. By using merge files --> add variables I am able to create a wide file, but when I try restructure data the missing cases are also included in the long file.

To be more clear I show you what I am struggling with by showing 2 files

File T1:

ID var1_T1 var2_T1 var3_T1 var4_T1

1003 41122 4 2 24

1008 41007 -4 1 1

1009 41007 -4 2 1

1010 21155 1 2 24

1011 11002 1 2 4

File T2:

ID var1_T2 var2_T2 var3_T2 var4_T2

1003 41122 1 2 24

1005 41122 1 2 13

1006 41072 2 1 4

1007 41034 999 999 13

1010 21178 3 2 1

By using merge files --> add variables I come to this syntax and file, which I think is correct.

DATASET ACTIVATE 'fileT1.sav'

MATCH FILES /FILE=*

/FILE='fileT2.sav'

/BY ID.

EXECUTE.

File T1 and T2 merged:

ID var1_T1 var2_T1 var3_T1 var4_T1 var1_T2 var2_T2

var3_T2 var4_T2

1003 41122 4 2 24 41122

1 2 24

1005 . . . .

41122 1 2 13

1006 . . . .

41072 2 1 4

1007 . . . .

41034 999 999 13

1008 41007 -4 1 1 .

. . .

1009 41007 -4 2 1 .

. . .

1010 21155 1 2 24 21178

3 2 1

1011 11002 1 2 4 .

. . .

Then I thought I use RESTRUCTURE to make from this wide file a long file.

But then I do something wrong. I may miss an option or there are other ways to do this, but the cases that are missing in a certain year are still included in the file. My current file looks like this:

Current restructured file:

ID var1 var2 var3 var4 time

1003 41122 4 2 24 1

1003 41122 1 2 24 2

1005 . . . . 1

1005 41122 1 2 13 2

1006 . . . . 1

1006 41072 2 1 4 2

1007 . . . . 1

1007 41034 999 999 13 2

1008 41007 -4 1 1 1

1008 . . . . 2

1009 41007 -4 2 1 1

1009 . . . . 2

1010 21155 1 2 24 1

1010 21178 3 2 1 2

1011 11002 1 2 4 1

1011 . . . . 2

I used this syntax:

VARSTOCASES

/MAKE var1 FROM var1_T1 var1_T2

/MAKE var2 FROM var2_T1 var2_T2

/MAKE var3 FROM var3_T1 var3_T2

/MAKE var4 FROM var4_T1 var4_T2

/INDEX=Index1(2)

/KEEP=ID

/NULL=KEEP.

I do not want these missing cases included in the long file. For this example I left out the fixed variables, but also want to include them. So what I am doing wrong. Are there some options I need to tick by restructuring and missing right now, or should I not use the wide file to create the long file, but use the 4 separate files for each measure to make the long file straight way. What I am looking for is this:

Restructured file I want:

ID var1 var2 var3 var4 time

1003 41122 4 2 24 1

1003 41122 1 2 24 2

1005 41122 1 2 13 2

1006 41072 2 1 4 2

1007 41034 999 999 13 2

1008 41007 -4 1 1 1

1009 41007 -4 2 1 1

1010 21155 1 2 24 1

1010 21178 3 2 1 2

1011 11002 1 2 4 1

Of course I can use my syntax to delete these cases, but I have to do this many times for different combinations of years and variables. So in short what is the best way to merge my files with repeated measures into both a wide and a long file, whereby the long file does not contain the missing cases.

Thank you so much!!

Marike

Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================

To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Bruce Weaver

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Administrator

In reply to this post by Marike

Hello Marike. If you change /NULL=KEEP to /NULL=DROP, do you get the result
you want?

https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistics_reference_project_ddita/spss/base/syn_varstocases_null.html

Marike wrote

> Hello!
>
> I have a question how I can best merge 4 large national survey data files
> (each about 18000 cases with about 500 variables) with 4 repeated measures
> in SPSS with unique ID. However cases are not always present in each of
> the
> years. I like to have both a wide and a long file for my analyses. By
> using
> merge files --> add variables I am able to create a wide file, but when I
> try restructure data the missing cases are also included in the long file.
>
> To be more clear I show you what I am struggling with by showing 2 files
>
> File T1:
> ID var1_T1 var2_T1 var3_T1 var4_T1
> 1003 41122 4 2 24
> 1008 41007 -4 1 1
> 1009 41007 -4 2 1
> 1010 21155 1 2 24
> 1011 11002 1 2 4
>
> File T2:
> ID var1_T2 var2_T2 var3_T2 var4_T2
> 1003 41122 1 2 24
> 1005 41122 1 2 13
> 1006 41072 2 1 4
> 1007 41034 999 999 13
> 1010 21178 3 2 1
>
> By using merge files --> add variables I come to this syntax and file,
> which
> I think is correct.
> DATASET ACTIVATE 'fileT1.sav'
> MATCH FILES /FILE=*
> /FILE='fileT2.sav'
> /BY ID.
> EXECUTE.
>
> File T1 and T2 merged:
> ID var1_T1 var2_T1 var3_T1 var4_T1 var1_T2
> var2_T2
> var3_T2 var4_T2
> 1003 41122 4 2 24 41122
> 1 2 24
> 1005 . . . .
> 41122 1 2 13
> 1006 . . . .
> 41072 2 1 4
> 1007 . . . .
> 41034 999 999 13
> 1008 41007 -4 1 1 .
> . . .
> 1009 41007 -4 2 1 .
> . . .
> 1010 21155 1 2 24 21178
> 3 2 1
> 1011 11002 1 2 4 .
> . . .
>
> Then I thought I use RESTRUCTURE to make from this wide file a long file.
> But then I do something wrong. I may miss an option or there are other
> ways
> to do this, but the cases that are missing in a certain year are still
> included in the file. My current file looks like this:
>
> Current restructured file:
> ID var1 var2 var3 var4 time
> 1003 41122 4 2 24 1
> 1003 41122 1 2 24 2
> 1005 . . . . 1
> <
> 1005 41122 1 2 13 2
> 1006 . . . . 1
> <
> 1006 41072 2 1 4 2
> 1007 . . . . 1
> <
> 1007 41034 999 999 13 2
> 1008 41007 -4 1 1 1
> 1008 . . . . 2
> <
> 1009 41007 -4 2 1 1
> 1009 . . . . 2
> <
> 1010 21155 1 2 24 1
> 1010 21178 3 2 1 2
> 1011 11002 1 2 4 1
> 1011 . . . . 2
> <
>
> I used this syntax:
> VARSTOCASES
> /MAKE var1 FROM var1_T1 var1_T2
> /MAKE var2 FROM var2_T1 var2_T2
> /MAKE var3 FROM var3_T1 var3_T2
> /MAKE var4 FROM var4_T1 var4_T2
> /INDEX=Index1(2)
> /KEEP=ID
> /NULL=KEEP.
>
> I do not want these missing cases included in the long file. For this
> example I left out the fixed variables, but also want to include them. So
> what I am doing wrong. Are there some options I need to tick by
> restructuring and missing right now, or should I not use the wide file to
> create the long file, but use the 4 separate files for each measure to
> make
> the long file straight way. What I am looking for is this:
>
> Restructured file I want:
> ID var1 var2 var3 var4 time
> 1003 41122 4 2 24 1
> 1003 41122 1 2 24 2
> 1005 41122 1 2 13 2
> 1006 41072 2 1 4 2
> 1007 41034 999 999 13 2
> 1008 41007 -4 1 1 1
> 1009 41007 -4 2 1 1
> 1010 21155 1 2 24 1
> 1010 21178 3 2 1 2
> 1011 11002 1 2 4 1
>
> Of course I can use my syntax to delete these cases, but I have to do this
> many times for different combinations of years and variables. So in short
> what is the best way to merge my files with repeated measures into both a
> wide and a long file, whereby the long file does not contain the missing
> cases.
>
> Thank you so much!!
> Marike
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

> (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

Ki Park

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

I think you could start with ADD FILES first to get what you want in the
second place, and then do the restructure. Something like this?

*1.
ADD FILES /FILE="fileT1.sav"
/FILE="fileT2.sav".
EXECUTE.

Sort cases by ID time.

*2.
CASESTOVARS
/ID=ID
/GROUPBY=VARIABLE.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

MLIves

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

That was my thought as well since casestovars is an easier syntax.
Based on your examples, to do what Ki suggests, before running the 'Add files', you'd need to rename your variables in the files to drop the _T1, _T2 etc, then add a 'Time' variable to each file. (both can be done simply using syntax).

In the Casestovars, you can specify how the variable names will indicate the time source (SEPARATOR and INDEX) and whether the case has data for that time (VIND). ("An optional rootname can be specified after the ROOT keyword on the subcommand. The default rootname is ind.")

CASESTOVARS
/ID=ID
/SEPARATOR="_T"
/INDEX=time
/VIND
/GROUPBY=VARIABLE.
--------------
Melissa

________________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Ki Park <[hidden email]>
Sent: Monday, October 15, 2018 11:59 AM
To: [hidden email]
Subject: Re: [SPSSX-L] SPSS: Question re merging and restructuring large files with repeated measures and missing cases

I think you could start with ADD FILES first to get what you want in the
second place, and then do the restructure. Something like this?

*1.
ADD FILES /FILE="fileT1.sav"
/FILE="fileT2.sav".
EXECUTE.

Sort cases by ID time.

*2.
CASESTOVARS
/ID=ID
/GROUPBY=VARIABLE.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

________________________________

This correspondence contains proprietary information some or all of which may be legally privileged; it is for the intended recipient only. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this correspondence and completely dispose of the correspondence immediately. Please notify the sender if you have received this email in error. NOTE: Messages to or from the State of Connecticut domain may be subject to the Freedom of Information statutes and regulations.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

David Marso

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Administrator

Drum Roll...
Here is another way to do it. Avoid manually renaming variables at all ;-)

DATA LIST FREE /ID var1_T1 var2_T1 var3_T1 var4_T1 .
BEGIN DATA
1003 41122 4 2 24
1008 41007 -4 1 1
1009 41007 -4 2 1
1010 21155 1 2 24
1011 11002 1 2 4
END DATA.
DATASET NAME Data1.

DATA LIST FREE /ID var1_T2 var2_T2 var3_T2 var4_T2 .
BEGIN DATA
1003 41122 1 2 24
1005 41122 1 2 13
1006 41072 2 1 4
1007 41034 999 999 13
1010 21178 3 2 1
END DATA.
DATASET NAME Data2.

ADD FILES / FILE Data1 / FILE Data2 / BY ID.

BEGIN PROGRAM.
import spssaux
import spss
varlist=" ".join(spssaux.VariableDict().expand("var1_T1 TO var4_T2" ))
spss.Submit( "AGGREGATE OUTFILE * /BREAK ID / " + varlist + "=MAX ("+
varlist +")" )
END PROGRAM.

DATASET NAME Wide.
DATASET COPY CpyForLong.
DATASET ACTIVATE CpyForLong.
VARSTOCASES MAKE New FROM var1_T1 TO var4_T2 /NULL=DROP /INDEX=Varname(New).
COMPUTE #=CHAR.INDEX(Varname,"_").
COMPUTE Index=NUMBER(CHAR.SUBSTR(Varname,#+2),F1).
COMPUTE Varname=CHAR.SUBSTR(Varname,1,#-1).
CASESTOVARS /ID=ID INDEX/INDEX=Varname .

-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Marike

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

In reply to this post by Marike

Thank you all for your answers.
I have been able to create the long file now.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

David Marso

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Administrator

In reply to this post by David Marso

SCRATCH THAT...Brain fart.
Better way and general.
DATA LIST FREE /ID var1_T1 var2_T1 var3_T1 var4_T1 .
BEGIN DATA
1003 41122 4 2 24
1008 41007 -4 1 1
1009 41007 -4 2 1
1010 21155 1 2 24
1011 11002 1 2 4
END DATA.
DATASET NAME Data1.

DATA LIST FREE /ID var1_T2 var2_T2 var3_T2 var4_T2 .
BEGIN DATA
1003 41122 1 2 24
1005 41122 1 2 13
1006 41072 2 1 4
1007 41034 999 999 13
1010 21178 3 2 1
END DATA.
DATASET NAME Data2.

MATCH FILES / FILE Data1 / FILE Data2 / BY ID.
DATASET NAME Wide.
DATASET COPY CpyForLong.
DATASET ACTIVATE CpyForLong.

/* Build @ and @@ so we don't need to know anything about the variable
names */.
NUMERIC @ (F1).
MATCH FILES /FILE * / KEEP ID @ ALL.
NUMERIC @@ (F1).

/* Note this will fail if there are STRING variables if there are then DROP
them in the previous MATCH */.
VARSTOCASES MAKE New FROM @ TO @@ /NULL=DROP /INDEX=Varname(New).

/* Use RINDEX in case the root variable name has multiple _ characters */.
COMPUTE #=CHAR.RINDEX(Varname,"_").
COMPUTE Index=NUMBER(CHAR.SUBSTR(Varname,#+2),F1).
COMPUTE Varname=CHAR.SUBSTR(Varname,1,#-1).
CASESTOVARS /ID=ID INDEX/INDEX=Varname .

David Marso wrote

> Drum Roll...
> Here is another way to do it. Avoid manually renaming variables at all
> ;-)
>
> DATA LIST FREE /ID var1_T1 var2_T1 var3_T1 var4_T1 .
> BEGIN DATA
> 1003 41122 4 2 24
> 1008 41007 -4 1 1
> 1009 41007 -4 2 1
> 1010 21155 1 2 24
> 1011 11002 1 2 4
> END DATA.
> DATASET NAME Data1.
>
> DATA LIST FREE /ID var1_T2 var2_T2 var3_T2 var4_T2 .
> BEGIN DATA
> 1003 41122 1 2 24
> 1005 41122 1 2 13
> 1006 41072 2 1 4
> 1007 41034 999 999 13
> 1010 21178 3 2 1
> END DATA.
> DATASET NAME Data2.
>
> ADD FILES / FILE Data1 / FILE Data2 / BY ID.
>
> BEGIN PROGRAM.
> import spssaux
> import spss
> varlist=" ".join(spssaux.VariableDict().expand("var1_T1 TO var4_T2" ))
> spss.Submit( "AGGREGATE OUTFILE * /BREAK ID / " + varlist + "=MAX ("+
> varlist +")" )
> END PROGRAM.
>
> DATASET NAME Wide.
> DATASET COPY CpyForLong.
> DATASET ACTIVATE CpyForLong.
> VARSTOCASES MAKE New FROM var1_T1 TO var4_T2 /NULL=DROP
> /INDEX=Varname(New).
> COMPUTE #=CHAR.INDEX(Varname,"_").
> COMPUTE Index=NUMBER(CHAR.SUBSTR(Varname,#+2),F1).
> COMPUTE Varname=CHAR.SUBSTR(Varname,1,#-1).
> CASESTOVARS /ID=ID INDEX/INDEX=Varname .
>
>
>
>
>
> -----
> Please reply to the list and not to my personal email.
> Those desiring my consulting or training services please feel free to
> email me.
> ---
> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos
> ne forte conculcent eas pedibus suis."
> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in
> abyssum?"
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

> (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD