SPSS: Question re merging and restructuring large files with repeated measures and missing cases

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Marike
Hello!

I have a question how I can best merge 4 large national survey data files
(each about 18000 cases with about 500 variables) with 4 repeated measures
in SPSS with unique ID. However cases are not always present in each of the
years. I like to have both a wide and a long file for my analyses. By using
merge files --> add variables I am able to create a wide file, but when I
try restructure data the missing cases are also included in the long file.

To be more clear I show you what I am struggling with by showing 2 files

File T1:
ID        var1_T1    var2_T1     var3_T1     var4_T1
1003     41122       4              2               24
1008     41007      -4              1               1
1009     41007      -4              2               1
1010     21155       1              2               24
1011     11002       1              2               4

File T2:
ID         var1_T2    var2_T2     var3_T2     var4_T2
1003     41122       1               2               24
1005     41122       1               2               13
1006     41072       2               1               4
1007     41034       999           999            13
1010     21178       3               2               1

By using merge files --> add variables I come to this syntax and file, which
I think is correct.
DATASET ACTIVATE 'fileT1.sav'
MATCH FILES /FILE=*
  /FILE='fileT2.sav'
  /BY ID.
EXECUTE.

File T1 and T2 merged:
ID         var1_T1    var2_T1     var3_T1    var4_T1     var1_T2    var2_T2    
var3_T2     var4_T2
1003     41122       4               2              24             41122      
1              2               24
1005     .               .                .               .              
41122       1              2               13
1006     .               .                .               .              
41072       2              1               4
1007     .               .                .               .              
41034       999          999            13
1008     41007      -4               1              1               .              
.              .                 .
1009     41007      -4               2              1               .              
.              .                 .
1010     21155       1               2              24             21178      
3              2               1
1011     11002       1               2              4               .              
.               .                .

Then I thought I use RESTRUCTURE to make from this wide file a long file.
But then I do something wrong. I may miss an option or there are other ways
to do this, but the cases that are missing in a certain year are still
included in the file. My current file looks like this:

Current restructured file:
ID         var1         var2          var3          var4     time
1003     41122       4               2              24        1
1003     41122       1               2              24        2
1005     .               .                .               .          1    
<
1005     41122       1               2              13        2
1006     .               .                .               .          1    
<
1006     41072       2               1              4          2
1007     .               .                .               .          1    
<
1007     41034      999            999           13        2
1008     41007      -4              1               1         1
1008     .               .                .               .          2    
<
1009     41007      -4               2              1         1
1009     .               .                .               .          2    
<
1010     21155       1               2              24        1
1010     21178       3               2              1          2
1011     11002       1               2              4          1
1011     .               .                .               .          2    
<

I used this syntax:
VARSTOCASES
  /MAKE var1 FROM var1_T1 var1_T2
  /MAKE var2 FROM var2_T1 var2_T2
  /MAKE var3 FROM var3_T1 var3_T2
  /MAKE var4 FROM var4_T1 var4_T2
  /INDEX=Index1(2)
  /KEEP=ID
  /NULL=KEEP.

I do not want these missing cases included in the long file. For this
example I left out the fixed variables, but also want to include them. So
what I am doing wrong. Are there some options I need to tick by
restructuring and missing right now, or should I not use the wide file to
create the long file, but use the 4 separate files for each measure to make
the long file straight way. What I am looking for is this:

Restructured file I want:
ID         var1         var2          var3          var4      time
1003     41122       4               2              24        1
1003     41122       1               2              24        2
1005     41122       1               2              13        2
1006     41072       2               1              4          2
1007     41034       999           999           13        2
1008     41007       -4              1              1          1
1009     41007       -4              2              1          1
1010     21155       1               2              24        1
1010     21178       3               2              1          2
1011     11002       1               2              4          1

Of course I can use my syntax to delete these cases, but I have to do this
many times for different combinations of years and variables. So in short
what is the best way to merge my files with repeated measures into both a
wide and a long file, whereby the long file does not contain the missing
cases.

Thank you so much!!
Marike



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

John F Hall
Bit complex this early in the morning, but I had a similar problem.  
Check out the Nabble thread: Estimating actual earnings from earnings groups

http://spssx-discussion.1045642.n5.nabble.com/Estimating-actual-earnings-fro
m-earnings-groups-td5736859.html
You might find something helpful in the commentaries at the bottom of my
page
https://surveyresearch.weebly.com/british-social-attitudes-1983-onwards-cumu
lative-spss-file.html
I'll have a look at your sample data later and see if I can sort something
out for you.
Also check out:  combinations using:
        ~~IF~~~ (not (missing(<varlist>))).

John F Hall  MA (Cantab) Dip Ed (Dunelm)
[Retired academic survey researcher]

Email:          [hidden email]
Website:     Journeys in Survey Research
Course:       Survey Analysis Workshop (SPSS)
Research:   Subjective Social Indicators (Quality of Life)

-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Marike
Sent: 15 October 2018 01:38
To: [hidden email]
Subject: SPSS: Question re merging and restructuring large files with
repeated measures and missing cases

Hello!

I have a question how I can best merge 4 large national survey data files
(each about 18000 cases with about 500 variables) with 4 repeated measures
in SPSS with unique ID. However cases are not always present in each of the
years. I like to have both a wide and a long file for my analyses. By using
merge files --> add variables I am able to create a wide file, but when I
try restructure data the missing cases are also included in the long file.

To be more clear I show you what I am struggling with by showing 2 files

File T1:
ID        var1_T1    var2_T1     var3_T1     var4_T1
1003     41122       4              2               24
1008     41007      -4              1               1
1009     41007      -4              2               1
1010     21155       1              2               24
1011     11002       1              2               4

File T2:
ID         var1_T2    var2_T2     var3_T2     var4_T2
1003     41122       1               2               24
1005     41122       1               2               13
1006     41072       2               1               4
1007     41034       999           999            13
1010     21178       3               2               1

By using merge files --> add variables I come to this syntax and file, which
I think is correct.
DATASET ACTIVATE 'fileT1.sav'
MATCH FILES /FILE=*
  /FILE='fileT2.sav'
  /BY ID.
EXECUTE.

File T1 and T2 merged:
ID         var1_T1    var2_T1     var3_T1    var4_T1     var1_T2    var2_T2

var3_T2     var4_T2
1003     41122       4               2              24             41122

1              2               24
1005     .               .                .               .              
41122       1              2               13
1006     .               .                .               .              
41072       2              1               4
1007     .               .                .               .              
41034       999          999            13
1008     41007      -4               1              1               .

.              .                 .
1009     41007      -4               2              1               .

.              .                 .
1010     21155       1               2              24             21178

3              2               1
1011     11002       1               2              4               .

.               .                .

Then I thought I use RESTRUCTURE to make from this wide file a long file.
But then I do something wrong. I may miss an option or there are other ways
to do this, but the cases that are missing in a certain year are still
included in the file. My current file looks like this:

Current restructured file:
ID         var1         var2          var3          var4     time
1003     41122       4               2              24        1
1003     41122       1               2              24        2
1005     .               .                .               .          1    
<
1005     41122       1               2              13        2
1006     .               .                .               .          1    
<
1006     41072       2               1              4          2
1007     .               .                .               .          1    
<
1007     41034      999            999           13        2
1008     41007      -4              1               1         1
1008     .               .                .               .          2    
<
1009     41007      -4               2              1         1
1009     .               .                .               .          2    
<
1010     21155       1               2              24        1
1010     21178       3               2              1          2
1011     11002       1               2              4          1
1011     .               .                .               .          2    
<

I used this syntax:
VARSTOCASES
  /MAKE var1 FROM var1_T1 var1_T2
  /MAKE var2 FROM var2_T1 var2_T2
  /MAKE var3 FROM var3_T1 var3_T2
  /MAKE var4 FROM var4_T1 var4_T2
  /INDEX=Index1(2)
  /KEEP=ID
  /NULL=KEEP.

I do not want these missing cases included in the long file. For this
example I left out the fixed variables, but also want to include them. So
what I am doing wrong. Are there some options I need to tick by
restructuring and missing right now, or should I not use the wide file to
create the long file, but use the 4 separate files for each measure to make
the long file straight way. What I am looking for is this:

Restructured file I want:
ID         var1         var2          var3          var4      time
1003     41122       4               2              24        1
1003     41122       1               2              24        2
1005     41122       1               2              13        2
1006     41072       2               1              4          2
1007     41034       999           999           13        2
1008     41007       -4              1              1          1
1009     41007       -4              2              1          1
1010     21155       1               2              24        1
1010     21178       3               2              1          2
1011     11002       1               2              4          1

Of course I can use my syntax to delete these cases, but I have to do this
many times for different combinations of years and variables. So in short
what is the best way to merge my files with repeated measures into both a
wide and a long file, whereby the long file does not contain the missing
cases.

Thank you so much!!
Marike



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

John F Hall
In reply to this post by Marike

Both files previously sorted by ID using:

sort cases by id.

 

This seems to work (tested)

 

MATCH FILES /FILE=dataset4

  /FILE= dataset5

  /BY ID.

 

On merged file, all cases

 

list  var1_T1, var2_T1, var3_T1, var4_T1,

    var1_T2, var2_T2, var3_T2, var4_T2.

 

var1_T1  var2_T1  var3_T1  var4_T1  var1_T2  var2_T2  var3_T2  var4_T2

 

   41122        4        2       24    41122    41122        2       24

       .        .        .        .    41122    41122        2       13

       .        .        .        .    41072    41072        1        4

       .        .        .        .    41034    41034      999       13

   41007       -4        1        1        .        .        .        .

   41007       -4        2        1        .        .        .        .

   21155        1        2       24    21178    21178        2        1

   11002        1        2        4        .        .        .        .

 

 

Number of cases read:  8    Number of cases listed:  8

 

Number of cases read:  8    Number of cases listed:  8

Checking cases not present in both files;

 

count t1missing =  var1_T1, var2_T1, var3_T1, var4_T1, (sysmis)

    /t2missing = var1_T2, var2_T2, var3_T2, var4_T2  (sysmis).

freq t1missing t2missing.

 

t1missing

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

0

5

62.5

62.5

62.5

4

3

37.5

37.5

100.0

Total

8

100.0

100.0

 

 

t1missing

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

0

5

62.5

62.5

62.5

4

3

37.5

37.5

100.0

Total

8

100.0

100.0

 

 

Selecting only cases present in both files

 

TEMPORARY.

select if (t1missing ne 4) and (t2missing ne 4).

list id var1_T1, var2_T1, var3_T1, var4_T1,

    var1_T2, var2_T2, var3_T2, var4_T2.

 

 

   ID  var1_T1  var2_T1  var3_T1  var4_T1  var1_T2  var2_T2  var3_T2  var4_T2

 

1003    41122        4        2       24    41122    41122        2       24

1010    21155        1        2       24    21178    21178        2        1

 

 

Number of cases read:  2    Number of cases listed:  2

 

If you want to save the merged file with only cases present in both files you can do a permanent selection with:

 

select if (t1missing ne 4) and (t2missing ne 4).

 

Hope this helps

 

John F Hall  MA (Cantab) Dip Ed (Dunelm)

[Retired academic survey researcher]

 

Email:          [hidden email]

Website:     Journeys in Survey Research

Course:       Survey Analysis Workshop (SPSS)

Research:   Subjective Social Indicators (Quality of Life)

 

-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Marike
Sent: 15 October 2018 01:38
To: [hidden email]
Subject: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

 

Hello!

 

I have a question how I can best merge 4 large national survey data files (each about 18000 cases with about 500 variables) with 4 repeated measures in SPSS with unique ID. However cases are not always present in each of the years. I like to have both a wide and a long file for my analyses. By using merge files --> add variables I am able to create a wide file, but when I try restructure data the missing cases are also included in the long file.

 

To be more clear I show you what I am struggling with by showing 2 files

 

File T1:

ID        var1_T1    var2_T1     var3_T1     var4_T1

1003     41122       4              2               24

1008     41007      -4              1               1

1009     41007      -4              2               1

1010     21155       1              2               24

1011     11002       1              2               4

 

File T2:

ID         var1_T2    var2_T2     var3_T2     var4_T2

1003     41122       1               2               24

1005     41122       1               2               13

1006     41072       2               1               4

1007     41034       999           999            13

1010     21178       3               2               1

 

By using merge files --> add variables I come to this syntax and file, which I think is correct.

DATASET ACTIVATE 'fileT1.sav'

MATCH FILES /FILE=*

  /FILE='fileT2.sav'

  /BY ID.

EXECUTE.

 

File T1 and T2 merged:

ID         var1_T1    var2_T1     var3_T1    var4_T1     var1_T2    var2_T2   

var3_T2     var4_T2

1003     41122       4               2              24             41122     

1              2               24

1005     .               .                .               .             

41122       1              2               13

1006     .               .                .               .             

41072       2              1               4

1007     .               .                .               .              

41034       999          999            13

1008     41007      -4               1              1               .             

.              .                 .

1009     41007      -4               2              1               .              

.              .                 .

1010     21155       1               2              24             21178     

3              2               1

1011     11002       1               2              4               .             

.               .                .

 

Then I thought I use RESTRUCTURE to make from this wide file a long file.

But then I do something wrong. I may miss an option or there are other ways to do this, but the cases that are missing in a certain year are still included in the file. My current file looks like this:

 

Current restructured file:

ID         var1         var2          var3          var4     time

1003     41122       4               2              24        1

1003     41122       1               2              24        2

1005     .               .                .               .          1    

< 

1005     41122       1               2              13        2

1006     .               .                .               .          1    

< 

1006     41072       2               1              4          2

1007     .               .                .               .          1    

< 

1007     41034      999            999           13        2

1008     41007      -4              1               1         1

1008     .               .                .               .          2    

< 

1009     41007      -4               2              1         1

1009     .               .                .               .          2     

< 

1010     21155       1               2              24        1

1010     21178       3               2              1          2

1011     11002       1               2              4          1

1011     .               .                .               .          2    

< 

 

I used this syntax:

VARSTOCASES

  /MAKE var1 FROM var1_T1 var1_T2

  /MAKE var2 FROM var2_T1 var2_T2

  /MAKE var3 FROM var3_T1 var3_T2

  /MAKE var4 FROM var4_T1 var4_T2

  /INDEX=Index1(2)

  /KEEP=ID

  /NULL=KEEP.

 

I do not want these missing cases included in the long file. For this example I left out the fixed variables, but also want to include them. So what I am doing wrong. Are there some options I need to tick by restructuring and missing right now, or should I not use the wide file to create the long file, but use the 4 separate files for each measure to make the long file straight way. What I am looking for is this:

 

Restructured file I want:

ID         var1         var2          var3          var4      time

1003     41122       4               2              24        1

1003     41122       1               2              24        2

1005     41122       1               2              13        2

1006     41072       2               1              4          2

1007     41034       999           999           13        2

1008     41007       -4              1              1          1

1009     41007       -4              2              1          1

1010     21155       1               2              24        1

1010     21178       3               2              1          2

1011     11002       1               2              4          1

 

Of course I can use my syntax to delete these cases, but I have to do this many times for different combinations of years and variables. So in short what is the best way to merge my files with repeated measures into both a wide and a long file, whereby the long file does not contain the missing cases.

 

Thank you so much!!

Marike

 

 

 

--

Sent from: http://spssx-discussion.1045642.n5.nabble.com/

 

=====================

To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Bruce Weaver
Administrator
In reply to this post by Marike
Hello Marike.  If you change /NULL=KEEP to /NULL=DROP, do you get the result
you want?  

https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistics_reference_project_ddita/spss/base/syn_varstocases_null.html



Marike wrote

> Hello!
>
> I have a question how I can best merge 4 large national survey data files
> (each about 18000 cases with about 500 variables) with 4 repeated measures
> in SPSS with unique ID. However cases are not always present in each of
> the
> years. I like to have both a wide and a long file for my analyses. By
> using
> merge files --> add variables I am able to create a wide file, but when I
> try restructure data the missing cases are also included in the long file.
>
> To be more clear I show you what I am struggling with by showing 2 files
>
> File T1:
> ID        var1_T1    var2_T1     var3_T1     var4_T1
> 1003     41122       4              2               24
> 1008     41007      -4              1               1
> 1009     41007      -4              2               1
> 1010     21155       1              2               24
> 1011     11002       1              2               4
>
> File T2:
> ID         var1_T2    var2_T2     var3_T2     var4_T2
> 1003     41122       1               2               24
> 1005     41122       1               2               13
> 1006     41072       2               1               4
> 1007     41034       999           999            13
> 1010     21178       3               2               1
>
> By using merge files --> add variables I come to this syntax and file,
> which
> I think is correct.
> DATASET ACTIVATE 'fileT1.sav'
> MATCH FILES /FILE=*
>   /FILE='fileT2.sav'
>   /BY ID.
> EXECUTE.
>
> File T1 and T2 merged:
> ID         var1_T1    var2_T1     var3_T1    var4_T1     var1_T2  
> var2_T2    
> var3_T2     var4_T2
> 1003     41122       4               2              24             41122      
> 1              2               24
> 1005     .               .                .               .              
> 41122       1              2               13
> 1006     .               .                .               .              
> 41072       2              1               4
> 1007     .               .                .               .              
> 41034       999          999            13
> 1008     41007      -4               1              1               .              
> .              .                 .
> 1009     41007      -4               2              1               .              
> .              .                 .
> 1010     21155       1               2              24             21178      
> 3              2               1
> 1011     11002       1               2              4               .              
> .               .                .
>
> Then I thought I use RESTRUCTURE to make from this wide file a long file.
> But then I do something wrong. I may miss an option or there are other
> ways
> to do this, but the cases that are missing in a certain year are still
> included in the file. My current file looks like this:
>
> Current restructured file:
> ID         var1         var2          var3          var4     time
> 1003     41122       4               2              24        1
> 1003     41122       1               2              24        2
> 1005     .               .                .               .          1    
> <
> 1005     41122       1               2              13        2
> 1006     .               .                .               .          1    
> <
> 1006     41072       2               1              4          2
> 1007     .               .                .               .          1    
> <
> 1007     41034      999            999           13        2
> 1008     41007      -4              1               1         1
> 1008     .               .                .               .          2    
> <
> 1009     41007      -4               2              1         1
> 1009     .               .                .               .          2    
> <
> 1010     21155       1               2              24        1
> 1010     21178       3               2              1          2
> 1011     11002       1               2              4          1
> 1011     .               .                .               .          2    
> <
>
> I used this syntax:
> VARSTOCASES
>   /MAKE var1 FROM var1_T1 var1_T2
>   /MAKE var2 FROM var2_T1 var2_T2
>   /MAKE var3 FROM var3_T1 var3_T2
>   /MAKE var4 FROM var4_T1 var4_T2
>   /INDEX=Index1(2)
>   /KEEP=ID
>   /NULL=KEEP.
>
> I do not want these missing cases included in the long file. For this
> example I left out the fixed variables, but also want to include them. So
> what I am doing wrong. Are there some options I need to tick by
> restructuring and missing right now, or should I not use the wide file to
> create the long file, but use the 4 separate files for each measure to
> make
> the long file straight way. What I am looking for is this:
>
> Restructured file I want:
> ID         var1         var2          var3          var4      time
> 1003     41122       4               2              24        1
> 1003     41122       1               2              24        2
> 1005     41122       1               2              13        2
> 1006     41072       2               1              4          2
> 1007     41034       999           999           13        2
> 1008     41007       -4              1              1          1
> 1009     41007       -4              2              1          1
> 1010     21155       1               2              24        1
> 1010     21178       3               2              1          2
> 1011     11002       1               2              4          1
>
> Of course I can use my syntax to delete these cases, but I have to do this
> many times for different combinations of years and variables. So in short
> what is the best way to merge my files with repeated measures into both a
> wide and a long file, whereby the long file does not contain the missing
> cases.
>
> Thank you so much!!
> Marike
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Ki Park
I think you could start with ADD FILES first to get what you want in the
second place, and then do the restructure. Something like this?

*1.
ADD FILES /FILE="fileT1.sav"
                /FILE="fileT2.sav".
EXECUTE.

Sort cases by ID time.

*2.
CASESTOVARS
  /ID=ID
  /GROUPBY=VARIABLE.




--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

MLIves
That was my thought as well since casestovars is an easier syntax.
Based on your examples, to do what Ki suggests, before running the 'Add files', you'd need to rename your variables in the files to drop the _T1, _T2 etc, then add a 'Time' variable to each file.  (both can be done simply using syntax).

In the Casestovars, you can specify how the variable names will indicate the time source (SEPARATOR and INDEX) and whether the case has data for that time (VIND). ("An optional rootname can be specified after the ROOT keyword on the subcommand. The default rootname is ind.")

CASESTOVARS
  /ID=ID
  /SEPARATOR="_T"
  /INDEX=time
  /VIND
  /GROUPBY=VARIABLE.
--------------
Melissa

________________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Ki Park <[hidden email]>
Sent: Monday, October 15, 2018 11:59 AM
To: [hidden email]
Subject: Re: [SPSSX-L] SPSS: Question re merging and restructuring large files with repeated measures and missing cases

I think you could start with ADD FILES first to get what you want in the
second place, and then do the restructure. Something like this?

*1.
ADD FILES /FILE="fileT1.sav"
                /FILE="fileT2.sav".
EXECUTE.

Sort cases by ID time.

*2.
CASESTOVARS
  /ID=ID
  /GROUPBY=VARIABLE.




--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

________________________________

This correspondence contains proprietary information some or all of which may be legally privileged; it is for the intended recipient only. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this correspondence and completely dispose of the correspondence immediately. Please notify the sender if you have received this email in error. NOTE: Messages to or from the State of Connecticut domain may be subject to the Freedom of Information statutes and regulations.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

David Marso
Administrator
Drum Roll...
Here is another way to do it.  Avoid manually renaming variables at all ;-)

DATA LIST FREE /ID        var1_T1    var2_T1     var3_T1     var4_T1 .
BEGIN DATA
1003     41122       4              2               24
1008     41007      -4              1               1
1009     41007      -4              2               1
1010     21155       1              2               24
1011     11002       1              2               4
END DATA.
DATASET NAME Data1.

DATA LIST FREE /ID         var1_T2    var2_T2     var3_T2     var4_T2 .
BEGIN DATA
1003     41122       1               2               24
1005     41122       1               2               13
1006     41072       2               1               4
1007     41034       999           999            13
1010     21178       3               2               1
END DATA.
DATASET NAME Data2.

ADD FILES / FILE Data1 / FILE Data2 / BY ID.

BEGIN PROGRAM.
import spssaux  
import spss
varlist=" ".join(spssaux.VariableDict().expand("var1_T1 TO var4_T2" ))
spss.Submit( "AGGREGATE OUTFILE * /BREAK ID / " + varlist + "=MAX ("+
varlist +")" )
END PROGRAM.

DATASET NAME Wide.
DATASET COPY CpyForLong.
DATASET ACTIVATE CpyForLong.
VARSTOCASES MAKE New FROM var1_T1 TO var4_T2 /NULL=DROP /INDEX=Varname(New).
COMPUTE #=CHAR.INDEX(Varname,"_").
COMPUTE Index=NUMBER(CHAR.SUBSTR(Varname,#+2),F1).
COMPUTE Varname=CHAR.SUBSTR(Varname,1,#-1).
CASESTOVARS /ID=ID INDEX/INDEX=Varname .





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

Marike
In reply to this post by Marike
Thank you all for your answers.
I have been able to create the long file now.





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS: Question re merging and restructuring large files with repeated measures and missing cases

David Marso
Administrator
In reply to this post by David Marso
SCRATCH THAT...Brain fart.
Better way and general.
DATA LIST FREE /ID        var1_T1    var2_T1     var3_T1     var4_T1 .
BEGIN DATA
1003     41122       4              2               24
1008     41007      -4              1               1
1009     41007      -4              2               1
1010     21155       1              2               24
1011     11002       1              2               4
END DATA.
DATASET NAME Data1.

DATA LIST FREE /ID         var1_T2    var2_T2     var3_T2     var4_T2 .
BEGIN DATA
1003     41122       1               2               24
1005     41122       1               2               13
1006     41072       2               1               4
1007     41034       999           999            13
1010     21178       3               2               1
END DATA.
DATASET NAME Data2.

MATCH FILES / FILE Data1 / FILE Data2 / BY ID.
DATASET NAME Wide.
DATASET COPY CpyForLong.
DATASET ACTIVATE CpyForLong.

/*  Build @ and @@ so we don't need to know anything about the variable
names */.
NUMERIC @ (F1).
MATCH FILES /FILE * / KEEP ID @ ALL.
NUMERIC @@ (F1).

/* Note this will fail if there are STRING variables if there are then DROP
them in the previous MATCH */.
VARSTOCASES MAKE New FROM @ TO @@ /NULL=DROP /INDEX=Varname(New).

/* Use RINDEX in case the root variable name has multiple _ characters */.
COMPUTE #=CHAR.RINDEX(Varname,"_").
COMPUTE Index=NUMBER(CHAR.SUBSTR(Varname,#+2),F1).
COMPUTE Varname=CHAR.SUBSTR(Varname,1,#-1).
CASESTOVARS /ID=ID INDEX/INDEX=Varname .




David Marso wrote

> Drum Roll...
> Here is another way to do it.  Avoid manually renaming variables at all
> ;-)
>
> DATA LIST FREE /ID        var1_T1    var2_T1     var3_T1     var4_T1 .
> BEGIN DATA
> 1003     41122       4              2               24
> 1008     41007      -4              1               1
> 1009     41007      -4              2               1
> 1010     21155       1              2               24
> 1011     11002       1              2               4
> END DATA.
> DATASET NAME Data1.
>
> DATA LIST FREE /ID         var1_T2    var2_T2     var3_T2     var4_T2 .
> BEGIN DATA
> 1003     41122       1               2               24
> 1005     41122       1               2               13
> 1006     41072       2               1               4
> 1007     41034       999           999            13
> 1010     21178       3               2               1
> END DATA.
> DATASET NAME Data2.
>
> ADD FILES / FILE Data1 / FILE Data2 / BY ID.
>
> BEGIN PROGRAM.
> import spssaux  
> import spss
> varlist=" ".join(spssaux.VariableDict().expand("var1_T1 TO var4_T2" ))
> spss.Submit( "AGGREGATE OUTFILE * /BREAK ID / " + varlist + "=MAX ("+
> varlist +")" )
> END PROGRAM.
>
> DATASET NAME Wide.
> DATASET COPY CpyForLong.
> DATASET ACTIVATE CpyForLong.
> VARSTOCASES MAKE New FROM var1_T1 TO var4_T2 /NULL=DROP
> /INDEX=Varname(New).
> COMPUTE #=CHAR.INDEX(Varname,"_").
> COMPUTE Index=NUMBER(CHAR.SUBSTR(Varname,#+2),F1).
> COMPUTE Varname=CHAR.SUBSTR(Varname,1,#-1).
> CASESTOVARS /ID=ID INDEX/INDEX=Varname .
>
>
>
>
>
> -----
> Please reply to the list and not to my personal email.
> Those desiring my consulting or training services please feel free to
> email me.
> ---
> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos
> ne forte conculcent eas pedibus suis."
> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in
> abyssum?"
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"