Filling in the top half of a correlation matrix

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Filling in the top half of a correlation matrix

Bruce Weaver
Administrator
In another thread (http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-td4994372.html), I suggested that one can use the EM correlations from MVA as input for FACTOR when one is doing exploratory factor analysis, but has missing data.  As I've been exploring how to do that, I've run up against a small problem:  The matrix of EM correlations can be captured via OMS, but it contains only the lower half (and main diagonal) of the correlation matrix.  But FACTOR wants the full matrix as input.  The only way I could think of to fill in the top half was with a little MATRIX program, like the one shown below.  Dataset LowerHalf holds the lower half of the EM correlation matrix, and looks like this, for example:

    V1     V2     V3     V4     V5  
 1.000   .      .      .      .
  .508  1.000   .      .      .
  .347   .583  1.000   .      .
  .204   .243   .294  1.000   .
  .108   .166   .213   .250  1.000

On first attempt to grab variables V1 to V5 with the GET command in the MATRIX program, it objected to the system missing values.  So I first recoded SYSMIS to 99 in the upper half of the correlation matrix.

DATASET ACTIVATE LowerHalf.
recode V1 to V5 (sysmis=99).
execute.

MATRIX.
get CM  / file = * / variables = V1 to V5.
print CM / format = "f5.3".
loop r = 1 to nrow(CM)-1.
loop c = 2 to ncol(CM).
compute CM(r,c) = CM(c,r).
end loop.
end loop.
print CM / format = "f5.3".
msave CM /TYPE=CORR /OUTFILE = * /VARIABLES=V1 to V5.
END MATRIX.

The two PRINT statements in the matrix program were included initially to check that things were working as expected.  But later, I found that it didn't work properly if I removed them.

So, I have two questions:

1. Is there some easier alternative to the double loop in a matrix program for filling in the top half of the correlation matrix?  (I was thinking MCONVERT or something like that, but found nothing suitable.)

2. Any thoughts on why removal of the two PRINT commands in my matrix program is causing it to go all FUBAR?  (I tried removing the PRINTs and including an EXECUTE after the double-loop, but that did not fix it.)  

Thanks,
Bruce

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Art Kendall
Does the EM Correlations procedure have a /MATRIX option?
Art Kendall
Social Research Consultants
On 9/25/2013 4:06 PM, Bruce Weaver [via SPSSX Discussion] wrote:
In another thread (http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-td4994372.html), I suggested that one can use the EM correlations from MVA as input for FACTOR when one is doing exploratory factor analysis, but has missing data.  As I've been exploring how to do that, I've run up against a small problem:  The matrix of EM correlations can be captured via OMS, but it contains only the lower half (and main diagonal) of the correlation matrix.  But FACTOR wants the full matrix as input.  The only way I could think of to fill in the top half was with a little MATRIX program, like the one shown below.  Dataset LowerHalf holds the lower half of the EM correlation matrix, and looks like this, for example:

    V1     V2     V3     V4     V5  
 1.000   .      .      .      .
  .508  1.000   .      .      .
  .347   .583  1.000   .      .
  .204   .243   .294  1.000   .
  .108   .166   .213   .250  1.000

On first attempt to grab variables V1 to V5 with the GET command in the MATRIX program, it objected to the system missing values.  So I first recoded SYSMIS to 99 in the upper half of the correlation matrix.

DATASET ACTIVATE LowerHalf.
recode V1 to V5 (sysmis=99).
execute.

MATRIX.
get CM  / file = * / variables = V1 to V5.
print CM / format = "f5.3".
loop r = 1 to nrow(CM)-1.
loop c = 2 to ncol(CM).
compute CM(r,c) = CM(c,r).
end loop.
end loop.
print CM / format = "f5.3".
msave CM /TYPE=CORR /OUTFILE = * /VARIABLES=V1 to V5.
END MATRIX.

The two PRINT statements in the matrix program were included initially to check that things were working as expected.  But later, I found that it didn't work properly if I removed them.

So, I have two questions:

1. Is there some easier alternative to the double loop in a matrix program for filling in the top half of the correlation matrix?  (I was thinking MCONVERT or something like that, but found nothing suitable.)

2. Any thoughts on why removal of the two PRINT commands in my matrix program is causing it to go all FUBAR?  (I tried removing the PRINTs and including an EXECUTE after the double-loop, but that did not fix it.)  

Thanks,
Bruce

--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.



If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229.html
To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

David Marso
Administrator
In reply to this post by Bruce Weaver
Does something l;ike the following work?
--
DATA LIST LIST   /V1     V2     V3     V4     V5   .
BEGIN DATA
 1.000   .      .      .      .
  .508  1.000   .      .      .
  .347   .583  1.000   .      .
  .204   .243   .294  1.000   .
  .108   .166   .213   .250  1.000
END DATA.
DATASET NAME lower .
DATASET DECLARE cout.
DATASET ACTIVATE lower.
MATRIX.
get CM  / file = lower / variables = V1 to V5/MISSING=ACCEPT / VALUE=0.
COMPUTE CM=CM+T(CM).
SAVE CM / OUTFILE COut .
END MATRIX.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Bruce Weaver
Administrator
In reply to this post by Art Kendall
Hi Art.  The table of EM correlations is generated via the MVA command, like this:

DATASET DECLARE EM1.
OMS
  /SELECT TABLES
  /IF COMMANDS=['MVA'] SUBTYPES=['EOUT_EM CORRELATIONS']
  /DESTINATION FORMAT=SAV NUMBERED=TableNumber_
   OUTFILE='EM1'.

dataset activate raw.
MVA VARIABLES= {variable list} /EM.

OMSEND.

I wrap it in OMS, because that's the only way I can see to get the EM correlations into a data file.

You can add an OUTFILE option to the /EM sub-command to write a file of "raw" data with missing values imputed*, but I see no /MATRIX option.

* As John Graham notes on p. 556 of the following, use of this imputed dataset is not recommended.  One should use MI instead to get multiple imputed datasets.  

   http://www.stats.ox.ac.uk/~snijders/Graham2009.pdf

Cheers,
Bruce


Art Kendall wrote
Does the EM
        Correlations procedure have a /MATRIX option?
      Art Kendall
Social Research Consultants
      On 9/25/2013 4:06 PM, Bruce Weaver [via SPSSX Discussion] wrote:
   
     In another thread ( http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-td4994372.html ),
      I suggested that one can use the EM correlations from MVA as input
      for FACTOR when one is doing exploratory factor analysis, but has
      missing data.  As I've been exploring how to do that, I've run up
      against a small problem:  The matrix of EM correlations can be
      captured via OMS, but it contains only the lower half (and
      main diagonal) of the correlation matrix.  But FACTOR wants the full
        matrix as input.  The only way I could think of to fill in
      the top half was with a little MATRIX program, like the one shown
      below.  Dataset LowerHalf holds the lower half of the EM
      correlation matrix, and looks like this, for example:
     
     
          V1     V2     V3     V4     V5  
     
       1.000   .      .      .      .
        .508  1.000   .      .      .
        .347   .583  1.000   .      .
        .204   .243   .294  1.000   .
        .108   .166   .213   .250  1.000
     
      On first attempt to grab variables V1 to V5 with the GET command
      in the MATRIX program, it objected to the system missing values.
       So I first recoded SYSMIS to 99 in the upper half of the
      correlation matrix.
     
     
      DATASET ACTIVATE LowerHalf.
     
      recode V1 to V5 (sysmis=99).
     
      execute.
     
     
      MATRIX.
     
      get CM  / file = * / variables = V1 to V5.
     
      print CM / format = "f5.3".
     
      loop r = 1 to nrow(CM)-1.
     
      loop c = 2 to ncol(CM).
     
      compute CM(r,c) = CM(c,r).
      end loop.
     
      end loop.
     
      print CM / format = "f5.3".
     
      msave CM /TYPE=CORR /OUTFILE = * /VARIABLES=V1 to V5.
     
      END MATRIX.
     
     
      The two PRINT statements in the matrix program were included
      initially to check that things were working as expected.  But
      later, I found that it didn't work properly if I removed them.
     
      So, I have two questions:
     
     
      1. Is there some easier alternative to the double loop in a matrix
      program for filling in the top half of the correlation matrix?  (I
      was thinking MCONVERT or something like that, but found nothing
      suitable.)
     
     
      2. Any thoughts on why removal of the two PRINT commands in my
      matrix program is causing it to go all FUBAR?  (I tried removing
      the PRINTs and including an EXECUTE after the double-loop, but
      that did not fix it.)  
     
     
      Thanks,
     
      Bruce
     
     
       --
       
        Bruce Weaver
       
        [hidden email] 
       
        http://sites.google.com/a/lakeheadu.ca/bweaver/ 
       
        "When all else fails, RTFM."
       
       
        NOTE: My Hotmail account is not monitored regularly.
         
          To send me an e-mail, please use the address shown above.
         
         
     
     
     
     
        If you reply to this email, your
          message will be added to the discussion below:
        http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229.html 
     
     
        To start a new topic under SPSSX Discussion, email
        [hidden email] 
        To unsubscribe from SPSSX Discussion, click
          here .
        NAML
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

David Marso
Administrator
In reply to this post by David Marso
Or the following?
NEW FILE.
DATASET CLOSE ALL.
DATA LIST LIST   /V1     V2     V3     V4     V5   .
BEGIN DATA
 1.000   .      .      .      .
  .508  1.000   .      .      .
  .347   .583  1.000   .      .
  .204   .243   .294  1.000   .
  .108   .166   .213   .250  1.000
END DATA.

DATASET NAME lower.
COMPUTE ID=$CASENUM.
FLIP.
DATASET NAME flipped.
RENAME VARIABLES (var001 TO var005 = V1 TO V5).
COMPUTE ID=$CASENUM.
UPDATE FILE flipped  / FILE lower/ BY ID.
SELECT IF (CASE_LBL NE 'ID').
EXECUTE.
DELETE VARIABLES ID.
LIST.

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Kirill Orlov
In reply to this post by Bruce Weaver
Bruce,
If you chose to do it via MATRIX, here it is.

matrix.
get m /vari= v1 to v5 /miss= 0 /names= names.
comp m= m+t(m).
print m.
call setdiag(m,1).
save m /outfile= * /names= names.
end matrix.

The above example takes only the matrix body - i.e. without variables ROWTYPE_ and VARNAME_ (if you have such there) - but you can modify it to take in those, too.

You can do a similar thing using MGET / MSAVE matrix statements, but I don't recommend it ever.


26.09.2013 0:06, Bruce Weaver пишет:
In another thread
(http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-td4994372.html),
I suggested that one can use the EM correlations from MVA as input for
FACTOR when one is doing exploratory factor analysis, but has missing data.
As I've been exploring how to do that, I've run up against a small problem:
The matrix of EM correlations can be captured via OMS, but it contains only
the /lower half/ (and main diagonal) of the correlation matrix.  But FACTOR
wants the /full matrix/ as input.  The only way I could think of to fill in
the top half was with a little MATRIX program, like the one shown below.
Dataset LowerHalf holds the lower half of the EM correlation matrix, and
looks like this, for example:

    V1     V2     V3     V4     V5
 1.000   .      .      .      .
  .508  1.000   .      .      .
  .347   .583  1.000   .      .
  .204   .243   .294  1.000   .
  .108   .166   .213   .250  1.000

On first attempt to grab variables V1 to V5 with the GET command in the
MATRIX program, it objected to the system missing values.  So I first
recoded SYSMIS to 99 in the upper half of the correlation matrix.

DATASET ACTIVATE LowerHalf.
recode V1 to V5 (sysmis=99).
execute.

MATRIX.
get CM  / file = * / variables = V1 to V5.
print CM / format = "f5.3".
loop r = 1 to nrow(CM)-1.
loop c = 2 to ncol(CM).
*compute CM(r,c) = CM(c,r).*
end loop.
end loop.
print CM / format = "f5.3".
msave CM /TYPE=CORR /OUTFILE = * /VARIABLES=V1 to V5.
END MATRIX.

The two PRINT statements in the matrix program were included initially to
check that things were working as expected.  But later, I found that it
didn't work properly if I removed them.

So, I have two questions:

1. Is there some easier alternative to the double loop in a matrix program
for filling in the top half of the correlation matrix?  (I was thinking
MCONVERT or something like that, but found nothing suitable.)

2. Any thoughts on why removal of the two PRINT commands in my matrix
program is causing it to go all FUBAR?  (I tried removing the PRINTs and
including an EXECUTE after the double-loop, but that did not fix it.)

Thanks,
Bruce





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD




Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Kirill Orlov
In reply to this post by David Marso
Ah, that was interesting piece below. +1, David.

26.09.2013 1:12, David Marso пишет:
Or the following?
NEW FILE.
DATASET CLOSE ALL.
DATA LIST LIST   /V1     V2     V3     V4     V5   .
BEGIN DATA
 1.000   .      .      .      .
  .508  1.000   .      .      .
  .347   .583  1.000   .      .
  .204   .243   .294  1.000   .
  .108   .166   .213   .250  1.000
END DATA.

DATASET NAME lower.
COMPUTE ID=$CASENUM.
FLIP.
DATASET NAME flipped.
RENAME VARIABLES (var001 TO var005 = V1 TO V5).
COMPUTE ID=$CASENUM.
UPDATE FILE flipped  / FILE lower/ BY ID.
SELECT IF (CASE_LBL NE 'ID').
EXECUTE.
DELETE VARIABLES ID.
LIST.


Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Bruce Weaver
Administrator
In reply to this post by David Marso
Aha...CM = CM + T(CM) was what I was looking for!  Nice one, David.  Thanks.  I'm dead chuffed, as our British friends might say.  ;-)

For some reason, /MISSING=ACCEPT / VALUE=0 is not working as expected--I end up with sysmis in the top half of the matrix.  But no big deal.  I can just recode sysmis to 0 before running the matrix program.  

So for the record, here's what it looks like now.  !MyVarList is a macro defining the list of variables in the correlation matrix.

********************************************* .

dataset activate EM1. /* the EM correlations from OMS.
recode !MyVarList (sysmis=0).
execute.

MATRIX.
get CM  / file = * / variables = !MyVarList .
compute CM=CM+T(CM).
msave CM /TYPE=CORR /OUTFILE=* /VARIABLES=!MyVarList.
END MATRIX.
DATASET NAME EM2.

********************************************* .

There's still a bit to do after that (e.g., adding an N row with the count of cases in the raw data file), but that's all fairly straightforward.  Here's an example of the final result in a matrix file ready for input to FACTOR:

ROWTYPE_ VARNAME_    V1      V2       V3       V4       V5  
N                    235.0000  235.0000 235.0000 235.0000 235.0000
CORR     V1         1.00000   .50810   .34713   .20362   .10753
CORR     V2          .50810  1.00000   .58283   .24337   .16637
CORR     V3          .34713   .58283  1.00000   .29423   .21310
CORR     V4          .20362   .24337   .29423  1.00000   .24957
CORR     V5          .10753   .16637   .21310   .24957  1.00000





David Marso wrote
Does something l;ike the following work?
--
DATA LIST LIST   /V1     V2     V3     V4     V5   .
BEGIN DATA
 1.000   .      .      .      .
  .508  1.000   .      .      .
  .347   .583  1.000   .      .
  .204   .243   .294  1.000   .
  .108   .166   .213   .250  1.000
END DATA.
DATASET NAME lower .
DATASET DECLARE cout.
DATASET ACTIVATE lower.
MATRIX.
get CM  / file = lower / variables = V1 to V5/MISSING=ACCEPT / VALUE=0.
COMPUTE CM=CM+T(CM).
SAVE CM / OUTFILE COut .
END MATRIX.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Art Kendall
In reply to this post by Bruce Weaver
Although, pairwise deletion can result in a very problematic matrix, it would be an interesting exercise to compare that matrix to the one output by the MVA procedure, and the one output by list deletion.
It might also be an interesting exercise to do an INDSCAL on matrices with listwise, pairwise, and each of the missing value imputation methods or from each of the different imputed data sets.
Art Kendall
Social Research Consultants
On 9/25/2013 5:12 PM, Bruce Weaver [via SPSSX Discussion] wrote:
Hi Art.  The table of EM correlations is generated via the MVA command, like this:

DATASET DECLARE EM1.
OMS
  /SELECT TABLES
  /IF COMMANDS=['MVA'] SUBTYPES=['EOUT_EM CORRELATIONS']
  /DESTINATION FORMAT=SAV NUMBERED=TableNumber_
   OUTFILE='EM1'.

dataset activate raw.
MVA VARIABLES= {variable list} /EM.

OMSEND.

I wrap it in OMS, because that's the only way I can see to get the EM correlations into a data file.

You can add an OUTFILE option to the /EM sub-command to write a file of "raw" data with missing values imputed*, but I see no /MATRIX option.

* As John Graham notes on p. 556 of the following, use of this imputed dataset is not recommended.  One should use MI instead to get multiple imputed datasets.  

   http://www.stats.ox.ac.uk/~snijders/Graham2009.pdf

Cheers,
Bruce


Art Kendall wrote
Does the EM
        Correlations procedure have a /MATRIX option?
      Art Kendall
Social Research Consultants
      On 9/25/2013 4:06 PM, Bruce Weaver [via SPSSX Discussion] wrote:
   
     In another thread ( http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-td4994372.html ),
      I suggested that one can use the EM correlations from MVA as input
      for FACTOR when one is doing exploratory factor analysis, but has
      missing data.  As I've been exploring how to do that, I've run up
      against a small problem:  The matrix of EM correlations can be
      captured via OMS, but it contains only the lower half (and
      main diagonal) of the correlation matrix.  But FACTOR wants the full
        matrix as input.  The only way I could think of to fill in
      the top half was with a little MATRIX program, like the one shown
      below.  Dataset LowerHalf holds the lower half of the EM
      correlation matrix, and looks like this, for example:
     
     
          V1     V2     V3     V4     V5  
     
       1.000   .      .      .      .
        .508  1.000   .      .      .
        .347   .583  1.000   .      .
        .204   .243   .294  1.000   .
        .108   .166   .213   .250  1.000
     
      On first attempt to grab variables V1 to V5 with the GET command
      in the MATRIX program, it objected to the system missing values.
       So I first recoded SYSMIS to 99 in the upper half of the
      correlation matrix.
     
     
      DATASET ACTIVATE LowerHalf.
     
      recode V1 to V5 (sysmis=99).
     
      execute.
     
     
      MATRIX.
     
      get CM  / file = * / variables = V1 to V5.
     
      print CM / format = "f5.3".
     
      loop r = 1 to nrow(CM)-1.
     
      loop c = 2 to ncol(CM).
     
      compute CM(r,c) = CM(c,r).
      end loop.
     
      end loop.
     
      print CM / format = "f5.3".
     
      msave CM /TYPE=CORR /OUTFILE = * /VARIABLES=V1 to V5.
     
      END MATRIX.
     
     
      The two PRINT statements in the matrix program were included
      initially to check that things were working as expected.  But
      later, I found that it didn't work properly if I removed them.
     
      So, I have two questions:
     
     
      1. Is there some easier alternative to the double loop in a matrix
      program for filling in the top half of the correlation matrix?  (I
      was thinking MCONVERT or something like that, but found nothing
      suitable.)
     
     
      2. Any thoughts on why removal of the two PRINT commands in my
      matrix program is causing it to go all FUBAR?  (I tried removing
      the PRINTs and including an EXECUTE after the double-loop, but
      that did not fix it.)  
     
     
      Thanks,
     
      Bruce
     
     
       --
       
        Bruce Weaver
       
        [hidden email] 
       
        http://sites.google.com/a/lakeheadu.ca/bweaver/ 
       
        "When all else fails, RTFM."
       
       
        NOTE: My Hotmail account is not monitored regularly.
         
          To send me an e-mail, please use the address shown above.
         
         
     
     
     
     
        If you reply to this email, your
          message will be added to the discussion below:
        http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229.html 
     
     
        To start a new topic under SPSSX Discussion, email
        [hidden email] 
        To unsubscribe from SPSSX Discussion, click
          here .
        NAML
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.



If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229p5722232.html
To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Bruce Weaver
Administrator
In reply to this post by Kirill Orlov
Another aha moment!  Thank you Kirill for showing me that I was misunderstanding how the /MISSING sub-command works for GET.  With /MISSING=0, I no longer need my recode out front.  

I didn't really need the /NAMES=NAMES bit, because I have my variable list as a macro.

My final syntax (I think) looks like this, without the recode out front:

dataset activate EM1. /* EM correlations from OMS.
MATRIX.
get CM  / file = * / variables = !MyVarList / missing=0 .
compute CM=CM+T(CM).
msave CM /TYPE=CORR /OUTFILE=* /VARIABLES=!MyVarList.
END MATRIX.
DATASET NAME EM2.

Note that I use MSAVE rather than SAVE, as this gives me the ROWTYPE_ and VARNAME_ variables I need.  (They are not present in the EM1 dataset obtained via OMS.)

Thanks fellas.  ;-)



Kirill Orlov wrote
Bruce,
If you chose to do it via MATRIX, here it is.

matrix.
get m /vari= v1 to v5 /miss= 0 /names= names.
comp m= m+t(m).
print m.
call setdiag(m,1).
save m /outfile= * /names= names.
end matrix.

The above example takes only the matrix body - i.e. without variables
ROWTYPE_ and VARNAME_ (if you have such there) - but you can modify it
to take in those, too.

You can do a similar thing using MGET / MSAVE matrix statements, but I
don't recommend it ever.


26.09.2013 0:06, Bruce Weaver ?????:
> In another thread
> (http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-td4994372.html),
> I suggested that one can use the EM correlations from MVA as input for
> FACTOR when one is doing exploratory factor analysis, but has missing data.
> As I've been exploring how to do that, I've run up against a small problem:
> The matrix of EM correlations can be captured via OMS, but it contains only
> the /lower half/ (and main diagonal) of the correlation matrix.  But FACTOR
> wants the /full matrix/ as input.  The only way I could think of to fill in
> the top half was with a little MATRIX program, like the one shown below.
> Dataset LowerHalf holds the lower half of the EM correlation matrix, and
> looks like this, for example:
>
>      V1     V2     V3     V4     V5
>   1.000   .      .      .      .
>    .508  1.000   .      .      .
>    .347   .583  1.000   .      .
>    .204   .243   .294  1.000   .
>    .108   .166   .213   .250  1.000
>
> On first attempt to grab variables V1 to V5 with the GET command in the
> MATRIX program, it objected to the system missing values.  So I first
> recoded SYSMIS to 99 in the upper half of the correlation matrix.
>
> DATASET ACTIVATE LowerHalf.
> recode V1 to V5 (sysmis=99).
> execute.
>
> MATRIX.
> get CM  / file = * / variables = V1 to V5.
> print CM / format = "f5.3".
> loop r = 1 to nrow(CM)-1.
> loop c = 2 to ncol(CM).
> *compute CM(r,c) = CM(c,r).*
> end loop.
> end loop.
> print CM / format = "f5.3".
> msave CM /TYPE=CORR /OUTFILE = * /VARIABLES=V1 to V5.
> END MATRIX.
>
> The two PRINT statements in the matrix program were included initially to
> check that things were working as expected.  But later, I found that it
> didn't work properly if I removed them.
>
> So, I have two questions:
>
> 1. Is there some easier alternative to the double loop in a matrix program
> for filling in the top half of the correlation matrix?  (I was thinking
> MCONVERT or something like that, but found nothing suitable.)
>
> 2. Any thoughts on why removal of the two PRINT commands in my matrix
> program is causing it to go all FUBAR?  (I tried removing the PRINTs and
> including an EXECUTE after the double-loop, but that did not fix it.)
>
> Thanks,
> Bruce
>
>
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Bruce Weaver
Administrator
In reply to this post by Art Kendall
I was thinking along similar lines, Art.  In a first draft of the syntax, I run the factor analysis using the EM correlations as input; but then I follow up using the raw data as input, and the following /MISSING options (in 3 separate models, obviously):

  /MISSING LISTWISE
  /MISSING PAIRWISE
  /MISSING MEANSUB

I figure we should be prepared in case any inquisitive members of the team want to know how the results from EM correlations differ from any of these other approaches (given that the latter approaches are probably more familiar to them).  We don't have the complete dataset yet, but judging from a quick glance at what I do have to date, I don't think it is going to make a huge difference to the final solution in this case.   But I'll still argue for using the EM correlations, given the well-known limitations of the other methods for dealing with missing data.

Cheers,
Bruce


Art Kendall wrote
Although, pairwise deletion
                can result in a very problematic
                matrix, it would be an interesting exercise to compare
      that matrix to the one output by the MVA procedure, and the one
      output by list deletion.
      It might also be an interesting exercise to do an INDSCAL on
      matrices with listwise, pairwise, and each of the missing
      value imputation methods or from each of the different imputed
      data sets.
      Art Kendall
Social Research Consultants
      On 9/25/2013 5:12 PM, Bruce Weaver [via SPSSX Discussion] wrote:
   
     Hi Art.  The table of EM correlations is generated
      via the MVA command, like this:
     
     
      DATASET DECLARE EM1.
     
      OMS
     
        /SELECT TABLES
     
        /IF COMMANDS=['MVA'] SUBTYPES=['EOUT_EM CORRELATIONS']
     
        /DESTINATION FORMAT=SAV NUMBERED=TableNumber_
     
         OUTFILE='EM1'.
     
     
      dataset activate raw.
     
      MVA VARIABLES= {variable list} /EM.
     
      OMSEND.
     
     
      I wrap it in OMS, because that's the only way I can see to get the
      EM correlations into a data file.
     
     
      You can add an OUTFILE option to the /EM sub-command to write a
      file of "raw" data with missing values imputed*, but I see no
      /MATRIX option.
     
     
      * As John Graham notes on p. 556 of the following, use of this
      imputed dataset is not recommended.  One should use MI instead to
      get multiple imputed datasets.  
     
     
          http://www.stats.ox.ac.uk/~snijders/Graham2009.pdf 
     
      Cheers,
     
      Bruce
     
     
     
     
       
          Art
            Kendall wrote
          Does the EM
           
                    Correlations procedure have a /MATRIX option?
                  Art Kendall
           
            Social Research Consultants
                  On 9/25/2013 4:06 PM, Bruce Weaver [via SPSSX
            Discussion] wrote:
               
                 In another thread ( http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-td4994372.html  ),
           
                  I suggested that one can use the EM correlations from
            MVA as input
           
                  for FACTOR when one is doing exploratory factor
            analysis, but has
           
                  missing data.  As I've been exploring how to do that,
            I've run up
           
                  against a small problem:  The matrix of EM
            correlations can be
           
                  captured via OMS, but it contains only the lower
            half (and
           
                  main diagonal) of the correlation matrix.  But FACTOR
            wants the full
           
                    matrix as input.  The only way I could think of to
            fill in
           
                  the top half was with a little MATRIX program, like
            the one shown
           
                  below.  Dataset LowerHalf holds the lower half of the
            EM
           
                  correlation matrix, and looks like this, for example:
           
                 
                 
                      V1     V2     V3     V4     V5  
           
                 
                   1.000   .      .      .      .
                    .508  1.000   .      .      .
                    .347   .583  1.000   .      .
                    .204   .243   .294  1.000   .
                    .108   .166   .213   .250  1.000
                 
                  On first attempt to grab variables V1 to V5 with the
            GET command
           
                  in the MATRIX program, it objected to the system
            missing values.
           
                   So I first recoded SYSMIS to 99 in the upper half of
            the
           
                  correlation matrix.
           
                 
                 
                  DATASET ACTIVATE LowerHalf.
           
                 
                  recode V1 to V5 (sysmis=99).
           
                 
                  execute.
           
                 
                 
                  MATRIX.
           
                 
                  get CM  / file = * / variables = V1 to V5.
           
                 
                  print CM / format = "f5.3".
           
                 
                  loop r = 1 to nrow(CM)-1.
           
                 
                  loop c = 2 to ncol(CM).
           
                 
                  compute CM(r,c) = CM(c,r).
                  end loop.
           
                 
                  end loop.
           
                 
                  print CM / format = "f5.3".
           
                 
                  msave CM /TYPE=CORR /OUTFILE = * /VARIABLES=V1 to V5.
           
                 
                  END MATRIX.
           
                 
                 
                  The two PRINT statements in the matrix program were
            included
           
                  initially to check that things were working as
            expected.  But
           
                  later, I found that it didn't work properly if I
            removed them.
                 
                  So, I have two questions:
           
                 
                 
                  1. Is there some easier alternative to the double loop
            in a matrix
           
                  program for filling in the top half of the correlation
            matrix?  (I
           
                  was thinking MCONVERT or something like that, but
            found nothing
           
                  suitable.)
           
                 
                 
                  2. Any thoughts on why removal of the two PRINT
            commands in my
           
                  matrix program is causing it to go all FUBAR?  (I
            tried removing
           
                  the PRINTs and including an EXECUTE after the
            double-loop, but
           
                  that did not fix it.)  
           
                 
                 
                  Thanks,
           
                 
                  Bruce
           
                 
                 
                   --
           
                   
                    Bruce Weaver
           
                   
                    [hidden
              email]  
           
                   
                    http://sites.google.com/a/lakeheadu.ca/bweaver/  
           
                   
                    "When all else fails, RTFM."
           
                   
                   
                    NOTE: My Hotmail account is not monitored regularly.
           
                     
                      To send me an e-mail, please use the address shown
            above.
           
                     
                     
           
                 
                 
                 
                 
                    If you reply to this email, your
           
                      message will be added to the discussion below:
                    http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229.html  
           
                 
                 
                    To start a new topic under SPSSX Discussion, email
           
                    [hidden
              email]  
                    To unsubscribe from SPSSX Discussion, click
           
                      here .
                    NAML
         
       
     
       --
       
        Bruce Weaver
       
        [hidden email] 
       
        http://sites.google.com/a/lakeheadu.ca/bweaver/ 
       
        "When all else fails, RTFM."
       
       
        NOTE: My Hotmail account is not monitored regularly.
         
          To send me an e-mail, please use the address shown above.
         
         
     
     
     
     
        If you reply to this email, your
          message will be added to the discussion below:
        http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229p5722232.html 
     
     
        To start a new topic under SPSSX Discussion, email
        [hidden email] 
        To unsubscribe from SPSSX Discussion, click
          here .
        NAML
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Bruce Weaver
Administrator
This post was updated on .
In reply to this post by Bruce Weaver
After some off-list discussion with David, I inserted a PRINT after CM=CM+T(CM) and discovered that the main diagonal terms were all equal to 2 -- that is why Kirill included the CALL SETDIAG line in his solution.  But interestingly, the MSAVE I used to write out the matrix seems to have fixed things, because the matrix in my EM2 dataset had 1's on the main diagonal.  

dataset activate EM1. /* EM correlations from OMS.
MATRIX.
get CM  / file = * / variables = !MyVarList / missing=0 .
compute CM=CM+T(CM).
print CM / format = "f5.3".  /* 2's on the main diagonal at this point /*
msave CM /TYPE=CORR /OUTFILE=* /VARIABLES=!MyVarList.
END MATRIX.
DATASET NAME EM2.  /* 1's on the main diagonal by this point /*


Meanwhile, it occurred to me this morning that I could replace missing with 1 when grabbing the matrix via GET, and then use element-wise multiplication (&*) of CM and T(CM).  Now there is definitely no ned for SETDIAG.  Here is a self-contained example, which demonstrates that it works just fine:

EDIT:  Note that in the original version of this post, I showed %* as the element-wise multiplication function.  It should have been &*, as in the example below!  

NEW FILE.
DATASET CLOSE all.
MATRIX.
* Create lower half of a correlation matrix, but with 1's in the top half .
COMPUTE CM =
 {1,1,1,1,1 ;
 .508,1,1,1,1 ;
 .347,.583,1,1,1;
 .204,.243,.294,1,1;
 .108,.166,.213,.250,1 }.
PRINT CM / format = "f5.3".
COMPUTE CM = CM &* T(CM).
PRINT CM / format = "f5.3".
MSAVE CM /TYPE=CORR /OUTFILE=* /VARIABLES=V1 to V5.
END MATRIX.
FORMATS V1 to V5 (f5.3).
LIST.

There are 1's on the main diagonal at all points, and here is the final result of the MSAVE:

ROWTYPE_ VARNAME_    V1    V2    V3    V4    V5
 
CORR     V1       1.000  .508  .347  .204  .108
CORR     V2        .508 1.000  .583  .243  .166
CORR     V3        .347  .583 1.000  .294  .213
CORR     V4        .204  .243  .294 1.000  .250
CORR     V5        .108  .166  .213  .250 1.000


Bruce Weaver wrote
Another aha moment!  Thank you Kirill for showing me that I was misunderstanding how the /MISSING sub-command works for GET.  With /MISSING=0, I no longer need my recode out front.  

I didn't really need the /NAMES=NAMES bit, because I have my variable list as a macro.

My final syntax (I think) looks like this, without the recode out front:

dataset activate EM1. /* EM correlations from OMS.
MATRIX.
get CM  / file = * / variables = !MyVarList / missing=0 .
compute CM=CM+T(CM).
msave CM /TYPE=CORR /OUTFILE=* /VARIABLES=!MyVarList.
END MATRIX.
DATASET NAME EM2.

Note that I use MSAVE rather than SAVE, as this gives me the ROWTYPE_ and VARNAME_ variables I need.  (They are not present in the EM1 dataset obtained via OMS.)

Thanks fellas.  ;-)



Kirill Orlov wrote
Bruce,
If you chose to do it via MATRIX, here it is.

matrix.
get m /vari= v1 to v5 /miss= 0 /names= names.
comp m= m+t(m).
print m.
call setdiag(m,1).
save m /outfile= * /names= names.
end matrix.

The above example takes only the matrix body - i.e. without variables
ROWTYPE_ and VARNAME_ (if you have such there) - but you can modify it
to take in those, too.

You can do a similar thing using MGET / MSAVE matrix statements, but I
don't recommend it ever.


26.09.2013 0:06, Bruce Weaver ?????:
> In another thread
> (http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-td4994372.html),
> I suggested that one can use the EM correlations from MVA as input for
> FACTOR when one is doing exploratory factor analysis, but has missing data.
> As I've been exploring how to do that, I've run up against a small problem:
> The matrix of EM correlations can be captured via OMS, but it contains only
> the /lower half/ (and main diagonal) of the correlation matrix.  But FACTOR
> wants the /full matrix/ as input.  The only way I could think of to fill in
> the top half was with a little MATRIX program, like the one shown below.
> Dataset LowerHalf holds the lower half of the EM correlation matrix, and
> looks like this, for example:
>
>      V1     V2     V3     V4     V5
>   1.000   .      .      .      .
>    .508  1.000   .      .      .
>    .347   .583  1.000   .      .
>    .204   .243   .294  1.000   .
>    .108   .166   .213   .250  1.000
>
> On first attempt to grab variables V1 to V5 with the GET command in the
> MATRIX program, it objected to the system missing values.  So I first
> recoded SYSMIS to 99 in the upper half of the correlation matrix.
>
> DATASET ACTIVATE LowerHalf.
> recode V1 to V5 (sysmis=99).
> execute.
>
> MATRIX.
> get CM  / file = * / variables = V1 to V5.
> print CM / format = "f5.3".
> loop r = 1 to nrow(CM)-1.
> loop c = 2 to ncol(CM).
> *compute CM(r,c) = CM(c,r).*
> end loop.
> end loop.
> print CM / format = "f5.3".
> msave CM /TYPE=CORR /OUTFILE = * /VARIABLES=V1 to V5.
> END MATRIX.
>
> The two PRINT statements in the matrix program were included initially to
> check that things were working as expected.  But later, I found that it
> didn't work properly if I removed them.
>
> So, I have two questions:
>
> 1. Is there some easier alternative to the double loop in a matrix program
> for filling in the top half of the correlation matrix?  (I was thinking
> MCONVERT or something like that, but found nothing suitable.)
>
> 2. Any thoughts on why removal of the two PRINT commands in my matrix
> program is causing it to go all FUBAR?  (I tried removing the PRINTs and
> including an EXECUTE after the double-loop, but that did not fix it.)
>
> Thanks,
> Bruce
>
>
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Filling in the top half of a correlation matrix

Bruce Weaver
Administrator
At the risk of beating a dead horse, here is one final (I hope) variation on the matrix program for filling in the top half of my correlation matrix.  In a nutshell, I have gone back to setting the SYSMIS values in the top half of the matrix to 0, using CM = CM + T(CM) to fill in the top half, and using CALL SETDIAG to restore the values on the main diagonal to their correct original values.  The reason for changing back to this approach is that it is more general--e.g., it will work for a covariance matrix just as well as a correlation matrix.  Using my pervious method with a covariance matrix would have resulted in the terms on the main diagonal being equal to the squares of the variances.  With the approach I'm using now, they are twice the size of the correct values for both covariance and correlation matrices.

Thanks again to Kirill & David for their help.

From my syntax file...

* In an earlier version of this matrix program, I set the missing
* values in the top half of the correlation matrix to 1, and then
* filled in the top half by setting CM = CM &* T(CM), where CM =
* the correlation matrix, &* is the element-wise multiplication
* function, and T(CM) is the transpose of matrix CM.  But I have
* now changed it to fill in the top half as follows:
* 1) Set the missing correlations in the top half of the matrix to 0;
* 2) Store the main diagonal of matrix CM in vector D;
* 3) Let CM = CM + T(CM), at which point there are 2's on the diagonal;
* 4) Use CALL SETDIAG to set the diagonal to the values stored in D.
* I have changed to this more general approach because it will also
* work for covariance matrices, where the main diagonal holds variances,
* not 1's. Using CM = CM &* T(CM) would give me a matrix where the terms
* on the main diagonal are equal to the squares of the variances.
* With the approach I now use below, I always end up with the terms
* on the diagonal being twice as large as they should be, and this
* is so for either correlation or covariance matrices.  Therefore,
* I can simply divide the terms on the diagonal by 2 in either case.

dataset activate EM1. /* bottom half of matrix of EM correlations.
MATRIX.
get CM  / file = * / variables = !MyVarList / missing=0 .
* MISSING=0 on the previous line replaces the SYSMIS values with zeroes.
compute CM = CM + T(CM).
* At this point, the terms on the main diagonal are twice
* as large as they should be, so divide them by 2.
call setdiag(CM,DIAG(CM)/2).
*print CM / format = "f5.3".
msave CM /TYPE=CORR /OUTFILE=* /VARIABLES=!MyVarList.
* MSAVE saves the specified matrix as a matrix file, the
* sort of file needed as input to FACTOR.
* If you are modifying this code to use with a covariance
* matrix, change TYPE=CORR to TYPE=COV.
END MATRIX.
DATASET NAME EM2.


I suppose this could all be stuck in a macro with CORR vs COV as an argument.  Maybe I'll do that someday if I need to do this with a covariance matrix.


Bruce Weaver wrote
After some off-list discussion with David, I inserted a PRINT after CM=CM+T(CM) and discovered that the main diagonal terms were all equal to 2 -- that is why Kirill included the CALL SETDIAG line in his solution.  But interestingly, the MSAVE I used to write out the matrix seems to have fixed things, because the matrix in my EM2 dataset had 1's on the main diagonal.  

dataset activate EM1. /* EM correlations from OMS.
MATRIX.
get CM  / file = * / variables = !MyVarList / missing=0 .
compute CM=CM+T(CM).
print CM / format = "f5.3".  /* 2's on the main diagonal at this point /*
msave CM /TYPE=CORR /OUTFILE=* /VARIABLES=!MyVarList.
END MATRIX.
DATASET NAME EM2.  /* 1's on the main diagonal by this point /*


Meanwhile, it occurred to me this morning that I could replace missing with 1 when grabbing the matrix via GET, and then use element-wise multiplication (&*) of CM and T(CM).  Now there is definitely no ned for SETDIAG.  Here is a self-contained example, which demonstrates that it works just fine:

EDIT:  Note that in the original version of this post, I showed %* as the element-wise multiplication function.  It should have been &*, as in the example below!  

NEW FILE.
DATASET CLOSE all.
MATRIX.
* Create lower half of a correlation matrix, but with 1's in the top half .
COMPUTE CM =
 {1,1,1,1,1 ;
 .508,1,1,1,1 ;
 .347,.583,1,1,1;
 .204,.243,.294,1,1;
 .108,.166,.213,.250,1 }.
PRINT CM / format = "f5.3".
COMPUTE CM = CM &* T(CM).
PRINT CM / format = "f5.3".
MSAVE CM /TYPE=CORR /OUTFILE=* /VARIABLES=V1 to V5.
END MATRIX.
FORMATS V1 to V5 (f5.3).
LIST.

There are 1's on the main diagonal at all points, and here is the final result of the MSAVE:

ROWTYPE_ VARNAME_    V1    V2    V3    V4    V5
 
CORR     V1       1.000  .508  .347  .204  .108
CORR     V2        .508 1.000  .583  .243  .166
CORR     V3        .347  .583 1.000  .294  .213
CORR     V4        .204  .243  .294 1.000  .250
CORR     V5        .108  .166  .213  .250 1.000


Bruce Weaver wrote
Another aha moment!  Thank you Kirill for showing me that I was misunderstanding how the /MISSING sub-command works for GET.  With /MISSING=0, I no longer need my recode out front.  

I didn't really need the /NAMES=NAMES bit, because I have my variable list as a macro.

My final syntax (I think) looks like this, without the recode out front:

dataset activate EM1. /* EM correlations from OMS.
MATRIX.
get CM  / file = * / variables = !MyVarList / missing=0 .
compute CM=CM+T(CM).
msave CM /TYPE=CORR /OUTFILE=* /VARIABLES=!MyVarList.
END MATRIX.
DATASET NAME EM2.

Note that I use MSAVE rather than SAVE, as this gives me the ROWTYPE_ and VARNAME_ variables I need.  (They are not present in the EM1 dataset obtained via OMS.)

Thanks fellas.  ;-)



Kirill Orlov wrote
Bruce,
If you chose to do it via MATRIX, here it is.

matrix.
get m /vari= v1 to v5 /miss= 0 /names= names.
comp m= m+t(m).
print m.
call setdiag(m,1).
save m /outfile= * /names= names.
end matrix.

The above example takes only the matrix body - i.e. without variables
ROWTYPE_ and VARNAME_ (if you have such there) - but you can modify it
to take in those, too.

You can do a similar thing using MGET / MSAVE matrix statements, but I
don't recommend it ever.


26.09.2013 0:06, Bruce Weaver ?????:
> In another thread
> (http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-td4994372.html),
> I suggested that one can use the EM correlations from MVA as input for
> FACTOR when one is doing exploratory factor analysis, but has missing data.
> As I've been exploring how to do that, I've run up against a small problem:
> The matrix of EM correlations can be captured via OMS, but it contains only
> the /lower half/ (and main diagonal) of the correlation matrix.  But FACTOR
> wants the /full matrix/ as input.  The only way I could think of to fill in
> the top half was with a little MATRIX program, like the one shown below.
> Dataset LowerHalf holds the lower half of the EM correlation matrix, and
> looks like this, for example:
>
>      V1     V2     V3     V4     V5
>   1.000   .      .      .      .
>    .508  1.000   .      .      .
>    .347   .583  1.000   .      .
>    .204   .243   .294  1.000   .
>    .108   .166   .213   .250  1.000
>
> On first attempt to grab variables V1 to V5 with the GET command in the
> MATRIX program, it objected to the system missing values.  So I first
> recoded SYSMIS to 99 in the upper half of the correlation matrix.
>
> DATASET ACTIVATE LowerHalf.
> recode V1 to V5 (sysmis=99).
> execute.
>
> MATRIX.
> get CM  / file = * / variables = V1 to V5.
> print CM / format = "f5.3".
> loop r = 1 to nrow(CM)-1.
> loop c = 2 to ncol(CM).
> *compute CM(r,c) = CM(c,r).*
> end loop.
> end loop.
> print CM / format = "f5.3".
> msave CM /TYPE=CORR /OUTFILE = * /VARIABLES=V1 to V5.
> END MATRIX.
>
> The two PRINT statements in the matrix program were included initially to
> check that things were working as expected.  But later, I found that it
> didn't work properly if I removed them.
>
> So, I have two questions:
>
> 1. Is there some easier alternative to the double loop in a matrix program
> for filling in the top half of the correlation matrix?  (I was thinking
> MCONVERT or something like that, but found nothing suitable.)
>
> 2. Any thoughts on why removal of the two PRINT commands in my matrix
> program is causing it to go all FUBAR?  (I tried removing the PRINTs and
> including an EXECUTE after the double-loop, but that did not fix it.)
>
> Thanks,
> Bruce
>
>
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Filling-in-the-top-half-of-a-correlation-matrix-tp5722229.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).