|
Hi Gene: Yes, in response to your question. We are interested in getting an IPA for every combination of persons and then using that value for the cluster analysis. I am playing around with the proximities command right now. martin From: Maguin, Eugene <[hidden email]>
Sent: Tuesday, May 23, 2017 2:11:57 PM To: Martin Sherman; [hidden email] Subject: RE: How to get an index of profile agreement for 1090 by 23 matrix I'm just curious. So, as I understand this, you are going to calculate an IPA for every combination of persons and then use that value in your cluster analysis. So, you'll have 522753 IPA values (1023*1022/2), functionally the upper or
lower triangle of a 1023 by 1023 matrix.
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
I'm not familiar with the formula you give but I kind of don't think you've reproduced it accurately because using your example data Profile 1: 1.2, 0.3, -0.5, 0.1, 0.9 Profile 2: 0.6, 0.7, -1.1, -0.3, 0.7 SigmaM2: 0.9, 0.5, -0.8, -0.2, 0.8 = 1.3 SigmaD2: 0.6, -0.4, 0.6, 0.4, -0.2 = 1.0 IPA = 5 + 2*1.3-1.0/sqrt(50) = 7.46 probably not Maybe it's (5 + 2*1.3-1.0)/sqrt(50) = 0.933 plausible I agree it that it is most directly a matrix problem and best done there. But, in theory but not practically, couldn't this be done if the data matrix were flipped? Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Martin F. Sherman Sent: Tuesday, May 23, 2017 11:21 AM To: [hidden email] Subject: How to get an index of profile agreement for 1090 by 23 matrix Dear listers: Looking for some advice on how to proceed with the following. Is there a macro or syntax file that I could manipulate to do what I need to have done. Thanks. Martin Here is what we have. We have a 1090 by 23 matrix of data. The rows are subjects and the columns are variables. These variables are scale scores on a measure. We wish to do a cluster analysis of these variables to identify homogenous groups of profiles across the 23 scales. To do this, we want to use as data an index of profile agreement. To do this, we want to calculate the Ipa (index of profile agreement) for each person with every other person. We would like to do this using the matrix routine from SPSS or even to develop a specific macro to calculate these values. Here is the equation, as best as I can get it in email: Ipa = k + 2SigmaM2 – Sigma d2 / sqrt (10*k). k = number of variables (which is 23 in this case) M=the mean of two scores on the same scale d = the difference between two scores on the same scale. So, if we had only 5 scales and wanted to compare two profiles, here is an example. Profile 1: 1.2, .3, –0.5, 0.1, and 0.9 (all values are z-scores) Profile 2: .6, .7, –1.1, –.3, and .7 Sigma M2 = (0.92 + 0.52 + (-0.8)2 + (-0.1)2 + 0.82) = 2.35 Sigma d2 =(0.62 + 0.42 + 0.62 +0.42 + 0.22) = 1.08 Martin F. Sherman, Ph.D. Professor of Psychology Director of Masters Education: Thesis Track Loyola University Maryland 4501 North Charles Street Baltimore, MD 21210 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by msherman
Just so you hear another point of view - I never cared for Clustering. And the better examples I've seen (where clustering seemed to give useful results) started out with
demographics, not scale score.
With scales, you end up characterizing the Clusters by their means on several variables... I thought it was economical and easier to start out with the variables, and do Factor Analysis.
I would look at the factors both before and after varimax rotation, to pick what seems "meaningful" - usually, the first few rotated factors.
Then I could score up the meaningful factors, probably with the simple average-of-top-items,
and look at scatterplots between pairs of the top three or so factors. If scores seem to "cluster" in top, middle, or bottom tiers, then there are "clusters". Even without clusters, there were
extremes that could be identified that accounted for much of the observed variance.
My prejudice in this direction was partly shaped by reading early work in looking at psychiatric symptoms (Overall and Klett, 1971 textbook), where the clear-cut diagnostic groups that
existed in the teaching of psychiatrists in the 1960s did not show up as clusters; rather, there
were dimensions of behavior which never did show sharp breaks between Dx groups.
-- Rich Ulrich
From: SPSSX(r) Discussion <[hidden email]> on behalf of Martin F. Sherman <[hidden email]>
Sent: Tuesday, May 23, 2017 11:20:33 AM To: [hidden email] Subject: How to get an index of profile agreement for 1090 by 23 matrix Dear listers: Looking for some advice on how to proceed with the following. Is there a macro or syntax file that I could manipulate to do what I need to have done. Thanks. Martin
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Here is what we have. We have a 1090 by 23 matrix of data. The rows are subjects and the columns are variables. These variables are scale scores on a measure. We wish to do a cluster analysis of these variables to identify homogenous groups of profiles across the 23 scales. To do this, we want to use as data an index of profile agreement. To do this, we want to calculate the Ipa (index of profile agreement) for each person with every other person. We would like to do this using the matrix routine from SPSS or even to develop a specific macro to calculate these values. Here is the equation, as best as I can get it in email: Ipa = k + 2SigmaM2 – Sigma d2 / sqrt (10*k). k = number of variables (which is 23 in this case) M=the mean of two scores on the same scale d = the difference between two scores on the same scale. So, if we had only 5 scales and wanted to compare two profiles, here is an example. Profile 1: 1.2, .3, –0.5, 0.1, and 0.9 (all values are z-scores) Profile 2: .6, .7, –1.1, –.3, and .7 Sigma M2 = (0.92 + 0.52 + (-0.8)2 + (-0.1)2 + 0.82) = 2.35 Sigma d2 =(0.62 + 0.42 + 0.62 +0.42 + 0.22) = 1.08 Martin F. Sherman, Ph.D. Professor of Psychology Director of Masters Education: Thesis Track Loyola University Maryland 4501 North Charles Street Baltimore, MD 21210 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Clustering is exploratory/heuristic. Many of the times I have seen meaningful results, items were factored, a unit weight scoring key was developed, then the scale scores were used as the input to clustering.
Art Kendall
Social Research Consultants |
|
Administrator
|
In reply to this post by msherman
This is what I was getting at.
DEFINE !RpaCluster ( Varlist !CMDEND) PRESERVE. SET MXLOOPS=1000000. MATRIX. GET data /FILE */VARIABLES !Varlist. /* Calculate rpa Profile similarity measure*/. COMPUTE N=NRow(data). COMPUTE K=NCOL(data). COMPUTE rpa=IDENT(N,N). COMPUTE Mean=CSUM(data)/NROW(data). COMPUTE SMeanSq=Mean * T(Mean). LOOP #=1 TO N - 1. LOOP ##=#+1 TO N . COMPUTE Diff=data(#,:) - data(##,:). COMPUTE ipa=(K + 2 * SMeanSq - Diff * T(Diff) ) /SQRT(10*K). COMPUTE rpa(#,##)=ipa/SQRT((K - 2) + ipa**2). COMPUTE rpa(##,#)=rpa(#,##). END LOOP. END LOOP. SAVE rpa /OUTFILE *. END MATRIX. /* Prepare matrix data file for Cluster */. STRING ROWTYPE_ ID (A8) VARNAME_ (A10). COMPUTE CASENO_=$CASENUM. FORMATS CASENO_ (F4.0). COMPUTE ROWTYPE_="PROX". VALUE LABELS ROWTYPE_"PROX" ="SIMILARITY". COMPUTE ID=CONCAT("Case ", LTRIM(STRING($CASENUM,F4.0))). COMPUTE VARNAME_=CONCAT("COL",LTRIM( STRING($CASENUM,F4.0))). MATCH FILES /FILE */KEEP ROWTYPE_ ID CaseNo_ VARNAME_ ALL. /* Customize the following to your specifications re METHOD etc */. CLUSTER / MATRIX IN (*) . RESTORE. !ENDDEFINE . /* Simulate some normal deviates */. MATRIX. SAVE UNIFORM(2000,22)/OUTFILE */VARIABLES x1 TO x22. END MATRIX. DO REPEAT v=x1 TO x22. COMPUTE v=RV.NORMAL(0,1). END REPEAT. /* Call the macro */. !RpaCluster !Varlist = x1 TO x22 .
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
David: Could I spend you my data file. I must be missing something. martin From: SPSSX(r) Discussion <[hidden email]> on behalf of David Marso <[hidden email]>
Sent: Wednesday, May 24, 2017 4:02:19 PM To: [hidden email] Subject: Re: How to get an index of profile agreement for 1090 by 23 matrix This is what I was getting at.
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
DEFINE !RpaCluster ( Varlist !CMDEND) PRESERVE. SET MXLOOPS=1000000. MATRIX. GET data /FILE */VARIABLES !Varlist. /* Calculate rpa Profile similarity measure*/. COMPUTE N=NRow(data). COMPUTE K=NCOL(data). COMPUTE rpa=IDENT(N,N). COMPUTE Mean=CSUM(data)/NROW(data). COMPUTE SMeanSq=Mean * T(Mean). LOOP #=1 TO N - 1. LOOP ##=#+1 TO N . COMPUTE Diff=data(#,:) - data(##,:). COMPUTE ipa=(K + 2 * SMeanSq - Diff * T(Diff) ) /SQRT(10*K). COMPUTE rpa(#,##)=ipa/SQRT((K - 2) + ipa**2). COMPUTE rpa(##,#)=rpa(#,##). END LOOP. END LOOP. SAVE rpa /OUTFILE *. END MATRIX. /* Prepare matrix data file for Cluster */. STRING ROWTYPE_ ID (A8) VARNAME_ (A10). COMPUTE CASENO_=$CASENUM. FORMATS CASENO_ (F4.0). COMPUTE ROWTYPE_="PROX". VALUE LABELS ROWTYPE_"PROX" ="SIMILARITY". COMPUTE ID=CONCAT("Case ", LTRIM(STRING($CASENUM,F4.0))). COMPUTE VARNAME_=CONCAT("COL",LTRIM( STRING($CASENUM,F4.0))). MATCH FILES /FILE */KEEP ROWTYPE_ ID CaseNo_ VARNAME_ ALL. /* Customize the following to your specifications re METHOD etc */. CLUSTER / MATRIX IN (*) . RESTORE. !ENDDEFINE . /* Simulate some normal deviates */. MATRIX. SAVE UNIFORM(2000,22)/OUTFILE */VARIABLES x1 TO x22. END MATRIX. DO REPEAT v=x1 TO x22. COMPUTE v=RV.NORMAL(0,1). END REPEAT. /* Call the macro */. !RpaCluster !Varlist = x1 TO x22 . ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-get-an-index-of-profile-agreement-for-1090-by-23-matrix-tp5734228p5734261.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
Martin, Contact me off list and we can discuss. David On Wed, May 24, 2017 at 4:24 PM, msherman [via SPSSX Discussion] <[hidden email]> wrote:
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
Administrator
|
In reply to this post by David Marso
The macro call is incorrect, it should be /* Call the macro */.
!RpaCluster Varlist = x1 TO x22 . instead of !RpaCluster !Varlist = x1 TO x22 .
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
Administrator
|
In reply to this post by David Marso
Martin,
Did you ever connect the dots on this or are you oblivious to the fact that I went way beyond the call of duty and answered your exact question. Perhaps you need to do some reading up on CLUSTER and the MATRIX language alternatively restate your question or indicate what part of my solution fails to penetrate your cerebrum.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
Administrator
|
In reply to this post by Maguin, Eugene
"I agree it that it is most directly a matrix problem and best done there. But, in theory but not practically, couldn't this be done if the data matrix were flipped? "
I really wouldn't want to try this outside of MATRIX. Perhaps explain how thsi could possibly work short of creating hundreds of THOUSANDS of new variables and a nearly intractible AGGREGATION. The very suggestion makes my head spin/explode.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
David, I agree with your statement. If I were faced with that problem, I'd try figure out matrix-end matrix but also beg some code from you because you can write it quickly and correctly. Gene Maguin
-----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Monday, June 05, 2017 5:21 PM To: [hidden email] Subject: Re: How to get an index of profile agreement for 1090 by 23 matrix "I agree it that it is most directly a matrix problem and best done there. But, in theory but not practically, c*ouldn't this be done if the data matrix were flipped?* " I really wouldn't want to try this outside of MATRIX. Perhaps explain how thsi could possibly work short of creating hundreds of THOUSANDS of new variables and a nearly intractible AGGREGATION. The very suggestion makes my head spin/explode. ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-get-an-index-of-profile-agreement-for-1090-by-23-matrix-tp5734228p5734313.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Yes, David you went beyond the call of duty. and very much appreciated. I have been waiting for my colleague to get back to me. He is the one that created the original code . And is the process of checking it out with a small dummy set. I have a meeting with him later this week to review where we are. My initial reaction is that the code you wrote is spot on. Again your work on this and the comments on the listserv are proving to be invaluable. Thanks. Martin
Sent from my iPhone > On Jun 5, 2017, at 5:32 PM, Maguin, Eugene <[hidden email]> wrote: > > David, I agree with your statement. If I were faced with that problem, I'd try figure out matrix-end matrix but also beg some code from you because you can write it quickly and correctly. Gene Maguin > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso > Sent: Monday, June 05, 2017 5:21 PM > To: [hidden email] > Subject: Re: How to get an index of profile agreement for 1090 by 23 matrix > > "I agree it that it is most directly a matrix problem and best done there. > But, in theory but not practically, c*ouldn't this be done if the data matrix were flipped?* " > > I really wouldn't want to try this outside of MATRIX. Perhaps explain how thsi could possibly work short of creating hundreds of THOUSANDS of new variables and a nearly intractible AGGREGATION. The very suggestion makes my head spin/explode. > > > > ----- > Please reply to the list and not to my personal email. > Those desiring my consulting or training services please feel free to email me. > --- > "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." > Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-get-an-index-of-profile-agreement-for-1090-by-23-matrix-tp5734228p5734313.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by msherman
A distinction is sometimes made between profile similarity and profile agreement. One way to look at that is to draw profile graphs. [still a hand operation. SPSS has yet to implement them.] A profile graph is much the same as a profile coordinate plot except that all axes are the same, e.g., z-scores, T-scores, Likert response scale, percentiles, etc]. One of the most common uses of such graphs are in achievement testing, the axes being things like spelling, noun-verb number agreement, plurals, etc. I believe the archives of this list have some syntax for profile graphs you could use.
Two profiles can differ in shape. elevation, and/or scatter. Two (case or cluster) profiles have similar shape when the line segments between axes have the same direction, ignoring the steepness. Two (case or cluster) profiles have similar elevation when the line segments between axes have about the same score (e.g, z-score). Two (case or cluster) profiles have similar shape when the line segments between axes have about the same direction and slope, i.e., visually are on top of one another. While you are waiting for your colleague, I suggest that you use PROXIMITIES to look at some of the similarity coefficients for continuous variables on your small set of example cases. Besides eyeballing the coefficients themselves, you might ask the list to provide some syntax to make a "sausage" file. That is make each pair of entities a row (case) and each kind of coefficient a variable. If you are ambitious you could use random sampling to select a 25 by 23 matrix of zscores and then produce the 300 by # of coefficients "sausage" file of the off-diagonal matrix. Then scatterplot pairs of of coefficients some from old PROXIMITIES coefficients and from your new kind of coefficients.
Art Kendall
Social Research Consultants |
| Free forum by Nabble | Edit this page |
