Hi all,
I'm trying to create a Euclidean distance table comparing cases for a sample of 4000+ cases and 5+ variables. I've pasted the syntax below. I'm only interested in the distance between all cases and one particular case (vs. a table of the distances between all cases and themselves, which takes a long time to run and the table is too large to export). I would appreciate any commands I can add to this syntax to make the adjustment above (similar to the "with" command in a bivariate correlation syntax). Thanks, Adam PROXIMITIES ZVar1 ZVar2 ZVar3 ZVar4 ZVar5 /ID=CaseID /VIEW=CASE /MEASURE=EUCLID /STANDARDIZE=NONE. |
try this. Is this what you are looking for?
If your prng does not give exactly the same values for the z scores for case 1 and the distance to case1 is not close to zero, go the data view and copy the z scores for the 5 variables. Paste them into the target list. Rerun the syntax for the do repeat. When you use this on your data, paste the appropriate values into the target list. set seed 20120509. input program. vector zvar (5,f5.3). loop caseid = 1 to 4000. loop #p = 1 to 5. compute zvar(#p) = rv.normal(0,1). end loop. end case. end loop. end file. end input program. formats zvar1 to zvar5 (f10.8). *part to re-run. do repeat z = zvar1 to zvar5 /target = -.46208301 .88636630 -2.00144742 -1.64200938 -.15693197 /dsq = dsq1 to dsq5. compute dsq= (target-z)**2. end repeat. compute dist2target = sqrt(sum(dsq1 to dsq5)). formats dist2target(f6.3). execute. Art Kendall Social Research Consultants On 5/9/2012 10:03 AM, Adam Troy wrote: Hi all,===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
Administrator
|
I would avoid the pasting and just use scratch variables. Also avoid the extra unneeded variables by using SUM within DO REPEAT. My personal preference for such things is to use the MATRIX language.
DO REPEAT z = zvar1 to zvar5 /#targ = #targ1 to #targ5. + DO IF $CASENUM=1. + COMPUTE #targ=Z. + ELSE. + COMPUTE dist= SUM(dist,(z-#targ)**2). + END IF. END REPEAT. COMPUTE dist = SQRT(dist). formats dist(f6.3). PRESERVE. SET MXLOOPS=10000. MATRIX. GET data / var zvar1 to zvar5. COMPUTE dist=MAKE(NROW(data),1,0). LOOP #=1 TO NROW(data). COMPUTE dist(#)=SQRT(RSUM((data(#,:)-data(1,:))&**2)). END LOOP. SAVE dist / OUTFILE "tmp.sav". END MATRIX. RESTORE. MATCH FILES / FILE * / FILE "tmp.sav".
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |