calculating distances

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

calculating distances

Volker, Gerard
Dear SPSS-users,
 
I have the following dataset:
 
VAR1 VAR2
H1  0
H2  10
H3  30
H(n)  200.
 
VAR1 is a string (trainstations,A3) and VAR2 is numeric (F8.0.) and represents the cumulative distances between the stations (H1 to H(n)).
 
I need to calculate the distance between the station-combinations (eq H1H2,H1H3,H1H4,etc.). 
The output that I'm looking for is in this case (n=4):
 
comb distance
H1H2 10
H1H3 30
H1H4 200
H2H3 20
H2H4 190
H3H4 170.
 
H(n) can be up to 30 so I'm looking at a staggering number of possible combinations. Obviously combinations such as H2H1 are the same as H1H2 and can be left out.
 
What is the best way to tackle this problem? I've been looking at spss tools but can't find a macro or syntax that provides a solution that fits this particular problem.
 
Your advice is much appreciated! 
 
Regards,
Gerard Volker 
Reply | Threaded
Open this post in threaded view
|

Syntax for adding reference line?

Robert L
Hi all,

anyone who could help me with a (hopefully) small graphical issue? I simply want to add reference lines to boxplots, but would like to do it using syntax rather than opening the Chart Editor for all of my numerous plots. Better still could be if I could add single points placed near boxplots, as in the example below:

   |------|   |  |-----|
           (*)

where the (*) is the value for the single point. Any suggestions for any of these, the reference line and/or adding a single point using syntax? Any sggestions would be appreciated.

Robert
*****************
Robert Lundqvist
Norrbotten county council
Lulea, Sweden

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Robert Lundqvist
Reply | Threaded
Open this post in threaded view
|

Re: calculating distances

Maguin, Eugene
In reply to this post by Volker, Gerard
Gerald,

It's time to start reading the syntax reference documentation, specifically,
the casestovars, vector, and loop-end loop commands. Basically, you are
going to do a casestovars command to get a dataset consisting of one record
(case) with variables var1.1 to var1.4 and var2.1 to var2.4. I think you
will need to add an id variable with a constant value across the four cases
for the casestovars command.

Then, define var1.1 to var1.4 and var2.1 to var2.4 as two vectors of size 4.
Call them vecv1, vecv2. And, define a third vector of size (4*3)/2 = 6. Call
this one result.

Then,

Compute #k=0
Loop #i=1 to 3.
Loop #j=#i+1 to 4.
Compute #k=#k+1.
Compute result(#k)=abs(vecv1-vecv2).
End loop.
End loop.

Gene Maguin


>>I have the following dataset:

VAR1 VAR2
H1  0
H2  10
H3  30
H(n)  200.

VAR1 is a string (trainstations,A3) and VAR2 is numeric (F8.0.) and
represents the cumulative distances between the stations (H1 to H(n)).

I need to calculate the distance between the station-combinations (eq
H1H2,H1H3,H1H4,etc.).
The output that I'm looking for is in this case (n=4):

comb distance
H1H2 10
H1H3 30
H1H4 200
H2H3 20
H2H4 190
H3H4 170.

H(n) can be up to 30 so I'm looking at a staggering number of possible
combinations. Obviously combinations such as H2H1 are the same as H1H2 and
can be left out.

What is the best way to tackle this problem? I've been looking at spss tools
but can't find a macro or syntax that provides a solution that fits this
particular problem.

Your advice is much appreciated!

Regards,
Gerard Volker

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: calculating distances

Marta Garcia-Granero
In reply to this post by Volker, Gerard
Volker, Gerard wrote:

> I have the following dataset:
>
> VAR1 VAR2
> H1  0
> H2  10
> H3  30
> H(n)  200.
>
> VAR1 is a string (trainstations,A3) and VAR2 is numeric (F8.0.) and
> represents the cumulative distances between the stations (H1 to H(n)).
>
> I need to calculate the distance between the station-combinations (eq
> H1H2,H1H3,H1H4,etc.).
> The output that I'm looking for is in this case (n=4):
>
> comb distance
> H1H2 10
> H1H3 30
> H1H4 200
> H2H3 20
> H2H4 190
> H3H4 170.
>
> H(n) can be up to 30 so I'm looking at a staggering number of possible
> combinations. Obviously combinations such as H2H1 are the same as H1H2
> and can be left out.
>
> What is the best way to tackle this problem? I've been looking at spss
> tools but can't find a macro or syntax that provides a solution that
> fits this particular problem.

A different approach to the one offered by Gene is this one, using MATRIX:

DATA LIST LIST/VAR1(A2) VAR2(F8).
BEGIN DATA.
H1   0
H2  10
H3  30
H4  200
END DATA.

PRESERVE.
SET MXLOOPS=100. /* If more than 100 rows of data, increase it *.
MATRIX.
GET Var1 /VAR=VAR1 /NAMES=Vname1.
GET Var2 /VAR=VAR2 /NAMES=Vname2.
COMPUTE NStation=NROW(Var1).
COMPUTE Distance=MAKE(NStation*(NStation-1)/2,1,0).
COMPUTE NameStat=MAKE(NStation*(NStation-1)/2,2,0).
COMPUTE a=1.
LOOP I=1 TO NStation-1.
. LOOP J=I+1 TO NStation.
.  COMPUTE Distance(a)=Var2(J)-Var2(I).
.  COMPUTE NameStat(a,1)=Var1(I).
.  COMPUTE NameStat(a,2)=Var1(J).
.  COMPUTE a=a+1.
. END LOOP.
END LOOP.
PRINT NameStat
 /FORMAT='A3'
 /TITLE='Stations being compared'.
PRINT Distance
 /FORMAT='F8'
 /TITLE='Distances'.
END MATRIX.



--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: calculating distances

Volker, Gerard
Thank you Gene and García-Granero for your quick and most adequate response!

Your solution works brilliantly and saves a lot of time! Much to learn from.

Kind regards,
Gerard Volker

-----Oorspronkelijk bericht-----
Van: SPSSX(r) Discussion [mailto:[hidden email]] Namens García-Granero
Verzonden: dinsdag 16 februari 2010 17:45
Aan: [hidden email]
Onderwerp: Re: calculating distances

Volker, Gerard wrote:

> I have the following dataset:
>
> VAR1 VAR2
> H1  0
> H2  10
> H3  30
> H(n)  200.
>
> VAR1 is a string (trainstations,A3) and VAR2 is numeric (F8.0.) and
> represents the cumulative distances between the stations (H1 to H(n)).
>
> I need to calculate the distance between the station-combinations (eq
> H1H2,H1H3,H1H4,etc.).
> The output that I'm looking for is in this case (n=4):
>
> comb distance
> H1H2 10
> H1H3 30
> H1H4 200
> H2H3 20
> H2H4 190
> H3H4 170.
>
> H(n) can be up to 30 so I'm looking at a staggering number of possible
> combinations. Obviously combinations such as H2H1 are the same as H1H2
> and can be left out.
>
> What is the best way to tackle this problem? I've been looking at spss
> tools but can't find a macro or syntax that provides a solution that
> fits this particular problem.

A different approach to the one offered by Gene is this one, using MATRIX:

DATA LIST LIST/VAR1(A2) VAR2(F8).
BEGIN DATA.
H1   0
H2  10
H3  30
H4  200
END DATA.

PRESERVE.
SET MXLOOPS=100. /* If more than 100 rows of data, increase it *.
MATRIX.
GET Var1 /VAR=VAR1 /NAMES=Vname1.
GET Var2 /VAR=VAR2 /NAMES=Vname2.
COMPUTE NStation=NROW(Var1).
COMPUTE Distance=MAKE(NStation*(NStation-1)/2,1,0).
COMPUTE NameStat=MAKE(NStation*(NStation-1)/2,2,0).
COMPUTE a=1.
LOOP I=1 TO NStation-1.
. LOOP J=I+1 TO NStation.
.  COMPUTE Distance(a)=Var2(J)-Var2(I).
.  COMPUTE NameStat(a,1)=Var1(I).
.  COMPUTE NameStat(a,2)=Var1(J).
.  COMPUTE a=a+1.
. END LOOP.
END LOOP.
PRINT NameStat
 /FORMAT='A3'
 /TITLE='Stations being compared'.
PRINT Distance
 /FORMAT='F8'
 /TITLE='Distances'.
END MATRIX.



--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: calculating distances

Ruben Geert van den Berg
In reply to this post by Volker, Gerard
Dear Gerard,
 
Alternatively, you could use
 
cd 'c:\temp'.
 
DATA LIST LIST/V1(A2) V2(F8).
BEGIN DATA.
H1 0
H2 10
H3 30
H4 200
END DATA.
 
flip
/new=v1.
 
define !blah()
!do !t1=1 !to 4
!do !t2=1 !to 4
!if (!t1 !lt !t2) !then
comp !concat ('Distance_between_H',!t1,'_and_H',!t2)=abs(!con('H',!t1)-!con('H',!t2)).
!ifend
!doend
!doend
exe.
!end.

!blah.
 
flip.
ren var (CASE_LBL V2=Description Distance).
sel if ind(Description,'Distance')>0.
exe.

Best regards,

Ruben van den Berg

Methodologist

TNS NIPO

E: [hidden email]

P: +31 20 522 5738

I: www.tns-nipo.com




 

Date: Tue, 16 Feb 2010 11:45:09 +0100
From: [hidden email]
Subject: calculating distances
To: [hidden email]

Dear SPSS-users,
 
I have the following dataset:
 
VAR1 VAR2
H1  0
H2  10
H3  30
H(n)  200.
 
VAR1 is a string (trainstations,A3) and VAR2 is numeric (F8.0.) and represents the cumulative distances between the stations (H1 to H(n)).
 
I need to calculate the distance between the station-combinations (eq H1H2,H1H3,H1H4,etc.). 
The output that I'm looking for is in this case (n=4):
 
comb distance
H1H2 10
H1H3 30
H1H4 200
H2H3 20
H2H4 190
H3H4 170.
 
H(n) can be up to 30 so I'm looking at a staggering number of possible combinations. Obviously combinations such as H2H1 are the same as H1H2 and can be left out.
 
What is the best way to tackle this problem? I've been looking at spss tools but can't find a macro or syntax that provides a solution that fits this particular problem.
 
Your advice is much appreciated! 
 
Regards,
Gerard Volker 


New Windows 7: Find the right PC for you. Learn more.
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for adding reference line?

Albert-Jan Roskam
In reply to this post by Robert L
Hi!
 
The approach below uses R. The boxplot() and the par() call are tested, the spssdata call is not as I'm  using a neolithically old Spss version ;-). You could easily assign the value of the horizontal line programmatically (the mean, the median, or whatever, see below)
 
begin program R.
# Your data (edit the var names)
# df <- spssdata.GetDataFromSPSS(variables=c("X1","X2"))
df <- data.frame(cbind(runif(10, 0.5), rnorm(10, 2))) # sample data
boxplot(df$X1, df$X2, main="Just some test", xlab="X label", ylab="Y label")
par(abline(h=2))
end program.
 
par(abline(h=mean(c(df$X1, df$X2))) # horizontal line at the grand mean of X1 and X2.

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In the face of ambiguity, refuse the temptation to guess.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- On Tue, 2/16/10, Robert Lundqvist <[hidden email]> wrote:

From: Robert Lundqvist <[hidden email]>
Subject: [SPSSX-L] Syntax for adding reference line?
To: [hidden email]
Date: Tuesday, February 16, 2010, 4:52 PM

Hi all,

anyone who could help me with a (hopefully) small graphical issue? I simply want to add reference lines to boxplots, but would like to do it using syntax rather than opening the Chart Editor for all of my numerous plots. Better still could be if I could add single points placed near boxplots, as in the example below:

   |------|   |  |-----|
           (*)

where the (*) is the value for the single point. Any suggestions for any of these, the reference line and/or adding a single point using syntax? Any sggestions would be appreciated.

Robert
*****************
Robert Lundqvist
Norrbotten county council
Lulea, Sweden

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Syntax for adding reference line?

Jon K Peck

This is easy to do with a template.  Create the chart you want and add reference lines as desired in the Chart Editor.
Then save the chart as a template (File>Save Chart Template).  When  you save the template, you can choose to just include the reference line settings or other properties as well (checkboxes down near the bottom).

When you want to do another chart, just include this template in the syntax.  For example,
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=educ salary MISSING=LISTWISE REPORTMISSING=NO  /GRAPHSPEC SOURCE=INLINE  
TEMPLATE=[    "C:\spss18\Looks\referenceLine.sgt"].

To build more complicated charts you could add additional features to the boxplot using other ELEMENT commands, including computed reference lines etc.



Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Albert-Jan Roskam <[hidden email]>
To: [hidden email]
Date: 02/17/2010 08:19 AM
Subject: Re: [SPSSX-L] Syntax for adding reference line?
Sent by: "SPSSX(r) Discussion" <[hidden email]>






--- On Tue, 2/16/10, Robert Lundqvist <[hidden email]> wrote:


From: Robert Lundqvist <[hidden email]>
Subject: [SPSSX-L] Syntax for adding reference line?
To: [hidden email]
Date: Tuesday, February 16, 2010, 4:52 PM

Hi all,

anyone who could help me with a (hopefully) small graphical issue? I simply want to add reference lines to boxplots, but would like to do it using syntax rather than opening the Chart Editor for all of my numerous plots. Better still could be if I could add single points placed near boxplots, as in the example below:

  |------|   |  |-----|
          (*)

where the (*) is the value for the single point. Any suggestions for any of these, the reference line and/or adding a single point using syntax? Any sggestions would be appreciated.

Robert
*****************
Robert Lundqvist
Norrbotten county council
Lulea, Sweden

=====================
To manage your subscription to SPSSX-L, send a message to

LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



Reply | Threaded
Open this post in threaded view
|

Re: Syntax for adding reference line?

Jon K Peck

Here is a little more complicated example.  It does a boxplot but adds a reference line and little squares at the mean of each category.  (Infinitely) many variations are possible.

GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=educ salary mean(salary)[name="meansal"] MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: educ=col(source(s), name("educ"), unit.category())
  DATA: salary=col(source(s), name("salary"))
 DATA: meansal =col(source(s), name("meansal"))
  DATA: id=col(source(s), name("$CASENUM"), unit.category())
  GUIDE: axis(dim(1), label("Educational Level (years)"))
  GUIDE: axis(dim(2), label("Current Salary"))
  SCALE: cat(dim(1), include("8", "12", "14", "15", "16", "17", "18", "19", "20", "21"))
  SCALE: linear(dim(2), include(0))
  ELEMENT: schema(position(bin.quantile.letter(educ*salary)), label(id))
  ELEMENT: point(position(summary.mean(educ*salary)),shape.interior(shape.square))
  GUIDE: form.line(position(*, 100000))
END GPL.Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Jon K Peck/Chicago/IBM@IBMUS
To: [hidden email]
Date: 02/17/2010 09:12 AM
Subject: [SPSSX-L] Syntax for adding reference line?
Sent by: "SPSSX(r) Discussion" <[hidden email]>






This is easy to do with a template.  Create the chart you want and add reference lines as desired in the Chart Editor.

Then save the chart as a template (File>Save Chart Template).  When  you save the template, you can choose to just include the reference line settings or other properties as well (checkboxes down near the bottom).


When you want to do another chart, just include this template in the syntax.  For example,

GGRAPH

 /GRAPHDATASET NAME="graphdataset" VARIABLES=educ salary MISSING=LISTWISE REPORTMISSING=NO  /GRAPHSPEC SOURCE=INLINE  
TEMPLATE=[    "C:\spss18\Looks\referenceLine.sgt"].


To build more complicated charts you could add additional features to the boxplot using other ELEMENT commands, including computed reference lines etc.




Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435


From: Albert-Jan Roskam <[hidden email]>
To: [hidden email]
Date: 02/17/2010 08:19 AM
Subject: Re: [SPSSX-L] Syntax for adding reference line?
Sent by: "SPSSX(r) Discussion" <[hidden email]>






--- On Tue, 2/16/10, Robert Lundqvist <[hidden email]> wrote:

From: Robert Lundqvist <[hidden email]>
Subject: [SPSSX-L] Syntax for adding reference line?
To: [hidden email]
Date: Tuesday, February 16, 2010, 4:52 PM

Hi all,

anyone who could help me with a (hopefully) small graphical issue? I simply want to add reference lines to boxplots, but would like to do it using syntax rather than opening the Chart Editor for all of my numerous plots. Better still could be if I could add single points placed near boxplots, as in the example below:

 |------|   |  |-----|
         (*)

where the (*) is the value for the single point. Any suggestions for any of these, the reference line and/or adding a single point using syntax? Any sggestions would be appreciated.

Robert
*****************
Robert Lundqvist
Norrbotten county council
Lulea, Sweden

=====================
To manage your subscription to SPSSX-L, send a message to

LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD





Reply | Threaded
Open this post in threaded view
|

GPL box plot?

Robert L
I am trying to come to grips with GPL commands, and it seems to work quite fine. At least when making a histogram. Box plots however end up with a series of bars. No box, no whiskers, just a sequence of bars. Here's the code for a small example:
 
NEW FILE.
INPUT PROGRAM.
   LOOP #I=1 TO 10.
      COMPUTE Normal = RV.NORMAL(100,20).
      END CASE.
   END LOOP.
   END FILE.
END INPUT PROGRAM.
dataset name rnd.
 
ggraph
/GRAPHDATASET name="rnd" variables=Normal
/GRAPHSPEC source=inline.
BEGIN GPL
SOURCE:slumptal=userSource(id("rnd"))
DATA:Normal=col(source(rnd), name("Normal"))
ELEMENT: schema(position(bin.quantile.letter(Normal)))
END GPL.
This should work, shouldn't it? If not, what am I missing? Your suggestions would really be appreciated.
 
Robert
Robert Lundqvist
Reply | Threaded
Open this post in threaded view
|

Re: GPL box plot?

Jon K Peck

Two changes: the dataset reference, and, most important, you have no x-axis in your graph algebra.  Try this:
ggraph
/GRAPHDATASET name="rnd" variables=Normal
/GRAPHSPEC source=inline.
BEGIN GPL
SOURCE:slumptal=userSource(id("rnd"))
DATA:Normal=col(source(slumptal), name("Normal"))
ELEMENT: schema(position(bin.quantile.letter(1*Normal)))
END GPL

HTH,

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Robert Lundqvist <[hidden email]>
To: [hidden email]
Date: 03/01/2010 10:10 PM
Subject: [SPSSX-L] GPL box plot?
Sent by: "SPSSX(r) Discussion" <[hidden email]>





I am trying to come to grips with GPL commands, and it seems to work quite fine. At least when making a histogram. Box plots however end up with a series of bars. No box, no whiskers, just a sequence of bars. Here's the code for a small example:
 
NEW FILE.
INPUT PROGRAM.
  LOOP #I=1 TO 10.
     COMPUTE Normal = RV.NORMAL(100,20).
     END CASE.
  END LOOP.
  END FILE.
END INPUT PROGRAM.

dataset name rnd.
 
ggraph
/GRAPHDATASET name="rnd" variables=Normal
/GRAPHSPEC source=inline.
BEGIN GPL
SOURCE:slumptal=userSource(id("rnd"))
DATA:Normal=col(source(rnd), name("Normal"))
ELEMENT: schema(position(bin.quantile.letter(Normal)))
END GPL.

This should work, shouldn't it? If not, what am I missing? Your suggestions would really be appreciated.
 
Robert