Looping by index from variable contents

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Looping by index from variable contents

Frank H. Millard
So how do create a loop like this:

For I = 1 to var(I)
   do something
Next I
 I know this is not the proper syntax but it represents what I need to
accomplish.

Thank you

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Looping by index from variable contents

Maguin, Eugene
All,

Yesterday, Frank Millard posted what I thought was a simple question about
how to set up loops.

I replied, 'Look at the Loop-end loop command in the documentation.'

He replied.
I did b4 asking for help.  The solution I was looking was not straight
forward in the docs.

I want to control a loop based on the results of a two step cluster
analysis.
I want to find clusters within clusters.
So, if the analysis produces 4 clusters, saved in a new variable "var1" by
the analysis, I need to construct a loop control such that:
Loop var1
Cases are selected for each cluster in var1; the two step cluster syntax is
modified to create new variable output for each cluster in var1;
Run the two step cluster
END

My reply this morning.

Frank, I think there are others on the list with more experience with
clustering and clusters. I've never done a cluster analysis. That said, the
idea of 'finding clusters within clusters' seems odd to me. That is, why
wouldn't you just extract more clusters. That said, I think that a macro
loop may be more relevant to your question because you are wanting to loop
through procedure commands rather than transformation commands. There are
macro examples, including a loop command, in one of the appendices of the
syntax ref manual. I've only set up a couple of simple macros. Other folks,
please jump in.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Looping by index from variable contents

Frank H. Millard
Gene,
Thank you for your time.
I'll work out my looping problem, I hope -- it's not critical.

About the clustering.  Very crudely, cluster analysis attempts to find
groups in data based on case attributes (or measures) and a distance measure
applied to each case to determine either similarity or dissimilarity of a
case to a group centroid.  Generally, a clustering tool finds a set of
initial clusters, assigns cases to these initial groups and ,by iteration,
updates group centroids and case membership until either the number of
iterations is exhausted or no further distance optimization can be achieved.

I am attempting to build automated production of demographic clusters based
on census measures at the block group unit of analysis.   2 Step Cluster
analysis, automatically -- the number of clusters is not predefined, so the
expectation is that the results are data driven -- produces 4 major clusters
from 4788 block groups: Higher Income, Middle Class, Lower Middle Class and
Lower Income; ordered by Median Household Income. Discriminant function
analysis validates that 93% of the cases are "correctly" classified by the
cluster analysis, which is very good for census data.  Moreover, the
discriminant function structure matrix show that the 1st two discriminant
functions separate the groups by affluence and urbanity, which is reproduced
by factor analysis and by major commercial segmentation systems such as
Claritas and ACORN.
Now, each of the major clusters or groups have sub-groups (or sub-types),
which are found by selecting cases in terms of each major cluster and
running the cluster analysis again to find the sub clusters.

You can see the result of this at:
http://health.state.ga.us/demographicprofiles/index.htm
Frank Millard

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: Wednesday, July 02, 2008 08:53
To: [hidden email]
Subject: Re: Looping by index from variable contents

All,

Yesterday, Frank Millard posted what I thought was a simple question about
how to set up loops.

I replied, 'Look at the Loop-end loop command in the documentation.'

He replied.
I did b4 asking for help.  The solution I was looking was not straight
forward in the docs.

I want to control a loop based on the results of a two step cluster
analysis.
I want to find clusters within clusters.
So, if the analysis produces 4 clusters, saved in a new variable "var1" by
the analysis, I need to construct a loop control such that:
Loop var1
Cases are selected for each cluster in var1; the two step cluster syntax is
modified to create new variable output for each cluster in var1;
Run the two step cluster
END

My reply this morning.

Frank, I think there are others on the list with more experience with
clustering and clusters. I've never done a cluster analysis. That said, the
idea of 'finding clusters within clusters' seems odd to me. That is, why
wouldn't you just extract more clusters. That said, I think that a macro
loop may be more relevant to your question because you are wanting to loop
through procedure commands rather than transformation commands. There are
macro examples, including a loop command, in one of the appendices of the
syntax ref manual. I've only set up a couple of simple macros. Other folks,
please jump in.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Looping by index from variable contents

ViAnn Beadle
Could you use the cluster group variable a split file variable and then run
cluster again on the splits?

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Frank H. Millard
Sent: Wednesday, July 02, 2008 8:54 AM
To: [hidden email]
Subject: Re: Looping by index from variable contents

Gene,
Thank you for your time.
I'll work out my looping problem, I hope -- it's not critical.

About the clustering.  Very crudely, cluster analysis attempts to find
groups in data based on case attributes (or measures) and a distance measure
applied to each case to determine either similarity or dissimilarity of a
case to a group centroid.  Generally, a clustering tool finds a set of
initial clusters, assigns cases to these initial groups and ,by iteration,
updates group centroids and case membership until either the number of
iterations is exhausted or no further distance optimization can be achieved.

I am attempting to build automated production of demographic clusters based
on census measures at the block group unit of analysis.   2 Step Cluster
analysis, automatically -- the number of clusters is not predefined, so the
expectation is that the results are data driven -- produces 4 major clusters
from 4788 block groups: Higher Income, Middle Class, Lower Middle Class and
Lower Income; ordered by Median Household Income. Discriminant function
analysis validates that 93% of the cases are "correctly" classified by the
cluster analysis, which is very good for census data.  Moreover, the
discriminant function structure matrix show that the 1st two discriminant
functions separate the groups by affluence and urbanity, which is reproduced
by factor analysis and by major commercial segmentation systems such as
Claritas and ACORN.
Now, each of the major clusters or groups have sub-groups (or sub-types),
which are found by selecting cases in terms of each major cluster and
running the cluster analysis again to find the sub clusters.

You can see the result of this at:
http://health.state.ga.us/demographicprofiles/index.htm
Frank Millard

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: Wednesday, July 02, 2008 08:53
To: [hidden email]
Subject: Re: Looping by index from variable contents

All,

Yesterday, Frank Millard posted what I thought was a simple question about
how to set up loops.

I replied, 'Look at the Loop-end loop command in the documentation.'

He replied.
I did b4 asking for help.  The solution I was looking was not straight
forward in the docs.

I want to control a loop based on the results of a two step cluster
analysis.
I want to find clusters within clusters.
So, if the analysis produces 4 clusters, saved in a new variable "var1" by
the analysis, I need to construct a loop control such that:
Loop var1
Cases are selected for each cluster in var1; the two step cluster syntax is
modified to create new variable output for each cluster in var1;
Run the two step cluster
END

My reply this morning.

Frank, I think there are others on the list with more experience with
clustering and clusters. I've never done a cluster analysis. That said, the
idea of 'finding clusters within clusters' seems odd to me. That is, why
wouldn't you just extract more clusters. That said, I think that a macro
loop may be more relevant to your question because you are wanting to loop
through procedure commands rather than transformation commands. There are
macro examples, including a loop command, in one of the appendices of the
syntax ref manual. I've only set up a couple of simple macros. Other folks,
please jump in.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Looping by index from variable contents

Art Kendall
In reply to this post by Maguin, Eugene
Are you looking for a TREE or for hierarchical clustering?

Art Kendall
Social Research Consultants

Gene Maguin wrote:

> All,
>
> Yesterday, Frank Millard posted what I thought was a simple question about
> how to set up loops.
>
> I replied, 'Look at the Loop-end loop command in the documentation.'
>
> He replied.
> I did b4 asking for help.  The solution I was looking was not straight
> forward in the docs.
>
> I want to control a loop based on the results of a two step cluster
> analysis.
> I want to find clusters within clusters.
> So, if the analysis produces 4 clusters, saved in a new variable "var1" by
> the analysis, I need to construct a loop control such that:
> Loop var1
> Cases are selected for each cluster in var1; the two step cluster syntax is
> modified to create new variable output for each cluster in var1;
> Run the two step cluster
> END
>
> My reply this morning.
>
> Frank, I think there are others on the list with more experience with
> clustering and clusters. I've never done a cluster analysis. That said, the
> idea of 'finding clusters within clusters' seems odd to me. That is, why
> wouldn't you just extract more clusters. That said, I think that a macro
> loop may be more relevant to your question because you are wanting to loop
> through procedure commands rather than transformation commands. There are
> macro examples, including a loop command, in one of the appendices of the
> syntax ref manual. I've only set up a couple of simple macros. Other folks,
> please jump in.
>
> Gene Maguin
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Looping by index from variable contents

Art Kendall
> The SAVE subcommand allows you to save cluster output to the active
> dataset.
> CLUSTER Save the cluster identification. The cluster number for each
> case is saved; the user
> may specify a variable name using the VARIABLE keyword, otherwise, it
> is saved to
> TSC_n, where n is a positive integer indicating the ordinal of the
> SAVE operation
> completed by this procedure in a given session.
Are you sure you are using /SAVE and still not getting TSC_1, TSC_2, etc?

try setting an explicit stem variable to replace "TSC_"
/SAVE CLUSTER VARIABLE =  firstvar
AIM firstvar

then try SPLIT FILE
and specify
/SAVE CLUSTER VARIABLE = secondvar
then see what new variables start with "second"


If you have two separate variables for the runs, crosstab the two
membership variables (firstvar and secondvar).
create a new variable with a different value for each cell in the
crosstab and try that in AIM.
AIM mynewvar.



I am still not sure why you wouldn't use a TREE or CLUSTER procedure or
keep a higher number of clusters in TWOSTEP.

Art Kendall
Social Research Consultants




Frank H. Millard wrote:

> I use 2 step clustering.  I am looking for a way to use the output from an
> initial 2 step cluster, which is stored in data file, as a loop control for
> finding sub clusters for each of the original clusters.  I have a syntax
> file for the 2 step.  I can split the file by initial clusters, but only the
> AIM output is produced; no sub clusters for each initial cluster, each saved
> in new, respective, variable in data file.
> If I knew how to use syntax to replace the 2 step /SAVE and /AIM contents
> with new output names, on the fly,  that would solve my problem.
>
> Also, if you know how to get scripting user input into a syntax file that
> would be useful too.
>
> Thank you
> Frank Millard
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Art
> Kendall
> Sent: Wednesday, July 02, 2008 12:02
> To: [hidden email]
> Subject: Re: Looping by index from variable contents
>
> Are you looking for a TREE or for hierarchical clustering?
>
> Art Kendall
> Social Research Consultants
>
> Gene Maguin wrote:
>
>> All,
>>
>> Yesterday, Frank Millard posted what I thought was a simple question about
>> how to set up loops.
>>
>> I replied, 'Look at the Loop-end loop command in the documentation.'
>>
>> He replied.
>> I did b4 asking for help.  The solution I was looking was not straight
>> forward in the docs.
>>
>> I want to control a loop based on the results of a two step cluster
>> analysis.
>> I want to find clusters within clusters.
>> So, if the analysis produces 4 clusters, saved in a new variable "var1" by
>> the analysis, I need to construct a loop control such that:
>> Loop var1
>> Cases are selected for each cluster in var1; the two step cluster syntax
>>
> is
>
>> modified to create new variable output for each cluster in var1;
>> Run the two step cluster
>> END
>>
>> My reply this morning.
>>
>> Frank, I think there are others on the list with more experience with
>> clustering and clusters. I've never done a cluster analysis. That said,
>>
> the
>
>> idea of 'finding clusters within clusters' seems odd to me. That is, why
>> wouldn't you just extract more clusters. That said, I think that a macro
>> loop may be more relevant to your question because you are wanting to loop
>> through procedure commands rather than transformation commands. There are
>> macro examples, including a loop command, in one of the appendices of the
>> syntax ref manual. I've only set up a couple of simple macros. Other
>>
> folks,
>
>> please jump in.
>>
>> Gene Maguin
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>>
>>
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Looping by index from variable contents

Frank H. Millard
Thank you Art,
Indeed the syntax doc. Says it will save a new variable TSC_x, but it does
not, unless I am doing something wrong.

I'll try your recommendations later.

Why keep a higher number of clusters from 2 step?  My goal, in using 2 step,
was to find clusters w/o specifying the number of clusters, so that the
number of clusters found would be "data driven".  So the initial 2 step
output is 4 clusters, and by selecting cases in term of the initial cluster
output (e.g., select cases where TSC_x=1 to 4) and running 2 step again to
get sub clusters for each of the initial clusters.  Moreover, the intial
clusters are recoded from highest to lowest by median HH income such that
initial cluster 1 has the highest median hh income and cluster 4 the lowest
median hh income, and the recoded clusters are used in a discriminant
function analysis, using the same input variables as the 2 step; and, indeed
4 distinct groups exist. Decomposing, for example cluster 1, produces 3
"Higher Income" sub clusters which are mostly urban and sub urban; and,
cluster 3 decomposition produces 4  "Lower Middle Class" sub clusters which
are mainly rural. You can see this at
http://health.state.ga.us/demographicprofiles/index.htm.

So my goal is to find "super types" and "sub types" within "super types"
without having to specify the number of "super types" or "sub types".


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Art
Kendall
Sent: Wednesday, July 02, 2008 14:23
To: [hidden email]
Subject: Re: Looping by index from variable contents

> The SAVE subcommand allows you to save cluster output to the active
> dataset.
> CLUSTER Save the cluster identification. The cluster number for each
> case is saved; the user
> may specify a variable name using the VARIABLE keyword, otherwise, it
> is saved to
> TSC_n, where n is a positive integer indicating the ordinal of the
> SAVE operation
> completed by this procedure in a given session.
Are you sure you are using /SAVE and still not getting TSC_1, TSC_2, etc?

try setting an explicit stem variable to replace "TSC_"
/SAVE CLUSTER VARIABLE =  firstvar
AIM firstvar

then try SPLIT FILE
and specify
/SAVE CLUSTER VARIABLE = secondvar
then see what new variables start with "second"


If you have two separate variables for the runs, crosstab the two
membership variables (firstvar and secondvar).
create a new variable with a different value for each cell in the
crosstab and try that in AIM.
AIM mynewvar.



I am still not sure why you wouldn't use a TREE or CLUSTER procedure or
keep a higher number of clusters in TWOSTEP.

Art Kendall
Social Research Consultants




Frank H. Millard wrote:
> I use 2 step clustering.  I am looking for a way to use the output from an
> initial 2 step cluster, which is stored in data file, as a loop control
for
> finding sub clusters for each of the original clusters.  I have a syntax
> file for the 2 step.  I can split the file by initial clusters, but only
the
> AIM output is produced; no sub clusters for each initial cluster, each
saved

> in new, respective, variable in data file.
> If I knew how to use syntax to replace the 2 step /SAVE and /AIM contents
> with new output names, on the fly,  that would solve my problem.
>
> Also, if you know how to get scripting user input into a syntax file that
> would be useful too.
>
> Thank you
> Frank Millard
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Art

> Kendall
> Sent: Wednesday, July 02, 2008 12:02
> To: [hidden email]
> Subject: Re: Looping by index from variable contents
>
> Are you looking for a TREE or for hierarchical clustering?
>
> Art Kendall
> Social Research Consultants
>
> Gene Maguin wrote:
>
>> All,
>>
>> Yesterday, Frank Millard posted what I thought was a simple question
about

>> how to set up loops.
>>
>> I replied, 'Look at the Loop-end loop command in the documentation.'
>>
>> He replied.
>> I did b4 asking for help.  The solution I was looking was not straight
>> forward in the docs.
>>
>> I want to control a loop based on the results of a two step cluster
>> analysis.
>> I want to find clusters within clusters.
>> So, if the analysis produces 4 clusters, saved in a new variable "var1"
by

>> the analysis, I need to construct a loop control such that:
>> Loop var1
>> Cases are selected for each cluster in var1; the two step cluster syntax
>>
> is
>
>> modified to create new variable output for each cluster in var1;
>> Run the two step cluster
>> END
>>
>> My reply this morning.
>>
>> Frank, I think there are others on the list with more experience with
>> clustering and clusters. I've never done a cluster analysis. That said,
>>
> the
>
>> idea of 'finding clusters within clusters' seems odd to me. That is, why
>> wouldn't you just extract more clusters. That said, I think that a macro
>> loop may be more relevant to your question because you are wanting to
loop

>> through procedure commands rather than transformation commands. There are
>> macro examples, including a loop command, in one of the appendices of the
>> syntax ref manual. I've only set up a couple of simple macros. Other
>>
> folks,
>
>> please jump in.
>>
>> Gene Maguin
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>>
>>
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Looping by index from variable contents

Richard Ristow
In reply to this post by Maguin, Eugene
At 08:53 AM 7/2/2008, Gene Maguin forwarded from Frank Millard:

>The solution I was looking was not straight forward in the docs.
>
>I want to control a loop based on the results of a two step cluster analysis.
>I want to find clusters within clusters.
>So, if the analysis produces 4 clusters, saved in a new variable
>"var1" by the analysis, I need to construct a loop control such that:
>Loop var1
>Cases are selected for each cluster in var1; the two step cluster
>syntax is modified to create new variable output for each cluster in var1;
>Run the two step cluster
>END

The LOOP/END LOOP construct works within a transformation program or
INPUT PROGRAM; you can't use it to re-run analyses. There are various
ways of creating a loop that runs multiple analyses, but the
currently favored one (providing you have SPSS 14+) is a loop within
a Python program.

However, ViAnn Beadle's suggestion, that you use SPLIT FILES within
'var1', is probably still easier and neater.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Looping by index from variable contents

Art Kendall
In reply to this post by Frank H. Millard
Wow things have really progressed since I worked on taxonomy of places
while at Census in the 70's!

What you are describing still sounds a lot like a TREE.


Art


Frank H. Millard wrote:

> Thank you Art,
> Indeed the syntax doc. Says it will save a new variable TSC_x, but it does
> not, unless I am doing something wrong.
>
> I'll try your recommendations later.
>
> Why keep a higher number of clusters from 2 step?  My goal, in using 2 step,
> was to find clusters w/o specifying the number of clusters, so that the
> number of clusters found would be "data driven".  So the initial 2 step
> output is 4 clusters, and by selecting cases in term of the initial cluster
> output (e.g., select cases where TSC_x=1 to 4) and running 2 step again to
> get sub clusters for each of the initial clusters.  Moreover, the intial
> clusters are recoded from highest to lowest by median HH income such that
> initial cluster 1 has the highest median hh income and cluster 4 the lowest
> median hh income, and the recoded clusters are used in a discriminant
> function analysis, using the same input variables as the 2 step; and, indeed
> 4 distinct groups exist. Decomposing, for example cluster 1, produces 3
> "Higher Income" sub clusters which are mostly urban and sub urban; and,
> cluster 3 decomposition produces 4  "Lower Middle Class" sub clusters which
> are mainly rural. You can see this at
> http://health.state.ga.us/demographicprofiles/index.htm.
>
> So my goal is to find "super types" and "sub types" within "super types"
> without having to specify the number of "super types" or "sub types".
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Art
> Kendall
> Sent: Wednesday, July 02, 2008 14:23
> To: [hidden email]
> Subject: Re: Looping by index from variable contents
>
>
>> The SAVE subcommand allows you to save cluster output to the active
>> dataset.
>> CLUSTER Save the cluster identification. The cluster number for each
>> case is saved; the user
>> may specify a variable name using the VARIABLE keyword, otherwise, it
>> is saved to
>> TSC_n, where n is a positive integer indicating the ordinal of the
>> SAVE operation
>> completed by this procedure in a given session.
>>
> Are you sure you are using /SAVE and still not getting TSC_1, TSC_2, etc?
>
> try setting an explicit stem variable to replace "TSC_"
> /SAVE CLUSTER VARIABLE =  firstvar
> AIM firstvar
>
> then try SPLIT FILE
> and specify
> /SAVE CLUSTER VARIABLE = secondvar
> then see what new variables start with "second"
>
>
> If you have two separate variables for the runs, crosstab the two
> membership variables (firstvar and secondvar).
> create a new variable with a different value for each cell in the
> crosstab and try that in AIM.
> AIM mynewvar.
>
>
>
> I am still not sure why you wouldn't use a TREE or CLUSTER procedure or
> keep a higher number of clusters in TWOSTEP.
>
> Art Kendall
> Social Research Consultants
>
>
>
>
> Frank H. Millard wrote:
>
>> I use 2 step clustering.  I am looking for a way to use the output from an
>> initial 2 step cluster, which is stored in data file, as a loop control
>>
> for
>
>> finding sub clusters for each of the original clusters.  I have a syntax
>> file for the 2 step.  I can split the file by initial clusters, but only
>>
> the
>
>> AIM output is produced; no sub clusters for each initial cluster, each
>>
> saved
>
>> in new, respective, variable in data file.
>> If I knew how to use syntax to replace the 2 step /SAVE and /AIM contents
>> with new output names, on the fly,  that would solve my problem.
>>
>> Also, if you know how to get scripting user input into a syntax file that
>> would be useful too.
>>
>> Thank you
>> Frank Millard
>>
>> -----Original Message-----
>> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
>>
> Art
>
>> Kendall
>> Sent: Wednesday, July 02, 2008 12:02
>> To: [hidden email]
>> Subject: Re: Looping by index from variable contents
>>
>> Are you looking for a TREE or for hierarchical clustering?
>>
>> Art Kendall
>> Social Research Consultants
>>
>> Gene Maguin wrote:
>>
>>
>>> All,
>>>
>>> Yesterday, Frank Millard posted what I thought was a simple question
>>>
> about
>
>>> how to set up loops.
>>>
>>> I replied, 'Look at the Loop-end loop command in the documentation.'
>>>
>>> He replied.
>>> I did b4 asking for help.  The solution I was looking was not straight
>>> forward in the docs.
>>>
>>> I want to control a loop based on the results of a two step cluster
>>> analysis.
>>> I want to find clusters within clusters.
>>> So, if the analysis produces 4 clusters, saved in a new variable "var1"
>>>
> by
>
>>> the analysis, I need to construct a loop control such that:
>>> Loop var1
>>> Cases are selected for each cluster in var1; the two step cluster syntax
>>>
>>>
>> is
>>
>>
>>> modified to create new variable output for each cluster in var1;
>>> Run the two step cluster
>>> END
>>>
>>> My reply this morning.
>>>
>>> Frank, I think there are others on the list with more experience with
>>> clustering and clusters. I've never done a cluster analysis. That said,
>>>
>>>
>> the
>>
>>
>>> idea of 'finding clusters within clusters' seems odd to me. That is, why
>>> wouldn't you just extract more clusters. That said, I think that a macro
>>> loop may be more relevant to your question because you are wanting to
>>>
> loop
>
>>> through procedure commands rather than transformation commands. There are
>>> macro examples, including a loop command, in one of the appendices of the
>>> syntax ref manual. I've only set up a couple of simple macros. Other
>>>
>>>
>> folks,
>>
>>
>>> please jump in.
>>>
>>> Gene Maguin
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>> [hidden email] (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>>>
>>>
>>>
>>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>>
>>
>>
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Looping by index from variable contents

Art Kendall
You are right about the DV.

WRT the large dendogram,  it is very common in the use of hierarchical
clustering to "cut" the dendogram using "elbows" in the criterion to
determine th approximate "height" at which to cut the tree.

Did you check whether TWOSTEP produced TSC_n variables in the file on
the first pass but failed to do so under SPLIT files?

If TWOSTEP produces TSC_n only when there is no SPLIT FILE , perhaps you
could create as many files as there are clusters on the first run, and
process each separately saving the cluster membership, and then ADD
FILES back together.

It would be interesting to know whether the number of clusters found
from the second pass, gave the same results as you would have found if
you had retained that number of cluster on the first pass.  For example,
if you found 3 clusters on the first pass, then 2 from the first of
those cluster, and 3 each from the second and third cluster, would that
be the same as if you had retained 8 from the first pass?

Art Kendall
Social Research Consultants

Frank H. Millard wrote:

>
> Yes a tree, but the tree based classification tools I have require a
> dependent variable.
>
> So if I do not know, or have, a dependent variable, what do I do?
>
> Also, hierarchical clustering by case produces a really large
> dendrogram for the 4788 cases.
>
> I used the 2 step:
>
> 1.    Large number of cases
>
> 2.    Clusters need not be predefined.
>
>
>
> Thanks
>
> Frank
>
> *From:* Art Kendall [mailto:[hidden email]]
> *Sent:* Wednesday, July 02, 2008 17:52
> *To:* Frank H. Millard; SPSSX-L post
> *Subject:* Re: Looping by index from variable contents
>
>
>
> Wow things have really progressed since I worked on taxonomy of places
> while at Census in the 70's!
>
> What you are describing still sounds a lot like a TREE.
>
>
> Art
>
>
> Frank H. Millard wrote:
>
> Thank you Art,
> Indeed the syntax doc. Says it will save a new variable TSC_x, but it does
> not, unless I am doing something wrong.
>
> I'll try your recommendations later.
>
> Why keep a higher number of clusters from 2 step?  My goal, in using 2 step,
> was to find clusters w/o specifying the number of clusters, so that the
> number of clusters found would be "data driven".  So the initial 2 step
> output is 4 clusters, and by selecting cases in term of the initial cluster
> output (e.g., select cases where TSC_x=1 to 4) and running 2 step again to
> get sub clusters for each of the initial clusters.  Moreover, the intial
> clusters are recoded from highest to lowest by median HH income such that
> initial cluster 1 has the highest median hh income and cluster 4 the lowest
> median hh income, and the recoded clusters are used in a discriminant
> function analysis, using the same input variables as the 2 step; and, indeed
> 4 distinct groups exist. Decomposing, for example cluster 1, produces 3
> "Higher Income" sub clusters which are mostly urban and sub urban; and,
> cluster 3 decomposition produces 4  "Lower Middle Class" sub clusters which
> are mainly rural. You can see this at
> http://health.state.ga.us/demographicprofiles/index.htm.
>
> So my goal is to find "super types" and "sub types" within "super types"
> without having to specify the number of "super types" or "sub types".
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Art
> Kendall
> Sent: Wednesday, July 02, 2008 14:23
> To: [hidden email] <mailto:[hidden email]>
> Subject: Re: Looping by index from variable contents
>
>
>
>     The SAVE subcommand allows you to save cluster output to the active
>
>     dataset.
>
>     CLUSTER Save the cluster identification. The cluster number for each
>
>     case is saved; the user
>
>     may specify a variable name using the VARIABLE keyword, otherwise, it
>
>     is saved to
>
>     TSC_n, where n is a positive integer indicating the ordinal of the
>
>     SAVE operation
>
>     completed by this procedure in a given session.
>
>
>
> Are you sure you are using /SAVE and still not getting TSC_1, TSC_2, etc?
>
> try setting an explicit stem variable to replace "TSC_"
> /SAVE CLUSTER VARIABLE =  firstvar
> AIM firstvar
>
> then try SPLIT FILE
> and specify
> /SAVE CLUSTER VARIABLE = secondvar
> then see what new variables start with "second"
>
>
> If you have two separate variables for the runs, crosstab the two
> membership variables (firstvar and secondvar).
> create a new variable with a different value for each cell in the
> crosstab and try that in AIM.
> AIM mynewvar.
>
>
>
> I am still not sure why you wouldn't use a TREE or CLUSTER procedure or
> keep a higher number of clusters in TWOSTEP.
>
> Art Kendall
> Social Research Consultants
>
>
>
>
> Frank H. Millard wrote:
>
>
>     I use 2 step clustering.  I am looking for a way to use the output from an
>
>     initial 2 step cluster, which is stored in data file, as a loop control
>
>
>
> for
>
>
>     finding sub clusters for each of the original clusters.  I have a syntax
>
>     file for the 2 step.  I can split the file by initial clusters, but only
>
>
>
> the
>
>
>     AIM output is produced; no sub clusters for each initial cluster, each
>
>
>
> saved
>
>
>     in new, respective, variable in data file.
>
>     If I knew how to use syntax to replace the 2 step /SAVE and /AIM contents
>
>     with new output names, on the fly,  that would solve my problem.
>
>
>
>     Also, if you know how to get scripting user input into a syntax file that
>
>     would be useful too.
>
>
>
>     Thank you
>
>     Frank Millard
>
>
>
>     -----Original Message-----
>
>     From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
>
>
>
> Art
>
>
>     Kendall
>
>     Sent: Wednesday, July 02, 2008 12:02
>
>     To: [hidden email] <mailto:[hidden email]>
>
>     Subject: Re: Looping by index from variable contents
>
>
>
>     Are you looking for a TREE or for hierarchical clustering?
>
>
>
>     Art Kendall
>
>     Social Research Consultants
>
>
>
>     Gene Maguin wrote:
>
>
>
>
>
>         All,
>
>
>
>         Yesterday, Frank Millard posted what I thought was a simple question
>
>
>
> about
>
>
>         how to set up loops.
>
>
>
>         I replied, 'Look at the Loop-end loop command in the documentation.'
>
>
>
>         He replied.
>
>         I did b4 asking for help.  The solution I was looking was not straight
>
>         forward in the docs.
>
>
>
>         I want to control a loop based on the results of a two step cluster
>
>         analysis.
>
>         I want to find clusters within clusters.
>
>         So, if the analysis produces 4 clusters, saved in a new variable "var1"
>
>
>
> by
>
>
>         the analysis, I need to construct a loop control such that:
>
>         Loop var1
>
>         Cases are selected for each cluster in var1; the two step cluster syntax
>
>
>
>
>
>     is
>
>
>
>
>
>         modified to create new variable output for each cluster in var1;
>
>         Run the two step cluster
>
>         END
>
>
>
>         My reply this morning.
>
>
>
>         Frank, I think there are others on the list with more experience with
>
>         clustering and clusters. I've never done a cluster analysis. That said,
>
>
>
>
>
>     the
>
>
>
>
>
>         idea of 'finding clusters within clusters' seems odd to me. That is, why
>
>         wouldn't you just extract more clusters. That said, I think that a macro
>
>         loop may be more relevant to your question because you are wanting to
>
>
>
> loop
>
>
>         through procedure commands rather than transformation commands. There are
>
>         macro examples, including a loop command, in one of the appendices of the
>
>         syntax ref manual. I've only set up a couple of simple macros. Other
>
>
>
>
>
>     folks,
>
>
>
>
>
>         please jump in.
>
>
>
>         Gene Maguin
>
>
>
>         =====================
>
>         To manage your subscription to SPSSX-L, send a message to
>
>         [hidden email] <mailto:[hidden email]> (not to SPSSX-L), with no body text except the
>
>         command. To leave the list, send the command
>
>         SIGNOFF SPSSX-L
>
>         For a list of commands to manage subscriptions, send the command
>
>         INFO REFCARD
>
>
>
>
>
>
>
>
>
>
>
>     =====================
>
>     To manage your subscription to SPSSX-L, send a message to
>
>     [hidden email] <mailto:[hidden email]> (not to SPSSX-L), with no body text except the
>
>     command. To leave the list, send the command
>
>     SIGNOFF SPSSX-L
>
>     For a list of commands to manage subscriptions, send the command
>
>     INFO REFCARD
>
>
>
>
>
>
>
>
>
>
>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] <mailto:[hidden email]> (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] <mailto:[hidden email]> (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants