SPSSX Discussion

Newbie problem with Clusters

Classic

List

Threaded

6 messages Options

Newbie problem with Clusters

Hello,
that's my first time on that list. And I have a newbie question about Clusters.

In a large dataset, I have a nominal variable (cntrs) with different Countries. Each case is market by a
letter: like US, UK, and so on.
Than i have a dummy variable (dmm), YES - NO, for every case of my dataset.

I'm interested to find if counties can be clustered in base the number of "NO" in the second variable I
have (dmm).

How to do that? What type of cluster should I use (if any) and how?

If someone have time to answers my questions i will be glad.

thank you,
c.

p.s.: sorry for my poor english.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

David Marso

Re: Newbie problem with Clusters

Administrator

I have absolutely *NO* idea what you mean by "clustered" in this case.
With one variable all you will have is a count of NO responses for each country.
Please be specific about what you are looking for?
---

c. wrote

Hello,
that's my first time on that list. And I have a newbie question about Clusters.

In a large dataset, I have a nominal variable (cntrs) with different Countries. Each case is market by a
letter: like US, UK, and so on.
Than i have a dummy variable (dmm), YES - NO, for every case of my dataset.

I'm interested to find if counties can be clustered in base the number of "NO" in the second variable I
have (dmm).

How to do that? What type of cluster should I use (if any) and how?

If someone have time to answers my questions i will be glad.

thank you,
c.

p.s.: sorry for my poor english.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Art Kendall

Re: Newbie problem with Clusters

In reply to this post by c.

I am not sure, but it sounds like you just want SORT CASES.
Open a new instance of SPSS.
Copy the syntax below into a syntax window, run it. Compare the data before and after the SORT.
If this is not what you are asking please post a small example of what you are trying to do.

data list list /country (a2) dummy (a3).
begin data
AA YES
BB NO
CC YES
DD NO
EE NO
FF YES
GG YES
END DATA.
SORT CASES BY DUMMY.
LIST.

Art Kendall
Social Research Consultants

On 5/18/2012 9:29 AM, c. wrote:

Hello,
that's my first time on that list. And I have a newbie question about Clusters.

In a large dataset, I have a nominal variable (cntrs) with different Countries. Each case is market by a
letter: like US, UK, and so on.
Than i have a dummy variable (dmm), YES - NO, for every case of my dataset.

I'm interested to find if counties can be clustered in base the number of "NO" in the second variable I
have (dmm).

How to do that? What type of cluster should I use (if any) and how?

If someone have time to answers my questions i will be glad.

thank you,
c.

p.s.: sorry for my poor english.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Art Kendall
Social Research Consultants

Jerabek Jindrich

Re: Newbie problem with Clusters

In reply to this post by c.

Hello,

You can cluster countries on base of number Yes/no. Aggregate your datafile by cntrs
AGGREGATE
/OUTFILE=*
/BREAK=cntrs
/dmm_1 = SUM(dmm).

And you get for each country number of Yes in dmm var.

Use k-mean cluster to create several groups of countries, based on number of Yes. E.g. 3 groups" Few Yes, Average, Many ..."

QUICK CLUSTER
dmm_1
/MISSING=LISTWISE
/CRITERIA= CLUSTER(3) MXITER(10) CONVERGE(0)
/METHOD=KMEANS(NOUPDATE)
/SAVE CLUSTER
/PRINT INITIAL.

Usually clustering is done with more than one variable, but it is possible to use your single dummy variable to create clusters.

Is this what you are looking for?

Jindra

> ------------ Původní zpráva ------------
> Od: c. <[hidden email]>
> Předmět: Newbie problem with Clusters
> Datum: 18.5.2012 15:34:26
> ----------------------------------------
> Hello,
> that's my first time on that list. And I have a newbie question about Clusters.
>
> In a large dataset, I have a nominal variable (cntrs) with different Countries.
> Each case is market by a
> letter: like US, UK, and so on.
> Than i have a dummy variable (dmm), YES - NO, for every case of my dataset.
>
> I'm interested to find if counties can be clustered in base the number of "NO"
> in the second variable I
> have (dmm).
>
> How to do that? What type of cluster should I use (if any) and how?
>
> If someone have time to answers my questions i will be glad.
>
> thank you,
> c.
>
> p.s.: sorry for my poor english.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Re: Newbie problem with Clusters

In reply to this post by c.

Thank you! That's exactly what I was looking for.
Thank you for your kind and quick answer, I would like to have your experience with SPSS ;)

bye,
c.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

David Marso

Re: Newbie problem with Clusters

Administrator

In reply to this post by Jerabek Jindrich

OTOH: Unless the countries are equally sampled the count might merely reflect disproportionate sampling.
In this case one might consider proportion of No's rather than simple counts?
See PIN function under AGGREGATE command in the manual.
---------------------------------------------------

Jerabek Jindrich wrote

Hello,

You can cluster countries on base of number Yes/no. Aggregate your datafile by cntrs
AGGREGATE
/OUTFILE=*
/BREAK=cntrs
/dmm_1 = SUM(dmm).

And you get for each country number of Yes in dmm var.

Use k-mean cluster to create several groups of countries, based on number of Yes. E.g. 3 groups" Few Yes, Average, Many ..."

QUICK CLUSTER
dmm_1
/MISSING=LISTWISE
/CRITERIA= CLUSTER(3) MXITER(10) CONVERGE(0)
/METHOD=KMEANS(NOUPDATE)
/SAVE CLUSTER
/PRINT INITIAL.

Usually clustering is done with more than one variable, but it is possible to use your single dummy variable to create clusters.

Is this what you are looking for?

Jindra

> ------------ Původní zpráva ------------
> Od: c. <[hidden email]>
> Předmět: Newbie problem with Clusters
> Datum: 18.5.2012 15:34:26
> ----------------------------------------
> Hello,
> that's my first time on that list. And I have a newbie question about Clusters.
>
> In a large dataset, I have a nominal variable (cntrs) with different Countries.
> Each case is market by a
> letter: like US, UK, and so on.
> Than i have a dummy variable (dmm), YES - NO, for every case of my dataset.
>
> I'm interested to find if counties can be clustered in base the number of "NO"
> in the second variable I
> have (dmm).
>
> How to do that? What type of cluster should I use (if any) and how?
>
> If someone have time to answers my questions i will be glad.
>
> thank you,
> c.
>
> p.s.: sorry for my poor english.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD