finding the modal value in a string variable

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

finding the modal value in a string variable

Hoover, Matthew

Hello SPSS experts,

 

This is probably an easy question, but I can’t seem to figure it out.  I am working with a dataset that includes a variable for a business code (NAICS) and a product or service code.  Both of these variables are string.  I’d like to aggregate the file with the business code as the break variable taking the modal value for product or service code (the value that repeats the most per business code).  SPSS seems to only allow me to take either the first, last, minimum and maximum.  Is there some easy way I can get the most recurring value (mode) in a string variable?

 

Thanks!

 

Matt

 

Matthew Hoover

Research Analyst

Donahue Institute: CareerWorks Center

34 School Street

Brockton, MA 02301

Voice: 1-508-513-3444

Fax: 1-508-513-3450

 

Reply | Threaded
Open this post in threaded view
|

Re: finding the modal value in a string variable

Maguin, Eugene
Matt,

Three steps, I think. Aggregate by business code and product/service code
and count records. The result on record for each business
code-product/service code combination and the number of records, call this
variable NumRecs, with that combination, which is your distribution. Step 2
is to sort by business code and NumRecs, ascending sort on business code and
descending sort on NumRecs. So now the file has the modal
product/service code as the first record in the set of records with the same
business code. Step 3 is to do this bit of code to pick out the first record
in each business code group.

Compute pick=0.
If ($casenum eq 1 or BusinessCode ne lag(BusinessCode)) pick=1.

This will do it.

Gene Maguin



>>This is probably an easy question, but I can't seem to figure it out.  I
am working with a dataset that includes a variable for a business code
(NAICS) and a product or service code.  Both of these variables are string.
I'd like to aggregate the file with the business code as the break variable
taking the modal value for product or service code (the value that repeats
the most per business code).  SPSS seems to only allow me to take either the
first, last, minimum and maximum.  Is there some easy way I can get the most
recurring value (mode) in a string variable?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: finding the modal value in a string variable

John F Hall
In reply to this post by Hoover, Matthew
If you have a huge number of values, try something like this to see what it throws up (substitute your var name(s) in <varlist> )
 
freq <varlist> /for not /sta mod .
----- Original Message -----
Sent: Friday, June 18, 2010 4:51 PM
Subject: finding the modal value in a string variable

Hello SPSS experts,

 

This is probably an easy question, but I can’t seem to figure it out.  I am working with a dataset that includes a variable for a business code (NAICS) and a product or service code.  Both of these variables are string.  I’d like to aggregate the file with the business code as the break variable taking the modal value for product or service code (the value that repeats the most per business code).  SPSS seems to only allow me to take either the first, last, minimum and maximum.  Is there some easy way I can get the most recurring value (mode) in a string variable?

 

Thanks!

 

Matt

 

Matthew Hoover

Research Analyst

Donahue Institute: CareerWorks Center

34 School Street

Brockton, MA 02301

Voice: 1-508-513-3444

Fax: 1-508-513-3450

 

Reply | Threaded
Open this post in threaded view
|

Re: finding the modal value in a string variable

Jon K Peck
In reply to this post by Hoover, Matthew
Gene's suggestion should work, but I have two comments.

The mode need not be unique, and ties need not be between nearby values, so you need to decide how you want to handle these cases.

Second, another approach would be to combine split files for the groups with OMS capture of an output table that contains the mode (Summarize and Examine might be candidates - we don't have a Blackberry version of Statistics yet).  Use that table to match to the original dataset as needed).

Jon Peck
-----------------
Sent from my BlackBerry Handheld.


----- Original Message -----
From: Gene Maguin [[hidden email]]
Sent: 06/18/2010 11:39 AM AST
To: [hidden email]
Subject: Re: [SPSSX-L] finding the modal value in a string variable



Matt,

Three steps, I think. Aggregate by business code and product/service code
and count records. The result on record for each business
code-product/service code combination and the number of records, call this
variable NumRecs, with that combination, which is your distribution. Step 2
is to sort by business code and NumRecs, ascending sort on business code and
descending sort on NumRecs. So now the file has the modal
product/service code as the first record in the set of records with the same
business code. Step 3 is to do this bit of code to pick out the first record
in each business code group.

Compute pick=0.
If ($casenum eq 1 or BusinessCode ne lag(BusinessCode)) pick=1.

This will do it.

Gene Maguin



>>This is probably an easy question, but I can't seem to figure it out.  I
am working with a dataset that includes a variable for a business code
(NAICS) and a product or service code.  Both of these variables are string.
I'd like to aggregate the file with the business code as the break variable
taking the modal value for product or service code (the value that repeats
the most per business code).  SPSS seems to only allow me to take either the
first, last, minimum and maximum.  Is there some easy way I can get the most
recurring value (mode) in a string variable?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD