Rank Variables Equivalent for String Variables

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Rank Variables Equivalent for String Variables

ariel barak

Hi All,

I have what should be an easy problem. I am interested in keeping records sorted by multiple string variables, last_name and first_name, and then creating a record number that starts at 1 and increases only if the last_name and first_name from the record above are different and otherwise stays the same. In the dummy data below, the record number would be 1, 2, 2, 2, 3, 3. If I were to sort on the client_no, I could use the Rank command but I am interested in sorting the data by the string variables in order to use the conditional format functionality in an MS Access report so that each group of record numbers will alternate from being white (no fill) and gray based on whether the record number is odd or not.
 
Thanks,
Ariel Barak

DATA LIST LIST /client_no (F11.0) last_name (A15) first_name (A15).
BEGIN DATA
773821 Doe John
4072419 Johnson Joe
4072419 Johnson Joe
4072419 Johnson Joe
771763 Smith Steve
771763 Smith Steve
END DATA.
Reply | Threaded
Open this post in threaded view
|

Re: Rank Variables Equivalent for String Variables

Rick Oliver-3
You could concatenate first and last name and use autorecode:

DATA LIST LIST /client_no (F11.0) last_name (A15) first_name (A15).
BEGIN DATA
773821 Doe John
4072419 Johnson Joe
4072419 Johnson Joe
4072419 Johnson Joe
771763 Smith Steve
771763 Smith Steve
END DATA.
string fullname (a30).
compute fullname=concat(ltrim(last_name), ltrim(first_name)).
autorecode fullname /into rankvar.
delete variables fullname.




From: Ariel Barak <[hidden email]>
To: [hidden email]
Date: 09/10/2010 10:59 AM
Subject: Rank Variables Equivalent for String Variables
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Hi All,

I have what should be an easy problem. I am interested in keeping records sorted by multiple string variables, last_name and first_name, and then creating a record number that starts at 1 and increases only if the last_name and first_name from the record above are different and otherwise stays the same. In the dummy data below, the record number would be 1, 2, 2, 2, 3, 3. If I were to sort on the client_no, I could use the Rank command but I am interested in sorting the data by the string variables in order to use the conditional format functionality in an MS Access report so that each group of record numbers will alternate from being white (no fill) and gray based on whether the record number is odd or not.
 
Thanks,
Ariel Barak

DATA LIST LIST /client_no (F11.0) last_name (A15) first_name (A15).
BEGIN DATA
773821 Doe John
4072419 Johnson Joe
4072419 Johnson Joe
4072419 Johnson Joe
771763 Smith Steve
771763 Smith Steve
END DATA.


Reply | Threaded
Open this post in threaded view
|

Re: Rank Variables Equivalent for String Variables

Bruce Weaver
Administrator
In reply to this post by ariel barak
ariel barak wrote
Hi All,
I have what should be an easy problem. I am interested in keeping records
sorted by multiple string variables, last_name and first_name, and then
creating a record number that starts at 1 and increases only if the
last_name and first_name from the record above are different and otherwise
stays the same. In the dummy data below, the record number would be 1, 2, 2,
2, 3, 3. If I were to sort on the client_no, I could use the Rank command
but I am interested in sorting the data by the string variables in order to
use the conditional format functionality in an MS Access report so that each
group of record numbers will alternate from being white (no fill) and gray
based on whether the record number is odd or not.

Thanks,
Ariel Barak

DATA LIST LIST /client_no (F11.0) last_name (A15) first_name (A15).
BEGIN DATA
773821 Doe John
4072419 Johnson Joe
4072419 Johnson Joe
4072419 Johnson Joe
771763 Smith Steve
771763 Smith Steve
END DATA.
sort cases by last_name first_name.
do if $casenum EQ 1.
-  compute recnum = 1.
else.
-  compute #newid = NOT (last_name EQ lag(last_name) and first_name EQ lag(first_name)).
-  compute recnum = lag(recnum) + #newid.
end if.
format recnum (f5.0).
list.

OUTPUT:

  client_no last_name       first_name      recnum

     773821 Doe             John                 1
    4072419 Johnson         Joe                  2
    4072419 Johnson         Joe                  2
    4072419 Johnson         Joe                  2
     771763 Smith           Steve                3
     771763 Smith           Steve                3

Number of cases read:  6    Number of cases listed:  6
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Rank Variables Equivalent for String Variables

Art Kendall
In reply to this post by ariel barak
  see if this is what you want. The /print option gives you a pace to
check for closely spelled variations on a name (JOE, JOS., JOEY) etc.
this syntax should be readable.  You might possibly be able to lose some
person-efficiency and gain some machine-efficiency by using sort and lags.
DATA LIST LIST /client_no (F11.0) last_name (A15) first_name (A15).
BEGIN DATA
773821 Doe John
4072419 Johnson Joe
4072419 Johnson Joe
4072419 Johnson Joe
4072419 Johnson Joey
4072419 Johnson Jos.
4072419 Johnsen Joe
4072419 Johnsen Joey
771763 Smith Steve
771763 Smith Steve
END DATA.
string wholename(a30).
compute wholename = concat(upcase(last_name),UPCASE(first_name)).
autorecode variables= first_name last_name wholename /into fnames lnames
nameset/print.
list.

Art Kendall
Social research Consultants

On 9/10/2010 11:57 AM, Ariel Barak wrote:

>
> Hi All,
>
> I have what should be an easy problem. I am interested in keeping
> records sorted by multiple string variables, last_name and first_name,
> and then creating a record number that starts at 1 and increases only
> if the last_name and first_name from the record above are different
> and otherwise stays the same. In the dummy data below, the record
> number would be 1, 2, 2, 2, 3, 3. If I were to sort on the client_no,
> I could use the Rank command but I am interested in sorting the data
> by the string variables in order to use the conditional format
> functionality in an MS Access report so that each group of record
> numbers will alternate from being white (no fill) and gray based on
> whether the record number is odd or not.
> Thanks,
> Ariel Barak
>
> DATA LIST LIST /client_no (F11.0) last_name (A15) first_name (A15).
> BEGIN DATA
> 773821 Doe John
> 4072419 Johnson Joe
> 4072419 Johnson Joe
> 4072419 Johnson Joe
> 771763 Smith Steve
> 771763 Smith Steve
> END DATA.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Rank Variables Equivalent for String Variables

David Marso
Administrator
In reply to this post by ariel barak
On Fri, 10 Sep 2010 09:19:35 -0700, Bruce Weaver <[hidden email]>
wrote:

>ariel barak wrote:
>>
>> Hi All,
>> I have what should be an easy problem. I am interested in keeping records
>> sorted by multiple string variables, last_name and first_name, and then
>> creating a record number that starts at 1 and increases only if the
>> last_name and first_name from the record above are different and otherwise
>> stays the same. In the dummy data below, the record number would be 1, 2,
>> 2,
>> 2, 3, 3. If I were to sort on the client_no, I could use the Rank command
>> but I am interested in sorting the data by the string variables in order
>> to
>> use the conditional format functionality in an MS Access report so that
>> each
>> group of record numbers will alternate from being white (no fill) and gray
>> based on whether the record number is odd or not.
>>
>> Thanks,
>> Ariel Barak
>>
>> DATA LIST LIST /client_no (F11.0) last_name (A15) first_name (A15).
>> BEGIN DATA
>> 773821 Doe John
>> 4072419 Johnson Joe
>> 4072419 Johnson Joe
>> 4072419 Johnson Joe
>> 771763 Smith Steve
>> 771763 Smith Steve
>> END DATA.
>>
>>
>
>sort cases by last_name first_name.
>do if $casenum EQ 1.
>-  compute recnum = 1.
>else.
>-  compute #newid = NOT (last_name EQ lag(last_name) and first_name EQ
>lag(first_name)).
>-  compute recnum = lag(recnum) + #newid.
>end if.
>format recnum (f5.0).
>list.
>
>OUTPUT:
>
>  client_no last_name       first_name      recnum
>
>     773821 Doe             John                 1
>    4072419 Johnson         Joe                  2
>    4072419 Johnson         Joe                  2
>    4072419 Johnson         Joe                  2
>     771763 Smith           Steve                3
>     771763 Smith           Steve                3
>
>Number of cases read:  6    Number of cases listed:  6
>
>
>-----
>--
>Bruce Weaver
>[hidden email]
>http://sites.google.com/a/lakeheadu.ca/bweaver/
>
>"When all else fails, RTFM."
>
Close, but I would also account for multiple clients with the same
first/last names.  NOTE, this counting code can be smashed into an almost
equally comprehensible two liner ;-)
Variable names modified to account for my prehistoric flint and bear skin
version of SPSS 11.5 ;-(
------------
DATA LIST LIST /client(F11.0) lastname (A15) firsname (A15).
BEGIN DATA
 773821 Doe John
 4072419 Johnson Joe
 4072419 Johnson Joe
 4072419 Johnson Joe
 771763 Smith Steve
 771763 Smith Steve
 333333 Smith Steve
 333333 Smith Steve
END DATA.
sort cases by lastname firsname client.

COMPUTE #recnum = ($casenum NE 1) * #recnum
                +   NOT (  lastname EQ LAG(lastname)
                       AND firsname EQ LAG(firsname)
                       AND client   EQ LAG(client)
                         ) .
COMPUTE RECNUM = #recnum.
format recnum (f5.0).
list.


     CLIENT LASTNAME        FIRSNAME        RECNUM

     773821 Doe             John                 1
    4072419 Johnson         Joe                  2
    4072419 Johnson         Joe                  2
    4072419 Johnson         Joe                  2
     333333 Smith           Steve                3
     333333 Smith           Steve                3
     771763 Smith           Steve                4
     771763 Smith           Steve                4


Number of cases read:  8    Number of cases listed:  8

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Rank Variables Equivalent for String Variables

Bruce Weaver
Administrator
Commenting on the solution I posted, David said:

David Marso wrote
Close, but I would also account for multiple clients with the same
first/last names.  
Good point, David.  This could also be done with a slight modification to the AUTORECODE solution Art posted (which I quite liked):  i.e., CLIENT_NO would have to be appended to his "wholename" variable.  Something like:

string wholename(a40).
compute wholename = concat(upcase(last_name),UPCASE(first_name),string(client_no,f11.0).
autorecode variables= wholename /into recnum /print.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Rank Variables Equivalent for String Variables

Richard Ristow
In reply to this post by David Marso
At 08:58 AM 9/12/2010, David Marso wrote:

On Fri, 10 Sep 2010 09:19:35 -0700, Bruce Weaver <[hidden email]> wrote:

sort cases by last_name first_name.
do if $casenum EQ 1.
-  compute recnum = 1.
else.
-  compute #newid = NOT
        (last_name  EQ lag(last_name)
     and first_name EQ lag(first_name)).
-  compute recnum = lag(recnum) + #newid.
end if.
format recnum (f5.0).

I would also account for multiple clients with the same first/last names:

sort cases by lastname firsname client.
COMPUTE #recnum = ($casenum NE 1) * #recnum
                +   NOT (  lastname EQ LAG(lastname)
                       AND firsname EQ LAG(firsname)
                       AND client   EQ LAG(client)
                         ) .
COMPUTE RECNUM = #recnum.
format recnum (f5.0).

However, with a multiple-variable key, the LAG logic can be awkward to write and to read. I suggest the following variation, though leaving variable "New_Person" in the file may be undesirable. (And, code not tested.)

sort cases by lastname firsname client.
ADD FILES
   /FILE  =*
   /BY    lastname firsname client
   /FIRST=NewPerson.

COMPUTE RecNum = SUM(NewPerson,LAG(RecNum)).
format recnum (f5.0).

In the COMPUTE statement, using "SUM" rather than "+" is a 'cute' trick. It gives the correct result in case 1, where "LAG(RecNum)" is missing.

Like many cute tricks, this may be confusing. One might prefer something like

DO IF $CASENUM EQ 1.
.  COMPUTE RecNum = 1.
ELSE.
.  COMPUTE RecNum = NewPerson +LAG(RecNum).
END IF.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD