select variable name with greatest value

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

select variable name with greatest value

Peter Spangler
I would like to select the variable name for each case based on the highest value such that:

id v1 v2 v3 newVar
1  1   5   6      v3
2   2  10  9     v2
3   6   7   5     v2
4   5   2    1    v1

I am beginning with

VECTOR v =v1 to v3.
LOOP #I = v1 TO v3.
  COMPUTE MX=MAX(v1 to v3). 
  END IF.
END LOOP.

Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Rich Ulrich

COMPUTE MX=MAX(v1 to v3).
DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
IF  ( MX= test )  newVar= literal .
END REPEAT.

--
Rich Ulrich


Date: Tue, 10 Sep 2013 13:47:13 -0700
From: [hidden email]
Subject: select variable name with greatest value
To: [hidden email]

I would like to select the variable name for each case based on the highest value such that:

id v1 v2 v3 newVar
1  1   5   6      v3
2   2  10  9     v2
3   6   7   5     v2
4   5   2    1    v1

I am beginning with

VECTOR v =v1 to v3.
LOOP #I = v1 TO v3.
  COMPUTE MX=MAX(v1 to v3). 
  END IF.
END LOOP.

Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Bruce Weaver
Administrator
While Rich was posting that, I was also thinking about using DO-REPEAT.  Here's my version:

STRING NewVar(a2).
DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
-  IF (Max(v1 to v3) EQ v) NewVar = New.
END REPEAT.
LIST.

Output:

ID V1 V2 V3 NewVar
 1  1  5  6 V3
 2  2 10  9 V2
 3  6  7  5 V2
 4  5  2  1 V1

However, if this was just a simplified example and your actual data file has a lot more than 3 variables to look at, there's probably a better way.  I.e., you don't really want to list "V1" "V2" ... "V946" on your DO REPEAT command.  ;-)

HTH.


Rich Ulrich wrote
COMPUTE MX=MAX(v1 to v3).
DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
IF  ( MX= test )  newVar= literal .
END REPEAT.

--
Rich Ulrich

Date: Tue, 10 Sep 2013 13:47:13 -0700
From: [hidden email]
Subject: select variable name with greatest value
To: [hidden email]

I would like to select the variable name for each case based on the highest value such that:
id v1 v2 v3 newVar1  1   5   6      v32   2  10  9     v23   6   7   5     v2
4   5   2    1    v1
I am beginning with
VECTOR v =v1 to v3.LOOP #I = v1 TO v3.  COMPUTE MX=MAX(v1 to v3).
  END IF.
END LOOP.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Bruce Weaver
Administrator
p.s. - You haven't said what you want to do if there are ties.


Bruce Weaver wrote
While Rich was posting that, I was also thinking about using DO-REPEAT.  Here's my version:

STRING NewVar(a2).
DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
-  IF (Max(v1 to v3) EQ v) NewVar = New.
END REPEAT.
LIST.

Output:

ID V1 V2 V3 NewVar
 1  1  5  6 V3
 2  2 10  9 V2
 3  6  7  5 V2
 4  5  2  1 V1

However, if this was just a simplified example and your actual data file has a lot more than 3 variables to look at, there's probably a better way.  I.e., you don't really want to list "V1" "V2" ... "V946" on your DO REPEAT command.  ;-)

HTH.


Rich Ulrich wrote
COMPUTE MX=MAX(v1 to v3).
DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
IF  ( MX= test )  newVar= literal .
END REPEAT.

--
Rich Ulrich

Date: Tue, 10 Sep 2013 13:47:13 -0700
From: [hidden email]
Subject: select variable name with greatest value
To: [hidden email]

I would like to select the variable name for each case based on the highest value such that:
id v1 v2 v3 newVar1  1   5   6      v32   2  10  9     v23   6   7   5     v2
4   5   2    1    v1
I am beginning with
VECTOR v =v1 to v3.LOOP #I = v1 TO v3.  COMPUTE MX=MAX(v1 to v3).
  END IF.
END LOOP.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Peter Spangler
Just getting back. 1)  Yes, this is a simplified version with more than 3 vars :) 
                           2)  Ties to other variables? For example, v1a to v3a and v1b to v3b? 


On Tue, Sep 10, 2013 at 2:59 PM, Bruce Weaver <[hidden email]> wrote:
p.s. - You haven't said what you want to do if there are ties.



Bruce Weaver wrote
> While Rich was posting that, I was also thinking about using DO-REPEAT.
> Here's my version:
>
> STRING NewVar(a2).
> DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
> -  IF (Max(v1 to v3) EQ v) NewVar = New.
> END REPEAT.
> LIST.
>
> Output:
>
> ID V1 V2 V3 NewVar
>  1  1  5  6 V3
>  2  2 10  9 V2
>  3  6  7  5 V2
>  4  5  2  1 V1
>
> However, if this was just a simplified example and your actual data file
> has a lot more than 3 variables to look at, there's probably a better way.
> I.e., you don't really want to list "V1" "V2" ... "V946" on your DO REPEAT
> command.  ;-)
>
> HTH.
>
> Rich Ulrich wrote
>> COMPUTE MX=MAX(v1 to v3).
>> DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
>> IF  ( MX= test )  newVar= literal .
>> END REPEAT.
>>
>> --
>> Rich Ulrich
>>
>> Date: Tue, 10 Sep 2013 13:47:13 -0700
>> From:

>> pspangler@

>> Subject: select variable name with greatest value
>> To:

>> SPSSX-L@.UGA

>>
>> I would like to select the variable name for each case based on the
>> highest value such that:
>> id v1 v2 v3 newVar1  1   5   6      v32   2  10  9     v23   6   7   5
>> v2
>> 4   5   2    1    v1
>> I am beginning with
>> VECTOR v =v1 to v3.LOOP #I = v1 TO v3.  COMPUTE MX=MAX(v1 to v3).
>>   END IF.
>> END LOOP.





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/select-variable-name-with-greatest-value-tp5721890p5721893.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Bruce Weaver
Administrator
Ties as in tied scores -- as in the new case I've added to your original data below.

id v1 v2 v3 newVar
1  1   5   6      v3
2   2  10  9     v2
3   6   7   5     v2
4   5   2    1    v1
5   4   2    4    v1 or v3?  Or maybe "v1, v3"?  Not clear what you want here!


Peter Spangler wrote
Just getting back. 1)  Yes, this is a simplified version with more than 3
vars :)
                           2)  Ties to other variables? For example, v1a to
v3a and v1b to v3b?


On Tue, Sep 10, 2013 at 2:59 PM, Bruce Weaver <[hidden email]>wrote:

> p.s. - You haven't said what you want to do if there are ties.
>
>
>
> Bruce Weaver wrote
> > While Rich was posting that, I was also thinking about using DO-REPEAT.
> > Here's my version:
> >
> > STRING NewVar(a2).
> > DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
> > -  IF (Max(v1 to v3) EQ v) NewVar = New.
> > END REPEAT.
> > LIST.
> >
> > Output:
> >
> > ID V1 V2 V3 NewVar
> >  1  1  5  6 V3
> >  2  2 10  9 V2
> >  3  6  7  5 V2
> >  4  5  2  1 V1
> >
> > However, if this was just a simplified example and your actual data file
> > has a lot more than 3 variables to look at, there's probably a better
> way.
> > I.e., you don't really want to list "V1" "V2" ... "V946" on your DO
> REPEAT
> > command.  ;-)
> >
> > HTH.
> >
> > Rich Ulrich wrote
> >> COMPUTE MX=MAX(v1 to v3).
> >> DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
> >> IF  ( MX= test )  newVar= literal .
> >> END REPEAT.
> >>
> >> --
> >> Rich Ulrich
> >>
> >> Date: Tue, 10 Sep 2013 13:47:13 -0700
> >> From:
>
> >> pspangler@
>
> >> Subject: select variable name with greatest value
> >> To:
>
> >> SPSSX-L@.UGA
>
> >>
> >> I would like to select the variable name for each case based on the
> >> highest value such that:
> >> id v1 v2 v3 newVar1  1   5   6      v32   2  10  9     v23   6   7   5
> >> v2
> >> 4   5   2    1    v1
> >> I am beginning with
> >> VECTOR v =v1 to v3.LOOP #I = v1 TO v3.  COMPUTE MX=MAX(v1 to v3).
> >>   END IF.
> >> END LOOP.
>
>
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context:
> http://spssx-discussion.1045642.n5.nabble.com/select-variable-name-with-greatest-value-tp5721890p5721893.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Peter Spangler
These values are more distinct. They are transnational data associated with associated with the category (v1 to v3). So not likely to have issue with ties.

Say I had 946 variables, would I get rid of DO REPEAT for another command?


On Tue, Sep 10, 2013 at 3:14 PM, Bruce Weaver <[hidden email]> wrote:
Ties as in tied scores -- as in the new case I've added to your original data
below.

id v1 v2 v3 newVar
1  1   5   6      v3
2   2  10  9     v2
3   6   7   5     v2
4   5   2    1    v1
5   4   2    4    v1 or v3?  Or maybe "v1, v3"?  Not clear what you want
here!



Peter Spangler wrote
> Just getting back. 1)  Yes, this is a simplified version with more than 3
> vars :)
>                            2)  Ties to other variables? For example, v1a
> to
> v3a and v1b to v3b?
>
>
> On Tue, Sep 10, 2013 at 2:59 PM, Bruce Weaver <

> bruce.weaver@

> >wrote:
>
>> p.s. - You haven't said what you want to do if there are ties.
>>
>>
>>
>> Bruce Weaver wrote
>> > While Rich was posting that, I was also thinking about using DO-REPEAT.
>> > Here's my version:
>> >
>> > STRING NewVar(a2).
>> > DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
>> > -  IF (Max(v1 to v3) EQ v) NewVar = New.
>> > END REPEAT.
>> > LIST.
>> >
>> > Output:
>> >
>> > ID V1 V2 V3 NewVar
>> >  1  1  5  6 V3
>> >  2  2 10  9 V2
>> >  3  6  7  5 V2
>> >  4  5  2  1 V1
>> >
>> > However, if this was just a simplified example and your actual data
>> file
>> > has a lot more than 3 variables to look at, there's probably a better
>> way.
>> > I.e., you don't really want to list "V1" "V2" ... "V946" on your DO
>> REPEAT
>> > command.  ;-)
>> >
>> > HTH.
>> >
>> > Rich Ulrich wrote
>> >> COMPUTE MX=MAX(v1 to v3).
>> >> DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
>> >> IF  ( MX= test )  newVar= literal .
>> >> END REPEAT.
>> >>
>> >> --
>> >> Rich Ulrich
>> >>
>> >> Date: Tue, 10 Sep 2013 13:47:13 -0700
>> >> From:
>>
>> >> pspangler@
>>
>> >> Subject: select variable name with greatest value
>> >> To:
>>
>> >> SPSSX-L@.UGA
>>
>> >>
>> >> I would like to select the variable name for each case based on the
>> >> highest value such that:
>> >> id v1 v2 v3 newVar1  1   5   6      v32   2  10  9     v23   6   7   5
>> >> v2
>> >> 4   5   2    1    v1
>> >> I am beginning with
>> >> VECTOR v =v1 to v3.LOOP #I = v1 TO v3.  COMPUTE MX=MAX(v1 to v3).
>> >>   END IF.
>> >> END LOOP.
>>
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>>

> bweaver@

>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> View this message in context:
>> http://spssx-discussion.1045642.n5.nabble.com/select-variable-name-with-greatest-value-tp5721890p5721893.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/select-variable-name-with-greatest-value-tp5721890p5721895.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

David Marso
Administrator
In reply to this post by Rich Ulrich
Of course if two variables are tied it will flag the LAST one.
If this is a problem then you will need to adapt the code.
---
Rich Ulrich wrote
COMPUTE MX=MAX(v1 to v3).
DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
IF  ( MX= test )  newVar= literal .
END REPEAT.

--
Rich Ulrich

Date: Tue, 10 Sep 2013 13:47:13 -0700
From: [hidden email]
Subject: select variable name with greatest value
To: [hidden email]

I would like to select the variable name for each case based on the highest value such that:
id v1 v2 v3 newVar1  1   5   6      v32   2  10  9     v23   6   7   5     v2
4   5   2    1    v1
I am beginning with
VECTOR v =v1 to v3.LOOP #I = v1 TO v3.  COMPUTE MX=MAX(v1 to v3).
  END IF.
END LOOP.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Richard Ristow
In reply to this post by Peter Spangler
At 04:47 PM 9/10/2013, Peter Spangler wrote:

>I would like to select the variable name for each case based on the
>highest value such that:
>
>id v1  v2  v3  newVar
>1   1   5   6    v3
>2   2  10   9  v2
>3   6   7   5  v2
>4   5   2   1    v1

And you've written elsewhere that tied largest values are not a
problem; but that you have nearer 900 variables than 3, so the code
should be scaleable.

Here's code that works by unrolling the file into one record per ID
per variable. It doesn't need to be changed when more variables are
added, except to change the variable list in VARSTOCASES.

|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:32       |
|-----------------------------|---------------------------|
  [Input]
id v1 v2 v3

  1  1  5  6
  2  2 10  9
  3  6  7  5
  4  5  2  1

Number of cases read:  4    Number of cases listed:  4


DATASET COPY     Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
   /MAKE Value FROM v1 TO v3
   /INDEX = VName(Value)
   /KEEP =  id
   /NULL = KEEP.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
    /BREAK=ID
    /HiName  'Variable with largest value' = FIRST(VName)
    /HiValue 'Highest value encountered'   = FIRST(Value).

DATASET NAME     Biggest WINDOW=FRONT.
LIST.
List
|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:33       |
|-----------------------------|---------------------------|
  [Biggest]
id HiName HiValue

  1 v3         6
  2 v2        10
  3 v2         7
  4 v1         5

Number of cases read:  4    Number of cases listed:  4

*  Now it's simple to add these values to the original file:.

MATCH FILES
   /FILE =Input
   /TABLE=Biggest
   /BY ID.

DATASET NAME     Final   WINDOW=FRONT.
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:33       |
|-----------------------------|---------------------------|
  [Final]
id v1 v2 v3 HiName HiValue

  1  1  5  6 v3         6
  2  2 10  9 v2        10
  3  6  7  5 v2         7
  4  5  2  1 v1         5

Number of cases read:  4    Number of cases listed:  4

========================================================
APPENDIX: All code, and test data (not saved separately)
========================================================
NEW FILE.
DATA LIST LIST/
    id v1 v2 v3
   (4F2).
BEGIN DATA
    1  1   5   6      v3
    2   2  10  9     v2
    3   6   7   5     v2
    4   5   2    1    v1
END DATA.
DATASET NAME     Input WINDOW=FRONT.
LIST.

DATASET COPY     Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
   /MAKE Value FROM v1 TO v3
   /INDEX = VName(Value)
   /KEEP =  id
   /NULL = KEEP.

LIST.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
    /BREAK=ID
    /HiName  'Variable with largest value' = FIRST(VName)
    /HiValue 'Highest value encountered'   = FIRST(Value).

DATASET NAME     Biggest WINDOW=FRONT.
LIST.

*  Now it's simple to add these values to the original file:.

MATCH FILES
   /FILE =Input
   /TABLE=Biggest
   /BY ID.

DATASET NAME     Final   WINDOW=FRONT.
LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Rich Ulrich
In reply to this post by Peter Spangler
If I had 946 variables, I think I would drop the luxury of hard-coding
the number with a V in front of it as a string variable, and just
find the number.  Getting that literal string could be an extra
step where you compute a string with the number after a prefix.

To get a number more efficiently, you could use a LOOP --
Loop until V(#) eq MaxVal; then save the # and BREAK out of the
loop.

--
Rich Ulrich


Date: Tue, 10 Sep 2013 15:24:45 -0700
From: [hidden email]
Subject: Re: select variable name with greatest value
To: [hidden email]

These values are more distinct. They are transnational data associated with associated with the category (v1 to v3). So not likely to have issue with ties.

Say I had 946 variables, would I get rid of DO REPEAT for another command?
... [snip, previous]
Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Bruce Weaver
Administrator
In reply to this post by Richard Ristow
Very nice, Richard.  I cobbled together a macro that also seems to work, but this approach is much tidier.  

Cheers,
Bruce


Richard Ristow wrote
At 04:47 PM 9/10/2013, Peter Spangler wrote:

>I would like to select the variable name for each case based on the
>highest value such that:
>
>id v1  v2  v3  newVar
>1   1   5   6    v3
>2   2  10   9  v2
>3   6   7   5  v2
>4   5   2   1    v1

And you've written elsewhere that tied largest values are not a
problem; but that you have nearer 900 variables than 3, so the code
should be scaleable.

Here's code that works by unrolling the file into one record per ID
per variable. It doesn't need to be changed when more variables are
added, except to change the variable list in VARSTOCASES.

|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:32       |
|-----------------------------|---------------------------|
  [Input]
id v1 v2 v3

  1  1  5  6
  2  2 10  9
  3  6  7  5
  4  5  2  1

Number of cases read:  4    Number of cases listed:  4


DATASET COPY     Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
   /MAKE Value FROM v1 TO v3
   /INDEX = VName(Value)
   /KEEP =  id
   /NULL = KEEP.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
    /BREAK=ID
    /HiName  'Variable with largest value' = FIRST(VName)
    /HiValue 'Highest value encountered'   = FIRST(Value).

DATASET NAME     Biggest WINDOW=FRONT.
LIST.
List
|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:33       |
|-----------------------------|---------------------------|
  [Biggest]
id HiName HiValue

  1 v3         6
  2 v2        10
  3 v2         7
  4 v1         5

Number of cases read:  4    Number of cases listed:  4

*  Now it's simple to add these values to the original file:.

MATCH FILES
   /FILE =Input
   /TABLE=Biggest
   /BY ID.

DATASET NAME     Final   WINDOW=FRONT.
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:33       |
|-----------------------------|---------------------------|
  [Final]
id v1 v2 v3 HiName HiValue

  1  1  5  6 v3         6
  2  2 10  9 v2        10
  3  6  7  5 v2         7
  4  5  2  1 v1         5

Number of cases read:  4    Number of cases listed:  4

========================================================
APPENDIX: All code, and test data (not saved separately)
========================================================
NEW FILE.
DATA LIST LIST/
    id v1 v2 v3
   (4F2).
BEGIN DATA
    1  1   5   6      v3
    2   2  10  9     v2
    3   6   7   5     v2
    4   5   2    1    v1
END DATA.
DATASET NAME     Input WINDOW=FRONT.
LIST.

DATASET COPY     Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
   /MAKE Value FROM v1 TO v3
   /INDEX = VName(Value)
   /KEEP =  id
   /NULL = KEEP.

LIST.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
    /BREAK=ID
    /HiName  'Variable with largest value' = FIRST(VName)
    /HiValue 'Highest value encountered'   = FIRST(Value).

DATASET NAME     Biggest WINDOW=FRONT.
LIST.

*  Now it's simple to add these values to the original file:.

MATCH FILES
   /FILE =Input
   /TABLE=Biggest
   /BY ID.

DATASET NAME     Final   WINDOW=FRONT.
LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Peter Spangler
In reply to this post by Richard Ristow
Simply fantastic, Richard! Many thanks. 


On Tue, Sep 10, 2013 at 6:30 PM, Richard Ristow <[hidden email]> wrote:
At 04:47 PM 9/10/2013, Peter Spangler wrote:

I would like to select the variable name for each case based on the highest value such that:

id v1  v2  v3  newVar
1   1   5   6    v3
2   2  10   9  v2
3   6   7   5  v2
4   5   2   1    v1

And you've written elsewhere that tied largest values are not a problem; but that you have nearer 900 variables than 3, so the code should be scaleable.

Here's code that works by unrolling the file into one record per ID per variable. It doesn't need to be changed when more variables are added, except to change the variable list in VARSTOCASES.

|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:32       |
|-----------------------------|---------------------------|
 [Input]
id v1 v2 v3

 1  1  5  6
 2  2 10  9
 3  6  7  5
 4  5  2  1

Number of cases read:  4    Number of cases listed:  4


DATASET COPY     Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
  /MAKE Value FROM v1 TO v3
  /INDEX = VName(Value)
  /KEEP =  id
  /NULL = KEEP.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
   /BREAK=ID
   /HiName  'Variable with largest value' = FIRST(VName)
   /HiValue 'Highest value encountered'   = FIRST(Value).

DATASET NAME     Biggest WINDOW=FRONT.
LIST.
List
|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:33       |
|-----------------------------|---------------------------|
 [Biggest]
id HiName HiValue

 1 v3         6
 2 v2        10
 3 v2         7
 4 v1         5

Number of cases read:  4    Number of cases listed:  4

*  Now it's simple to add these values to the original file:.

MATCH FILES
  /FILE =Input
  /TABLE=Biggest
  /BY ID.

DATASET NAME     Final   WINDOW=FRONT.
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:33       |
|-----------------------------|---------------------------|
 [Final]
id v1 v2 v3 HiName HiValue

 1  1  5  6 v3         6
 2  2 10  9 v2        10
 3  6  7  5 v2         7
 4  5  2  1 v1         5

Number of cases read:  4    Number of cases listed:  4

========================================================
APPENDIX: All code, and test data (not saved separately)
========================================================
NEW FILE.
DATA LIST LIST/
   id v1 v2 v3
  (4F2).
BEGIN DATA

   1  1   5   6      v3
   2   2  10  9     v2
   3   6   7   5     v2
   4   5   2    1    v1
END DATA.
DATASET NAME     Input WINDOW=FRONT.
LIST.

DATASET COPY     Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
  /MAKE Value FROM v1 TO v3
  /INDEX = VName(Value)
  /KEEP =  id
  /NULL = KEEP.

LIST.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
   /BREAK=ID
   /HiName  'Variable with largest value' = FIRST(VName)
   /HiValue 'Highest value encountered'   = FIRST(Value).

DATASET NAME     Biggest WINDOW=FRONT.
LIST.

*  Now it's simple to add these values to the original file:.

MATCH FILES
  /FILE =Input
  /TABLE=Biggest
  /BY ID.

DATASET NAME     Final   WINDOW=FRONT.
LIST.



Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Jon K Peck
Adding to the heap of solutions, here is Python solution using the SPSSINC TRANS extension command that does not require the varstocases transformation.  One would change the two instances of v1 to v3 below as appropriate.

begin program.
import spss, spssaux
vnames = spssaux.VariableDict().expand("v1 to v3")

def nameOfMax(*args):
    return vnames[args.index(max(args))]
end program.

spssinc trans result=vname type=64
/variables v1 to v3
/formula "nameOfMax(<>)".


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Peter Spangler <[hidden email]>
To:        [hidden email],
Date:        09/11/2013 10:03 AM
Subject:        Re: [SPSSX-L] select variable name with greatest value
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Simply fantastic, Richard! Many thanks. 


On Tue, Sep 10, 2013 at 6:30 PM, Richard Ristow <wrristow@...> wrote:
At 04:47 PM 9/10/2013, Peter Spangler wrote:

I would like to select the variable name for each case based on the highest value such that:

id v1  v2  v3  newVar
1   1   5   6    v3
2   2  10   9  v2
3   6   7   5  v2
4   5   2   1    v1


And you've written elsewhere that tied largest values are not a problem; but that you have nearer 900 variables than 3, so the code should be scaleable.

Here's code that works by unrolling the file into one record per ID per variable. It doesn't need to be changed when more variables are added, except to change the variable list in VARSTOCASES.

|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:32       |
|-----------------------------|---------------------------|
 [Input]
id v1 v2 v3

 1  1  5  6
 2  2 10  9
 3  6  7  5
 4  5  2  1

Number of cases read:  4    Number of cases listed:  4


DATASET COPY     Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
  /MAKE Value FROM v1 TO v3
  /INDEX = VName(Value)
  /KEEP =  id
  /NULL = KEEP.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
   /BREAK=ID
   /HiName  'Variable with largest value' = FIRST(VName)
   /HiValue 'Highest value encountered'   = FIRST(Value).

DATASET NAME     Biggest WINDOW=FRONT.
LIST.
List
|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:33       |
|-----------------------------|---------------------------|
 [Biggest]
id HiName HiValue

 1 v3         6
 2 v2        10
 3 v2         7
 4 v1         5

Number of cases read:  4    Number of cases listed:  4

*  Now it's simple to add these values to the original
<a href=file:///>file:.

MATCH FILES
  /FILE =Input
  /TABLE=Biggest
  /BY ID.

DATASET NAME     Final   WINDOW=FRONT.
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |10-SEP-2013 21:18:33       |
|-----------------------------|---------------------------|
 [Final]
id v1 v2 v3 HiName HiValue

 1  1  5  6 v3         6
 2  2 10  9 v2        10
 3  6  7  5 v2         7
 4  5  2  1 v1         5

Number of cases read:  4    Number of cases listed:  4

========================================================
APPENDIX: All code, and test data (not saved separately)
========================================================
NEW FILE.
DATA LIST LIST/
   id v1 v2 v3
  (4F2).
BEGIN DATA


   1  1   5   6      v3
   2   2  10  9     v2
   3   6   7   5     v2
   4   5   2    1    v1

END DATA.
DATASET NAME     Input WINDOW=FRONT.
LIST.

DATASET COPY     Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
  /MAKE Value FROM v1 TO v3
  /INDEX = VName(Value)
  /KEEP =  id
  /NULL = KEEP.

LIST.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
   /BREAK=ID
   /HiName  'Variable with largest value' = FIRST(VName)
   /HiValue 'Highest value encountered'   = FIRST(Value).

DATASET NAME     Biggest WINDOW=FRONT.
LIST.

*  Now it's simple to add these values to the original
<a href=file:///>file:.

MATCH FILES
  /FILE =Input
  /TABLE=Biggest
  /BY ID.

DATASET NAME     Final   WINDOW=FRONT.
LIST.



Reply | Threaded
Open this post in threaded view
|

Re: select variable name with greatest value

Ruben Geert van den Berg
Dear Peter,

I wrote a solution (http://www.pythonforspss.org/find-within-subjects-favorite-over-several-variables/) for this a while ago.

In case of ties, one of the maximum values is chosen at random and one of the new variables will hold the number of maximum values. So you can easily check afterwards whether there really weren't any ties. It also generates clearly formulated variable and value labels for the newly created variables.

Obviously, all this requires a bit more code but after copying-pasting, you need to adjust only one line (which variables to take into account).

HTH,

Ruben