SPSSX Discussion

select variable name with greatest value

Classic

List

Threaded

14 messages Options

Peter Spangler

select variable name with greatest value

I would like to select the variable name for each case based on the highest value such that:

id v1 v2 v3 newVar

1 1 5 6 v3

2 2 10 9 v2

3 6 7 5 v2

4 5 2 1 v1

I am beginning with

VECTOR v =v1 to v3.

LOOP #I = v1 TO v3.

COMPUTE MX=MAX(v1 to v3).

END IF.

END LOOP.

Rich Ulrich

Re: select variable name with greatest value

COMPUTE MX=MAX(v1 to v3).
DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
IF ( MX= test ) newVar= literal .
END REPEAT.

--
Rich Ulrich

Date: Tue, 10 Sep 2013 13:47:13 -0700
From: [hidden email]
Subject: select variable name with greatest value
To: [hidden email]

I would like to select the variable name for each case based on the highest value such that:

id v1 v2 v3 newVar

1 1 5 6 v3

2 2 10 9 v2

3 6 7 5 v2

4 5 2 1 v1

I am beginning with

VECTOR v =v1 to v3.

LOOP #I = v1 TO v3.

COMPUTE MX=MAX(v1 to v3).

END IF.

END LOOP.

Bruce Weaver

Re: select variable name with greatest value

Administrator

While Rich was posting that, I was also thinking about using DO-REPEAT. Here's my version:

STRING NewVar(a2).
DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
- IF (Max(v1 to v3) EQ v) NewVar = New.
END REPEAT.
LIST.

Output:

ID V1 V2 V3 NewVar
1 1 5 6 V3
2 2 10 9 V2
3 6 7 5 V2
4 5 2 1 V1

However, if this was just a simplified example and your actual data file has a lot more than 3 variables to look at, there's probably a better way. I.e., you don't really want to list "V1" "V2" ... "V946" on your DO REPEAT command. ;-)

HTH.

Rich Ulrich wrote

COMPUTE MX=MAX(v1 to v3).
DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
IF ( MX= test ) newVar= literal .
END REPEAT.

--
Rich Ulrich

Date: Tue, 10 Sep 2013 13:47:13 -0700
From: [hidden email]
Subject: select variable name with greatest value
To: [hidden email]

I would like to select the variable name for each case based on the highest value such that:
id v1 v2 v3 newVar1 1 5 6 v32 2 10 9 v23 6 7 5 v2
4 5 2 1 v1
I am beginning with
VECTOR v =v1 to v3.LOOP #I = v1 TO v3. COMPUTE MX=MAX(v1 to v3).
END IF.
END LOOP.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

Bruce Weaver

Re: select variable name with greatest value

Administrator

p.s. - You haven't said what you want to do if there are ties.

Bruce Weaver wrote

While Rich was posting that, I was also thinking about using DO-REPEAT. Here's my version:

STRING NewVar(a2).
DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
- IF (Max(v1 to v3) EQ v) NewVar = New.
END REPEAT.
LIST.

Output:

ID V1 V2 V3 NewVar
1 1 5 6 V3
2 2 10 9 V2
3 6 7 5 V2
4 5 2 1 V1

However, if this was just a simplified example and your actual data file has a lot more than 3 variables to look at, there's probably a better way. I.e., you don't really want to list "V1" "V2" ... "V946" on your DO REPEAT command. ;-)

HTH.

Rich Ulrich wrote

COMPUTE MX=MAX(v1 to v3).
DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
IF ( MX= test ) newVar= literal .
END REPEAT.

--
Rich Ulrich

Date: Tue, 10 Sep 2013 13:47:13 -0700
From: [hidden email]
Subject: select variable name with greatest value
To: [hidden email]

I would like to select the variable name for each case based on the highest value such that:
id v1 v2 v3 newVar1 1 5 6 v32 2 10 9 v23 6 7 5 v2
4 5 2 1 v1
I am beginning with
VECTOR v =v1 to v3.LOOP #I = v1 TO v3. COMPUTE MX=MAX(v1 to v3).
END IF.
END LOOP.

Peter Spangler

Re: select variable name with greatest value

Just getting back. 1) Yes, this is a simplified version with more than 3 vars :)

2) Ties to other variables? For example, v1a to v3a and v1b to v3b?

On Tue, Sep 10, 2013 at 2:59 PM, Bruce Weaver <[hidden email]> wrote:

p.s. - You haven't said what you want to do if there are ties.

Bruce Weaver wrote

> While Rich was posting that, I was also thinking about using DO-REPEAT.
> Here's my version:
>
> STRING NewVar(a2).
> DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
> - IF (Max(v1 to v3) EQ v) NewVar = New.
> END REPEAT.
> LIST.
>
> Output:
>
> ID V1 V2 V3 NewVar
> 1 1 5 6 V3
> 2 2 10 9 V2
> 3 6 7 5 V2
> 4 5 2 1 V1
>
> However, if this was just a simplified example and your actual data file
> has a lot more than 3 variables to look at, there's probably a better way.
> I.e., you don't really want to list "V1" "V2" ... "V946" on your DO REPEAT
> command. ;-)
>
> HTH.
>
> Rich Ulrich wrote
>> COMPUTE MX=MAX(v1 to v3).
>> DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
>> IF ( MX= test ) newVar= literal .
>> END REPEAT.
>>
>> --
>> Rich Ulrich
>>
>> Date: Tue, 10 Sep 2013 13:47:13 -0700
>> From:

>> pspangler@

>> Subject: select variable name with greatest value
>> To:

>> SPSSX-L@.UGA

>>
>> I would like to select the variable name for each case based on the
>> highest value such that:
>> id v1 v2 v3 newVar1 1 5 6 v32 2 10 9 v23 6 7 5
>> v2
>> 4 5 2 1 v1
>> I am beginning with
>> VECTOR v =v1 to v3.LOOP #I = v1 TO v3. COMPUTE MX=MAX(v1 to v3).
>> END IF.
>> END LOOP.

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--

View this message in context: http://spssx-discussion.1045642.n5.nabble.com/select-variable-name-with-greatest-value-tp5721890p5721893.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Bruce Weaver

Re: select variable name with greatest value

Administrator

Ties as in tied scores -- as in the new case I've added to your original data below.

id v1 v2 v3 newVar
1 1 5 6 v3
2 2 10 9 v2
3 6 7 5 v2
4 5 2 1 v1
5 4 2 4 v1 or v3? Or maybe "v1, v3"? Not clear what you want here!

Peter Spangler wrote

Just getting back. 1) Yes, this is a simplified version with more than 3
vars :)
2) Ties to other variables? For example, v1a to
v3a and v1b to v3b?

On Tue, Sep 10, 2013 at 2:59 PM, Bruce Weaver <[hidden email]>wrote:

> p.s. - You haven't said what you want to do if there are ties.
>
>
>
> Bruce Weaver wrote
> > While Rich was posting that, I was also thinking about using DO-REPEAT.
> > Here's my version:
> >
> > STRING NewVar(a2).
> > DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
> > - IF (Max(v1 to v3) EQ v) NewVar = New.
> > END REPEAT.
> > LIST.
> >
> > Output:
> >
> > ID V1 V2 V3 NewVar
> > 1 1 5 6 V3
> > 2 2 10 9 V2
> > 3 6 7 5 V2
> > 4 5 2 1 V1
> >
> > However, if this was just a simplified example and your actual data file
> > has a lot more than 3 variables to look at, there's probably a better
> way.
> > I.e., you don't really want to list "V1" "V2" ... "V946" on your DO
> REPEAT
> > command. ;-)
> >
> > HTH.
> >
> > Rich Ulrich wrote
> >> COMPUTE MX=MAX(v1 to v3).
> >> DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
> >> IF ( MX= test ) newVar= literal .
> >> END REPEAT.
> >>
> >> --
> >> Rich Ulrich
> >>
> >> Date: Tue, 10 Sep 2013 13:47:13 -0700
> >> From:
>
> >> pspangler@
>
> >> Subject: select variable name with greatest value
> >> To:
>
> >> SPSSX-L@.UGA
>
> >>
> >> I would like to select the variable name for each case based on the
> >> highest value such that:
> >> id v1 v2 v3 newVar1 1 5 6 v32 2 10 9 v23 6 7 5
> >> v2
> >> 4 5 2 1 v1
> >> I am beginning with
> >> VECTOR v =v1 to v3.LOOP #I = v1 TO v3. COMPUTE MX=MAX(v1 to v3).
> >> END IF.
> >> END LOOP.
>
>
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context:
> http://spssx-discussion.1045642.n5.nabble.com/select-variable-name-with-greatest-value-tp5721890p5721893.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

Peter Spangler

Re: select variable name with greatest value

These values are more distinct. They are transnational data associated with associated with the category (v1 to v3). So not likely to have issue with ties.

Say I had 946 variables, would I get rid of DO REPEAT for another command?

On Tue, Sep 10, 2013 at 3:14 PM, Bruce Weaver <[hidden email]> wrote:

Ties as in tied scores -- as in the new case I've added to your original data
below.

id v1 v2 v3 newVar
1 1 5 6 v3
2 2 10 9 v2

3 6 7 5 v2
4 5 2 1 v1

5 4 2 4 v1 or v3? Or maybe "v1, v3"? Not clear what you want
here!

Peter Spangler wrote

> Just getting back. 1) Yes, this is a simplified version with more than 3
> vars :)
> 2) Ties to other variables? For example, v1a
> to
> v3a and v1b to v3b?
>
>
> On Tue, Sep 10, 2013 at 2:59 PM, Bruce Weaver <

> bruce.weaver@

> >wrote:
>
>> p.s. - You haven't said what you want to do if there are ties.
>>
>>
>>
>> Bruce Weaver wrote
>> > While Rich was posting that, I was also thinking about using DO-REPEAT.
>> > Here's my version:
>> >
>> > STRING NewVar(a2).
>> > DO REPEAT v = v1 to v3 / New = "V1" "V2" "V3".
>> > - IF (Max(v1 to v3) EQ v) NewVar = New.
>> > END REPEAT.
>> > LIST.
>> >
>> > Output:
>> >
>> > ID V1 V2 V3 NewVar
>> > 1 1 5 6 V3
>> > 2 2 10 9 V2
>> > 3 6 7 5 V2
>> > 4 5 2 1 V1
>> >
>> > However, if this was just a simplified example and your actual data
>> file
>> > has a lot more than 3 variables to look at, there's probably a better
>> way.
>> > I.e., you don't really want to list "V1" "V2" ... "V946" on your DO
>> REPEAT
>> > command. ;-)
>> >
>> > HTH.
>> >
>> > Rich Ulrich wrote
>> >> COMPUTE MX=MAX(v1 to v3).
>> >> DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
>> >> IF ( MX= test ) newVar= literal .
>> >> END REPEAT.
>> >>
>> >> --
>> >> Rich Ulrich
>> >>
>> >> Date: Tue, 10 Sep 2013 13:47:13 -0700
>> >> From:
>>
>> >> pspangler@
>>
>> >> Subject: select variable name with greatest value
>> >> To:
>>
>> >> SPSSX-L@.UGA
>>
>> >>
>> >> I would like to select the variable name for each case based on the
>> >> highest value such that:
>> >> id v1 v2 v3 newVar1 1 5 6 v32 2 10 9 v23 6 7 5
>> >> v2
>> >> 4 5 2 1 v1
>> >> I am beginning with
>> >> VECTOR v =v1 to v3.LOOP #I = v1 TO v3. COMPUTE MX=MAX(v1 to v3).
>> >> END IF.
>> >> END LOOP.
>>
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>>

> bweaver@

>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> View this message in context:
>> http://spssx-discussion.1045642.n5.nabble.com/select-variable-name-with-greatest-value-tp5721890p5721893.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

> (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--

View this message in context: http://spssx-discussion.1045642.n5.nabble.com/select-variable-name-with-greatest-value-tp5721890p5721895.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

David Marso

Re: select variable name with greatest value

Administrator

In reply to this post by Rich Ulrich

Of course if two variables are tied it will flag the LAST one.
If this is a problem then you will need to adapt the code.
---

Rich Ulrich wrote

COMPUTE MX=MAX(v1 to v3).
DO REPEAT test= v1 to v3/literal= 'v1', 'v2', 'v3' .
IF ( MX= test ) newVar= literal .
END REPEAT.

--
Rich Ulrich

Date: Tue, 10 Sep 2013 13:47:13 -0700
From: [hidden email]
Subject: select variable name with greatest value
To: [hidden email]

I would like to select the variable name for each case based on the highest value such that:
id v1 v2 v3 newVar1 1 5 6 v32 2 10 9 v23 6 7 5 v2
4 5 2 1 v1
I am beginning with
VECTOR v =v1 to v3.LOOP #I = v1 TO v3. COMPUTE MX=MAX(v1 to v3).
END IF.
END LOOP.

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Richard Ristow

Re: select variable name with greatest value

In reply to this post by Peter Spangler

At 04:47 PM 9/10/2013, Peter Spangler wrote:

>I would like to select the variable name for each case based on the
>highest value such that:
>
>id v1 v2 v3 newVar
>1 1 5 6 v3
>2 2 10 9 v2
>3 6 7 5 v2
>4 5 2 1 v1

And you've written elsewhere that tied largest values are not a
problem; but that you have nearer 900 variables than 3, so the code
should be scaleable.

Here's code that works by unrolling the file into one record per ID
per variable. It doesn't need to be changed when more variables are
added, except to change the variable list in VARSTOCASES.

|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:32 |
|-----------------------------|---------------------------|
[Input]
id v1 v2 v3

1 1 5 6
2 2 10 9
3 6 7 5
4 5 2 1

Number of cases read: 4 Number of cases listed: 4

DATASET COPY Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
/MAKE Value FROM v1 TO v3
/INDEX = VName(Value)
/KEEP = id
/NULL = KEEP.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
/BREAK=ID
/HiName 'Variable with largest value' = FIRST(VName)
/HiValue 'Highest value encountered' = FIRST(Value).

DATASET NAME Biggest WINDOW=FRONT.
LIST.
List
|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:33 |
|-----------------------------|---------------------------|
[Biggest]
id HiName HiValue

1 v3 6
2 v2 10
3 v2 7
4 v1 5

Number of cases read: 4 Number of cases listed: 4

* Now it's simple to add these values to the original file:.

MATCH FILES
/FILE =Input
/TABLE=Biggest
/BY ID.

DATASET NAME Final WINDOW=FRONT.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:33 |
|-----------------------------|---------------------------|
[Final]
id v1 v2 v3 HiName HiValue

1 1 5 6 v3 6
2 2 10 9 v2 10
3 6 7 5 v2 7
4 5 2 1 v1 5

Number of cases read: 4 Number of cases listed: 4

========================================================
APPENDIX: All code, and test data (not saved separately)
========================================================
NEW FILE.
DATA LIST LIST/
id v1 v2 v3
(4F2).
BEGIN DATA
1 1 5 6 v3
2 2 10 9 v2
3 6 7 5 v2
4 5 2 1 v1
END DATA.
DATASET NAME Input WINDOW=FRONT.
LIST.

DATASET COPY Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
/MAKE Value FROM v1 TO v3
/INDEX = VName(Value)
/KEEP = id
/NULL = KEEP.

LIST.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
/BREAK=ID
/HiName 'Variable with largest value' = FIRST(VName)
/HiValue 'Highest value encountered' = FIRST(Value).

DATASET NAME Biggest WINDOW=FRONT.
LIST.

* Now it's simple to add these values to the original file:.

MATCH FILES
/FILE =Input
/TABLE=Biggest
/BY ID.

DATASET NAME Final WINDOW=FRONT.
LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Rich Ulrich

Re: select variable name with greatest value

In reply to this post by Peter Spangler

If I had 946 variables, I think I would drop the luxury of hard-coding
the number with a V in front of it as a string variable, and just
find the number. Getting that literal string could be an extra
step where you compute a string with the number after a prefix.

To get a number more efficiently, you could use a LOOP --
Loop until V(#) eq MaxVal; then save the # and BREAK out of the
loop.

--
Rich Ulrich

Date: Tue, 10 Sep 2013 15:24:45 -0700
From: [hidden email]
Subject: Re: select variable name with greatest value
To: [hidden email]

These values are more distinct. They are transnational data associated with associated with the category (v1 to v3). So not likely to have issue with ties.

Say I had 946 variables, would I get rid of DO REPEAT for another command?

... [snip, previous]

Bruce Weaver

Re: select variable name with greatest value

Administrator

In reply to this post by Richard Ristow

Very nice, Richard. I cobbled together a macro that also seems to work, but this approach is much tidier.

Cheers,
Bruce

Richard Ristow wrote

At 04:47 PM 9/10/2013, Peter Spangler wrote:

>I would like to select the variable name for each case based on the
>highest value such that:
>
>id v1 v2 v3 newVar
>1 1 5 6 v3
>2 2 10 9 v2
>3 6 7 5 v2
>4 5 2 1 v1

And you've written elsewhere that tied largest values are not a
problem; but that you have nearer 900 variables than 3, so the code
should be scaleable.

Here's code that works by unrolling the file into one record per ID
per variable. It doesn't need to be changed when more variables are
added, except to change the variable list in VARSTOCASES.

|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:32 |
|-----------------------------|---------------------------|
[Input]
id v1 v2 v3

1 1 5 6
2 2 10 9
3 6 7 5
4 5 2 1

Number of cases read: 4 Number of cases listed: 4

DATASET COPY Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
/MAKE Value FROM v1 TO v3
/INDEX = VName(Value)
/KEEP = id
/NULL = KEEP.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
/BREAK=ID
/HiName 'Variable with largest value' = FIRST(VName)
/HiValue 'Highest value encountered' = FIRST(Value).

DATASET NAME Biggest WINDOW=FRONT.
LIST.
List
|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:33 |
|-----------------------------|---------------------------|
[Biggest]
id HiName HiValue

1 v3 6
2 v2 10
3 v2 7
4 v1 5

Number of cases read: 4 Number of cases listed: 4

* Now it's simple to add these values to the original file:.

MATCH FILES
/FILE =Input
/TABLE=Biggest
/BY ID.

DATASET NAME Final WINDOW=FRONT.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:33 |
|-----------------------------|---------------------------|
[Final]
id v1 v2 v3 HiName HiValue

1 1 5 6 v3 6
2 2 10 9 v2 10
3 6 7 5 v2 7
4 5 2 1 v1 5

Number of cases read: 4 Number of cases listed: 4

========================================================
APPENDIX: All code, and test data (not saved separately)
========================================================
NEW FILE.
DATA LIST LIST/
id v1 v2 v3
(4F2).
BEGIN DATA
1 1 5 6 v3
2 2 10 9 v2
3 6 7 5 v2
4 5 2 1 v1
END DATA.
DATASET NAME Input WINDOW=FRONT.
LIST.

DATASET COPY Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
/MAKE Value FROM v1 TO v3
/INDEX = VName(Value)
/KEEP = id
/NULL = KEEP.

LIST.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
/BREAK=ID
/HiName 'Variable with largest value' = FIRST(VName)
/HiValue 'Highest value encountered' = FIRST(Value).

DATASET NAME Biggest WINDOW=FRONT.
LIST.

* Now it's simple to add these values to the original file:.

MATCH FILES
/FILE =Input
/TABLE=Biggest
/BY ID.

DATASET NAME Final WINDOW=FRONT.
LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Peter Spangler

Re: select variable name with greatest value

In reply to this post by Richard Ristow

Simply fantastic, Richard! Many thanks.

On Tue, Sep 10, 2013 at 6:30 PM, Richard Ristow <[hidden email]> wrote:

At 04:47 PM 9/10/2013, Peter Spangler wrote:

I would like to select the variable name for each case based on the highest value such that:

id v1 v2 v3 newVar
1 1 5 6 v3
2 2 10 9 v2
3 6 7 5 v2
4 5 2 1 v1

And you've written elsewhere that tied largest values are not a problem; but that you have nearer 900 variables than 3, so the code should be scaleable.

Here's code that works by unrolling the file into one record per ID per variable. It doesn't need to be changed when more variables are added, except to change the variable list in VARSTOCASES.

|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:32 |
|-----------------------------|---------------------------|
[Input]
id v1 v2 v3

1 1 5 6
2 2 10 9
3 6 7 5
4 5 2 1

Number of cases read: 4 Number of cases listed: 4

DATASET COPY Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
/MAKE Value FROM v1 TO v3
/INDEX = VName(Value)
/KEEP = id
/NULL = KEEP.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
/BREAK=ID
/HiName 'Variable with largest value' = FIRST(VName)
/HiValue 'Highest value encountered' = FIRST(Value).

DATASET NAME Biggest WINDOW=FRONT.
LIST.
List
|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:33 |
|-----------------------------|---------------------------|
[Biggest]
id HiName HiValue

1 v3 6
2 v2 10
3 v2 7
4 v1 5

Number of cases read: 4 Number of cases listed: 4

* Now it's simple to add these values to the original file:.

MATCH FILES
/FILE =Input
/TABLE=Biggest
/BY ID.

DATASET NAME Final WINDOW=FRONT.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:33 |
|-----------------------------|---------------------------|
[Final]
id v1 v2 v3 HiName HiValue

1 1 5 6 v3 6
2 2 10 9 v2 10
3 6 7 5 v2 7
4 5 2 1 v1 5

Number of cases read: 4 Number of cases listed: 4

========================================================
APPENDIX: All code, and test data (not saved separately)
========================================================
NEW FILE.
DATA LIST LIST/
id v1 v2 v3
(4F2).
BEGIN DATA

1 1 5 6 v3
2 2 10 9 v2
3 6 7 5 v2
4 5 2 1 v1
END DATA.
DATASET NAME Input WINDOW=FRONT.
LIST.

DATASET COPY Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
/MAKE Value FROM v1 TO v3
/INDEX = VName(Value)
/KEEP = id
/NULL = KEEP.

LIST.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
/BREAK=ID
/HiName 'Variable with largest value' = FIRST(VName)
/HiValue 'Highest value encountered' = FIRST(Value).

DATASET NAME Biggest WINDOW=FRONT.
LIST.

* Now it's simple to add these values to the original file:.

MATCH FILES
/FILE =Input
/TABLE=Biggest
/BY ID.

DATASET NAME Final WINDOW=FRONT.
LIST.

Jon K Peck

Re: select variable name with greatest value

Adding to the heap of solutions, here is Python solution using the SPSSINC TRANS extension command that does not require the varstocases transformation. One would change the two instances of v1 to v3 below as appropriate.

begin program.
import spss, spssaux
vnames = spssaux.VariableDict().expand("v1 to v3")

def nameOfMax(*args):
return vnames[args.index(max(args))]
end program.

spssinc trans result=vname type=64
/variables v1 to v3
/formula "nameOfMax(<>)".

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621

From: Peter Spangler <[hidden email]>
To: [hidden email],
Date: 09/11/2013 10:03 AM
Subject: Re: [SPSSX-L] select variable name with greatest value
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Simply fantastic, Richard! Many thanks.

On Tue, Sep 10, 2013 at 6:30 PM, Richard Ristow <wrristow@...> wrote:
At 04:47 PM 9/10/2013, Peter Spangler wrote:

I would like to select the variable name for each case based on the highest value such that:

id v1 v2 v3 newVar
1 1 5 6 v3
2 2 10 9 v2
3 6 7 5 v2
4 5 2 1 v1

And you've written elsewhere that tied largest values are not a problem; but that you have nearer 900 variables than 3, so the code should be scaleable.

Here's code that works by unrolling the file into one record per ID per variable. It doesn't need to be changed when more variables are added, except to change the variable list in VARSTOCASES.

|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:32 |
|-----------------------------|---------------------------|
[Input]
id v1 v2 v3

1 1 5 6
2 2 10 9
3 6 7 5
4 5 2 1

Number of cases read: 4 Number of cases listed: 4

DATASET COPY Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
/MAKE Value FROM v1 TO v3
/INDEX = VName(Value)
/KEEP = id
/NULL = KEEP.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
/BREAK=ID
/HiName 'Variable with largest value' = FIRST(VName)
/HiValue 'Highest value encountered' = FIRST(Value).

DATASET NAME Biggest WINDOW=FRONT.
LIST.
List
|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:33 |
|-----------------------------|---------------------------|
[Biggest]
id HiName HiValue

1 v3 6
2 v2 10
3 v2 7
4 v1 5

Number of cases read: 4 Number of cases listed: 4

* Now it's simple to add these values to the original <a href=file:///>file:.

MATCH FILES
/FILE =Input
/TABLE=Biggest
/BY ID.

DATASET NAME Final WINDOW=FRONT.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |10-SEP-2013 21:18:33 |
|-----------------------------|---------------------------|
[Final]
id v1 v2 v3 HiName HiValue

1 1 5 6 v3 6
2 2 10 9 v2 10
3 6 7 5 v2 7
4 5 2 1 v1 5

Number of cases read: 4 Number of cases listed: 4

========================================================
APPENDIX: All code, and test data (not saved separately)
========================================================
NEW FILE.
DATA LIST LIST/
id v1 v2 v3
(4F2).
BEGIN DATA

1 1 5 6 v3
2 2 10 9 v2
3 6 7 5 v2
4 5 2 1 v1
END DATA.
DATASET NAME Input WINDOW=FRONT.
LIST.

DATASET COPY Unroll.
DATASET ACTIVATE Unroll WINDOW=FRONT.
VARSTOCASES
/MAKE Value FROM v1 TO v3
/INDEX = VName(Value)
/KEEP = id
/NULL = KEEP.

LIST.

SORT CASES BY ID (A) Value (D).
AGGREGATE OUTFILE=*
/BREAK=ID
/HiName 'Variable with largest value' = FIRST(VName)
/HiValue 'Highest value encountered' = FIRST(Value).

DATASET NAME Biggest WINDOW=FRONT.
LIST.

* Now it's simple to add these values to the original <a href=file:///>file:.

MATCH FILES
/FILE =Input
/TABLE=Biggest
/BY ID.

DATASET NAME Final WINDOW=FRONT.
LIST.

Ruben Geert van den Berg

Re: select variable name with greatest value

Dear Peter,

I wrote a solution (http://www.pythonforspss.org/find-within-subjects-favorite-over-several-variables/) for this a while ago.

In case of ties, one of the maximum values is chosen at random and one of the new variables will hold the number of maximum values. So you can easily check afterwards whether there really weren't any ties. It also generates clearly formulated variable and value labels for the newly created variables.

Obviously, all this requires a bit more code but after copying-pasting, you need to adjust only one line (which variables to take into account).

HTH,

Ruben