Dynamic way to create quartiles

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Dynamic way to create quartiles

Jignesh Sutar
Hi all,

Is there not a dynamic way to group cases into quartiles (or any number of ntiles)?

I am aware of visual binning but I want to have a dynamic syntax to be able to group cases on different datasets rather than have to hard code actual cut off points?

There does not seem to be a direct way of doing so using either AGGREGATE or RANK which perhaps you would might have thought so, unless I am missing something here?


Thanks in advance,
Jignesh
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Dynamic way to create quartiles

David Marso
Administrator
Yes you must be missing something!
RANK (NTILES) followed by AGGREGATE using Max?
Perhaps post your failed attempts?
--
Jignesh Sutar wrote
Hi all,

Is there not a dynamic way to group cases into quartiles (or any number of
ntiles)?

I am aware of visual binning but I want to have a dynamic syntax to be able
to group cases on different datasets rather than have to hard code actual
cut off points?

There does not seem to be a direct way of doing so using either AGGREGATE
or RANK which perhaps you would might have thought so, unless I am missing
something here?


Thanks in advance,
Jignesh

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Dynamic way to create quartiles

Jignesh Sutar
ahh yes of course! My mind was looking for a one liner solution but I'll settle for a two liner!

On 9 October 2014 12:02, David Marso <[hidden email]> wrote:
Yes you must be missing something!
RANK (NTILES) followed by AGGREGATE using Max?
Perhaps post your failed attempts?
--

Jignesh Sutar wrote
> Hi all,
>
> Is there not a dynamic way to group cases into quartiles (or any number of
> ntiles)?
>
> I am aware of visual binning but I want to have a dynamic syntax to be
> able
> to group cases on different datasets rather than have to hard code actual
> cut off points?
>
> There does not seem to be a direct way of doing so using either AGGREGATE
> or RANK which perhaps you would might have thought so, unless I am missing
> something here?
>
>
> Thanks in advance,
> Jignesh
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Dynamic-way-to-create-quartiles-tp5727552p5727553.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Dynamic way to create quartiles

Jignesh Sutar
Are there any extension/modules that replicate Visual Binning? Specifically I am wanting to bin a variable into NTILES. Visually binning (GUI) generates the recode specification and value labels to achieve this. However I need to do this over multiple variables so would like to generalize the code so that it can be done programmatically rather than having to static/fixed definitions. And also so that I can set something up on interim data which would also then be valid on final or any updated data.


I could use RANK to create the NTILES but then there is no way of attributing the correct cut off point labels.


Visual binning creates non-breaking labels whereas if I was to read the minimum/maximum value from each of the bins created from RANK I may not be able to deduce the actual bin start and end points.

On 9 October 2014 at 12:08, Jignesh Sutar <[hidden email]> wrote:
ahh yes of course! My mind was looking for a one liner solution but I'll settle for a two liner!

On 9 October 2014 12:02, David Marso <[hidden email]> wrote:
Yes you must be missing something!
RANK (NTILES) followed by AGGREGATE using Max?
Perhaps post your failed attempts?
--

Jignesh Sutar wrote
> Hi all,
>
> Is there not a dynamic way to group cases into quartiles (or any number of
> ntiles)?
>
> I am aware of visual binning but I want to have a dynamic syntax to be
> able
> to group cases on different datasets rather than have to hard code actual
> cut off points?
>
> There does not seem to be a direct way of doing so using either AGGREGATE
> or RANK which perhaps you would might have thought so, unless I am missing
> something here?
>
>
> Thanks in advance,
> Jignesh
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Dynamic-way-to-create-quartiles-tp5727552p5727553.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Dynamic way to create quartiles

Jignesh Sutar
Also, Visual Binning vs. RANK seem to give different results (I am specifying same number of tiles in each method. Is there some difference in their calculation I am missing?):

dataset close all.
new file.
output close all.

data list list /id Q1.
BEGIN DATA
1 3
2 10
3 40
4 50
5 50
6 60
7 70
8 80
9 80
10 80
11 100
12 100
13 100
14 100
15 100
16 100
17 100
18 100
19 110
20 120
21 120
22 150
23 150
24 150
25 150
26 150
27 150
28 150
29 150
30 150
31 160
32 180
33 180
34 180
35 200
36 200
37 200
38 200
39 200
40 200
41 200
42 200
43 200
44 200
45 250
46 250
47 250
48 250
49 250
50 250
51 270
52 270
53 300
54 300
55 300
56 300
57 300
58 300
59 300
60 300
61 400
62 500
63 500
64 600
65 900
end data.
variable level Q1 (scale).

* Visual Binning - Version 1 Upper end points Included (<=).
*Q1.
RECODE  Q1 (MISSING=COPY) (LO THRU 100=1) (LO THRU 150=2) (LO THRU 200=3) (LO THRU 285=4) (LO THRU 
    HI=5) (ELSE=SYSMIS) INTO Q1Bin5V1.
VARIABLE LABELS  Q1Bin5V1 'Q1 (Binned)'.
FORMATS  Q1Bin5V1 (F5.0).
VALUE LABELS  Q1Bin5V1 1 '<= 100.00' 2 '101.00 - 150.00' 3 '151.00 - 200.00' 4 '201.00 - 285.00' 5 '286.00+'.
VARIABLE LEVEL  Q1Bin5V1 (ORDINAL).


* Visual Binning - Version 2 Upper end points Excluded (<).
*Q1.
RECODE  Q1 (MISSING=COPY) (285 THRU HI=5) (200 THRU HI=4) (150 THRU HI=3) (100 THRU HI=2) (LO THRU 
    HI=1) (ELSE=SYSMIS) INTO Q1Bin5V2.
VARIABLE LABELS  Q1Bin5V2 'Q1 (Binned)'.
FORMATS  Q1Bin5V2 (F5.0).
VALUE LABELS  Q1Bin5V2 1 '< 100.00' 2 '100.00 - 149.00' 3 '150.00 - 199.00' 4 '200.00 - 284.00' 5 '285.00+'.
VARIABLE LEVEL  Q1Bin5V2 (ORDINAL).


rank variables = Q1 /rank /ntile(5) into Q1Bin5_2.

cro Q1Bin5V1 Q1Bin5V2 by Q1Bin5_2.

On 26 November 2014 at 17:38, Jignesh Sutar <[hidden email]> wrote:
Are there any extension/modules that replicate Visual Binning? Specifically I am wanting to bin a variable into NTILES. Visually binning (GUI) generates the recode specification and value labels to achieve this. However I need to do this over multiple variables so would like to generalize the code so that it can be done programmatically rather than having to static/fixed definitions. And also so that I can set something up on interim data which would also then be valid on final or any updated data.


I could use RANK to create the NTILES but then there is no way of attributing the correct cut off point labels.


Visual binning creates non-breaking labels whereas if I was to read the minimum/maximum value from each of the bins created from RANK I may not be able to deduce the actual bin start and end points.

On 9 October 2014 at 12:08, Jignesh Sutar <[hidden email]> wrote:
ahh yes of course! My mind was looking for a one liner solution but I'll settle for a two liner!

On 9 October 2014 12:02, David Marso <[hidden email]> wrote:
Yes you must be missing something!
RANK (NTILES) followed by AGGREGATE using Max?
Perhaps post your failed attempts?
--

Jignesh Sutar wrote
> Hi all,
>
> Is there not a dynamic way to group cases into quartiles (or any number of
> ntiles)?
>
> I am aware of visual binning but I want to have a dynamic syntax to be
> able
> to group cases on different datasets rather than have to hard code actual
> cut off points?
>
> There does not seem to be a direct way of doing so using either AGGREGATE
> or RANK which perhaps you would might have thought so, unless I am missing
> something here?
>
>
> Thanks in advance,
> Jignesh
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Dynamic-way-to-create-quartiles-tp5727552p5727553.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Dynamic way to create quartiles

Rick Oliver-3
Rank has a number of different methods for dealing with ties.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Jignesh Sutar <[hidden email]>
To:        [hidden email]
Date:        11/26/2014 01:14 PM
Subject:        Re: Dynamic way to create quartiles
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Also, Visual Binning vs. RANK seem to give different results (I am specifying same number of tiles in each method. Is there some difference in their calculation I am missing?):

dataset close all.
new file.
output close all.

data list list /id Q1.
BEGIN DATA
1 3
2 10
3 40
4 50
5 50
6 60
7 70
8 80
9 80
10 80
11 100
12 100
13 100
14 100
15 100
16 100
17 100
18 100
19 110
20 120
21 120
22 150
23 150
24 150
25 150
26 150
27 150
28 150
29 150
30 150
31 160
32 180
33 180
34 180
35 200
36 200
37 200
38 200
39 200
40 200
41 200
42 200
43 200
44 200
45 250
46 250
47 250
48 250
49 250
50 250
51 270
52 270
53 300
54 300
55 300
56 300
57 300
58 300
59 300
60 300
61 400
62 500
63 500
64 600
65 900
end data.
variable level Q1 (scale).

* Visual Binning - Version 1 Upper end points Included (<=).
*Q1.
RECODE  Q1 (MISSING=COPY) (LO THRU 100=1) (LO THRU 150=2) (LO THRU 200=3) (LO THRU 285=4) (LO THRU 
    HI=5) (ELSE=SYSMIS) INTO Q1Bin5V1.
VARIABLE LABELS  Q1Bin5V1 'Q1 (Binned)'.
FORMATS  Q1Bin5V1 (F5.0).
VALUE LABELS  Q1Bin5V1 1 '<= 100.00' 2 '101.00 - 150.00' 3 '151.00 - 200.00' 4 '201.00 - 285.00' 5 '286.00+'.
VARIABLE LEVEL  Q1Bin5V1 (ORDINAL).


* Visual Binning - Version 2 Upper end points Excluded (<).
*Q1.
RECODE  Q1 (MISSING=COPY) (285 THRU HI=5) (200 THRU HI=4) (150 THRU HI=3) (100 THRU HI=2) (LO THRU 
    HI=1) (ELSE=SYSMIS) INTO Q1Bin5V2.
VARIABLE LABELS  Q1Bin5V2 'Q1 (Binned)'.
FORMATS  Q1Bin5V2 (F5.0).
VALUE LABELS  Q1Bin5V2 1 '< 100.00' 2 '100.00 - 149.00' 3 '150.00 - 199.00' 4 '200.00 - 284.00' 5 '285.00+'.
VARIABLE LEVEL  Q1Bin5V2 (ORDINAL).


rank variables = Q1 /rank /ntile(5) into Q1Bin5_2.

cro Q1Bin5V1 Q1Bin5V2 by Q1Bin5_2.

On 26 November 2014 at 17:38, Jignesh Sutar <jsutar@...> wrote:
Are there any extension/modules that replicate Visual Binning? Specifically I am wanting to bin a variable into NTILES. Visually binning (GUI) generates the recode specification and value labels to achieve this. However I need to do this over multiple variables so would like to generalize the code so that it can be done programmatically rather than having to static/fixed definitions. And also so that I can set something up on interim data which would also then be valid on final or any updated data.


I could use RANK to create the NTILES but then there is no way of attributing the correct cut off point labels.


Visual binning creates non-breaking labels whereas if I was to read the minimum/maximum value from each of the bins created from RANK I may not be able to deduce the actual bin start and end points.


On 9 October 2014 at 12:08, Jignesh Sutar <jsutar@...> wrote:
ahh yes of course! My mind was looking for a one liner solution but I'll settle for a two liner!

On 9 October 2014 12:02, David Marso <david.marso@...> wrote:
Yes you must be missing something!
RANK (NTILES) followed by AGGREGATE using Max?
Perhaps post your failed attempts?
--

Jignesh Sutar wrote

> Hi all,
>
> Is there not a dynamic way to group cases into quartiles (or any number of
> ntiles)?
>
> I am aware of visual binning but I want to have a dynamic syntax to be
> able
> to group cases on different datasets rather than have to hard code actual
> cut off points?
>
> There does not seem to be a direct way of doing so using either AGGREGATE
> or RANK which perhaps you would might have thought so, unless I am missing
> something here?
>
>
> Thanks in advance,
> Jignesh
>

> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Dynamic-way-to-create-quartiles-tp5727552p5727553.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to

LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD




===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Dynamic way to create quartiles

Jignesh Sutar
They do. But the options I am specifying here ought to be returning the same results, no?

On 26 November 2014 at 19:28, Rick Oliver <[hidden email]> wrote:
Rank has a number of different methods for dealing with ties.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Jignesh Sutar <[hidden email]>
To:        [hidden email]
Date:        11/26/2014 01:14 PM
Subject:        Re: Dynamic way to create quartiles
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Also, Visual Binning vs. RANK seem to give different results (I am specifying same number of tiles in each method. Is there some difference in their calculation I am missing?):

dataset close all.
new file.
output close all.

data list list /id Q1.
BEGIN DATA
1 3
2 10
3 40
4 50
5 50
6 60
7 70
8 80
9 80
10 80
11 100
12 100
13 100
14 100
15 100
16 100
17 100
18 100
19 110
20 120
21 120
22 150
23 150
24 150
25 150
26 150
27 150
28 150
29 150
30 150
31 160
32 180
33 180
34 180
35 200
36 200
37 200
38 200
39 200
40 200
41 200
42 200
43 200
44 200
45 250
46 250
47 250
48 250
49 250
50 250
51 270
52 270
53 300
54 300
55 300
56 300
57 300
58 300
59 300
60 300
61 400
62 500
63 500
64 600
65 900
end data.
variable level Q1 (scale).

* Visual Binning - Version 1 Upper end points Included (<=).
*Q1.
RECODE  Q1 (MISSING=COPY) (LO THRU 100=1) (LO THRU 150=2) (LO THRU 200=3) (LO THRU 285=4) (LO THRU 
    HI=5) (ELSE=SYSMIS) INTO Q1Bin5V1.
VARIABLE LABELS  Q1Bin5V1 'Q1 (Binned)'.
FORMATS  Q1Bin5V1 (F5.0).
VALUE LABELS  Q1Bin5V1 1 '<= 100.00' 2 '101.00 - 150.00' 3 '151.00 - 200.00' 4 '201.00 - 285.00' 5 '286.00+'.
VARIABLE LEVEL  Q1Bin5V1 (ORDINAL).


* Visual Binning - Version 2 Upper end points Excluded (<).
*Q1.
RECODE  Q1 (MISSING=COPY) (285 THRU HI=5) (200 THRU HI=4) (150 THRU HI=3) (100 THRU HI=2) (LO THRU 
    HI=1) (ELSE=SYSMIS) INTO Q1Bin5V2.
VARIABLE LABELS  Q1Bin5V2 'Q1 (Binned)'.
FORMATS  Q1Bin5V2 (F5.0).
VALUE LABELS  Q1Bin5V2 1 '< 100.00' 2 '100.00 - 149.00' 3 '150.00 - 199.00' 4 '200.00 - 284.00' 5 '285.00+'.
VARIABLE LEVEL  Q1Bin5V2 (ORDINAL).


rank variables = Q1 /rank /ntile(5) into Q1Bin5_2.

cro Q1Bin5V1 Q1Bin5V2 by Q1Bin5_2.

On 26 November 2014 at 17:38, Jignesh Sutar <[hidden email]> wrote:
Are there any extension/modules that replicate Visual Binning? Specifically I am wanting to bin a variable into NTILES. Visually binning (GUI) generates the recode specification and value labels to achieve this. However I need to do this over multiple variables so would like to generalize the code so that it can be done programmatically rather than having to static/fixed definitions. And also so that I can set something up on interim data which would also then be valid on final or any updated data.


I could use RANK to create the NTILES but then there is no way of attributing the correct cut off point labels.


Visual binning creates non-breaking labels whereas if I was to read the minimum/maximum value from each of the bins created from RANK I may not be able to deduce the actual bin start and end points.


On 9 October 2014 at 12:08, Jignesh Sutar <[hidden email]> wrote:
ahh yes of course! My mind was looking for a one liner solution but I'll settle for a two liner!

On 9 October 2014 12:02, David Marso <[hidden email]> wrote:
Yes you must be missing something!
RANK (NTILES) followed by AGGREGATE using Max?
Perhaps post your failed attempts?
--

Jignesh Sutar wrote

> Hi all,
>
> Is there not a dynamic way to group cases into quartiles (or any number of
> ntiles)?
>
> I am aware of visual binning but I want to have a dynamic syntax to be
> able
> to group cases on different datasets rather than have to hard code actual
> cut off points?
>
> There does not seem to be a direct way of doing so using either AGGREGATE
> or RANK which perhaps you would might have thought so, unless I am missing
> something here?
>
>
> Thanks in advance,
> Jignesh
>

> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Dynamic-way-to-create-quartiles-tp5727552p5727553.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to

[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD




===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Dynamic way to create quartiles

Rick Oliver-3
I think you'll find that visual binning is equivalent to TIES=LOW with RANK.

rank variables = Q1 /ntile(5) into Q1Bin5_2 /ties=low.
crosstabs Q1Bin5V1 by Q1Bin5_2.


Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Jignesh Sutar <[hidden email]>
To:        [hidden email]
Date:        11/26/2014 02:24 PM
Subject:        Re: Dynamic way to create quartiles
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




They do. But the options I am specifying here ought to be returning the same results, no?

On 26 November 2014 at 19:28, Rick Oliver <oliverr@...> wrote:
Rank has a number of different methods for dealing with ties.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
oliverr@...



From:        
Jignesh Sutar <jsutar@...>
To:        
[hidden email]
Date:        
11/26/2014 01:14 PM
Subject:        
Re: Dynamic way to create quartiles
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>





Also, Visual Binning vs. RANK seem to give different results (I am specifying same number of tiles in each method. Is there some difference in their calculation I am missing?):


dataset close all.

new file.

output close all.


data list list /id Q1.

BEGIN DATA

1 3

2 10

3 40

4 50

5 50

6 60

7 70

8 80

9 80

10 80

11 100

12 100

13 100

14 100

15 100

16 100

17 100

18 100

19 110

20 120

21 120

22 150

23 150

24 150

25 150

26 150

27 150

28 150

29 150

30 150

31 160

32 180

33 180

34 180

35 200

36 200

37 200

38 200

39 200

40 200

41 200

42 200

43 200

44 200

45 250

46 250

47 250

48 250

49 250

50 250

51 270

52 270

53 300

54 300

55 300

56 300

57 300

58 300

59 300

60 300

61 400

62 500

63 500

64 600

65 900

end data.

variable level Q1 (scale).


* Visual Binning - Version 1 Upper end points Included (<=).

*Q1.

RECODE  Q1 (MISSING=COPY) (LO THRU 100=1) (LO THRU 150=2) (LO THRU 200=3) (LO THRU 285=4) (LO THRU 

    HI=5) (ELSE=SYSMIS) INTO Q1Bin5V1.

VARIABLE LABELS  Q1Bin5V1 'Q1 (Binned)'.

FORMATS  Q1Bin5V1 (F5.0).

VALUE LABELS  Q1Bin5V1 1 '<= 100.00' 2 '101.00 - 150.00' 3 '151.00 - 200.00' 4 '201.00 - 285.00' 5 '286.00+'.

VARIABLE LEVEL  Q1Bin5V1 (ORDINAL).



* Visual Binning - Version 2 Upper end points Excluded (<).

*Q1.

RECODE  Q1 (MISSING=COPY) (285 THRU HI=5) (200 THRU HI=4) (150 THRU HI=3) (100 THRU HI=2) (LO THRU 

    HI=1) (ELSE=SYSMIS) INTO Q1Bin5V2.

VARIABLE LABELS  Q1Bin5V2 'Q1 (Binned)'.

FORMATS  Q1Bin5V2 (F5.0).

VALUE LABELS  Q1Bin5V2 1 '< 100.00' 2 '100.00 - 149.00' 3 '150.00 - 199.00' 4 '200.00 - 284.00' 5 '285.00+'.

VARIABLE LEVEL  Q1Bin5V2 (ORDINAL).



rank variables = Q1 /rank /ntile(5) into Q1Bin5_2.


cro Q1Bin5V1 Q1Bin5V2 by Q1Bin5_2.


On 26 November 2014 at 17:38, Jignesh Sutar <
jsutar@...> wrote:
Are there any extension/modules that replicate Visual Binning? Specifically I am wanting to bin a variable into NTILES. Visually binning (GUI) generates the recode specification and value labels to achieve this. However I need to do this over multiple variables so would like to generalize the code so that it can be done programmatically rather than having to static/fixed definitions. And also so that I can set something up on interim data which would also then be valid on final or any updated data.


I could use RANK to create the NTILES but then there is no way of attributing the correct cut off point labels.


Visual binning creates non-breaking labels whereas if I was to read the minimum/maximum value from each of the bins created from RANK I may not be able to deduce the actual bin start and end points.

On 9 October 2014 at 12:08, Jignesh Sutar <
jsutar@...> wrote:
ahh yes of course! My mind was looking for a one liner solution but I'll settle for a two liner!

On 9 October 2014 12:02, David Marso <
david.marso@...> wrote:
Yes you must be missing something!
RANK (NTILES) followed by AGGREGATE using Max?
Perhaps post your failed attempts?
--

Jignesh Sutar wrote
> Hi all,
>
> Is there not a dynamic way to group cases into quartiles (or any number of
> ntiles)?
>
> I am aware of visual binning but I want to have a dynamic syntax to be
> able
> to group cases on different datasets rather than have to hard code actual
> cut off points?
>
> There does not seem to be a direct way of doing so using either AGGREGATE
> or RANK which perhaps you would might have thought so, unless I am missing
> something here?
>
>
> Thanks in advance,
> Jignesh
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Dynamic-way-to-create-quartiles-tp5727552p5727553.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to

LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Dynamic way to create quartiles

Jignesh Sutar
Brilliant, thanks Rick. I'm probably going to have to setup a python solution to be able to set the labelling. 

On Wednesday, 26 November 2014, Rick Oliver <[hidden email]> wrote:
I think you'll find that visual binning is equivalent to TIES=LOW with RANK.

rank variables = Q1 /ntile(5) into Q1Bin5_2 /ties=low.
crosstabs Q1Bin5V1 by Q1Bin5_2.


Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: <a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;oliverr@us.ibm.com&#39;);" target="_blank">oliverr@...




From:        Jignesh Sutar <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;jsutar@gmail.com&#39;);" target="_blank">jsutar@...>
To:        <a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;SPSSX-L@LISTSERV.UGA.EDU&#39;);" target="_blank">SPSSX-L@...
Date:        11/26/2014 02:24 PM
Subject:        Re: Dynamic way to create quartiles
Sent by:        "SPSSX(r) Discussion" <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;SPSSX-L@LISTSERV.UGA.EDU&#39;);" target="_blank">SPSSX-L@...>




They do. But the options I am specifying here ought to be returning the same results, no?

On 26 November 2014 at 19:28, Rick Oliver <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;oliverr@us.ibm.com&#39;);" target="_blank">oliverr@...> wrote:
Rank has a number of different methods for dealing with ties.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;oliverr@us.ibm.com&#39;);" target="_blank">oliverr@...



From:        
Jignesh Sutar <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;jsutar@gmail.com&#39;);" target="_blank">jsutar@...>
To:        
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;SPSSX-L@LISTSERV.UGA.EDU&#39;);" target="_blank">SPSSX-L@...
Date:        
11/26/2014 01:14 PM
Subject:        
Re: Dynamic way to create quartiles
Sent by:        
"SPSSX(r) Discussion" <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;SPSSX-L@LISTSERV.UGA.EDU&#39;);" target="_blank">SPSSX-L@...>





Also, Visual Binning vs. RANK seem to give different results (I am specifying same number of tiles in each method. Is there some difference in their calculation I am missing?):


dataset close all.

new file.

output close all.


data list list /id Q1.

BEGIN DATA

1 3

2 10

3 40

4 50

5 50

6 60

7 70

8 80

9 80

10 80

11 100

12 100

13 100

14 100

15 100

16 100

17 100

18 100

19 110

20 120

21 120

22 150

23 150

24 150

25 150

26 150

27 150

28 150

29 150

30 150

31 160

32 180

33 180

34 180

35 200

36 200

37 200

38 200

39 200

40 200

41 200

42 200

43 200

44 200

45 250

46 250

47 250

48 250

49 250

50 250

51 270

52 270

53 300

54 300

55 300

56 300

57 300

58 300

59 300

60 300

61 400

62 500

63 500

64 600

65 900

end data.

variable level Q1 (scale).


* Visual Binning - Version 1 Upper end points Included (<=).

*Q1.

RECODE  Q1 (MISSING=COPY) (LO THRU 100=1) (LO THRU 150=2) (LO THRU 200=3) (LO THRU 285=4) (LO THRU 

    HI=5) (ELSE=SYSMIS) INTO Q1Bin5V1.

VARIABLE LABELS  Q1Bin5V1 'Q1 (Binned)'.

FORMATS  Q1Bin5V1 (F5.0).

VALUE LABELS  Q1Bin5V1 1 '<= 100.00' 2 '101.00 - 150.00' 3 '151.00 - 200.00' 4 '201.00 - 285.00' 5 '286.00+'.

VARIABLE LEVEL  Q1Bin5V1 (ORDINAL).



* Visual Binning - Version 2 Upper end points Excluded (<).

*Q1.

RECODE  Q1 (MISSING=COPY) (285 THRU HI=5) (200 THRU HI=4) (150 THRU HI=3) (100 THRU HI=2) (LO THRU 

    HI=1) (ELSE=SYSMIS) INTO Q1Bin5V2.

VARIABLE LABELS  Q1Bin5V2 'Q1 (Binned)'.

FORMATS  Q1Bin5V2 (F5.0).

VALUE LABELS  Q1Bin5V2 1 '< 100.00' 2 '100.00 - 149.00' 3 '150.00 - 199.00' 4 '200.00 - 284.00' 5 '285.00+'.

VARIABLE LEVEL  Q1Bin5V2 (ORDINAL).



rank variables = Q1 /rank /ntile(5) into Q1Bin5_2.


cro Q1Bin5V1 Q1Bin5V2 by Q1Bin5_2.


On 26 November 2014 at 17:38, Jignesh Sutar <
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;jsutar@gmail.com&#39;);" target="_blank">jsutar@...> wrote:
Are there any extension/modules that replicate Visual Binning? Specifically I am wanting to bin a variable into NTILES. Visually binning (GUI) generates the recode specification and value labels to achieve this. However I need to do this over multiple variables so would like to generalize the code so that it can be done programmatically rather than having to static/fixed definitions. And also so that I can set something up on interim data which would also then be valid on final or any updated data.


I could use RANK to create the NTILES but then there is no way of attributing the correct cut off point labels.


Visual binning creates non-breaking labels whereas if I was to read the minimum/maximum value from each of the bins created from RANK I may not be able to deduce the actual bin start and end points.

On 9 October 2014 at 12:08, Jignesh Sutar <
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;jsutar@gmail.com&#39;);" target="_blank">jsutar@...> wrote:
ahh yes of course! My mind was looking for a one liner solution but I'll settle for a two liner!

On 9 October 2014 12:02, David Marso <
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;david.marso@gmail.com&#39;);" target="_blank">david.marso@...> wrote:
Yes you must be missing something!
RANK (NTILES) followed by AGGREGATE using Max?
Perhaps post your failed attempts?
--

Jignesh Sutar wrote
> Hi all,
>
> Is there not a dynamic way to group cases into quartiles (or any number of
> ntiles)?
>
> I am aware of visual binning but I want to have a dynamic syntax to be
> able
> to group cases on different datasets rather than have to hard code actual
> cut off points?
>
> There does not seem to be a direct way of doing so using either AGGREGATE
> or RANK which perhaps you would might have thought so, unless I am missing
> something here?
>
>
> Thanks in advance,
> Jignesh
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Dynamic-way-to-create-quartiles-tp5727552p5727553.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to

<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;LISTSERV@LISTSERV.UGA.EDU&#39;);" target="_blank">LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;LISTSERV@LISTSERV.UGA.EDU&#39;);" target="_blank">LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to <a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;LISTSERV@LISTSERV.UGA.EDU&#39;);" target="_blank">LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD