Add percentage or proportion to dataset

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Add percentage or proportion to dataset

Ananda Victoria
 

Hi

I have a problem to solve and I hope you can help me

I have a big data set and I have shown below a small sample to get the spss syntax.

Except the gender (numeric) other variables are stringing.

School_no       Year            subject      gender             Male           Female

1                          2010            Maths        2                  0.6               0

1                          2010            Maths        2                  0.6               0

1                          2010            Maths        2                  0.6               0

1                          2010            Maths        1                  0                   0.4

1                          2010            Maths        1                  0                   0.4

1                          2010            english       2                  0.5               0

1                          2010            english       2                  0.5               0

1                          2010            english       2                  0.5               0

1                          2010            english       1                  0                   0.5

1                          2010            english       1                  0                   0.5

1                          2010            english       1                  0                   0.5

1                          2011            Maths        2                  0.66             0

1                          2011            Maths        2                  0.66             0

1                          2011            Maths        1                  0                   0.33

1                          2011            english       2                  0.5               0

1                          2011            english       2                  0.5               0

1                          2011            english       1                  0                   0.5

1                          2011            english       1                  0                   0.5

2                          2010            Maths        2                  1                   0

2                          2010            Maths        2                  1                   0

2                          2010            english       2                  0.5               0

2                          2010            english       1                  0                   0.5

 

I need to get proportion (if possible percentage too) of gender(male=1, female=2) for a particular school_no,per year, per subject per each line. So the proportion will be same for same (School_no , Year , subject,  gender).

 

I need to create 2 columns(One with Female and one with male as shown above. I did this in excel to show the outcome )

I hope someone can help me with the script to get Male and fFmale columns from the dataset.

Thanks

Ananda

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Add percentage or proportion to dataset

Andy W
See the AGGREGATE command. I'm confused by your example though, why in the first line is Male 0.6 and Female 0? (I would expect female would be 0.4 per your criteria.) See example below.

**********************************************.
DATA LIST FREE / School_no  (F1.0) Year (F4.0) subject (A10) gender (F1.0) Male Female (2F2.1).
BEGIN DATA
1  2010 Maths    2 0.6   0
1  2010 Maths    2 0.6   0
1  2010 Maths    2 0.6   0
1  2010 Maths    1 0     0.4
1  2010 Maths    1 0     0.4
1  2010 english  2 0.5   0
1  2010 english  2 0.5   0
1  2010 english  2 0.5   0
1  2010 english  1 0     0.5
1  2010 english  1 0     0.5
1  2010 english  1 0     0.5
1  2011 Maths    2 0.66  0
1  2011 Maths    2 0.66  0
1  2011 Maths    1 0     0.33
1  2011 english  2 0.5   0
1  2011 english  2 0.5   0
1  2011 english  1 0     0.5
1  2011 english  1 0     0.5
2  2010 Maths    2 1     0
2  2010 Maths    2 1     0
2  2010 english  2 0.5   0
2  2010 english  1 0     0.5
END DATA.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES
  /BREAK School_no Year subject
  /Male_Per = FGT(gender,1)
  /Female_Per = FLT(gender,2).
**********************************************.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Add percentage or proportion to dataset

Ananda Victoria

Hi Andy

Thanks for your help Andy.

The reason for 0 in female column is in that,  it has female name etc. So can’t put the male proportion in the female column in the same row and need to put the proportion in Female column same row. (Gender 2=male and 1= female).

1.       How can I put zero in the line if not male or female in the row

2.       How can I have new variable called ‘Gender_2 and put both proportion in the same line (Female and male)?

 

Thanks for your help in advance

 

Ananda


> Date: Wed, 25 Nov 2015 05:47:00 -0700

> From: [hidden email]
> Subject: Re: Add percentage or proportion to dataset
> To: [hidden email]
>
> See the AGGREGATE command. I'm confused by your example though, why in the
> first line is Male 0.6 and Female 0? (I would expect female would be 0.4 per
> your criteria.) See example below.
>
> **********************************************.
> DATA LIST FREE / School_no (F1.0) Year (F4.0) subject (A10) gender (F1.0)
> Male Female (2F2.1).
> BEGIN DATA
> 1 2010 Maths 2 0.6 0
> 1 2010 Maths 2 0.6 0
> 1 2010 Maths 2 0.6 0
> 1 2010 Maths 1 0 0.4
> 1 2010 Maths 1 0 0.4
> 1 2010 english 2 0.5 0
> 1 2010 english 2 0.5 0
> 1 2010 english 2 0.5 0
> 1 2010 english 1 0 0.5
> 1 2010 english 1 0 0.5
> 1 2010 english 1 0 0.5
> 1 2011 Maths 2 0.66 0
> 1 2011 Maths 2 0.66 0
> 1 2011 Maths 1 0 0.33
> 1 2011 english 2 0.5 0
> 1 2011 english 2 0.5 0
> 1 2011 english 1 0 0.5
> 1 2011 english 1 0 0.5
> 2 2010 Maths 2 1 0
> 2 2010 Maths 2 1 0
> 2 2010 english 2 0.5 0
> 2 2010 english 1 0 0.5
> END DATA.
>
> AGGREGATE OUTFILE=* MODE=ADDVARIABLES
> /BREAK School_no Year subject
> /Male_Per = FGT(gender,1)
> /Female_Per = FLT(gender,2).
> **********************************************.
>
>
>
> -----
> Andy W
> [hidden email]
> http://andrewpwheeler.wordpress.com/
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Add-percentage-or-proportion-to-dataset-tp5731050p5731052.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Add percentage or proportion to dataset

PRogman
Your presented data structure is confusing; Variable Subject is not further increasing the data resolution as it is the same for each combination of school and year.  I miss a subject level variable or similar.
As one of Male and Female always is 0 a quick way to (almost) get a long data format is
COMPUTE Proportion = Male + Female.
EXECUTE.
Variables Male and Female are now redundant, but then you cannot know which cases to combine to get male and female proportions. There are 5 cases of '1  2010 Maths' and 3 cases of '2  2010 Maths'.

Another way, if every case is a separate case:
IF (Male=0)    M_prop   = 1 - Female.
IF (Female=0)  F_prop = 1 - Male.
EXECUTE.
this makes variable gender redundant (and is equivalent to the solution by Andy W)

HTH, PRogman

Aoba Victoria wrote
Hi Andy

Thanks for your help Andy.

The reason for 0 in female column is in that,  it has female name etc. So can’t put the male proportion
in the female column in the same row and need to put the proportion in Female column
same row. (Gender 2=male and 1= female).

1.      
How can I put zero in the line if not male or
female in the row

2.      
How can I have new variable called ‘Gender_2 and
put both proportion in the same line (Female and male)?

Thanks for your help in advance
Ananda

> Date: Wed, 25 Nov 2015 05:47:00 -0700
> From: [hidden email]
> Subject: Re: Add percentage or proportion to dataset
> To: [hidden email]
>
> See the AGGREGATE command. I'm confused by your example though, why in the
> first line is Male 0.6 and Female 0? (I would expect female would be 0.4 per
> your criteria.) See example below.
>
> **********************************************.
> DATA LIST FREE / School_no  (F1.0) Year (F4.0) subject (A10) gender (F1.0)
> Male Female (2F2.1).
> BEGIN DATA
> 1  2010 Maths    2 0.6   0
> 1  2010 Maths    2 0.6   0
> 1  2010 Maths    2 0.6   0
> 1  2010 Maths    1 0     0.4
> 1  2010 Maths    1 0     0.4
> 1  2010 english  2 0.5   0
> 1  2010 english  2 0.5   0
> 1  2010 english  2 0.5   0
> 1  2010 english  1 0     0.5
> 1  2010 english  1 0     0.5
> 1  2010 english  1 0     0.5
> 1  2011 Maths    2 0.66  0
> 1  2011 Maths    2 0.66  0
> 1  2011 Maths    1 0     0.33
> 1  2011 english  2 0.5   0
> 1  2011 english  2 0.5   0
> 1  2011 english  1 0     0.5
> 1  2011 english  1 0     0.5
> 2  2010 Maths    2 1     0
> 2  2010 Maths    2 1     0
> 2  2010 english  2 0.5   0
> 2  2010 english  1 0     0.5
> END DATA.
>
> AGGREGATE OUTFILE=* MODE=ADDVARIABLES
>   /BREAK School_no Year subject
>   /Male_Per = FGT(gender,1)
>   /Female_Per = FLT(gender,2).
> **********************************************.
>
> -----
> Andy W
> [hidden email]
> http://andrewpwheeler.wordpress.com/
> --

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Add percentage or proportion to dataset

Ananda Victoria
Hi Andy and PRogman

Thank you both very much for your help.
I managed to fix the problems using the below scripts with minor changes

AGGREGATE OUTFILE=* MODE=ADDVARIABLES
 /BREAK School_no Year subject
 /Male_Per = FGT(gender,1)
 /Female_Per = FLT(gender,2).


IF (Male=0) M_prop = 1 - Female.
IF (Female=0) F_prop = 1 - Male.
EXECUTE.

IF (Male=1) T_prop = 1 - Female.
IF (Female=2) T_prop = 1 - Male.
EXECUTE.

Thanks again and have a nice weekend

Regards

Ananda


> Date: Thu, 26 Nov 2015 16:51:21 -0700

> From: [hidden email]
> Subject: Re: Add percentage or proportion to dataset
> To: [hidden email]
>
> Your presented data structure is confusing; Variable Subject is not further
> increasing the data resolution as it is the same for each combination of
> school and year. I miss a subject level variable or similar.
> As one of Male and Female always is 0 a quick way to (almost) get a long
> data format is
> COMPUTE Proportion = Male + Female.
> EXECUTE.
> Variables Male and Female are now redundant, but then you cannot know which
> cases to combine to get male and female proportions. There are 5 cases of '1
> 2010 Maths' and 3 cases of '2 2010 Maths'.
>
> Another way, if every case is a separate case:
> IF (Male=0) M_prop = 1 - Female.
> IF (Female=0) F_prop = 1 - Male.
> EXECUTE.
> this makes variable gender redundant (and is equivalent to the solution by
> Andy W)
>
> HTH, PRogman
>
>
> Aoba Victoria wrote
> > Hi Andy
> >
> > Thanks for your help Andy.
> >
> > The reason for 0 in female column is in that, it has female name etc. So
> > can’t put the male proportion
> > in the female column in the same row and need to put the proportion in
> > Female column
> > same row. (Gender 2=male and 1= female).
> >
> > 1.
> > How can I put zero in the line if not male or
> > female in the row
> >
> > 2.
> > How can I have new variable called ‘Gender_2 and
> > put both proportion in the same line (Female and male)?
> >
> > Thanks for your help in advance
> > Ananda
> >
> >> Date: Wed, 25 Nov 2015 05:47:00 -0700
> >> From:
>
> > apwheele@
>
> >> Subject: Re: Add percentage or proportion to dataset
> >> To:
>
> > SPSSX-L@.UGA
>
> >>
> >> See the AGGREGATE command. I'm confused by your example though, why in
> >> the
> >> first line is Male 0.6 and Female 0? (I would expect female would be 0.4
> >> per
> >> your criteria.) See example below.
> >>
> >> **********************************************.
> >> DATA LIST FREE / School_no (F1.0) Year (F4.0) subject (A10) gender
> >> (F1.0)
> >> Male Female (2F2.1).
> >> BEGIN DATA
> >> 1 2010 Maths 2 0.6 0
> >> 1 2010 Maths 2 0.6 0
> >> 1 2010 Maths 2 0.6 0
> >> 1 2010 Maths 1 0 0.4
> >> 1 2010 Maths 1 0 0.4
> >> 1 2010 english 2 0.5 0
> >> 1 2010 english 2 0.5 0
> >> 1 2010 english 2 0.5 0
> >> 1 2010 english 1 0 0.5
> >> 1 2010 english 1 0 0.5
> >> 1 2010 english 1 0 0.5
> >> 1 2011 Maths 2 0.66 0
> >> 1 2011 Maths 2 0.66 0
> >> 1 2011 Maths 1 0 0.33
> >> 1 2011 english 2 0.5 0
> >> 1 2011 english 2 0.5 0
> >> 1 2011 english 1 0 0.5
> >> 1 2011 english 1 0 0.5
> >> 2 2010 Maths 2 1 0
> >> 2 2010 Maths 2 1 0
> >> 2 2010 english 2 0.5 0
> >> 2 2010 english 1 0 0.5
> >> END DATA.
> >>
> >> AGGREGATE OUTFILE=* MODE=ADDVARIABLES
> >> /BREAK School_no Year subject
> >> /Male_Per = FGT(gender,1)
> >> /Female_Per = FLT(gender,2).
> >> **********************************************.
> >>
> >> -----
> >> Andy W
> >>
>
> > apwheele@
>
> >> http://andrewpwheeler.wordpress.com/
> >> --
> >
> > =====================
> > To manage your subscription to SPSSX-L, send a message to
>
> > LISTSERV@.UGA
>
> > (not to SPSSX-L), with no body text except the
> > command. To leave the list, send the command
> > SIGNOFF SPSSX-L
> > For a list of commands to manage subscriptions, send the command
> > INFO REFCARD
>
>
>
>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Add-percentage-or-proportion-to-dataset-tp5731050p5731059.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD