Duplicate IDs

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Duplicate IDs

J Scelza
I'm working with a data set in which an ID was assigned to each department
who responded to our survey. We have several variables from a question
matrix that we would like to analyze, particularly which department did not
have women (1 variable) or minorities (another variable) working on their
staff.

There are multiple responses from each department, depending on how many
people they have on staff. For example, a department that had 3 males and 1
female on staff has 4 records in our dataset. A department who has 2 males
and 0 females on their staff would have 2 records. The dataset, if just
looking at these variables, would look like this:

CaseID   Gender
1009       2
1009       1
1009       1
1009       1
1048       1
1048       1

I would like to go through each block of IDs and determine the number of
departments that did NOT hire any females. What steps might I use to answer
this question? (particularly if I want to write the commands in a Syntax file).

Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Duplicate IDs

ViAnn Beadle
What is your goal here--do you simply want a report of the departments
without females or do you want to get a result file in which each case is a
department for further analysis?

If the former, you can use something like SUMMARIZE grouping by department
ID and request the PIN function on your GENDER variable--PIN(gender,1,1)
will give you the percentage of records with gender=1 for the break group.
If the latter, do the same thing in AGGREGATE breaking on department.

If the latter, you can use AGGREGATE breaking on ID and requesting the PIN
function.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of J
Scelza
Sent: Wednesday, July 09, 2008 1:10 PM
To: [hidden email]
Subject: Duplicate IDs

I'm working with a data set in which an ID was assigned to each department
who responded to our survey. We have several variables from a question
matrix that we would like to analyze, particularly which department did not
have women (1 variable) or minorities (another variable) working on their
staff.

There are multiple responses from each department, depending on how many
people they have on staff. For example, a department that had 3 males and 1
female on staff has 4 records in our dataset. A department who has 2 males
and 0 females on their staff would have 2 records. The dataset, if just
looking at these variables, would look like this:

CaseID   Gender
1009       2
1009       1
1009       1
1009       1
1048       1
1048       1

I would like to go through each block of IDs and determine the number of
departments that did NOT hire any females. What steps might I use to answer
this question? (particularly if I want to write the commands in a Syntax
file).

Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD