Employee data.sav: When was the data collected?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Employee data.sav: When was the data collected?

Bruce Weaver
Administrator
Short version of the question:  Does anyone know what year the data in the
"Employee data.sav" sample file were gathered?  


Now here's the longer version.

Here is a listing of the first 10 records from "Employee data.sav", one of
the sample files that comes with (or at least used to come with) SPSS:

  id gender      bdate educ jobcat   salary salbegin jobtime prevexp
minority
 
   1 m      02/03/1952  15     3    $57,000  $27,000    98       144     0
   2 m      05/23/1958  16     1    $40,200  $18,750    98        36     0
   3 f      07/26/1929  12     1    $21,450  $12,000    98       381     0
   4 f      04/15/1947   8     1    $21,900  $13,200    98       190     0
   5 m      02/09/1955  15     1    $45,000  $21,000    98       138     0
   6 m      08/22/1958  15     1    $32,100  $13,500    98        67     0
   7 m      04/26/1956  15     1    $36,000  $18,750    98       114     0
   8 f      05/06/1966  12     1    $21,900   $9,750    98         0     0
   9 f      01/23/1946  15     1    $27,900  $12,750    98       115     0
  10 f      02/13/1946  12     1    $24,000  $13,500    98       244     0

I am considering using it for a class exercise, and if I do, I would like to
have the students compute an age variable.  Birth date is given, but I don't
know when the data were gathered.  I used SYSFILE INFO to find the file
creation date, which appears to be 24-Jan-2012.  But when I use that value
to compute age, I get ages ranging from 40 to 82 with a mean of about 55.
Those ages are higher than I expected.  I could always just knock 20 years
off and say that the data were gathered in 1992.  But it would be nice to
have the actual date if anyone knows it.  

Cheers,
Bruce




-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Employee data.sav: When was the data collected?

Rick Oliver
It is essentially fake data, although based on real data. It does not stand up well to close scrutiny. It's based on data used in lawsuit. I think the income numbers were already a little outdated when I joined SPSS in 1989. At some point we bumped up the values, and we may have updated the age values at the same time, while also changing some values to make more interesting relationships. 

On Thu, Jan 16, 2020, 1:54 PM Bruce Weaver <[hidden email]> wrote:
Short version of the question:  Does anyone know what year the data in the
"Employee data.sav" sample file were gathered? 


Now here's the longer version.

Here is a listing of the first 10 records from "Employee data.sav", one of
the sample files that comes with (or at least used to come with) SPSS:

  id gender      bdate educ jobcat   salary salbegin jobtime prevexp
minority

   1 m      02/03/1952  15     3    $57,000  $27,000    98       144     0
   2 m      05/23/1958  16     1    $40,200  $18,750    98        36     0
   3 f      07/26/1929  12     1    $21,450  $12,000    98       381     0
   4 f      04/15/1947   8     1    $21,900  $13,200    98       190     0
   5 m      02/09/1955  15     1    $45,000  $21,000    98       138     0
   6 m      08/22/1958  15     1    $32,100  $13,500    98        67     0
   7 m      04/26/1956  15     1    $36,000  $18,750    98       114     0
   8 f      05/06/1966  12     1    $21,900   $9,750    98         0     0
   9 f      01/23/1946  15     1    $27,900  $12,750    98       115     0
  10 f      02/13/1946  12     1    $24,000  $13,500    98       244     0

I am considering using it for a class exercise, and if I do, I would like to
have the students compute an age variable.  Birth date is given, but I don't
know when the data were gathered.  I used SYSFILE INFO to find the file
creation date, which appears to be 24-Jan-2012.  But when I use that value
to compute age, I get ages ranging from 40 to 82 with a mean of about 55.
Those ages are higher than I expected.  I could always just knock 20 years
off and say that the data were gathered in 1992.  But it would be nice to
have the actual date if anyone knows it. 

Cheers,
Bruce




-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Employee data.sav: When was the data collected?

Bruce Weaver
Administrator
Thanks Rick (and Jon, who told me the same thing via email).  If I use
24-Jan-1995 as the date for calculating Age, things work out reasonably
well, although the salaries may be a bit low.  Given that it's just for an
assignment, I won't worry about that.  

Cheers,
Bruce


* Change path on next line as needed.
GET FILE "C:\SPSSdata\Employee data.sav".
* Assume data are from 1995.
COMPUTE Age = DATEDIFF(DATE.DMY(24,1,1995), bdate, "years").
* Compute Age minus years in current job.
COMPUTE Check1 = Age-jobtime/12. /* jobtime = time in job (mos).
* Compute Age minus years of previous experience.
COMPUTE Check2 = Age-prevexp/12. /* prevexp = previous experience (mos).

DESCRIPTIVES Age Check1 Check2 /STATISTICS=MIN MAX MEAN.

*Descriptive Statistics
                N Minimum Maximum Mean
   Age 473 23.00 65.00 37.7696
Check1 473 17.58 59.50 31.0078
Check2 473 21.08 60.00 29.7740
.

* Get median & IQR for salaries.
FREQUENCIES VARIABLES=salary
  /FORMAT=NOTABLE
  /NTILES=4
  /STATISTICS=MEAN MEDIAN
  /ORDER=ANALYSIS.

*Statistics
Current Salary
Mean $34,419.57
Median $28,875.00
Percentiles
        25 $24,000.00
        50 $28,875.00
        75 $37,162.50
.




Rick Oliver wrote
> It is essentially fake data, although based on real data. It does not
> stand
> up well to close scrutiny. It's based on data used in lawsuit. I think the
> income numbers were already a little outdated when I joined SPSS in 1989.
> At some point we bumped up the values, and we may have updated the age
> values at the same time, while also changing some values to make more
> interesting relationships.
>
> On Thu, Jan 16, 2020, 1:54 PM Bruce Weaver &lt;

> bruce.weaver@

> &gt; wrote:
>
>> Short version of the question:  Does anyone know what year the data in
>> the
>> "Employee data.sav" sample file were gathered?
>>
>>
>> Now here's the longer version.
>>
>> Here is a listing of the first 10 records from "Employee data.sav", one
>> of
>> the sample files that comes with (or at least used to come with) SPSS:
>>
>>   id gender      bdate educ jobcat   salary salbegin jobtime prevexp
>> minority
>>
>>    1 m      02/03/1952  15     3    $57,000  $27,000    98       144    
>> 0
>>    2 m      05/23/1958  16     1    $40,200  $18,750    98        36    
>> 0
>>    3 f      07/26/1929  12     1    $21,450  $12,000    98       381    
>> 0
>>    4 f      04/15/1947   8     1    $21,900  $13,200    98       190    
>> 0
>>    5 m      02/09/1955  15     1    $45,000  $21,000    98       138    
>> 0
>>    6 m      08/22/1958  15     1    $32,100  $13,500    98        67    
>> 0
>>    7 m      04/26/1956  15     1    $36,000  $18,750    98       114    
>> 0
>>    8 f      05/06/1966  12     1    $21,900   $9,750    98         0    
>> 0
>>    9 f      01/23/1946  15     1    $27,900  $12,750    98       115    
>> 0
>>   10 f      02/13/1946  12     1    $24,000  $13,500    98       244    
>> 0
>>
>> I am considering using it for a class exercise, and if I do, I would like
>> to
>> have the students compute an age variable.  Birth date is given, but I
>> don't
>> know when the data were gathered.  I used SYSFILE INFO to find the file
>> creation date, which appears to be 24-Jan-2012.  But when I use that
>> value
>> to compute age, I get ages ranging from 40 to 82 with a mean of about 55.
>> Those ages are higher than I expected.  I could always just knock 20
>> years
>> off and say that the data were gathered in 1992.  But it would be nice to
>> have the actual date if anyone knows it.
>>
>> Cheers,
>> Bruce
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>>

> bweaver@

>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).