Syntax help - duplicating variables

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Syntax help - duplicating variables

Jack Noone
Dear colleagues,

I would like to be able to duplicate a variable (V1)  X times according to
the value of another variable (V2).

In other words, I would like to convert this:

        V1      V2
Case1   43.2    3
Case2   48.1    4

To this:
        V1      V2      V2_a    V2_b    V2_c    V2_d
Case1   43.2    3       43.2    43.2    43.2    .
Case2   48.1    4       48.1    48.1    48.1    48.1


Unfortunately my syntax skills aren't up to the task. Could anyone offer
any assistance please?

Thanks,

Jack

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Richard Ristow
At 09:40 PM 10/9/2012, Jack Noone wrote:

>I would like to be able to duplicate a variable (V1)  X times according to
>the value of another variable (V2). In other words, I would like to
>convert this:
>
>         V1      V2
>Case1   43.2    3
>Case2   48.1    4
>
>To this:
>         V1      V2      V2_a    V2_b    V2_c    V2_d
>Case1   43.2    3       43.2    43.2    43.2    .
>Case2   48.1    4       48.1    48.1    48.1    48.1

I'm tossing this off, untested. Besides any mistakes I may make, one
warning: it doesn't check for invalid values of V2 (though it does
cap at 6 copies).

NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
VECTOR  New_V V2_a TO V2_f.

LOOP  #idx = 1 TO (MIN(V2,6)).
.  COMPUTE New_V(#idx) = V1.
END LOOP.

Let me end with the question that bothers me: Why do you want to do
this? If you tell us what you need to accomplish, there may be
another way to do it.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Jack Noone
Hi Richard and others.

The variable V1 represents a continuous socioeconomic indicator for a
participant's given job and the variable V2 represents the number of years
a participant has worked in this position.

I want to create a wide data set (I've not had much to do with long data
sets) with 62 variables. These variables correspond to each year from 1950
to 2012 and the value for each variable will be the socioeconomic
indicator for a job held during that year.

I have variables that represent:
1) The year each job started
2) The year each job ended
3) the number of years in each job (subtract 1 from 2 to create V2
variable)
4) The socioeconomic value for each job held (V1 variable).

So in reality I have:

V1_1 (SES for first job), V1_2 (SES for second job),     . V1_j (SES for
last job)
V2_1 (Number of years in first job), V2_2,       ..V2_J (number of years
in last job)

I would like to turn this into something like this:

        1950    1951    1952    1953    1954    1955    1956    1957    1958    1959   ..2012
Case1   43.2    43.2    43.2    60.2    60.2    60.2    60.2    60.2    60.2    60.2  ...

This is a person who started  their first job (SES value = 43.2) in 1950
and left the job in 1952. They started their second job (SES value=60.2)
in 1953 and left it in 2012.

There are also lots of little caveats along the way too!

In the end, the data is going to form the basis for growth mixture
modeling.

I'm sure their is a better way and I am open to suggestions.

Jack

On 10/10/12 3:48 PM, "Richard Ristow" <[hidden email]> wrote:

>At 09:40 PM 10/9/2012, Jack Noone wrote:
>
>>I would like to be able to duplicate a variable (V1)  X times according
>>to
>>the value of another variable (V2). In other words, I would like to
>>convert this:
>>
>>         V1      V2
>>Case1   43.2    3
>>Case2   48.1    4
>>
>>To this:
>>         V1      V2      V2_a    V2_b    V2_c    V2_d
>>Case1   43.2    3       43.2    43.2    43.2    .
>>Case2   48.1    4       48.1    48.1    48.1    48.1
>
>I'm tossing this off, untested. Besides any mistakes I may make, one
>warning: it doesn't check for invalid values of V2 (though it does
>cap at 6 copies).
>
>NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
>VECTOR  New_V V2_a TO V2_f.
>
>LOOP  #idx = 1 TO (MIN(V2,6)).
>.  COMPUTE New_V(#idx) = V1.
>END LOOP.
>
>Let me end with the question that bothers me: Why do you want to do
>this? If you tell us what you need to accomplish, there may be
>another way to do it.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

John F Hall
In reply to this post by Richard Ristow
Not the answer you need, but an opportunity for me to ask yet again the same
question I was asking in 1974, why doesn't SPSS have a facility for
automatic generation of variables ending in alphabetic characters?

DO REPEAT
 X = v2_a to v2_d
 ~ ~ ~ ~
END REPEAT.


John F Hall (Mr)

Email:    [hidden email]
Website: www.surveyresearch.weebly.com








-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Richard Ristow
Sent: 10 October 2012 04:49
To: [hidden email]
Subject: Re: Syntax help - duplicating variables

At 09:40 PM 10/9/2012, Jack Noone wrote:

>I would like to be able to duplicate a variable (V1)  X times according
>to the value of another variable (V2). In other words, I would like to
>convert this:
>
>         V1      V2
>Case1   43.2    3
>Case2   48.1    4
>
>To this:
>         V1      V2      V2_a    V2_b    V2_c    V2_d
>Case1   43.2    3       43.2    43.2    43.2    .
>Case2   48.1    4       48.1    48.1    48.1    48.1

I'm tossing this off, untested. Besides any mistakes I may make, one
warning: it doesn't check for invalid values of V2 (though it does cap at 6
copies).

NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
VECTOR  New_V V2_a TO V2_f.

LOOP  #idx = 1 TO (MIN(V2,6)).
.  COMPUTE New_V(#idx) = V1.
END LOOP.

Let me end with the question that bothers me: Why do you want to do this? If
you tell us what you need to accomplish, there may be another way to do it.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Rich Ulrich
In reply to this post by Jack Noone
I think you will save yourself a lot of grief in the long run if
you do take the trouble to use a "long data set" version.
For one thing, I would not trust the number of years to add up
to the right total until they have been massaged a bit, because
"years in job" is not likely to be without rounding error.

So -- Write out the number of lines, using XSAVE, for each
job.  For that new file, find the people who inconstant figures
for total-time versus the number of lines.  The long form is
also useful for any other edits that you might want to do on
SES or whatever, since every occurrence of SES becomes the
same variable in the long form.

If you do need the file back in wide form, you can use
CasesToVars.

--
Rich Ulrich


> Date: Wed, 10 Oct 2012 03:33:23 +0000

> From: [hidden email]
> Subject: Re: Syntax help - duplicating variables
> To: [hidden email]
>
> Hi Richard and others.
>
> The variable V1 represents a continuous socioeconomic indicator for a
> participant's given job and the variable V2 represents the number of years
> a participant has worked in this position.
>
> I want to create a wide data set (I've not had much to do with long data
> sets) with 62 variables. These variables correspond to each year from 1950
> to 2012 and the value for each variable will be the socioeconomic
> indicator for a job held during that year.
>
> I have variables that represent:
> 1) The year each job started
> 2) The year each job ended
> 3) the number of years in each job (subtract 1 from 2 to create V2
> variable)
> 4) The socioeconomic value for each job held (V1 variable).
>
> So in reality I have:
>
> V1_1 (SES for first job), V1_2 (SES for second job), . V1_j (SES for
> last job)
> V2_1 (Number of years in first job), V2_2, ..V2_J (number of years
> in last job)
>
> I would like to turn this into something like this:
>
> 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 ..2012
> Case1 43.2 43.2 43.2 60.2 60.2 60.2 60.2 60.2 60.2 60.2 ...
>
> This is a person who started their first job (SES value = 43.2) in 1950
> and left the job in 1952. They started their second job (SES value=60.2)
> in 1953 and left it in 2012.
>
> There are also lots of little caveats along the way too!
>
> In the end, the data is going to form the basis for growth mixture
> modeling.
>
> I'm sure their is a better way and I am open to suggestions.
>
> Jack
>
> On 10/10/12 3:48 PM, "Richard Ristow" <[hidden email]> wrote:
>
> >At 09:40 PM 10/9/2012, Jack Noone wrote:
> >
> >>I would like to be able to duplicate a variable (V1) X times according
> >>to
> >>the value of another variable (V2). In other words, I would like to
> >>convert this:
> >>
> >> V1 V2
> >>Case1 43.2 3
> >>Case2 48.1 4
> >>
> >>To this:
> >> V1 V2 V2_a V2_b V2_c V2_d
> >>Case1 43.2 3 43.2 43.2 43.2 .
> >>Case2 48.1 4 48.1 48.1 48.1 48.1
> >
> >I'm tossing this off, untested. Besides any mistakes I may make, one
> >warning: it doesn't check for invalid values of V2 (though it does
> >cap at 6 copies).
> >
> >NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
> >VECTOR New_V V2_a TO V2_f.
> >
> >LOOP #idx = 1 TO (MIN(V2,6)).
> >. COMPUTE New_V(#idx) = V1.
> >END LOOP.
> >
> >Let me end with the question that bothers me: Why do you want to do
> >this? If you tell us what you need to accomplish, there may be
> >another way to do it.
> >
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

David Marso
Administrator
Additionally,  AFAIK growth mixture modelling in SPSS would most likely require the MIXED procedure which presumes a LONG format for the data!!!
Rich Ulrich-2 wrote
I think you will save yourself a lot of grief in the long run if
you do take the trouble to use a "long data set" version.
For one thing, I would not trust the number of years to add up
to the right total until they have been massaged a bit, because
"years in job" is not likely to be without rounding error.

So -- Write out the number of lines, using XSAVE, for each
job.  For that new file, find the people who inconstant figures
for total-time versus the number of lines.  The long form is
also useful for any other edits that you might want to do on
SES or whatever, since every occurrence of SES becomes the
same variable in the long form.

If you do need the file back in wide form, you can use
CasesToVars.

--
Rich Ulrich


> Date: Wed, 10 Oct 2012 03:33:23 +0000
> From: [hidden email]
> Subject: Re: Syntax help - duplicating variables
> To: [hidden email]
>
> Hi Richard and others.
>
> The variable V1 represents a continuous socioeconomic indicator for a
> participant's given job and the variable V2 represents the number of years
> a participant has worked in this position.
>
> I want to create a wide data set (I've not had much to do with long data
> sets) with 62 variables. These variables correspond to each year from 1950
> to 2012 and the value for each variable will be the socioeconomic
> indicator for a job held during that year.
>
> I have variables that represent:
> 1) The year each job started
> 2) The year each job ended
> 3) the number of years in each job (subtract 1 from 2 to create V2
> variable)
> 4) The socioeconomic value for each job held (V1 variable).
>
> So in reality I have:
>
> V1_1 (SES for first job), V1_2 (SES for second job),     . V1_j (SES for
> last job)
> V2_1 (Number of years in first job), V2_2,       ..V2_J (number of years
> in last job)
>
> I would like to turn this into something like this:
>
>         1950    1951    1952    1953    1954    1955    1956    1957    1958    1959   ..2012
> Case1   43.2    43.2    43.2    60.2    60.2    60.2    60.2    60.2    60.2    60.2  ...
>
> This is a person who started  their first job (SES value = 43.2) in 1950
> and left the job in 1952. They started their second job (SES value=60.2)
> in 1953 and left it in 2012.
>
> There are also lots of little caveats along the way too!
>
> In the end, the data is going to form the basis for growth mixture
> modeling.
>
> I'm sure their is a better way and I am open to suggestions.
>
> Jack
>
> On 10/10/12 3:48 PM, "Richard Ristow" <[hidden email]> wrote:
>
> >At 09:40 PM 10/9/2012, Jack Noone wrote:
> >
> >>I would like to be able to duplicate a variable (V1)  X times according
> >>to
> >>the value of another variable (V2). In other words, I would like to
> >>convert this:
> >>
> >>         V1      V2
> >>Case1   43.2    3
> >>Case2   48.1    4
> >>
> >>To this:
> >>         V1      V2      V2_a    V2_b    V2_c    V2_d
> >>Case1   43.2    3       43.2    43.2    43.2    .
> >>Case2   48.1    4       48.1    48.1    48.1    48.1
> >
> >I'm tossing this off, untested. Besides any mistakes I may make, one
> >warning: it doesn't check for invalid values of V2 (though it does
> >cap at 6 copies).
> >
> >NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
> >VECTOR  New_V V2_a TO V2_f.
> >
> >LOOP  #idx = 1 TO (MIN(V2,6)).
> >.  COMPUTE New_V(#idx) = V1.
> >END LOOP.
> >
> >Let me end with the question that bothers me: Why do you want to do
> >this? If you tell us what you need to accomplish, there may be
> >another way to do it.
> >
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Jack Noone
Thanks All. I'll do it in the long format as you suggest. However, I'll be
doing the analysis in Mplus.

Regards,

Jack

On 11/10/12 6:42 AM, "David Marso" <[hidden email]> wrote:

>Additionally,  AFAIK growth mixture modelling in SPSS would most likely
>require the *MIXED* procedure which presumes a *LONG* format for the
>data!!!
>
>Rich Ulrich-2 wrote
>> I think you will save yourself a lot of grief in the long run if
>> you do take the trouble to use a "long data set" version.
>> For one thing, I would not trust the number of years to add up
>> to the right total until they have been massaged a bit, because
>> "years in job" is not likely to be without rounding error.
>>
>> So -- Write out the number of lines, using XSAVE, for each
>> job.  For that new file, find the people who inconstant figures
>> for total-time versus the number of lines.  The long form is
>> also useful for any other edits that you might want to do on
>> SES or whatever, since every occurrence of SES becomes the
>> same variable in the long form.
>>
>> If you do need the file back in wide form, you can use
>> CasesToVars.
>>
>> --
>> Rich Ulrich
>>
>>
>>> Date: Wed, 10 Oct 2012 03:33:23 +0000
>>> From:
>
>> jack.noone@.edu
>
>>> Subject: Re: Syntax help - duplicating variables
>>> To:
>
>> SPSSX-L@.UGA
>
>>>
>>> Hi Richard and others.
>>>
>>> The variable V1 represents a continuous socioeconomic indicator for a
>>> participant's given job and the variable V2 represents the number of
>>> years
>>> a participant has worked in this position.
>>>
>>> I want to create a wide data set (I've not had much to do with long
>>>data
>>> sets) with 62 variables. These variables correspond to each year from
>>> 1950
>>> to 2012 and the value for each variable will be the socioeconomic
>>> indicator for a job held during that year.
>>>
>>> I have variables that represent:
>>> 1) The year each job started
>>> 2) The year each job ended
>>> 3) the number of years in each job (subtract 1 from 2 to create V2
>>> variable)
>>> 4) The socioeconomic value for each job held (V1 variable).
>>>
>>> So in reality I have:
>>>
>>> V1_1 (SES for first job), V1_2 (SES for second job),     . V1_j (SES
>>>for
>>> last job)
>>> V2_1 (Number of years in first job), V2_2,       ..V2_J (number of
>>>years
>>> in last job)
>>>
>>> I would like to turn this into something like this:
>>>
>>>         1950    1951    1952    1953    1954    1955    1956    1957
>>> 1958    1959   ..2012
>>> Case1   43.2    43.2    43.2    60.2    60.2    60.2    60.2    60.2
>>> 60.2    60.2  ...
>>>
>>> This is a person who started  their first job (SES value = 43.2) in
>>>1950
>>> and left the job in 1952. They started their second job (SES
>>>value=60.2)
>>> in 1953 and left it in 2012.
>>>
>>> There are also lots of little caveats along the way too!
>>>
>>> In the end, the data is going to form the basis for growth mixture
>>> modeling.
>>>
>>> I'm sure their is a better way and I am open to suggestions.
>>>
>>> Jack
>>>
>>> On 10/10/12 3:48 PM, "Richard Ristow" &lt;
>
>> wrristow@
>
>> &gt; wrote:
>>>
>>> >At 09:40 PM 10/9/2012, Jack Noone wrote:
>>> >
>>> >>I would like to be able to duplicate a variable (V1)  X times
>>>according
>>> >>to
>>> >>the value of another variable (V2). In other words, I would like to
>>> >>convert this:
>>> >>
>>> >>         V1      V2
>>> >>Case1   43.2    3
>>> >>Case2   48.1    4
>>> >>
>>> >>To this:
>>> >>         V1      V2      V2_a    V2_b    V2_c    V2_d
>>> >>Case1   43.2    3       43.2    43.2    43.2    .
>>> >>Case2   48.1    4       48.1    48.1    48.1    48.1
>>> >
>>> >I'm tossing this off, untested. Besides any mistakes I may make, one
>>> >warning: it doesn't check for invalid values of V2 (though it does
>>> >cap at 6 copies).
>>> >
>>> >NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
>>> >VECTOR  New_V V2_a TO V2_f.
>>> >
>>> >LOOP  #idx = 1 TO (MIN(V2,6)).
>>> >.  COMPUTE New_V(#idx) = V1.
>>> >END LOOP.
>>> >
>>> >Let me end with the question that bothers me: Why do you want to do
>>> >this? If you tell us what you need to accomplish, there may be
>>> >another way to do it.
>>> >
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>>
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5715581.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Richard Ristow
In reply to this post by Jack Noone
At 11:33 PM 10/9/2012, Jack Noone wrote:

>The variable V1 represents a continuous
>socioeconomic indicator for a participant's
>given job and the variable V2 represents the
>number of years a participant has worked in this position.
>
>I have variables that represent:
>1) The year each job started
>2) The year each job ended
>4) The socioeconomic value for each job held (V1 variable).

Thanks. Notice that this is considerably more
complex than the problem you originally posted.
Generally, we can give you better answers, and
more quickly, if you give us the full problem and context at the beginning.

>3) the number of years in each job (subtract 1 from 2 to create V2 variable)

You probably don't want to create and use these.
The beginning and ending years, themselves, are
much easier to work with accurately.

Now, how are your variables laid out in the
dataset? If all the beginning years are together,
all the ending years together, and all the SES
variables together, you can define each kind of
variable as a vector, and the code is much
easier. But I'll bet it's not that way; that you
have Start_year_1, End_Year_1, SES_1, then
Start_year_2, End_Year_2, SES_2, and so on for
however many jobs your records accommodate.

Then, are there variables other than SES that you
want to have for each year? I'll bet there are.
Are those variables associated with the jobs
held, i.e. don't change during the duration of
the job, or do they change at times besides when
a job changes? I'll bet they do.

As others have said, a long-form dataset will be
easier to create. It's almost imperative if there
are any variables that vary with time but change
at other times than when jobs start and end. For
many analyses, the long-form data is easier to
work with, and for some it is essential. And, as
Rich Ulrich noted, if you finally do need a wide
dataset, you can create it easily from the long dataset, using CASESTOVARS.

The following code is tested (see listing in
Appendix). Notice that it unrolls your data
twice, first to one record per job, then to one record per year.
Finally,
a.) The code destroys the input data, converting
it to another form. Make sure you have a copy saved elsewhere.
b.) If you do have variables other than SES,
you'll have to ask about them separately.
c.) File 'By_Year' must be an external file (or
file handle for one), not a dataset

Input data:
[TestData]
Case_ID StartYr1 EndYr1   SES1 StartYr2 EndYr2   SES2 StartYr3 EndYr3   SES3

Case1     1974    1982   42.10   1983    2000   45.80   2001    2005   46.00
Case2     1952    1970   39.90   1971    1980   42.00      .       .     .

Number of cases read:  2    Number of cases listed:  2


Code:
VARSTOCASES
    /MAKE StartYear FROM StartYr1 StartYr2 StartYr3
    /MAKE EndYear   FROM EndYr1   EndYr2   EndYr3
    /MAKE SES       FROM SES1     SES2     SES3
    /KEEP = Case_ID
    /NULL = DROP.

LIST.

NUMERIC Year (F4).
LOOP    Year  = StartYear TO EndYear.
.  XSAVE  OUTFILE=By_Year
        /KEEP=Case_ID Year SES.
END LOOP.
EXECUTE /* (yes, really needed) */.

GET FILE=By_Year

LIST.

SORT CASES BY Case_ID Year .
CASESTOVARS
  /ID      = Case_ID
  /INDEX   = Year
  /GROUPBY = VARIABLE .

================================
Appendix I:  Listing of test run
================================
*  C:\Documents and Settings\Richard\My Documents               .
*    \Technical\spssx-l\Z-2012\                                 .
*    2012-10-09 Noone - Syntax help - duplicating variables.SPS .


*  In response to postings                                      .
*  From:     Jack Noone <[hidden email]>              .
*  Subject:  Syntax help - duplicating variables                .
*  To:       [hidden email]                           .
*  Date:     [09:40 PM 10/9/2012]                               .
*     and follow-up                                             .
*  Date:     Wed, 10 Oct 2012 03:33:23 +0000                    .
*  From:     Jack Noone <[hidden email]>              .
*  Subject:  Re: Syntax help - duplicating variables            .
*  To: [hidden email]                                 .

*  I want to create a wide data set (I've not had much to do with
*  long data sets) with 62 variables [corresponging] to each year
*  from 1950 to 2012 ,and the value for each variable will be the
*  socioeconomic indicator for a job held during that year.
*
*  I have variables that represent:                                   .
*  1) The year each job started                                       .
*  2) The year each job ended                                         .
*  3) the number of years in each job (subtract 1 from 2 to create    .
*     V2 variable)                                                    .
*  4) The socioeconomic value for each job held (V1 variable).        .


*  Scratch file, as target for XSAVE:    ............................ .

FILE HANDLE By_Year
  /NAME='C:\Documents and Settings\Richard\My Documents'               +
          '\Temporary\SPSS\'                                           +
        '2012-10-09 Noone - Syntax help - duplicating variables'       +
        '-UNROLLED'                                                    +
        '.SAV'.


*  Test data:   ..................................................... .

PRESERVE.
SET MXWARNS 0.
DATA LIST LIST/
    Case_ID StartYr1 EndYr1  SES1
            StartYr2 EndYr2  SES2
            StartYr3 EndYr3  SES3
   (A5,     F4,  F4,  F5.2, F4,  F4,  F5.2, F4,  F4,  F5.2).
BEGIN DATA
    Case1   1974 1982 42.1  1983 2000 45.8  2001 2005 46.0
    Case2   1952 1970 39.9  1971 1980 42.0

 >Warning # 92
 >The limit of MXWARNS warnings in this data pass has been printed.  Further
 >warnings have been suppressed.

END DATA.
RESTORE.

DATASET NAME     TestData WINDOW=FRONT.
LIST.
   List
öòòòòòòòòòòòòòòòòòòòòòòòòòòòòòûòòòòòòòòòòòòòòòø
óOutput Created               ó10-OCT-2012    ó
ó                             ó21:49:59       ó
ùòòòòòòòòòòòòòòòòòòòòòòòòòòòòòôòòòòòòòòòòòòòòòú

[TestData]
Case_ID StartYr1 EndYr1   SES1 StartYr2 EndYr2   SES2 StartYr3 EndYr3   SES3

Case1     1974    1982   42.10   1983    2000   45.80   2001    2005   46.00
Case2     1952    1970   39.90   1971    1980   42.00      .       .     .

Number of cases read:  2    Number of cases listed:  2


VARSTOCASES
    /MAKE StartYear FROM StartYr1 StartYr2 StartYr3
    /MAKE EndYear   FROM EndYr1   EndYr2   EndYr3
    /MAKE SES       FROM SES1     SES2     SES3
    /KEEP = Case_ID
    /NULL = DROP.


Variables to Cases
[TestData]
Generated Variables
öòòòòòòòòòûòòòòòòø
óName     óLabel ó
ùòòòòòòòòòôòòòòòòú
óStartYearó<none>ó
óEndYear  ó<none>ó
óSES      ó<none>ó
õòòòòòòòòòüòòòòòò÷
Processing Statistics
öòòòòòòòòòòòòòûòòø
óVariables In ó10ó
óVariables Outó4 ó
õòòòòòòòòòòòòòüòò÷

LIST.
öòòòòòòòòòòòòòòòòòòòòòòòòòòòòòûòòòòòòòòòòòòòòòø
óOutput Created               ó10-OCT-2012    ó
ó                             ó21:49:59       ó
ùòòòòòòòòòòòòòòòòòòòòòòòòòòòòòôòòòòòòòòòòòòòòòú
[TestData]
Case_ID StartYear EndYear    SES

Case1      1974     1982   42.10
Case1      1983     2000   45.80
Case1      2001     2005   46.00
Case2      1952     1970   39.90
Case2      1971     1980   42.00

Number of cases read:  5    Number of cases listed:  5


NUMERIC Year (F4).
LOOP    Year  = StartYear TO EndYear.
.  XSAVE  OUTFILE=By_Year
        /KEEP=Case_ID Year SES.
END LOOP.
EXECUTE /* (yes, really needed) */.

GET FILE=By_Year

LIST.
öòòòòòòòòòòòòòòòòòòòòòòòòòòòòòûòòòòòòòòòòòòòòòø
óOutput Created               ó10-OCT-2012    ó
ó                             ó21:50:00       ó
õòòòòòòòòòòòòòüòòòòòòòòòòòòòòòüòòòòòòòòòòòòòòò÷
C:\Documents and Settings\Richard\My Documents\Temporary\SPSS\
   2012-10-09 Noone - Syntax help - duplicating variables-UNROLLED.SAV
Case_ID Year    SES

Case1   1974  42.10
Case1   1975  42.10
Case1   1976  42.10
Case1   1977  42.10
Case1   1978  42.10
Case1   1979  42.10
Case1   1980  42.10
Case1   1981  42.10
Case1   1982  42.10
Case1   1983  45.80
Case1   1984  45.80
Case1   1985  45.80
Case1   1986  45.80
Case1   1987  45.80
Case1   1988  45.80
Case1   1989  45.80
Case1   1990  45.80
Case1   1991  45.80
Case1   1992  45.80
Case1   1993  45.80
Case1   1994  45.80
Case1   1995  45.80
Case1   1996  45.80
Case1   1997  45.80
Case1   1998  45.80
Case1   1999  45.80
Case1   2000  45.80
Case1   2001  46.00
Case1   2002  46.00
Case1   2003  46.00
Case1   2004  46.00
Case1   2005  46.00
Case2   1952  39.90
Case2   1953  39.90
Case2   1954  39.90
Case2   1955  39.90
Case2   1956  39.90
Case2   1957  39.90
Case2   1958  39.90
Case2   1959  39.90
Case2   1960  39.90
Case2   1961  39.90
Case2   1962  39.90
Case2   1963  39.90
Case2   1964  39.90
Case2   1965  39.90
Case2   1966  39.90
Case2   1967  39.90
Case2   1968  39.90
Case2   1969  39.90
Case2   1970  39.90
Case2   1971  42.00
Case2   1972  42.00
Case2   1973  42.00
Case2   1974  42.00
Case2   1975  42.00
Case2   1976  42.00
Case2   1977  42.00
Case2   1978  42.00
Case2   1979  42.00
Case2   1980  42.00

Number of cases read:  61    Number of cases listed:  61

SORT CASES BY Case_ID Year .
CASESTOVARS
  /ID      = Case_ID
  /INDEX   = Year
  /GROUPBY = VARIABLE .

öòòòòòòòòòòòòòòòòòòòòòòòòòòûòòòòòòòòòòòòòòòø
óOutput Created            ó10-OCT-2012    ó
ó                          ó21:50:00       ó
õòòòòòòòòòòòòòüòòòòòòòòòòòòüòòòòòòòòòòòòòòò÷

  C:\Documents and Settings\Richard\My Documents\Temporary\SPSS\
    2012-10-09 Noone - Syntax help - duplicating variables-UNROLLED.SAV

Generated Variables
öòòòòòòòòûòòòòûòòòòòòòòø
óOriginalóYearóResult  ó
óVariableó    ùòòòòòòòòú
ó        ó    óName    ó
ùòòòòòòòòôòòòòôòòòòòòòòú
óSES     ó1952óSES.1952ó
ó        ó1953óSES.1953ó
ó        ó1954óSES.1954ó
ó        ó1955óSES.1955ó
ó        ó1956óSES.1956ó
ó        ó1957óSES.1957ó
ó        ó1958óSES.1958ó
ó        ó1959óSES.1959ó
ó        ó1960óSES.1960ó
ó        ó1961óSES.1961ó
ó        ó1962óSES.1962ó
ó        ó1963óSES.1963ó
ó        ó1964óSES.1964ó
ó        ó1965óSES.1965ó
ó        ó1966óSES.1966ó
ó        ó1967óSES.1967ó
ó        ó1968óSES.1968ó
ó        ó1969óSES.1969ó
ó        ó1970óSES.1970ó
ó        ó1971óSES.1971ó
ó        ó1972óSES.1972ó
ó        ó1973óSES.1973ó
ó        ó1974óSES.1974ó
ó        ó1975óSES.1975ó
ó        ó1976óSES.1976ó
ó        ó1977óSES.1977ó
ó        ó1978óSES.1978ó
ó        ó1979óSES.1979ó
ó        ó1980óSES.1980ó
ó        ó1981óSES.1981ó
ó        ó1982óSES.1982ó
ó        ó1983óSES.1983ó
ó        ó1984óSES.1984ó
ó        ó1985óSES.1985ó
ó        ó1986óSES.1986ó
ó        ó1987óSES.1987ó
ó        ó1988óSES.1988ó
ó        ó1989óSES.1989ó
ó        ó1990óSES.1990ó
ó        ó1991óSES.1991ó
ó        ó1992óSES.1992ó
ó        ó1993óSES.1993ó
ó        ó1994óSES.1994ó
ó        ó1995óSES.1995ó
ó        ó1996óSES.1996ó
ó        ó1997óSES.1997ó
ó        ó1998óSES.1998ó
ó        ó1999óSES.1999ó
ó        ó2000óSES.2000ó
ó        ó2001óSES.2001ó
ó        ó2002óSES.2002ó
ó        ó2003óSES.2003ó
ó        ó2004óSES.2004ó
ó        ó2005óSES.2005ó
õòòòòòòòòüòòòòüòòòòòòòò÷

Processing Statistics
öòòòòòòòòòòòòòòòûòòòòø
óCases In       ó61  ó
óCases Out      ó2   ó
ùòòòòòòòòòòòòòòòôòòòòú
óCases In/Cases ó30.5ó
óOut            ó    ó
ùòòòòòòòòòòòòòòòôòòòòú
óVariables In   ó3   ó
óVariables Out  ó55  ó
ùòòòòòòòòòòòòòòòôòòòòú
óIndex Values   ó54  ó
õòòòòòòòòòòòòòòòüòòòò÷
================================
Appendix II: Test data, and code
================================
*  C:\Documents and Settings\Richard\My Documents               .
*    \Technical\spssx-l\Z-2012\                                 .
*    2012-10-09 Noone - Syntax help - duplicating variables.SPS .


*  In response to postings                                      .
*  From:     Jack Noone <[hidden email]>              .
*  Subject:  Syntax help - duplicating variables                .
*  To:       [hidden email]                           .
*  Date:     [09:40 PM 10/9/2012]                               .
*     and follow-up                                             .
*  Date:     Wed, 10 Oct 2012 03:33:23 +0000                    .
*  From:     Jack Noone <[hidden email]>              .
*  Subject:  Re: Syntax help - duplicating variables            .
*  To: [hidden email]                                 .

*  I want to create a wide data set (I've not had much to do with
*  long data sets) with 62 variables [corresponging] to each year
*  from 1950 to 2012 ,and the value for each variable will be the
*  socioeconomic indicator for a job held during that year.
*
*  I have variables that represent:                                   .
*  1) The year each job started                                       .
*  2) The year each job ended                                         .
*  3) the number of years in each job (subtract 1 from 2 to create    .
*     V2 variable)                                                    .
*  4) The socioeconomic value for each job held (V1 variable).        .


*  Scratch file, as target for XSAVE:    ............................ .

FILE HANDLE By_Year
  /NAME='C:\Documents and Settings\Richard\My Documents'               +
          '\Temporary\SPSS\'                                           +
        '2012-10-09 Noone - Syntax help - duplicating variables'       +
        '-UNROLLED'                                                    +
        '.SAV'.


*  Test data:   ..................................................... .

PRESERVE.
SET MXWARNS 0.
DATA LIST LIST/
    Case_ID StartYr1 EndYr1  SES1
            StartYr2 EndYr2  SES2
            StartYr3 EndYr3  SES3
   (A5,     F4,  F4,  F5.2, F4,  F4,  F5.2, F4,  F4,  F5.2).
BEGIN DATA
    Case1   1974 1982 42.1  1983 2000 45.8  2001 2005 46.0
    Case2   1952 1970 39.9  1971 1980 42.0
END DATA.
RESTORE.

DATASET NAME     TestData WINDOW=FRONT.
LIST.

VARSTOCASES
    /MAKE StartYear FROM StartYr1 StartYr2 StartYr3
    /MAKE EndYear   FROM EndYr1   EndYr2   EndYr3
    /MAKE SES       FROM SES1     SES2     SES3
    /KEEP = Case_ID
    /NULL = DROP.

LIST.

NUMERIC Year (F4).
LOOP    Year  = StartYear TO EndYear.
.  XSAVE  OUTFILE=By_Year
        /KEEP=Case_ID Year SES.
END LOOP.
EXECUTE /* (yes, really needed) */.

GET FILE=By_Year

LIST.

SORT CASES BY Case_ID Year .
CASESTOVARS
  /ID      = Case_ID
  /INDEX   = Year
  /GROUPBY = VARIABLE .

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Richard Ristow
At 01:36 AM 10/13/2012, Jack Noone wrote, off-list:
(I'm replying on-list, as is my usual practice, because
+ Posting gives others the opportunity to respond. Someone else might
think of something I don't.
+ The list isn't just a question-and-answer resource; it's a forum
for all of us to learn more about SPSS. Posted questions and
responses can inform everyone on the list; off-list questions and
responses inform only one person.)

>I've located the original, long format, SPSS file! Here is an
>example/snippet of what one participant's data looks like in long format:
>
>P_ID    Job_number year_start year_end Job_SES  years_in_job
>
>0001    1               1964    1966    64.6    2
>0001    2               1966    1970    70.2    4
>
>I then ran the following syntax to create one line for every year in
>paid work and this works well:
>
>loop copy = 1 to years_in_job.
>.  xsave
>    outfile  = "/Users/jacknoone/desktop/expanded file.sav" /
>    keep     = copy all .
>end loop.
>execute.
>
>However, I appreciate the issue you identified with start/end years
>versus "years_in_job". In particular, some people have finished and
>started another job in the same year (as above) In this instance I
>would like to use the job that has the highest SES value. Other
>people are doing two part-time at once and again I would like to use
>the job with the highest SES rating.
>
>1. In the loop copy syntax above, what would be a better alternative
>to "years_in_job"?

As I wrote in my last post, it's far better to write out the calendar
year than the year-number within job. (Among other things, that's the
only way you can recognize when two jobs were held in the same year):

>loop year = year_start TO year_end.
>.  xsave
>       outfile  = "/Users/jacknoone/desktop/expanded file.sav" /
>       keep     = P_ID year Job_number Job_SES.
>end loop.
>execute.
>
>2. Is there a way to write syntax that would automatically select
>the highest SES job for any one year?

Quite easy, though this loses the job number in the process. AFTER
the above code,

GET FILE="/Users/jacknoone/desktop/expanded file.sav".
AGGREGATE
    /OUTFILE=*
    /BREAK  =P_ID year
    /JinYear 'Number of jobs held in calendar year'=NU
    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).

I haven't tested the code for this posting; apologies, for any mistakes.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Jack Noone
Hi Richard,

You may remember the thread below. The syntax you wrote was perfect,
however I need to keep some other variables as well and I can't seem to
figure out how to do it. Here is the piece of syntax in question.


GET FILE="/Users/jacknoone/desktop/expanded file.sav".
AGGREGATE
    /OUTFILE=*
    /BREAK  =P_ID year
    /JinYear 'Number of jobs held in calendar year'=NU
    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).

So, how would I fit
keep = p_id to AUSEI06_3digit.
into the syntax above

Thanks,

Jack


Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411





On 14/10/12 5:36 AM, "Richard Ristow" <[hidden email]> wrote:

>At 01:36 AM 10/13/2012, Jack Noone wrote, off-list:
>(I'm replying on-list, as is my usual practice, because
>+ Posting gives others the opportunity to respond. Someone else might
>think of something I don't.
>+ The list isn't just a question-and-answer resource; it's a forum
>for all of us to learn more about SPSS. Posted questions and
>responses can inform everyone on the list; off-list questions and
>responses inform only one person.)
>
>>I've located the original, long format, SPSS file! Here is an
>>example/snippet of what one participant's data looks like in long format:
>>
>>P_ID    Job_number year_start year_end Job_SES  years_in_job
>>
>>0001    1               1964    1966    64.6    2
>>0001    2               1966    1970    70.2    4
>>
>>I then ran the following syntax to create one line for every year in
>>paid work and this works well:
>>
>>loop copy = 1 to years_in_job.
>>.  xsave
>>    outfile  = "/Users/jacknoone/desktop/expanded file.sav" /
>>    keep     = copy all .
>>end loop.
>>execute.
>>
>>However, I appreciate the issue you identified with start/end years
>>versus "years_in_job". In particular, some people have finished and
>>started another job in the same year (as above) In this instance I
>>would like to use the job that has the highest SES value. Other
>>people are doing two part-time at once and again I would like to use
>>the job with the highest SES rating.
>>
>>1. In the loop copy syntax above, what would be a better alternative
>>to "years_in_job"?
>
>As I wrote in my last post, it's far better to write out the calendar
>year than the year-number within job. (Among other things, that's the
>only way you can recognize when two jobs were held in the same year):
>
>>loop year = year_start TO year_end.
>>.  xsave
>>       outfile  = "/Users/jacknoone/desktop/expanded file.sav" /
>>       keep     = P_ID year Job_number Job_SES.
>>end loop.
>>execute.
>>
>>2. Is there a way to write syntax that would automatically select
>>the highest SES job for any one year?
>
>Quite easy, though this loses the job number in the process. AFTER
>the above code,
>
>GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>AGGREGATE
>    /OUTFILE=*
>    /BREAK  =P_ID year
>    /JinYear 'Number of jobs held in calendar year'=NU
>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>
>I haven't tested the code for this posting; apologies, for any mistakes.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

David Marso
Administrator
Without reviewing the entire thread:
If  p_id to AUSEI06_3digit are CONSTANT within the structure (P_ID * year) simply add them to the list of BREAKS
ie
AGGREGATE
    /OUTFILE=*
    /BREAK  =P_ID to AUSEI06_3digit year
    /JinYear 'Number of jobs held in calendar year'=NU
    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).

Otherwise your question requires greater specificity.

Jack Noone wrote
Hi Richard,

You may remember the thread below. The syntax you wrote was perfect,
however I need to keep some other variables as well and I can't seem to
figure out how to do it. Here is the piece of syntax in question.


GET FILE="/Users/jacknoone/desktop/expanded file.sav".
AGGREGATE
    /OUTFILE=*
    /BREAK  =P_ID year
    /JinYear 'Number of jobs held in calendar year'=NU
    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).

So, how would I fit
keep = p_id to AUSEI06_3digit.
into the syntax above

Thanks,

Jack


Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411





On 14/10/12 5:36 AM, "Richard Ristow" <[hidden email]> wrote:

>At 01:36 AM 10/13/2012, Jack Noone wrote, off-list:
>(I'm replying on-list, as is my usual practice, because
>+ Posting gives others the opportunity to respond. Someone else might
>think of something I don't.
>+ The list isn't just a question-and-answer resource; it's a forum
>for all of us to learn more about SPSS. Posted questions and
>responses can inform everyone on the list; off-list questions and
>responses inform only one person.)
>
>>I've located the original, long format, SPSS file! Here is an
>>example/snippet of what one participant's data looks like in long format:
>>
>>P_ID    Job_number year_start year_end Job_SES  years_in_job
>>
>>0001    1               1964    1966    64.6    2
>>0001    2               1966    1970    70.2    4
>>
>>I then ran the following syntax to create one line for every year in
>>paid work and this works well:
>>
>>loop copy = 1 to years_in_job.
>>.  xsave
>>    outfile  = "/Users/jacknoone/desktop/expanded file.sav" /
>>    keep     = copy all .
>>end loop.
>>execute.
>>
>>However, I appreciate the issue you identified with start/end years
>>versus "years_in_job". In particular, some people have finished and
>>started another job in the same year (as above) In this instance I
>>would like to use the job that has the highest SES value. Other
>>people are doing two part-time at once and again I would like to use
>>the job with the highest SES rating.
>>
>>1. In the loop copy syntax above, what would be a better alternative
>>to "years_in_job"?
>
>As I wrote in my last post, it's far better to write out the calendar
>year than the year-number within job. (Among other things, that's the
>only way you can recognize when two jobs were held in the same year):
>
>>loop year = year_start TO year_end.
>>.  xsave
>>       outfile  = "/Users/jacknoone/desktop/expanded file.sav" /
>>       keep     = P_ID year Job_number Job_SES.
>>end loop.
>>execute.
>>
>>2. Is there a way to write syntax that would automatically select
>>the highest SES job for any one year?
>
>Quite easy, though this loses the job number in the process. AFTER
>the above code,
>
>GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>AGGREGATE
>    /OUTFILE=*
>    /BREAK  =P_ID year
>    /JinYear 'Number of jobs held in calendar year'=NU
>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>
>I haven't tested the code for this posting; apologies, for any mistakes.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Jack Noone
Hi David,

Thanks for your help, but unfortunately the syntax didn't work as I'd
hoped.

I believe the context for the problem is in thread below. But, according
to point number 2 (see bottom of thread), the original syntax was designed
to "automatically select
the highest SES job for any one year" and it did this perfectly. Some
people had more than one job in a calendar year and I wanted to select the
job with the highest socioeconomic rating.

But, if I add the other variables under the break command as suggested,
then the highest SES job for any one year is not selected out. I thought
that they were constant within the structure, but I now suspect that I
didn't understand what you meant. Could you elaborate please?

I also have a another, somewhat related, syntax query. Having converted my
long file to wide I end up with a file looking like this:

P_ID    ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6     .
1       34      34      .       .       48      48
2       48      48      48      75      75      75

This is simply a occupation-based socioeconomic index for each year of my
participants' working lives - exactly what I wanted. However, I need to
fill in the missing data by substituting in the last SES score. For
example, participant 1 was out of the workforce for year 3 and year 4 and
I would like to substitute in their SES score of 34 (from their last job)
for the two points of missing data.

I'm sure there is an easy way to do this, but I have no idea how.

Thanks,

Jack



Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411





On 14/11/12 5:39 PM, "David Marso" <[hidden email]> wrote:

>Without reviewing the entire thread:
>*If*  p_id to AUSEI06_3digit are *CONSTANT *within the structure (P_ID *
>year) simply add them to the list of BREAKS
>ie
>AGGREGATE
>    /OUTFILE=*
>    /BREAK  =P_ID to AUSEI06_3digit year
>    /JinYear 'Number of jobs held in calendar year'=NU
>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>
>*Otherwise *your question requires greater specificity.
>
>
>Jack Noone wrote
>> Hi Richard,
>>
>> You may remember the thread below. The syntax you wrote was perfect,
>> however I need to keep some other variables as well and I can't seem to
>> figure out how to do it. Here is the piece of syntax in question.
>>
>>
>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>> AGGREGATE
>>     /OUTFILE=*
>>     /BREAK  =P_ID year
>>     /JinYear 'Number of jobs held in calendar year'=NU
>>     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>
>> So, how would I fit
>> keep = p_id to AUSEI06_3digit.
>> into the syntax above
>>
>> Thanks,
>>
>> Jack
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 14/10/12 5:36 AM, "Richard Ristow" &lt;
>
>> wrristow@
>
>> &gt; wrote:
>>
>>>At 01:36 AM 10/13/2012, Jack Noone wrote, off-list:
>>>(I'm replying on-list, as is my usual practice, because
>>>+ Posting gives others the opportunity to respond. Someone else might
>>>think of something I don't.
>>>+ The list isn't just a question-and-answer resource; it's a forum
>>>for all of us to learn more about SPSS. Posted questions and
>>>responses can inform everyone on the list; off-list questions and
>>>responses inform only one person.)
>>>
>>>>I've located the original, long format, SPSS file! Here is an
>>>>example/snippet of what one participant's data looks like in long
>>>>format:
>>>>
>>>>P_ID    Job_number year_start year_end Job_SES  years_in_job
>>>>
>>>>0001    1               1964    1966    64.6    2
>>>>0001    2               1966    1970    70.2    4
>>>>
>>>>I then ran the following syntax to create one line for every year in
>>>>paid work and this works well:
>>>>
>>>>loop copy = 1 to years_in_job.
>>>>.  xsave
>>>>    outfile  = "/Users/jacknoone/desktop/expanded file.sav" /
>>>>    keep     = copy all .
>>>>end loop.
>>>>execute.
>>>>
>>>>However, I appreciate the issue you identified with start/end years
>>>>versus "years_in_job". In particular, some people have finished and
>>>>started another job in the same year (as above) In this instance I
>>>>would like to use the job that has the highest SES value. Other
>>>>people are doing two part-time at once and again I would like to use
>>>>the job with the highest SES rating.
>>>>
>>>>1. In the loop copy syntax above, what would be a better alternative
>>>>to "years_in_job"?
>>>
>>>As I wrote in my last post, it's far better to write out the calendar
>>>year than the year-number within job. (Among other things, that's the
>>>only way you can recognize when two jobs were held in the same year):
>>>
>>>>loop year = year_start TO year_end.
>>>>.  xsave
>>>>       outfile  = "/Users/jacknoone/desktop/expanded file.sav" /
>>>>       keep     = P_ID year Job_number Job_SES.
>>>>end loop.
>>>>execute.
>>>>
>>>>2. Is there a way to write syntax that would automatically select
>>>>the highest SES job for any one year?
>>>
>>>Quite easy, though this loses the job number in the process. AFTER
>>>the above code,
>>>
>>>GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>AGGREGATE
>>>    /OUTFILE=*
>>>    /BREAK  =P_ID year
>>>    /JinYear 'Number of jobs held in calendar year'=NU
>>>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>
>>>I haven't tested the code for this posting; apologies, for any mistakes.
>>>
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716178.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

David Marso
Administrator
Your initial followup:
"I need to keep some other variables as well and I can't seem to figure out how to do it."
If the values of these additional variables vary over year then you need to specify how these new variables will be represented in the new data file.  If they don't vary then everything should be exactly as if they were not used in the AGG BREAK.  Maybe time for you to post what the before/after (pre aggregate/post aggregated) data appear.

Point 2:
Data x1...x10
1 1 . . 3 4 5
-----
VECTOR V=V1 TO V10.
LOOP #=2 TO 10.
IF MISSING(V(#)) V(#)=V(#-1),
END LOOP.
----------------
Jack Noone wrote
Hi David,

Thanks for your help, but unfortunately the syntax didn't work as I'd
hoped.

I believe the context for the problem is in thread below. But, according
to point number 2 (see bottom of thread), the original syntax was designed
to "automatically select
the highest SES job for any one year" and it did this perfectly. Some
people had more than one job in a calendar year and I wanted to select the
job with the highest socioeconomic rating.

But, if I add the other variables under the break command as suggested,
then the highest SES job for any one year is not selected out. I thought
that they were constant within the structure, but I now suspect that I
didn't understand what you meant. Could you elaborate please?

I also have a another, somewhat related, syntax query. Having converted my
long file to wide I end up with a file looking like this:

P_ID    ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6     .
1       34      34      .       .       48      48
2       48      48      48      75      75      75

This is simply a occupation-based socioeconomic index for each year of my
participants' working lives - exactly what I wanted. However, I need to
fill in the missing data by substituting in the last SES score. For
example, participant 1 was out of the workforce for year 3 and year 4 and
I would like to substitute in their SES score of 34 (from their last job)
for the two points of missing data.

I'm sure there is an easy way to do this, but I have no idea how.

Thanks,

Jack



Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411





On 14/11/12 5:39 PM, "David Marso" <[hidden email]> wrote:

>Without reviewing the entire thread:
>*If*  p_id to AUSEI06_3digit are *CONSTANT *within the structure (P_ID *
>year) simply add them to the list of BREAKS
>ie
>AGGREGATE
>    /OUTFILE=*
>    /BREAK  =P_ID to AUSEI06_3digit year
>    /JinYear 'Number of jobs held in calendar year'=NU
>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>
>*Otherwise *your question requires greater specificity.
>
>
>Jack Noone wrote
>> Hi Richard,
>>
>> You may remember the thread below. The syntax you wrote was perfect,
>> however I need to keep some other variables as well and I can't seem to
>> figure out how to do it. Here is the piece of syntax in question.
>>
>>
>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>> AGGREGATE
>>     /OUTFILE=*
>>     /BREAK  =P_ID year
>>     /JinYear 'Number of jobs held in calendar year'=NU
>>     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>
>> So, how would I fit
>> keep = p_id to AUSEI06_3digit.
>> into the syntax above
>>
>> Thanks,
>>
>> Jack
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>> <SNIP>
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Jack Noone
Hi David and all,

Point 1:

Here is what the data looks like prior to AGG BREAK.

P_ID    year    job     yr_start        yr_stop         job_SES         self_employed
1       1964    1       1964            1965            48.4            1
1       1965    1       1964            1965            48.4            1
1       1965    2       1965            1967            48.4            0
1       1965    2       1965            1967            48.4            0
1       1965    2       1965            1967            48.4            0
1       1967    3       1967            1969            48.4            1
1       1967    3       1967            1969            48.4            1
1       1968    4       1968            1969            48.4            0
1       1969    4       1968            1969            48.4            0
1       1969    5       1969            1974            83.7            1
1       1969    5       1969            1974            83.7            1

And so on


However, we can see that people are holding more than one job in a
calendar year.
So I applied this syntax (℅ R.Ristow) with the aim to have only the
highest job_SES for any given year:

AGGREGATE
     /OUTFILE=*
     /BREAK  =P_ID year
     /JinYear 'Number of jobs held in calendar year'=NU
     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).

Which resulted in

P_ID    year    jinyear         job_SES
1       1964    1               48.4
1       1965    2               48.4
1       1966    1               48.4
1       1967    2               48.4
1       1968    2               48.4
1       1969    2               83.7
1       1970    1               83.7
1       1971    1               83.7

Perfect! In 1969, this participant held one job with a SES rating of 48.4
and one with SES rating of 83.7. However, only the higher rating SES value
is chosen for 1969.

However, I want to know if the person was self-employed for the job that
has been selected. So I tried this:

AGGREGATE
     /OUTFILE=*
     /BREAK  =P_ID year self_employed
     /JinYear 'Number of jobs held in calendar year'=NU
     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).

But it didn't work. Their were no error messages but the sorting by
Job_SES didn't run. HELP!


Point 2:

I converted the long format file to wide format so I could take a look at
the missing data.

I then applied this syntax after sorting the variables

VECTOR V=Job_ses_1 TO Job_SES_55.
LOOP #=2 TO 10.
IF MISSING(V(#)) V(#)=V(#-1),
END LOOP.

There were no errors but the missing data were not filled.




Sadly I don't have the knowledge to solve this one myself. HELP AGAIN!

Jack



Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411





On 16/11/12 1:23 AM, "David Marso" <[hidden email]> wrote:

>Your initial followup:
>"I need to keep some other variables as well and I can't seem to figure
>out
>how to do it."
>If the values of these additional variables vary over year then you need
>to
>specify how these new variables will be represented in the new data file.
>If they don't vary then everything should be exactly as if they were not
>used in the AGG BREAK.  Maybe time for you to post what the before/after
>(pre aggregate/post aggregated) data appear.
>
>Point 2:
>Data x1...x10
>1 1 . . 3 4 5
>-----
>VECTOR V=V1 TO V10.
>LOOP #=2 TO 10.
>IF MISSING(V(#)) V(#)=V(#-1),
>END LOOP.
>----------------
>
>Jack Noone wrote
>> Hi David,
>>
>> Thanks for your help, but unfortunately the syntax didn't work as I'd
>> hoped.
>>
>> I believe the context for the problem is in thread below. But, according
>> to point number 2 (see bottom of thread), the original syntax was
>>designed
>> to "automatically select
>> the highest SES job for any one year" and it did this perfectly. Some
>> people had more than one job in a calendar year and I wanted to select
>>the
>> job with the highest socioeconomic rating.
>>
>> But, if I add the other variables under the break command as suggested,
>> then the highest SES job for any one year is not selected out. I thought
>> that they were constant within the structure, but I now suspect that I
>> didn't understand what you meant. Could you elaborate please?
>>
>> I also have a another, somewhat related, syntax query. Having converted
>>my
>> long file to wide I end up with a file looking like this:
>>
>> P_ID    ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6     .
>> 1       34      34      .       .       48      48
>> 2       48      48      48      75      75      75
>>
>> This is simply a occupation-based socioeconomic index for each year of
>>my
>> participants' working lives - exactly what I wanted. However, I need to
>> fill in the missing data by substituting in the last SES score. For
>> example, participant 1 was out of the workforce for year 3 and year 4
>>and
>> I would like to substitute in their SES score of 34 (from their last
>>job)
>> for the two points of missing data.
>>
>> I'm sure there is an easy way to do this, but I have no idea how.
>>
>> Thanks,
>>
>> Jack
>>
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 14/11/12 5:39 PM, "David Marso" &lt;
>
>> david.marso@
>
>> &gt; wrote:
>>
>>>Without reviewing the entire thread:
>>>*If*  p_id to AUSEI06_3digit are *CONSTANT *within the structure (P_ID *
>>>year) simply add them to the list of BREAKS
>>>ie
>>>AGGREGATE
>>>    /OUTFILE=*
>>>    /BREAK  =P_ID to AUSEI06_3digit year
>>>    /JinYear 'Number of jobs held in calendar year'=NU
>>>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>
>>>*Otherwise *your question requires greater specificity.
>>>
>>>
>>>Jack Noone wrote
>>>> Hi Richard,
>>>>
>>>> You may remember the thread below. The syntax you wrote was perfect,
>>>> however I need to keep some other variables as well and I can't seem
>>>>to
>>>> figure out how to do it. Here is the piece of syntax in question.
>>>>
>>>>
>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>> AGGREGATE
>>>>     /OUTFILE=*
>>>>     /BREAK  =P_ID year
>>>>     /JinYear 'Number of jobs held in calendar year'=NU
>>>>     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>>
>>>> So, how would I fit
>>>> keep = p_id to AUSEI06_3digit.
>>>> into the syntax above
>>>>
>>>> Thanks,
>>>>
>>>> Jack
>>>>
>>>>
>>>> Dr. Jack Noone
>>>> Research Fellow & LHH/ABBA Project Manager
>>>> Ageing, Work and Health Research Unit
>>>> Faculty of Health Sciences
>>>> University of Sydney
>>>>
>>>> Ph: 02 9351 9411
>>>>
>>>>
>>>>
>>>>
>> <SNIP>
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716214.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

David Marso
Administrator
Please RTFM re AGGREGATE rather than just blindly running donated code.
ie MODE=ADDVARIABLES will be useful.
--
If Self Employed is a constant for P_ID and Year then you will get precisely the result required.
"Their were no error messages but the sorting by Job_SES didn't run. HELP!"

"didn't run" is not informative!  What DID happen????
Maybe post what did occur???

Run the original AGG per RW using MODE=ADDVAR then do a SELECT IF for the MAX(SES).

Point 2.
See EXECUTE command (so the data pass is performed and the fills are populated).

---
Jack Noone wrote
Hi David and all,

Point 1:

Here is what the data looks like prior to AGG BREAK.

P_ID    year    job     yr_start        yr_stop         job_SES         self_employed
1       1964    1       1964            1965            48.4            1
1       1965    1       1964            1965            48.4            1
1       1965    2       1965            1967            48.4            0
1       1965    2       1965            1967            48.4            0
1       1965    2       1965            1967            48.4            0
1       1967    3       1967            1969            48.4            1
1       1967    3       1967            1969            48.4            1
1       1968    4       1968            1969            48.4            0
1       1969    4       1968            1969            48.4            0
1       1969    5       1969            1974            83.7            1
1       1969    5       1969            1974            83.7            1

And so on


However, we can see that people are holding more than one job in a
calendar year.
So I applied this syntax (℅ R.Ristow) with the aim to have only the
highest job_SES for any given year:

AGGREGATE
     /OUTFILE=*
     /BREAK  =P_ID year
     /JinYear 'Number of jobs held in calendar year'=NU
     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).

Which resulted in

P_ID    year    jinyear         job_SES
1       1964    1               48.4
1       1965    2               48.4
1       1966    1               48.4
1       1967    2               48.4
1       1968    2               48.4
1       1969    2               83.7
1       1970    1               83.7
1       1971    1               83.7

Perfect! In 1969, this participant held one job with a SES rating of 48.4
and one with SES rating of 83.7. However, only the higher rating SES value
is chosen for 1969.

However, I want to know if the person was self-employed for the job that
has been selected. So I tried this:

AGGREGATE
     /OUTFILE=*
     /BREAK  =P_ID year self_employed
     /JinYear 'Number of jobs held in calendar year'=NU
     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).

But it didn't work. Their were no error messages but the sorting by
Job_SES didn't run. HELP!


Point 2:

I converted the long format file to wide format so I could take a look at
the missing data.

I then applied this syntax after sorting the variables

VECTOR V=Job_ses_1 TO Job_SES_55.
LOOP #=2 TO 10.
IF MISSING(V(#)) V(#)=V(#-1),
END LOOP.

There were no errors but the missing data were not filled.




Sadly I don't have the knowledge to solve this one myself. HELP AGAIN!

Jack



Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411





On 16/11/12 1:23 AM, "David Marso" <[hidden email]> wrote:

>Your initial followup:
>"I need to keep some other variables as well and I can't seem to figure
>out
>how to do it."
>If the values of these additional variables vary over year then you need
>to
>specify how these new variables will be represented in the new data file.
>If they don't vary then everything should be exactly as if they were not
>used in the AGG BREAK.  Maybe time for you to post what the before/after
>(pre aggregate/post aggregated) data appear.
>
>Point 2:
>Data x1...x10
>1 1 . . 3 4 5
>-----
>VECTOR V=V1 TO V10.
>LOOP #=2 TO 10.
>IF MISSING(V(#)) V(#)=V(#-1),
>END LOOP.
>----------------
>
>Jack Noone wrote
>> Hi David,
>>
>> Thanks for your help, but unfortunately the syntax didn't work as I'd
>> hoped.
>>
>> I believe the context for the problem is in thread below. But, according
>> to point number 2 (see bottom of thread), the original syntax was
>>designed
>> to "automatically select
>> the highest SES job for any one year" and it did this perfectly. Some
>> people had more than one job in a calendar year and I wanted to select
>>the
>> job with the highest socioeconomic rating.
>>
>> But, if I add the other variables under the break command as suggested,
>> then the highest SES job for any one year is not selected out. I thought
>> that they were constant within the structure, but I now suspect that I
>> didn't understand what you meant. Could you elaborate please?
>>
>> I also have a another, somewhat related, syntax query. Having converted
>>my
>> long file to wide I end up with a file looking like this:
>>
>> P_ID    ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6     .
>> 1       34      34      .       .       48      48
>> 2       48      48      48      75      75      75
>>
>> This is simply a occupation-based socioeconomic index for each year of
>>my
>> participants' working lives - exactly what I wanted. However, I need to
>> fill in the missing data by substituting in the last SES score. For
>> example, participant 1 was out of the workforce for year 3 and year 4
>>and
>> I would like to substitute in their SES score of 34 (from their last
>>job)
>> for the two points of missing data.
>>
>> I'm sure there is an easy way to do this, but I have no idea how.
>>
>> Thanks,
>>
>> Jack
>>
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 14/11/12 5:39 PM, "David Marso" <
>
>> david.marso@
>
>> > wrote:
>>
>>>Without reviewing the entire thread:
>>>*If*  p_id to AUSEI06_3digit are *CONSTANT *within the structure (P_ID *
>>>year) simply add them to the list of BREAKS
>>>ie
>>>AGGREGATE
>>>    /OUTFILE=*
>>>    /BREAK  =P_ID to AUSEI06_3digit year
>>>    /JinYear 'Number of jobs held in calendar year'=NU
>>>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>
>>>*Otherwise *your question requires greater specificity.
>>>
>>>
>>>Jack Noone wrote
>>>> Hi Richard,
>>>>
>>>> You may remember the thread below. The syntax you wrote was perfect,
>>>> however I need to keep some other variables as well and I can't seem
>>>>to
>>>> figure out how to do it. Here is the piece of syntax in question.
>>>>
>>>>
>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>> AGGREGATE
>>>>     /OUTFILE=*
>>>>     /BREAK  =P_ID year
>>>>     /JinYear 'Number of jobs held in calendar year'=NU
>>>>     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>>
>>>> So, how would I fit
>>>> keep = p_id to AUSEI06_3digit.
>>>> into the syntax above
>>>>
>>>> Thanks,
>>>>
>>>> Jack
>>>>
>>>>
>>>> Dr. Jack Noone
>>>> Research Fellow & LHH/ABBA Project Manager
>>>> Ageing, Work and Health Research Unit
>>>> Faculty of Health Sciences
>>>> University of Sydney
>>>>
>>>> Ph: 02 9351 9411
>>>>
>>>>
>>>>
>>>>
>> <SNIP>
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716214.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Jack Noone
Dear David,

I find your language (e.g. RTFM) completely inappropriate for this forum.

I am doing my best to solve a complex problem within a tight time-frame
and with only limited knowledge of SPSS syntax.

I would appreciate it if you did not respond to any more of my posts
including this one.

Jack.

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411





On 16/11/12 1:09 PM, "David Marso" <[hidden email]> wrote:

>Please RTFM re AGGREGATE rather than just blindly running donated code.
>ie MODE=ADDVARIABLES will be useful.
>--
>If Self Employed is a constant for P_ID and Year then you will get
>precisely
>the result required.
>"Their were no error messages but the sorting by Job_SES didn't run.
>HELP!"
>
>"didn't run" is *not *informative!  What DID happen????
>Maybe post what did occur???
>
>Run the original AGG per RW using MODE=ADDVAR then do a SELECT IF for the
>MAX(SES).
>
>Point 2.
>See EXECUTE command (so the data pass is performed and the fills are
>populated).
>
>---
>
>Jack Noone wrote
>> Hi David and all,
>>
>> Point 1:
>>
>> Here is what the data looks like prior to AGG BREAK.
>>
>> P_ID    year    job     yr_start        yr_stop         job_SES
>> self_employed
>> 1       1964    1       1964            1965            48.4
>>1
>> 1       1965    1       1964            1965            48.4
>>1
>> 1       1965    2       1965            1967            48.4
>>0
>> 1       1965    2       1965            1967            48.4
>>0
>> 1       1965    2       1965            1967            48.4
>>0
>> 1       1967    3       1967            1969            48.4
>>1
>> 1       1967    3       1967            1969            48.4
>>1
>> 1       1968    4       1968            1969            48.4
>>0
>> 1       1969    4       1968            1969            48.4
>>0
>> 1       1969    5       1969            1974            83.7
>>1
>> 1       1969    5       1969            1974            83.7
>>1
>>
>> And so on
>>
>>
>> However, we can see that people are holding more than one job in a
>> calendar year.
>> So I applied this syntax (℅ R.Ristow) with the aim to have only the
>> highest job_SES for any given year:
>>
>> AGGREGATE
>>      /OUTFILE=*
>>      /BREAK  =P_ID year
>>      /JinYear 'Number of jobs held in calendar year'=NU
>>      /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>
>> Which resulted in
>>
>> P_ID    year    jinyear         job_SES
>> 1       1964    1               48.4
>> 1       1965    2               48.4
>> 1       1966    1               48.4
>> 1       1967    2               48.4
>> 1       1968    2               48.4
>> 1       1969    2               83.7
>> 1       1970    1               83.7
>> 1       1971    1               83.7
>>
>> Perfect! In 1969, this participant held one job with a SES rating of
>>48.4
>> and one with SES rating of 83.7. However, only the higher rating SES
>>value
>> is chosen for 1969.
>>
>> However, I want to know if the person was self-employed for the job that
>> has been selected. So I tried this:
>>
>> AGGREGATE
>>      /OUTFILE=*
>>      /BREAK  =P_ID year self_employed
>>      /JinYear 'Number of jobs held in calendar year'=NU
>>      /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>
>> But it didn't work. Their were no error messages but the sorting by
>> Job_SES didn't run. HELP!
>>
>>
>> Point 2:
>>
>> I converted the long format file to wide format so I could take a look
>>at
>> the missing data.
>>
>> I then applied this syntax after sorting the variables
>>
>> VECTOR V=Job_ses_1 TO Job_SES_55.
>> LOOP #=2 TO 10.
>> IF MISSING(V(#)) V(#)=V(#-1),
>> END LOOP.
>>
>> There were no errors but the missing data were not filled.
>>
>>
>>
>>
>> Sadly I don't have the knowledge to solve this one myself. HELP AGAIN!
>>
>> Jack
>>
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 16/11/12 1:23 AM, "David Marso" &lt;
>
>> david.marso@
>
>> &gt; wrote:
>>
>>>Your initial followup:
>>>"I need to keep some other variables as well and I can't seem to figure
>>>out
>>>how to do it."
>>>If the values of these additional variables vary over year then you need
>>>to
>>>specify how these new variables will be represented in the new data
>>>file.
>>>If they don't vary then everything should be exactly as if they were not
>>>used in the AGG BREAK.  Maybe time for you to post what the before/after
>>>(pre aggregate/post aggregated) data appear.
>>>
>>>Point 2:
>>>Data x1...x10
>>>1 1 . . 3 4 5
>>>-----
>>>VECTOR V=V1 TO V10.
>>>LOOP #=2 TO 10.
>>>IF MISSING(V(#)) V(#)=V(#-1),
>>>END LOOP.
>>>----------------
>>>
>>>Jack Noone wrote
>>>> Hi David,
>>>>
>>>> Thanks for your help, but unfortunately the syntax didn't work as I'd
>>>> hoped.
>>>>
>>>> I believe the context for the problem is in thread below. But,
>>>>according
>>>> to point number 2 (see bottom of thread), the original syntax was
>>>>designed
>>>> to "automatically select
>>>> the highest SES job for any one year" and it did this perfectly. Some
>>>> people had more than one job in a calendar year and I wanted to select
>>>>the
>>>> job with the highest socioeconomic rating.
>>>>
>>>> But, if I add the other variables under the break command as
>>>>suggested,
>>>> then the highest SES job for any one year is not selected out. I
>>>>thought
>>>> that they were constant within the structure, but I now suspect that I
>>>> didn't understand what you meant. Could you elaborate please?
>>>>
>>>> I also have a another, somewhat related, syntax query. Having
>>>>converted
>>>>my
>>>> long file to wide I end up with a file looking like this:
>>>>
>>>> P_ID    ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6     .
>>>> 1       34      34      .       .       48      48
>>>> 2       48      48      48      75      75      75
>>>>
>>>> This is simply a occupation-based socioeconomic index for each year of
>>>>my
>>>> participants' working lives - exactly what I wanted. However, I need
>>>>to
>>>> fill in the missing data by substituting in the last SES score. For
>>>> example, participant 1 was out of the workforce for year 3 and year 4
>>>>and
>>>> I would like to substitute in their SES score of 34 (from their last
>>>>job)
>>>> for the two points of missing data.
>>>>
>>>> I'm sure there is an easy way to do this, but I have no idea how.
>>>>
>>>> Thanks,
>>>>
>>>> Jack
>>>>
>>>>
>>>>
>>>> Dr. Jack Noone
>>>> Research Fellow & LHH/ABBA Project Manager
>>>> Ageing, Work and Health Research Unit
>>>> Faculty of Health Sciences
>>>> University of Sydney
>>>>
>>>> Ph: 02 9351 9411
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 14/11/12 5:39 PM, "David Marso" &lt;
>>>
>>>> david.marso@
>>>
>>>> &gt; wrote:
>>>>
>>>>>Without reviewing the entire thread:
>>>>>*If*  p_id to AUSEI06_3digit are *CONSTANT *within the structure
>>>>>(P_ID *
>>>>>year) simply add them to the list of BREAKS
>>>>>ie
>>>>>AGGREGATE
>>>>>    /OUTFILE=*
>>>>>    /BREAK  =P_ID to AUSEI06_3digit year
>>>>>    /JinYear 'Number of jobs held in calendar year'=NU
>>>>>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>>>
>>>>>*Otherwise *your question requires greater specificity.
>>>>>
>>>>>
>>>>>Jack Noone wrote
>>>>>> Hi Richard,
>>>>>>
>>>>>> You may remember the thread below. The syntax you wrote was perfect,
>>>>>> however I need to keep some other variables as well and I can't seem
>>>>>>to
>>>>>> figure out how to do it. Here is the piece of syntax in question.
>>>>>>
>>>>>>
>>>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>>>> AGGREGATE
>>>>>>     /OUTFILE=*
>>>>>>     /BREAK  =P_ID year
>>>>>>     /JinYear 'Number of jobs held in calendar year'=NU
>>>>>>     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>>>>
>>>>>> So, how would I fit
>>>>>> keep = p_id to AUSEI06_3digit.
>>>>>> into the syntax above
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jack
>>>>>>
>>>>>>
>>>>>> Dr. Jack Noone
>>>>>> Research Fellow & LHH/ABBA Project Manager
>>>>>> Ageing, Work and Health Research Unit
>>>>>> Faculty of Health Sciences
>>>>>> University of Sydney
>>>>>>
>>>>>> Ph: 02 9351 9411
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>> <SNIP>
>>>
>>>
>>>
>>>
>>>
>>>-----
>>>Please reply to the list and not to my personal email.
>>>Those desiring my consulting or training services please feel free to
>>>email me.
>>>--
>>>View this message in context:
>>>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-va
>>>ri
>>>ables-tp5715562p5716214.html
>>>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>>=====================
>>>To manage your subscription to SPSSX-L, send a message to
>>>
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>>>command. To leave the list, send the command
>>>SIGNOFF SPSSX-L
>>>For a list of commands to manage subscriptions, send the command
>>>INFO REFCARD
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716240.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables -

thara vardhan-2
Hi Jack

I have been following this discussion quite keenly.

 It is a pity that you are reacting so strongly to David's response.

Just wanted to let you know that David is one of the most helpful and knowledgeable persons on the list.

In fact he is guru of SPSS syntax in the real sense.

His comments/suggestions and help with syntax  go beyond the initial problem posted by members and thereby helps the person think more carefully and come to the best possible solution and conclusion for the issue they are working on.

Perhaps you are under a lot of stress right now.

Hopefully this will change your mind about not wanting to interact with him anymore on this forum. Oh yes I am writing from down under!

cheers
thara vardhan

 



From:        Jack Noone <[hidden email]>
To:        [hidden email]
Date:        16/11/2012 11:35
Subject:        Re: Syntax help - duplicating variables
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Dear David,

I find your language (e.g. RTFM) completely inappropriate for this forum.

I am doing my best to solve a complex problem within a tight time-frame
and with only limited knowledge of SPSS syntax.

I would appreciate it if you did not respond to any more of my posts
including this one.

Jack.

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411





On 16/11/12 1:09 PM, "David Marso" <[hidden email]> wrote:

>Please RTFM re AGGREGATE rather than just blindly running donated code.
>ie MODE=ADDVARIABLES will be useful.
>--
>If Self Employed is a constant for P_ID and Year then you will get
>precisely
>the result required.
>"Their were no error messages but the sorting by Job_SES didn't run.
>HELP!"
>
>"didn't run" is *not *informative!  What DID happen????
>Maybe post what did occur???
>
>Run the original AGG per RW using MODE=ADDVAR then do a SELECT IF for the
>MAX(SES).
>
>Point 2.
>See EXECUTE command (so the data pass is performed and the fills are
>populated).
>
>---
>
>Jack Noone wrote
>> Hi David and all,
>>
>> Point 1:
>>
>> Here is what the data looks like prior to AGG BREAK.
>>
>> P_ID    year    job     yr_start        yr_stop         job_SES
>> self_employed
>> 1       1964    1       1964            1965            48.4
>>1
>> 1       1965    1       1964            1965            48.4
>>1
>> 1       1965    2       1965            1967            48.4
>>0
>> 1       1965    2       1965            1967            48.4
>>0
>> 1       1965    2       1965            1967            48.4
>>0
>> 1       1967    3       1967            1969            48.4
>>1
>> 1       1967    3       1967            1969            48.4
>>1
>> 1       1968    4       1968            1969            48.4
>>0
>> 1       1969    4       1968            1969            48.4
>>0
>> 1       1969    5       1969            1974            83.7
>>1
>> 1       1969    5       1969            1974            83.7
>>1
>>
>> And so on
>>
>>
>> However, we can see that people are holding more than one job in a
>> calendar year.
>> So I applied this syntax (℅ R.Ristow) with the aim to have only the
>> highest job_SES for any given year:
>>
>> AGGREGATE
>>      /OUTFILE=*
>>      /BREAK  =P_ID year
>>      /JinYear 'Number of jobs held in calendar year'=NU
>>      /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>
>> Which resulted in
>>
>> P_ID    year    jinyear         job_SES
>> 1       1964    1               48.4
>> 1       1965    2               48.4
>> 1       1966    1               48.4
>> 1       1967    2               48.4
>> 1       1968    2               48.4
>> 1       1969    2               83.7
>> 1       1970    1               83.7
>> 1       1971    1               83.7
>>
>> Perfect! In 1969, this participant held one job with a SES rating of
>>48.4
>> and one with SES rating of 83.7. However, only the higher rating SES
>>value
>> is chosen for 1969.
>>
>> However, I want to know if the person was self-employed for the job that
>> has been selected. So I tried this:
>>
>> AGGREGATE
>>      /OUTFILE=*
>>      /BREAK  =P_ID year self_employed
>>      /JinYear 'Number of jobs held in calendar year'=NU
>>      /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>
>> But it didn't work. Their were no error messages but the sorting by
>> Job_SES didn't run. HELP!
>>
>>
>> Point 2:
>>
>> I converted the long format file to wide format so I could take a look
>>at
>> the missing data.
>>
>> I then applied this syntax after sorting the variables
>>
>> VECTOR V=Job_ses_1 TO Job_SES_55.
>> LOOP #=2 TO 10.
>> IF MISSING(V(#)) V(#)=V(#-1),
>> END LOOP.
>>
>> There were no errors but the missing data were not filled.
>>
>>
>>
>>
>> Sadly I don't have the knowledge to solve this one myself. HELP AGAIN!
>>
>> Jack
>>
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 16/11/12 1:23 AM, "David Marso" &lt;
>
>> david.marso@
>
>> &gt; wrote:
>>
>>>Your initial followup:
>>>"I need to keep some other variables as well and I can't seem to figure
>>>out
>>>how to do it."
>>>If the values of these additional variables vary over year then you need
>>>to
>>>specify how these new variables will be represented in the new data
>>>file.
>>>If they don't vary then everything should be exactly as if they were not
>>>used in the AGG BREAK.  Maybe time for you to post what the before/after
>>>(pre aggregate/post aggregated) data appear.
>>>
>>>Point 2:
>>>Data x1...x10
>>>1 1 . . 3 4 5
>>>-----
>>>VECTOR V=V1 TO V10.
>>>LOOP #=2 TO 10.
>>>IF MISSING(V(#)) V(#)=V(#-1),
>>>END LOOP.
>>>----------------
>>>
>>>Jack Noone wrote
>>>> Hi David,
>>>>
>>>> Thanks for your help, but unfortunately the syntax didn't work as I'd
>>>> hoped.
>>>>
>>>> I believe the context for the problem is in thread below. But,
>>>>according
>>>> to point number 2 (see bottom of thread), the original syntax was
>>>>designed
>>>> to "automatically select
>>>> the highest SES job for any one year" and it did this perfectly. Some
>>>> people had more than one job in a calendar year and I wanted to select
>>>>the
>>>> job with the highest socioeconomic rating.
>>>>
>>>> But, if I add the other variables under the break command as
>>>>suggested,
>>>> then the highest SES job for any one year is not selected out. I
>>>>thought
>>>> that they were constant within the structure, but I now suspect that I
>>>> didn't understand what you meant. Could you elaborate please?
>>>>
>>>> I also have a another, somewhat related, syntax query. Having
>>>>converted
>>>>my
>>>> long file to wide I end up with a file looking like this:
>>>>
>>>> P_ID    ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6     .
>>>> 1       34      34      .       .       48      48
>>>> 2       48      48      48      75      75      75
>>>>
>>>> This is simply a occupation-based socioeconomic index for each year of
>>>>my
>>>> participants' working lives - exactly what I wanted. However, I need
>>>>to
>>>> fill in the missing data by substituting in the last SES score. For
>>>> example, participant 1 was out of the workforce for year 3 and year 4
>>>>and
>>>> I would like to substitute in their SES score of 34 (from their last
>>>>job)
>>>> for the two points of missing data.
>>>>
>>>> I'm sure there is an easy way to do this, but I have no idea how.
>>>>
>>>> Thanks,
>>>>
>>>> Jack
>>>>
>>>>
>>>>
>>>> Dr. Jack Noone
>>>> Research Fellow & LHH/ABBA Project Manager
>>>> Ageing, Work and Health Research Unit
>>>> Faculty of Health Sciences
>>>> University of Sydney
>>>>
>>>> Ph: 02 9351 9411
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 14/11/12 5:39 PM, "David Marso" &lt;
>>>
>>>> david.marso@
>>>
>>>> &gt; wrote:
>>>>
>>>>>Without reviewing the entire thread:
>>>>>*If*  p_id to AUSEI06_3digit are *CONSTANT *within the structure
>>>>>(P_ID *
>>>>>year) simply add them to the list of BREAKS
>>>>>ie
>>>>>AGGREGATE
>>>>>    /OUTFILE=*
>>>>>    /BREAK  =P_ID to AUSEI06_3digit year
>>>>>    /JinYear 'Number of jobs held in calendar year'=NU
>>>>>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>>>
>>>>>*Otherwise *your question requires greater specificity.
>>>>>
>>>>>
>>>>>Jack Noone wrote
>>>>>> Hi Richard,
>>>>>>
>>>>>> You may remember the thread below. The syntax you wrote was perfect,
>>>>>> however I need to keep some other variables as well and I can't seem
>>>>>>to
>>>>>> figure out how to do it. Here is the piece of syntax in question.
>>>>>>
>>>>>>
>>>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>>>> AGGREGATE
>>>>>>     /OUTFILE=*
>>>>>>     /BREAK  =P_ID year
>>>>>>     /JinYear 'Number of jobs held in calendar year'=NU
>>>>>>     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>>>>
>>>>>> So, how would I fit
>>>>>> keep = p_id to AUSEI06_3digit.
>>>>>> into the syntax above
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jack
>>>>>>
>>>>>>
>>>>>> Dr. Jack Noone
>>>>>> Research Fellow & LHH/ABBA Project Manager
>>>>>> Ageing, Work and Health Research Unit
>>>>>> Faculty of Health Sciences
>>>>>> University of Sydney
>>>>>>
>>>>>> Ph: 02 9351 9411
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>> <SNIP>
>>>
>>>
>>>
>>>
>>>
>>>-----
>>>Please reply to the list and not to my personal email.
>>>Those desiring my consulting or training services please feel free to
>>>email me.
>>>--
>>>View this message in context:
>>>
http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-va
>>>ri
>>>ables-tp5715562p5716214.html
>>>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>>=====================
>>>To manage your subscription to SPSSX-L, send a message to
>>>
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>>>command. To leave the list, send the command
>>>SIGNOFF SPSSX-L
>>>For a list of commands to manage subscriptions, send the command
>>>INFO REFCARD
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>
http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716240.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

All mail is subject to content scanning for possible violation of NSW Police
Force policy, including the Email and Internet Policy and Guidelines. All NSW
Police Force employees are required to familiarise themselves with these
policies, available on the NSW Police Force Intranet.







_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The information contained in this email is intended for the named recipient(s)
only. It may contain private, confidential, copyright or legally privileged
information. If you are not the intended recipient or you have received this
email by mistake, please reply to the author and delete this email immediately.
You must not copy, print, forward or distribute this email, nor place reliance
on its contents. This email and any attachment have been virus scanned. However,
you are requested to conduct a virus scan as well. No liability is accepted
for any loss or damage resulting from a computer virus, or resulting from a delay
or defect in transmission of this email or any attached file. This email does not
constitute a representation by the NSW Police Force unless the author is legally
entitled to do so.


Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

David Marso
Administrator
In reply to this post by Jack Noone
RTFM= Read the FINE manual.  If you for some reason believed otherwise you have not been here in this forum very long.  When people such as Richard and myself reach out to assist perhaps you should take some responsibility for your own outcomes and do a bit of reading!
I will refrain from any further assistance on any of your future issues since you really think looking a gift horse in the mouth is the modus operandi!
"complex problem"?? Hardly!
--
Jack Noone wrote
Dear David,

I find your language (e.g. RTFM) completely inappropriate for this forum.

I am doing my best to solve a complex problem within a tight time-frame
and with only limited knowledge of SPSS syntax.

I would appreciate it if you did not respond to any more of my posts
including this one.

Jack.

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411





On 16/11/12 1:09 PM, "David Marso" <[hidden email]> wrote:

>Please RTFM re AGGREGATE rather than just blindly running donated code.
>ie MODE=ADDVARIABLES will be useful.
>--
>If Self Employed is a constant for P_ID and Year then you will get
>precisely
>the result required.
>"Their were no error messages but the sorting by Job_SES didn't run.
>HELP!"
>
>"didn't run" is *not *informative!  What DID happen????
>Maybe post what did occur???
>
>Run the original AGG per RW using MODE=ADDVAR then do a SELECT IF for the
>MAX(SES).
>
>Point 2.
>See EXECUTE command (so the data pass is performed and the fills are
>populated).
>
>---
>
>Jack Noone wrote
>> Hi David and all,
>>
>> Point 1:
>>
>> Here is what the data looks like prior to AGG BREAK.
>>
>> P_ID    year    job     yr_start        yr_stop         job_SES
>> self_employed
>> 1       1964    1       1964            1965            48.4
>>1
>> 1       1965    1       1964            1965            48.4
>>1
>> 1       1965    2       1965            1967            48.4
>>0
>> 1       1965    2       1965            1967            48.4
>>0
>> 1       1965    2       1965            1967            48.4
>>0
>> 1       1967    3       1967            1969            48.4
>>1
>> 1       1967    3       1967            1969            48.4
>>1
>> 1       1968    4       1968            1969            48.4
>>0
>> 1       1969    4       1968            1969            48.4
>>0
>> 1       1969    5       1969            1974            83.7
>>1
>> 1       1969    5       1969            1974            83.7
>>1
>>
>> And so on
>>
>>
>> However, we can see that people are holding more than one job in a
>> calendar year.
>> So I applied this syntax (℅ R.Ristow) with the aim to have only the
>> highest job_SES for any given year:
>>
>> AGGREGATE
>>      /OUTFILE=*
>>      /BREAK  =P_ID year
>>      /JinYear 'Number of jobs held in calendar year'=NU
>>      /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>
>> Which resulted in
>>
>> P_ID    year    jinyear         job_SES
>> 1       1964    1               48.4
>> 1       1965    2               48.4
>> 1       1966    1               48.4
>> 1       1967    2               48.4
>> 1       1968    2               48.4
>> 1       1969    2               83.7
>> 1       1970    1               83.7
>> 1       1971    1               83.7
>>
>> Perfect! In 1969, this participant held one job with a SES rating of
>>48.4
>> and one with SES rating of 83.7. However, only the higher rating SES
>>value
>> is chosen for 1969.
>>
>> However, I want to know if the person was self-employed for the job that
>> has been selected. So I tried this:
>>
>> AGGREGATE
>>      /OUTFILE=*
>>      /BREAK  =P_ID year self_employed
>>      /JinYear 'Number of jobs held in calendar year'=NU
>>      /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>
>> But it didn't work. Their were no error messages but the sorting by
>> Job_SES didn't run. HELP!
>>
>>
>> Point 2:
>>
>> I converted the long format file to wide format so I could take a look
>>at
>> the missing data.
>>
>> I then applied this syntax after sorting the variables
>>
>> VECTOR V=Job_ses_1 TO Job_SES_55.
>> LOOP #=2 TO 10.
>> IF MISSING(V(#)) V(#)=V(#-1),
>> END LOOP.
>>
>> There were no errors but the missing data were not filled.
>>
>>
>>
>>
>> Sadly I don't have the knowledge to solve this one myself. HELP AGAIN!
>>
>> Jack
>>
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 16/11/12 1:23 AM, "David Marso" <
>
>> david.marso@
>
>> > wrote:
>>
>>>Your initial followup:
>>>"I need to keep some other variables as well and I can't seem to figure
>>>out
>>>how to do it."
>>>If the values of these additional variables vary over year then you need
>>>to
>>>specify how these new variables will be represented in the new data
>>>file.
>>>If they don't vary then everything should be exactly as if they were not
>>>used in the AGG BREAK.  Maybe time for you to post what the before/after
>>>(pre aggregate/post aggregated) data appear.
>>>
>>>Point 2:
>>>Data x1...x10
>>>1 1 . . 3 4 5
>>>-----
>>>VECTOR V=V1 TO V10.
>>>LOOP #=2 TO 10.
>>>IF MISSING(V(#)) V(#)=V(#-1),
>>>END LOOP.
>>>----------------
>>>
>>>Jack Noone wrote
>>>> Hi David,
>>>>
>>>> Thanks for your help, but unfortunately the syntax didn't work as I'd
>>>> hoped.
>>>>
>>>> I believe the context for the problem is in thread below. But,
>>>>according
>>>> to point number 2 (see bottom of thread), the original syntax was
>>>>designed
>>>> to "automatically select
>>>> the highest SES job for any one year" and it did this perfectly. Some
>>>> people had more than one job in a calendar year and I wanted to select
>>>>the
>>>> job with the highest socioeconomic rating.
>>>>
>>>> But, if I add the other variables under the break command as
>>>>suggested,
>>>> then the highest SES job for any one year is not selected out. I
>>>>thought
>>>> that they were constant within the structure, but I now suspect that I
>>>> didn't understand what you meant. Could you elaborate please?
>>>>
>>>> I also have a another, somewhat related, syntax query. Having
>>>>converted
>>>>my
>>>> long file to wide I end up with a file looking like this:
>>>>
>>>> P_ID    ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6     .
>>>> 1       34      34      .       .       48      48
>>>> 2       48      48      48      75      75      75
>>>>
>>>> This is simply a occupation-based socioeconomic index for each year of
>>>>my
>>>> participants' working lives - exactly what I wanted. However, I need
>>>>to
>>>> fill in the missing data by substituting in the last SES score. For
>>>> example, participant 1 was out of the workforce for year 3 and year 4
>>>>and
>>>> I would like to substitute in their SES score of 34 (from their last
>>>>job)
>>>> for the two points of missing data.
>>>>
>>>> I'm sure there is an easy way to do this, but I have no idea how.
>>>>
>>>> Thanks,
>>>>
>>>> Jack
>>>>
>>>>
>>>>
>>>> Dr. Jack Noone
>>>> Research Fellow & LHH/ABBA Project Manager
>>>> Ageing, Work and Health Research Unit
>>>> Faculty of Health Sciences
>>>> University of Sydney
>>>>
>>>> Ph: 02 9351 9411
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 14/11/12 5:39 PM, "David Marso" <
>>>
>>>> david.marso@
>>>
>>>> > wrote:
>>>>
>>>>>Without reviewing the entire thread:
>>>>>*If*  p_id to AUSEI06_3digit are *CONSTANT *within the structure
>>>>>(P_ID *
>>>>>year) simply add them to the list of BREAKS
>>>>>ie
>>>>>AGGREGATE
>>>>>    /OUTFILE=*
>>>>>    /BREAK  =P_ID to AUSEI06_3digit year
>>>>>    /JinYear 'Number of jobs held in calendar year'=NU
>>>>>    /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>>>
>>>>>*Otherwise *your question requires greater specificity.
>>>>>
>>>>>
>>>>>Jack Noone wrote
>>>>>> Hi Richard,
>>>>>>
>>>>>> You may remember the thread below. The syntax you wrote was perfect,
>>>>>> however I need to keep some other variables as well and I can't seem
>>>>>>to
>>>>>> figure out how to do it. Here is the piece of syntax in question.
>>>>>>
>>>>>>
>>>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>>>> AGGREGATE
>>>>>>     /OUTFILE=*
>>>>>>     /BREAK  =P_ID year
>>>>>>     /JinYear 'Number of jobs held in calendar year'=NU
>>>>>>     /Job_SES 'Highest job SES in calendar year'    =MAX(Job_SES).
>>>>>>
>>>>>> So, how would I fit
>>>>>> keep = p_id to AUSEI06_3digit.
>>>>>> into the syntax above
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jack
>>>>>>
>>>>>>
>>>>>> Dr. Jack Noone
>>>>>> Research Fellow & LHH/ABBA Project Manager
>>>>>> Ageing, Work and Health Research Unit
>>>>>> Faculty of Health Sciences
>>>>>> University of Sydney
>>>>>>
>>>>>> Ph: 02 9351 9411
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>> <SNIP>
>>>
>>>
>>>
>>>
>>>
>>>-----
>>>Please reply to the list and not to my personal email.
>>>Those desiring my consulting or training services please feel free to
>>>email me.
>>>--
>>>View this message in context:
>>>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-va
>>>ri
>>>ables-tp5715562p5716214.html
>>>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>>=====================
>>>To manage your subscription to SPSSX-L, send a message to
>>>
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>>>command. To leave the list, send the command
>>>SIGNOFF SPSSX-L
>>>For a list of commands to manage subscriptions, send the command
>>>INFO REFCARD
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716240.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Rich Ulrich
In reply to this post by Jack Noone
That IF MISSING   line needs to end with a period, not a comma.

It is curious that no syntax error was generated.

--
Rich Ulrich

> Date: Thu, 15 Nov 2012 22:58:45 +0000
> From: [hidden email]
> Subject: Re: Syntax help - duplicating variables
> To: [hidden email]
...

>
> Point 2:
>
> I converted the long format file to wide format so I could take a look at
> the missing data.
>
> I then applied this syntax after sorting the variables
>
> VECTOR V=Job_ses_1 TO Job_SES_55.
> LOOP #=2 TO 10.
> IF MISSING(V(#)) V(#)=V(#-1),
> END LOOP.
>
> There were no errors but the missing data were not filled.
>
> ...
Reply | Threaded
Open this post in threaded view
|

Re: Syntax help - duplicating variables

Jack Noone
It is curious isn't it?

I'd actually corrected the comma and added the execute before hand – I'd just copied in David's syntax without correcting the error.

So this is what I am running:

VECTOR V=AUSEI06_2.1 TO AUSEI06_2.55.
LOOP #=2 TO 10.
IF MISSING(V(#)) V(#)=V(#-1).
END LOOP.
Execute.

The AUSEI06_2.1 variable is first in the database followed by all the others up to AUSEI06_2.55.

Any ideas? According to the output window, everything went fine. I might send the file and data to a friend to see if it works for them.

Thanks,

Jack

Dr. Jack Noone

Research Fellow & LHH/ABBA Project Manager

Ageing, Work and Health Research Unit

Faculty of Health Sciences

University of Sydney

 

Ph: 02 9351 9411


From: Rich Ulrich <[hidden email]>
Date: Fri, 16 Nov 2012 01:35:10 -0500
To: Jack Noone <[hidden email]>, SPSS list <[hidden email]>
Subject: RE: Syntax help - duplicating variables

That IF MISSING   line needs to end with a period, not a comma.

It is curious that no syntax error was generated.

--
Rich Ulrich

> Date: Thu, 15 Nov 2012 22:58:45 +0000
> From: [hidden email]
> Subject: Re: Syntax help - duplicating variables
> To: [hidden email]
...
>
> Point 2:
>
> I converted the long format file to wide format so I could take a look at
> the missing data.
>
> I then applied this syntax after sorting the variables
>
> VECTOR V=Job_ses_1 TO Job_SES_55.
> LOOP #=2 TO 10.
> IF MISSING(V(#)) V(#)=V(#-1),
> END LOOP.
>
> There were no errors but the missing data were not filled.
>
> ...
12