SPSSX Discussion

Syntax help - duplicating variables

Classic

List

Threaded

34 messages Options

Jack Noone

Syntax help - duplicating variables

Dear colleagues,

I would like to be able to duplicate a variable (V1) X times according to
the value of another variable (V2).

In other words, I would like to convert this:

V1 V2
Case1 43.2 3
Case2 48.1 4

To this:
V1 V2 V2_a V2_b V2_c V2_d
Case1 43.2 3 43.2 43.2 43.2 .
Case2 48.1 4 48.1 48.1 48.1 48.1

Unfortunately my syntax skills aren't up to the task. Could anyone offer
any assistance please?

Thanks,

Jack

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Richard Ristow

Re: Syntax help - duplicating variables

At 09:40 PM 10/9/2012, Jack Noone wrote:

>I would like to be able to duplicate a variable (V1) X times according to
>the value of another variable (V2). In other words, I would like to
>convert this:
>
> V1 V2
>Case1 43.2 3
>Case2 48.1 4
>
>To this:
> V1 V2 V2_a V2_b V2_c V2_d
>Case1 43.2 3 43.2 43.2 43.2 .
>Case2 48.1 4 48.1 48.1 48.1 48.1

I'm tossing this off, untested. Besides any mistakes I may make, one
warning: it doesn't check for invalid values of V2 (though it does
cap at 6 copies).

NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
VECTOR New_V V2_a TO V2_f.

LOOP #idx = 1 TO (MIN(V2,6)).
. COMPUTE New_V(#idx) = V1.
END LOOP.

Let me end with the question that bothers me: Why do you want to do
this? If you tell us what you need to accomplish, there may be
another way to do it.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Jack Noone

Re: Syntax help - duplicating variables

Hi Richard and others.

The variable V1 represents a continuous socioeconomic indicator for a
participant's given job and the variable V2 represents the number of years
a participant has worked in this position.

I want to create a wide data set (I've not had much to do with long data
sets) with 62 variables. These variables correspond to each year from 1950
to 2012 and the value for each variable will be the socioeconomic
indicator for a job held during that year.

I have variables that represent:
1) The year each job started
2) The year each job ended
3) the number of years in each job (subtract 1 from 2 to create V2
variable)
4) The socioeconomic value for each job held (V1 variable).

So in reality I have:

V1_1 (SES for first job), V1_2 (SES for second job), . V1_j (SES for
last job)
V2_1 (Number of years in first job), V2_2, ..V2_J (number of years
in last job)

I would like to turn this into something like this:

1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 ..2012
Case1 43.2 43.2 43.2 60.2 60.2 60.2 60.2 60.2 60.2 60.2 ...

This is a person who started their first job (SES value = 43.2) in 1950
and left the job in 1952. They started their second job (SES value=60.2)
in 1953 and left it in 2012.

There are also lots of little caveats along the way too!

In the end, the data is going to form the basis for growth mixture
modeling.

I'm sure their is a better way and I am open to suggestions.

Jack

On 10/10/12 3:48 PM, "Richard Ristow" <[hidden email]> wrote:

>At 09:40 PM 10/9/2012, Jack Noone wrote:
>
>>I would like to be able to duplicate a variable (V1) X times according
>>to
>>the value of another variable (V2). In other words, I would like to
>>convert this:
>>
>> V1 V2
>>Case1 43.2 3
>>Case2 48.1 4
>>
>>To this:
>> V1 V2 V2_a V2_b V2_c V2_d
>>Case1 43.2 3 43.2 43.2 43.2 .
>>Case2 48.1 4 48.1 48.1 48.1 48.1
>
>I'm tossing this off, untested. Besides any mistakes I may make, one
>warning: it doesn't check for invalid values of V2 (though it does
>cap at 6 copies).
>
>NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
>VECTOR New_V V2_a TO V2_f.
>
>LOOP #idx = 1 TO (MIN(V2,6)).
>. COMPUTE New_V(#idx) = V1.
>END LOOP.
>
>Let me end with the question that bothers me: Why do you want to do
>this? If you tell us what you need to accomplish, there may be
>another way to do it.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

John F Hall

Re: Syntax help - duplicating variables

In reply to this post by Richard Ristow

Not the answer you need, but an opportunity for me to ask yet again the same
question I was asking in 1974, why doesn't SPSS have a facility for
automatic generation of variables ending in alphabetic characters?

DO REPEAT
X = v2_a to v2_d
~ ~ ~ ~
END REPEAT.

John F Hall (Mr)

Email: [hidden email]
Website: www.surveyresearch.weebly.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Richard Ristow
Sent: 10 October 2012 04:49
To: [hidden email]
Subject: Re: Syntax help - duplicating variables

At 09:40 PM 10/9/2012, Jack Noone wrote:

>I would like to be able to duplicate a variable (V1) X times according
>to the value of another variable (V2). In other words, I would like to
>convert this:
>
> V1 V2
>Case1 43.2 3
>Case2 48.1 4
>
>To this:
> V1 V2 V2_a V2_b V2_c V2_d
>Case1 43.2 3 43.2 43.2 43.2 .
>Case2 48.1 4 48.1 48.1 48.1 48.1

I'm tossing this off, untested. Besides any mistakes I may make, one
warning: it doesn't check for invalid values of V2 (though it does cap at 6
copies).

NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
VECTOR New_V V2_a TO V2_f.

LOOP #idx = 1 TO (MIN(V2,6)).
. COMPUTE New_V(#idx) = V1.
END LOOP.

Let me end with the question that bothers me: Why do you want to do this? If
you tell us what you need to accomplish, there may be another way to do it.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Rich Ulrich

Re: Syntax help - duplicating variables

In reply to this post by Jack Noone

I think you will save yourself a lot of grief in the long run if
you do take the trouble to use a "long data set" version.
For one thing, I would not trust the number of years to add up
to the right total until they have been massaged a bit, because
"years in job" is not likely to be without rounding error.

So -- Write out the number of lines, using XSAVE, for each
job. For that new file, find the people who inconstant figures
for total-time versus the number of lines. The long form is
also useful for any other edits that you might want to do on
SES or whatever, since every occurrence of SES becomes the
same variable in the long form.

If you do need the file back in wide form, you can use
CasesToVars.

--
Rich Ulrich

> Date: Wed, 10 Oct 2012 03:33:23 +0000

> From: [hidden email]
> Subject: Re: Syntax help - duplicating variables
> To: [hidden email]
>
> Hi Richard and others.
>
> The variable V1 represents a continuous socioeconomic indicator for a
> participant's given job and the variable V2 represents the number of years
> a participant has worked in this position.
>
> I want to create a wide data set (I've not had much to do with long data
> sets) with 62 variables. These variables correspond to each year from 1950
> to 2012 and the value for each variable will be the socioeconomic
> indicator for a job held during that year.
>
> I have variables that represent:
> 1) The year each job started
> 2) The year each job ended
> 3) the number of years in each job (subtract 1 from 2 to create V2
> variable)
> 4) The socioeconomic value for each job held (V1 variable).
>
> So in reality I have:
>
> V1_1 (SES for first job), V1_2 (SES for second job), . V1_j (SES for
> last job)
> V2_1 (Number of years in first job), V2_2, ..V2_J (number of years
> in last job)
>
> I would like to turn this into something like this:
>
> 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 ..2012
> Case1 43.2 43.2 43.2 60.2 60.2 60.2 60.2 60.2 60.2 60.2 ...
>
> This is a person who started their first job (SES value = 43.2) in 1950
> and left the job in 1952. They started their second job (SES value=60.2)
> in 1953 and left it in 2012.
>
> There are also lots of little caveats along the way too!
>
> In the end, the data is going to form the basis for growth mixture
> modeling.
>
> I'm sure their is a better way and I am open to suggestions.
>
> Jack
>
> On 10/10/12 3:48 PM, "Richard Ristow" <[hidden email]> wrote:
>
> >At 09:40 PM 10/9/2012, Jack Noone wrote:
> >
> >>I would like to be able to duplicate a variable (V1) X times according
> >>to
> >>the value of another variable (V2). In other words, I would like to
> >>convert this:
> >>
> >> V1 V2
> >>Case1 43.2 3
> >>Case2 48.1 4
> >>
> >>To this:
> >> V1 V2 V2_a V2_b V2_c V2_d
> >>Case1 43.2 3 43.2 43.2 43.2 .
> >>Case2 48.1 4 48.1 48.1 48.1 48.1
> >
> >I'm tossing this off, untested. Besides any mistakes I may make, one
> >warning: it doesn't check for invalid values of V2 (though it does
> >cap at 6 copies).
> >
> >NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
> >VECTOR New_V V2_a TO V2_f.
> >
> >LOOP #idx = 1 TO (MIN(V2,6)).
> >. COMPUTE New_V(#idx) = V1.
> >END LOOP.
> >
> >Let me end with the question that bothers me: Why do you want to do
> >this? If you tell us what you need to accomplish, there may be
> >another way to do it.
> >
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

David Marso

Re: Syntax help - duplicating variables

Administrator

Additionally, AFAIK growth mixture modelling in SPSS would most likely require the MIXED procedure which presumes a LONG format for the data!!!

Rich Ulrich-2 wrote

I think you will save yourself a lot of grief in the long run if
you do take the trouble to use a "long data set" version.
For one thing, I would not trust the number of years to add up
to the right total until they have been massaged a bit, because
"years in job" is not likely to be without rounding error.

So -- Write out the number of lines, using XSAVE, for each
job. For that new file, find the people who inconstant figures
for total-time versus the number of lines. The long form is
also useful for any other edits that you might want to do on
SES or whatever, since every occurrence of SES becomes the
same variable in the long form.

If you do need the file back in wide form, you can use
CasesToVars.

--
Rich Ulrich

> Date: Wed, 10 Oct 2012 03:33:23 +0000
> From: [hidden email]
> Subject: Re: Syntax help - duplicating variables
> To: [hidden email]
>
> Hi Richard and others.
>
> The variable V1 represents a continuous socioeconomic indicator for a
> participant's given job and the variable V2 represents the number of years
> a participant has worked in this position.
>
> I want to create a wide data set (I've not had much to do with long data
> sets) with 62 variables. These variables correspond to each year from 1950
> to 2012 and the value for each variable will be the socioeconomic
> indicator for a job held during that year.
>
> I have variables that represent:
> 1) The year each job started
> 2) The year each job ended
> 3) the number of years in each job (subtract 1 from 2 to create V2
> variable)
> 4) The socioeconomic value for each job held (V1 variable).
>
> So in reality I have:
>
> V1_1 (SES for first job), V1_2 (SES for second job), . V1_j (SES for
> last job)
> V2_1 (Number of years in first job), V2_2, ..V2_J (number of years
> in last job)
>
> I would like to turn this into something like this:
>
> 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 ..2012
> Case1 43.2 43.2 43.2 60.2 60.2 60.2 60.2 60.2 60.2 60.2 ...
>
> This is a person who started their first job (SES value = 43.2) in 1950
> and left the job in 1952. They started their second job (SES value=60.2)
> in 1953 and left it in 2012.
>
> There are also lots of little caveats along the way too!
>
> In the end, the data is going to form the basis for growth mixture
> modeling.
>
> I'm sure their is a better way and I am open to suggestions.
>
> Jack
>
> On 10/10/12 3:48 PM, "Richard Ristow" <[hidden email]> wrote:
>
> >At 09:40 PM 10/9/2012, Jack Noone wrote:
> >
> >>I would like to be able to duplicate a variable (V1) X times according
> >>to
> >>the value of another variable (V2). In other words, I would like to
> >>convert this:
> >>
> >> V1 V2
> >>Case1 43.2 3
> >>Case2 48.1 4
> >>
> >>To this:
> >> V1 V2 V2_a V2_b V2_c V2_d
> >>Case1 43.2 3 43.2 43.2 43.2 .
> >>Case2 48.1 4 48.1 48.1 48.1 48.1
> >
> >I'm tossing this off, untested. Besides any mistakes I may make, one
> >warning: it doesn't check for invalid values of V2 (though it does
> >cap at 6 copies).
> >
> >NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
> >VECTOR New_V V2_a TO V2_f.
> >
> >LOOP #idx = 1 TO (MIN(V2,6)).
> >. COMPUTE New_V(#idx) = V1.
> >END LOOP.
> >
> >Let me end with the question that bothers me: Why do you want to do
> >this? If you tell us what you need to accomplish, there may be
> >another way to do it.
> >
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Jack Noone

Re: Syntax help - duplicating variables

Thanks All. I'll do it in the long format as you suggest. However, I'll be
doing the analysis in Mplus.

Regards,

Jack

On 11/10/12 6:42 AM, "David Marso" <[hidden email]> wrote:

>Additionally, AFAIK growth mixture modelling in SPSS would most likely
>require the *MIXED* procedure which presumes a *LONG* format for the
>data!!!
>
>Rich Ulrich-2 wrote
>> I think you will save yourself a lot of grief in the long run if
>> you do take the trouble to use a "long data set" version.
>> For one thing, I would not trust the number of years to add up
>> to the right total until they have been massaged a bit, because
>> "years in job" is not likely to be without rounding error.
>>
>> So -- Write out the number of lines, using XSAVE, for each
>> job. For that new file, find the people who inconstant figures
>> for total-time versus the number of lines. The long form is
>> also useful for any other edits that you might want to do on
>> SES or whatever, since every occurrence of SES becomes the
>> same variable in the long form.
>>
>> If you do need the file back in wide form, you can use
>> CasesToVars.
>>
>> --
>> Rich Ulrich
>>
>>
>>> Date: Wed, 10 Oct 2012 03:33:23 +0000
>>> From:
>
>> jack.noone@.edu
>
>>> Subject: Re: Syntax help - duplicating variables
>>> To:
>
>> SPSSX-L@.UGA
>
>>>
>>> Hi Richard and others.
>>>
>>> The variable V1 represents a continuous socioeconomic indicator for a
>>> participant's given job and the variable V2 represents the number of
>>> years
>>> a participant has worked in this position.
>>>
>>> I want to create a wide data set (I've not had much to do with long
>>>data
>>> sets) with 62 variables. These variables correspond to each year from
>>> 1950
>>> to 2012 and the value for each variable will be the socioeconomic
>>> indicator for a job held during that year.
>>>
>>> I have variables that represent:
>>> 1) The year each job started
>>> 2) The year each job ended
>>> 3) the number of years in each job (subtract 1 from 2 to create V2
>>> variable)
>>> 4) The socioeconomic value for each job held (V1 variable).
>>>
>>> So in reality I have:
>>>
>>> V1_1 (SES for first job), V1_2 (SES for second job), . V1_j (SES
>>>for
>>> last job)
>>> V2_1 (Number of years in first job), V2_2, ..V2_J (number of
>>>years
>>> in last job)
>>>
>>> I would like to turn this into something like this:
>>>
>>> 1950 1951 1952 1953 1954 1955 1956 1957
>>> 1958 1959 ..2012
>>> Case1 43.2 43.2 43.2 60.2 60.2 60.2 60.2 60.2
>>> 60.2 60.2 ...
>>>
>>> This is a person who started their first job (SES value = 43.2) in
>>>1950
>>> and left the job in 1952. They started their second job (SES
>>>value=60.2)
>>> in 1953 and left it in 2012.
>>>
>>> There are also lots of little caveats along the way too!
>>>
>>> In the end, the data is going to form the basis for growth mixture
>>> modeling.
>>>
>>> I'm sure their is a better way and I am open to suggestions.
>>>
>>> Jack
>>>
>>> On 10/10/12 3:48 PM, "Richard Ristow" <
>
>> wrristow@
>
>> > wrote:
>>>
>>> >At 09:40 PM 10/9/2012, Jack Noone wrote:
>>> >
>>> >>I would like to be able to duplicate a variable (V1) X times
>>>according
>>> >>to
>>> >>the value of another variable (V2). In other words, I would like to
>>> >>convert this:
>>> >>
>>> >> V1 V2
>>> >>Case1 43.2 3
>>> >>Case2 48.1 4
>>> >>
>>> >>To this:
>>> >> V1 V2 V2_a V2_b V2_c V2_d
>>> >>Case1 43.2 3 43.2 43.2 43.2 .
>>> >>Case2 48.1 4 48.1 48.1 48.1 48.1
>>> >
>>> >I'm tossing this off, untested. Besides any mistakes I may make, one
>>> >warning: it doesn't check for invalid values of V2 (though it does
>>> >cap at 6 copies).
>>> >
>>> >NUMERIC V2_a V2_b V2_c V2_d V2_e V2_f (F5.1).
>>> >VECTOR New_V V2_a TO V2_f.
>>> >
>>> >LOOP #idx = 1 TO (MIN(V2,6)).
>>> >. COMPUTE New_V(#idx) = V1.
>>> >END LOOP.
>>> >
>>> >Let me end with the question that bothers me: Why do you want to do
>>> >this? If you tell us what you need to accomplish, there may be
>>> >another way to do it.
>>> >
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>>
>
>> LISTSERV@.UGA
>
>> (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5715581.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

Richard Ristow

Re: Syntax help - duplicating variables

In reply to this post by Jack Noone

At 11:33 PM 10/9/2012, Jack Noone wrote:

>The variable V1 represents a continuous
>socioeconomic indicator for a participant's
>given job and the variable V2 represents the
>number of years a participant has worked in this position.
>
>I have variables that represent:
>1) The year each job started
>2) The year each job ended
>4) The socioeconomic value for each job held (V1 variable).

Thanks. Notice that this is considerably more
complex than the problem you originally posted.
Generally, we can give you better answers, and
more quickly, if you give us the full problem and context at the beginning.

>3) the number of years in each job (subtract 1 from 2 to create V2 variable)

You probably don't want to create and use these.
The beginning and ending years, themselves, are
much easier to work with accurately.

Now, how are your variables laid out in the
dataset? If all the beginning years are together,
all the ending years together, and all the SES
variables together, you can define each kind of
variable as a vector, and the code is much
easier. But I'll bet it's not that way; that you
have Start_year_1, End_Year_1, SES_1, then
Start_year_2, End_Year_2, SES_2, and so on for
however many jobs your records accommodate.

Then, are there variables other than SES that you
want to have for each year? I'll bet there are.
Are those variables associated with the jobs
held, i.e. don't change during the duration of
the job, or do they change at times besides when
a job changes? I'll bet they do.

As others have said, a long-form dataset will be
easier to create. It's almost imperative if there
are any variables that vary with time but change
at other times than when jobs start and end. For
many analyses, the long-form data is easier to
work with, and for some it is essential. And, as
Rich Ulrich noted, if you finally do need a wide
dataset, you can create it easily from the long dataset, using CASESTOVARS.

The following code is tested (see listing in
Appendix). Notice that it unrolls your data
twice, first to one record per job, then to one record per year.
Finally,
a.) The code destroys the input data, converting
it to another form. Make sure you have a copy saved elsewhere.
b.) If you do have variables other than SES,
you'll have to ask about them separately.
c.) File 'By_Year' must be an external file (or
file handle for one), not a dataset

Input data:
[TestData]
Case_ID StartYr1 EndYr1 SES1 StartYr2 EndYr2 SES2 StartYr3 EndYr3 SES3

Case1 1974 1982 42.10 1983 2000 45.80 2001 2005 46.00
Case2 1952 1970 39.90 1971 1980 42.00 . . .

Number of cases read: 2 Number of cases listed: 2

Code:
VARSTOCASES
/MAKE StartYear FROM StartYr1 StartYr2 StartYr3
/MAKE EndYear FROM EndYr1 EndYr2 EndYr3
/MAKE SES FROM SES1 SES2 SES3
/KEEP = Case_ID
/NULL = DROP.

LIST.

NUMERIC Year (F4).
LOOP Year = StartYear TO EndYear.
. XSAVE OUTFILE=By_Year
/KEEP=Case_ID Year SES.
END LOOP.
EXECUTE /* (yes, really needed) */.

GET FILE=By_Year

LIST.

SORT CASES BY Case_ID Year .
CASESTOVARS
/ID = Case_ID
/INDEX = Year
/GROUPBY = VARIABLE .

================================
Appendix I: Listing of test run
================================
* C:\Documents and Settings\Richard\My Documents .
* \Technical\spssx-l\Z-2012\ .
* 2012-10-09 Noone - Syntax help - duplicating variables.SPS .

* In response to postings .
* From: Jack Noone <[hidden email]> .
* Subject: Syntax help - duplicating variables .
* To: [hidden email] .
* Date: [09:40 PM 10/9/2012] .
* and follow-up .
* Date: Wed, 10 Oct 2012 03:33:23 +0000 .
* From: Jack Noone <[hidden email]> .
* Subject: Re: Syntax help - duplicating variables .
* To: [hidden email] .

* I want to create a wide data set (I've not had much to do with
* long data sets) with 62 variables [corresponging] to each year
* from 1950 to 2012 ,and the value for each variable will be the
* socioeconomic indicator for a job held during that year.
*
* I have variables that represent: .
* 1) The year each job started .
* 2) The year each job ended .
* 3) the number of years in each job (subtract 1 from 2 to create .
* V2 variable) .
* 4) The socioeconomic value for each job held (V1 variable). .

* Scratch file, as target for XSAVE: ............................ .

FILE HANDLE By_Year
/NAME='C:\Documents and Settings\Richard\My Documents' +
'\Temporary\SPSS\' +
'2012-10-09 Noone - Syntax help - duplicating variables' +
'-UNROLLED' +
'.SAV'.

* Test data: ..................................................... .

PRESERVE.
SET MXWARNS 0.
DATA LIST LIST/
Case_ID StartYr1 EndYr1 SES1
StartYr2 EndYr2 SES2
StartYr3 EndYr3 SES3
(A5, F4, F4, F5.2, F4, F4, F5.2, F4, F4, F5.2).
BEGIN DATA
Case1 1974 1982 42.1 1983 2000 45.8 2001 2005 46.0
Case2 1952 1970 39.9 1971 1980 42.0

>Warning # 92
>The limit of MXWARNS warnings in this data pass has been printed. Further
>warnings have been suppressed.

END DATA.
RESTORE.

DATASET NAME TestData WINDOW=FRONT.
LIST.
List
öòòòòòòòòòòòòòòòòòòòòòòòòòòòòòûòòòòòòòòòòòòòòòø
óOutput Created ó10-OCT-2012 ó
ó ó21:49:59 ó
ùòòòòòòòòòòòòòòòòòòòòòòòòòòòòòôòòòòòòòòòòòòòòòú

[TestData]
Case_ID StartYr1 EndYr1 SES1 StartYr2 EndYr2 SES2 StartYr3 EndYr3 SES3

Case1 1974 1982 42.10 1983 2000 45.80 2001 2005 46.00
Case2 1952 1970 39.90 1971 1980 42.00 . . .

Number of cases read: 2 Number of cases listed: 2

VARSTOCASES
/MAKE StartYear FROM StartYr1 StartYr2 StartYr3
/MAKE EndYear FROM EndYr1 EndYr2 EndYr3
/MAKE SES FROM SES1 SES2 SES3
/KEEP = Case_ID
/NULL = DROP.

Variables to Cases
[TestData]
Generated Variables
öòòòòòòòòòûòòòòòòø
óName óLabel ó
ùòòòòòòòòòôòòòòòòú
óStartYearó<none>ó
óEndYear ó<none>ó
óSES ó<none>ó
õòòòòòòòòòüòòòòòò÷
Processing Statistics
öòòòòòòòòòòòòòûòòø
óVariables In ó10ó
óVariables Outó4 ó
õòòòòòòòòòòòòòüòò÷

LIST.
öòòòòòòòòòòòòòòòòòòòòòòòòòòòòòûòòòòòòòòòòòòòòòø
óOutput Created ó10-OCT-2012 ó
ó ó21:49:59 ó
ùòòòòòòòòòòòòòòòòòòòòòòòòòòòòòôòòòòòòòòòòòòòòòú
[TestData]
Case_ID StartYear EndYear SES

Case1 1974 1982 42.10
Case1 1983 2000 45.80
Case1 2001 2005 46.00
Case2 1952 1970 39.90
Case2 1971 1980 42.00

Number of cases read: 5 Number of cases listed: 5

NUMERIC Year (F4).
LOOP Year = StartYear TO EndYear.
. XSAVE OUTFILE=By_Year
/KEEP=Case_ID Year SES.
END LOOP.
EXECUTE /* (yes, really needed) */.

GET FILE=By_Year

LIST.
öòòòòòòòòòòòòòòòòòòòòòòòòòòòòòûòòòòòòòòòòòòòòòø
óOutput Created ó10-OCT-2012 ó
ó ó21:50:00 ó
õòòòòòòòòòòòòòüòòòòòòòòòòòòòòòüòòòòòòòòòòòòòòò÷
C:\Documents and Settings\Richard\My Documents\Temporary\SPSS\
2012-10-09 Noone - Syntax help - duplicating variables-UNROLLED.SAV
Case_ID Year SES

Case1 1974 42.10
Case1 1975 42.10
Case1 1976 42.10
Case1 1977 42.10
Case1 1978 42.10
Case1 1979 42.10
Case1 1980 42.10
Case1 1981 42.10
Case1 1982 42.10
Case1 1983 45.80
Case1 1984 45.80
Case1 1985 45.80
Case1 1986 45.80
Case1 1987 45.80
Case1 1988 45.80
Case1 1989 45.80
Case1 1990 45.80
Case1 1991 45.80
Case1 1992 45.80
Case1 1993 45.80
Case1 1994 45.80
Case1 1995 45.80
Case1 1996 45.80
Case1 1997 45.80
Case1 1998 45.80
Case1 1999 45.80
Case1 2000 45.80
Case1 2001 46.00
Case1 2002 46.00
Case1 2003 46.00
Case1 2004 46.00
Case1 2005 46.00
Case2 1952 39.90
Case2 1953 39.90
Case2 1954 39.90
Case2 1955 39.90
Case2 1956 39.90
Case2 1957 39.90
Case2 1958 39.90
Case2 1959 39.90
Case2 1960 39.90
Case2 1961 39.90
Case2 1962 39.90
Case2 1963 39.90
Case2 1964 39.90
Case2 1965 39.90
Case2 1966 39.90
Case2 1967 39.90
Case2 1968 39.90
Case2 1969 39.90
Case2 1970 39.90
Case2 1971 42.00
Case2 1972 42.00
Case2 1973 42.00
Case2 1974 42.00
Case2 1975 42.00
Case2 1976 42.00
Case2 1977 42.00
Case2 1978 42.00
Case2 1979 42.00
Case2 1980 42.00

Number of cases read: 61 Number of cases listed: 61

SORT CASES BY Case_ID Year .
CASESTOVARS
/ID = Case_ID
/INDEX = Year
/GROUPBY = VARIABLE .

öòòòòòòòòòòòòòòòòòòòòòòòòòòûòòòòòòòòòòòòòòòø
óOutput Created ó10-OCT-2012 ó
ó ó21:50:00 ó
õòòòòòòòòòòòòòüòòòòòòòòòòòòüòòòòòòòòòòòòòòò÷

C:\Documents and Settings\Richard\My Documents\Temporary\SPSS\
2012-10-09 Noone - Syntax help - duplicating variables-UNROLLED.SAV

Generated Variables
öòòòòòòòòûòòòòûòòòòòòòòø
óOriginalóYearóResult ó
óVariableó ùòòòòòòòòú
ó ó óName ó
ùòòòòòòòòôòòòòôòòòòòòòòú
óSES ó1952óSES.1952ó
ó ó1953óSES.1953ó
ó ó1954óSES.1954ó
ó ó1955óSES.1955ó
ó ó1956óSES.1956ó
ó ó1957óSES.1957ó
ó ó1958óSES.1958ó
ó ó1959óSES.1959ó
ó ó1960óSES.1960ó
ó ó1961óSES.1961ó
ó ó1962óSES.1962ó
ó ó1963óSES.1963ó
ó ó1964óSES.1964ó
ó ó1965óSES.1965ó
ó ó1966óSES.1966ó
ó ó1967óSES.1967ó
ó ó1968óSES.1968ó
ó ó1969óSES.1969ó
ó ó1970óSES.1970ó
ó ó1971óSES.1971ó
ó ó1972óSES.1972ó
ó ó1973óSES.1973ó
ó ó1974óSES.1974ó
ó ó1975óSES.1975ó
ó ó1976óSES.1976ó
ó ó1977óSES.1977ó
ó ó1978óSES.1978ó
ó ó1979óSES.1979ó
ó ó1980óSES.1980ó
ó ó1981óSES.1981ó
ó ó1982óSES.1982ó
ó ó1983óSES.1983ó
ó ó1984óSES.1984ó
ó ó1985óSES.1985ó
ó ó1986óSES.1986ó
ó ó1987óSES.1987ó
ó ó1988óSES.1988ó
ó ó1989óSES.1989ó
ó ó1990óSES.1990ó
ó ó1991óSES.1991ó
ó ó1992óSES.1992ó
ó ó1993óSES.1993ó
ó ó1994óSES.1994ó
ó ó1995óSES.1995ó
ó ó1996óSES.1996ó
ó ó1997óSES.1997ó
ó ó1998óSES.1998ó
ó ó1999óSES.1999ó
ó ó2000óSES.2000ó
ó ó2001óSES.2001ó
ó ó2002óSES.2002ó
ó ó2003óSES.2003ó
ó ó2004óSES.2004ó
ó ó2005óSES.2005ó
õòòòòòòòòüòòòòüòòòòòòòò÷

Processing Statistics
öòòòòòòòòòòòòòòòûòòòòø
óCases In ó61 ó
óCases Out ó2 ó
ùòòòòòòòòòòòòòòòôòòòòú
óCases In/Cases ó30.5ó
óOut ó ó
ùòòòòòòòòòòòòòòòôòòòòú
óVariables In ó3 ó
óVariables Out ó55 ó
ùòòòòòòòòòòòòòòòôòòòòú
óIndex Values ó54 ó
õòòòòòòòòòòòòòòòüòòòò÷
================================
Appendix II: Test data, and code
================================
* C:\Documents and Settings\Richard\My Documents .
* \Technical\spssx-l\Z-2012\ .
* 2012-10-09 Noone - Syntax help - duplicating variables.SPS .

* In response to postings .
* From: Jack Noone <[hidden email]> .
* Subject: Syntax help - duplicating variables .
* To: [hidden email] .
* Date: [09:40 PM 10/9/2012] .
* and follow-up .
* Date: Wed, 10 Oct 2012 03:33:23 +0000 .
* From: Jack Noone <[hidden email]> .
* Subject: Re: Syntax help - duplicating variables .
* To: [hidden email] .

* I want to create a wide data set (I've not had much to do with
* long data sets) with 62 variables [corresponging] to each year
* from 1950 to 2012 ,and the value for each variable will be the
* socioeconomic indicator for a job held during that year.
*
* I have variables that represent: .
* 1) The year each job started .
* 2) The year each job ended .
* 3) the number of years in each job (subtract 1 from 2 to create .
* V2 variable) .
* 4) The socioeconomic value for each job held (V1 variable). .

* Scratch file, as target for XSAVE: ............................ .

FILE HANDLE By_Year
/NAME='C:\Documents and Settings\Richard\My Documents' +
'\Temporary\SPSS\' +
'2012-10-09 Noone - Syntax help - duplicating variables' +
'-UNROLLED' +
'.SAV'.

* Test data: ..................................................... .

PRESERVE.
SET MXWARNS 0.
DATA LIST LIST/
Case_ID StartYr1 EndYr1 SES1
StartYr2 EndYr2 SES2
StartYr3 EndYr3 SES3
(A5, F4, F4, F5.2, F4, F4, F5.2, F4, F4, F5.2).
BEGIN DATA
Case1 1974 1982 42.1 1983 2000 45.8 2001 2005 46.0
Case2 1952 1970 39.9 1971 1980 42.0
END DATA.
RESTORE.

DATASET NAME TestData WINDOW=FRONT.
LIST.

VARSTOCASES
/MAKE StartYear FROM StartYr1 StartYr2 StartYr3
/MAKE EndYear FROM EndYr1 EndYr2 EndYr3
/MAKE SES FROM SES1 SES2 SES3
/KEEP = Case_ID
/NULL = DROP.

LIST.

NUMERIC Year (F4).
LOOP Year = StartYear TO EndYear.
. XSAVE OUTFILE=By_Year
/KEEP=Case_ID Year SES.
END LOOP.
EXECUTE /* (yes, really needed) */.

GET FILE=By_Year

LIST.

SORT CASES BY Case_ID Year .
CASESTOVARS
/ID = Case_ID
/INDEX = Year
/GROUPBY = VARIABLE .

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Richard Ristow

Re: Syntax help - duplicating variables

At 01:36 AM 10/13/2012, Jack Noone wrote, off-list:
(I'm replying on-list, as is my usual practice, because
+ Posting gives others the opportunity to respond. Someone else might
think of something I don't.
+ The list isn't just a question-and-answer resource; it's a forum
for all of us to learn more about SPSS. Posted questions and
responses can inform everyone on the list; off-list questions and
responses inform only one person.)

>I've located the original, long format, SPSS file! Here is an
>example/snippet of what one participant's data looks like in long format:
>
>P_ID Job_number year_start year_end Job_SES years_in_job
>
>0001 1 1964 1966 64.6 2
>0001 2 1966 1970 70.2 4
>
>I then ran the following syntax to create one line for every year in
>paid work and this works well:
>
>loop copy = 1 to years_in_job.
>. xsave
> outfile = "/Users/jacknoone/desktop/expanded file.sav" /
> keep = copy all .
>end loop.
>execute.
>
>However, I appreciate the issue you identified with start/end years
>versus "years_in_job". In particular, some people have finished and
>started another job in the same year (as above) In this instance I
>would like to use the job that has the highest SES value. Other
>people are doing two part-time at once and again I would like to use
>the job with the highest SES rating.
>
>1. In the loop copy syntax above, what would be a better alternative
>to "years_in_job"?

As I wrote in my last post, it's far better to write out the calendar
year than the year-number within job. (Among other things, that's the
only way you can recognize when two jobs were held in the same year):

>loop year = year_start TO year_end.
>. xsave
> outfile = "/Users/jacknoone/desktop/expanded file.sav" /
> keep = P_ID year Job_number Job_SES.
>end loop.
>execute.
>
>2. Is there a way to write syntax that would automatically select
>the highest SES job for any one year?

Quite easy, though this loses the job number in the process. AFTER
the above code,

GET FILE="/Users/jacknoone/desktop/expanded file.sav".
AGGREGATE
/OUTFILE=*
/BREAK =P_ID year
/JinYear 'Number of jobs held in calendar year'=NU
/Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).

I haven't tested the code for this posting; apologies, for any mistakes.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Jack Noone

Re: Syntax help - duplicating variables

Hi Richard,

You may remember the thread below. The syntax you wrote was perfect,
however I need to keep some other variables as well and I can't seem to
figure out how to do it. Here is the piece of syntax in question.

GET FILE="/Users/jacknoone/desktop/expanded file.sav".
AGGREGATE
/OUTFILE=*
/BREAK =P_ID year
/JinYear 'Number of jobs held in calendar year'=NU
/Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).

So, how would I fit
keep = p_id to AUSEI06_3digit.
into the syntax above

Thanks,

Jack

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411

On 14/10/12 5:36 AM, "Richard Ristow" <[hidden email]> wrote:

>At 01:36 AM 10/13/2012, Jack Noone wrote, off-list:
>(I'm replying on-list, as is my usual practice, because
>+ Posting gives others the opportunity to respond. Someone else might
>think of something I don't.
>+ The list isn't just a question-and-answer resource; it's a forum
>for all of us to learn more about SPSS. Posted questions and
>responses can inform everyone on the list; off-list questions and
>responses inform only one person.)
>
>>I've located the original, long format, SPSS file! Here is an
>>example/snippet of what one participant's data looks like in long format:
>>
>>P_ID Job_number year_start year_end Job_SES years_in_job
>>
>>0001 1 1964 1966 64.6 2
>>0001 2 1966 1970 70.2 4
>>
>>I then ran the following syntax to create one line for every year in
>>paid work and this works well:
>>
>>loop copy = 1 to years_in_job.
>>. xsave
>> outfile = "/Users/jacknoone/desktop/expanded file.sav" /
>> keep = copy all .
>>end loop.
>>execute.
>>
>>However, I appreciate the issue you identified with start/end years
>>versus "years_in_job". In particular, some people have finished and
>>started another job in the same year (as above) In this instance I
>>would like to use the job that has the highest SES value. Other
>>people are doing two part-time at once and again I would like to use
>>the job with the highest SES rating.
>>
>>1. In the loop copy syntax above, what would be a better alternative
>>to "years_in_job"?
>
>As I wrote in my last post, it's far better to write out the calendar
>year than the year-number within job. (Among other things, that's the
>only way you can recognize when two jobs were held in the same year):
>
>>loop year = year_start TO year_end.
>>. xsave
>> outfile = "/Users/jacknoone/desktop/expanded file.sav" /
>> keep = P_ID year Job_number Job_SES.
>>end loop.
>>execute.
>>
>>2. Is there a way to write syntax that would automatically select
>>the highest SES job for any one year?
>
>Quite easy, though this loses the job number in the process. AFTER
>the above code,
>
>GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>AGGREGATE
> /OUTFILE=*
> /BREAK =P_ID year
> /JinYear 'Number of jobs held in calendar year'=NU
> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>
>I haven't tested the code for this posting; apologies, for any mistakes.
>

David Marso

Re: Syntax help - duplicating variables

Administrator

Without reviewing the entire thread:
If p_id to AUSEI06_3digit are CONSTANT within the structure (P_ID * year) simply add them to the list of BREAKS
ie
AGGREGATE
/OUTFILE=*
/BREAK =P_ID to AUSEI06_3digit year
/JinYear 'Number of jobs held in calendar year'=NU
/Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).

Otherwise your question requires greater specificity.

Jack Noone wrote

Hi Richard,

You may remember the thread below. The syntax you wrote was perfect,
however I need to keep some other variables as well and I can't seem to
figure out how to do it. Here is the piece of syntax in question.

GET FILE="/Users/jacknoone/desktop/expanded file.sav".
AGGREGATE
/OUTFILE=*
/BREAK =P_ID year
/JinYear 'Number of jobs held in calendar year'=NU
/Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).

So, how would I fit
keep = p_id to AUSEI06_3digit.
into the syntax above

Thanks,

Jack

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411

On 14/10/12 5:36 AM, "Richard Ristow" <[hidden email]> wrote:

>At 01:36 AM 10/13/2012, Jack Noone wrote, off-list:
>(I'm replying on-list, as is my usual practice, because
>+ Posting gives others the opportunity to respond. Someone else might
>think of something I don't.
>+ The list isn't just a question-and-answer resource; it's a forum
>for all of us to learn more about SPSS. Posted questions and
>responses can inform everyone on the list; off-list questions and
>responses inform only one person.)
>
>>I've located the original, long format, SPSS file! Here is an
>>example/snippet of what one participant's data looks like in long format:
>>
>>P_ID Job_number year_start year_end Job_SES years_in_job
>>
>>0001 1 1964 1966 64.6 2
>>0001 2 1966 1970 70.2 4
>>
>>I then ran the following syntax to create one line for every year in
>>paid work and this works well:
>>
>>loop copy = 1 to years_in_job.
>>. xsave
>> outfile = "/Users/jacknoone/desktop/expanded file.sav" /
>> keep = copy all .
>>end loop.
>>execute.
>>
>>However, I appreciate the issue you identified with start/end years
>>versus "years_in_job". In particular, some people have finished and
>>started another job in the same year (as above) In this instance I
>>would like to use the job that has the highest SES value. Other
>>people are doing two part-time at once and again I would like to use
>>the job with the highest SES rating.
>>
>>1. In the loop copy syntax above, what would be a better alternative
>>to "years_in_job"?
>
>As I wrote in my last post, it's far better to write out the calendar
>year than the year-number within job. (Among other things, that's the
>only way you can recognize when two jobs were held in the same year):
>
>>loop year = year_start TO year_end.
>>. xsave
>> outfile = "/Users/jacknoone/desktop/expanded file.sav" /
>> keep = P_ID year Job_number Job_SES.
>>end loop.
>>execute.
>>
>>2. Is there a way to write syntax that would automatically select
>>the highest SES job for any one year?
>
>Quite easy, though this loses the job number in the process. AFTER
>the above code,
>
>GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>AGGREGATE
> /OUTFILE=*
> /BREAK =P_ID year
> /JinYear 'Number of jobs held in calendar year'=NU
> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>
>I haven't tested the code for this posting; apologies, for any mistakes.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Jack Noone

Re: Syntax help - duplicating variables

Hi David,

Thanks for your help, but unfortunately the syntax didn't work as I'd
hoped.

I believe the context for the problem is in thread below. But, according
to point number 2 (see bottom of thread), the original syntax was designed
to "automatically select
the highest SES job for any one year" and it did this perfectly. Some
people had more than one job in a calendar year and I wanted to select the
job with the highest socioeconomic rating.

But, if I add the other variables under the break command as suggested,
then the highest SES job for any one year is not selected out. I thought
that they were constant within the structure, but I now suspect that I
didn't understand what you meant. Could you elaborate please?

I also have a another, somewhat related, syntax query. Having converted my
long file to wide I end up with a file looking like this:

P_ID ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6 .
1 34 34 . . 48 48
2 48 48 48 75 75 75

This is simply a occupation-based socioeconomic index for each year of my
participants' working lives - exactly what I wanted. However, I need to
fill in the missing data by substituting in the last SES score. For
example, participant 1 was out of the workforce for year 3 and year 4 and
I would like to substitute in their SES score of 34 (from their last job)
for the two points of missing data.

I'm sure there is an easy way to do this, but I have no idea how.

Thanks,

Jack

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411

On 14/11/12 5:39 PM, "David Marso" <[hidden email]> wrote:

>Without reviewing the entire thread:
>*If* p_id to AUSEI06_3digit are *CONSTANT *within the structure (P_ID *
>year) simply add them to the list of BREAKS
>ie
>AGGREGATE
> /OUTFILE=*
> /BREAK =P_ID to AUSEI06_3digit year
> /JinYear 'Number of jobs held in calendar year'=NU
> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>
>*Otherwise *your question requires greater specificity.
>
>
>Jack Noone wrote
>> Hi Richard,
>>
>> You may remember the thread below. The syntax you wrote was perfect,
>> however I need to keep some other variables as well and I can't seem to
>> figure out how to do it. Here is the piece of syntax in question.
>>
>>
>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>> AGGREGATE
>> /OUTFILE=*
>> /BREAK =P_ID year
>> /JinYear 'Number of jobs held in calendar year'=NU
>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>
>> So, how would I fit
>> keep = p_id to AUSEI06_3digit.
>> into the syntax above
>>
>> Thanks,
>>
>> Jack
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 14/10/12 5:36 AM, "Richard Ristow" <
>
>> wrristow@
>
>> > wrote:
>>
>>>At 01:36 AM 10/13/2012, Jack Noone wrote, off-list:
>>>(I'm replying on-list, as is my usual practice, because
>>>+ Posting gives others the opportunity to respond. Someone else might
>>>think of something I don't.
>>>+ The list isn't just a question-and-answer resource; it's a forum
>>>for all of us to learn more about SPSS. Posted questions and
>>>responses can inform everyone on the list; off-list questions and
>>>responses inform only one person.)
>>>
>>>>I've located the original, long format, SPSS file! Here is an
>>>>example/snippet of what one participant's data looks like in long
>>>>format:
>>>>
>>>>P_ID Job_number year_start year_end Job_SES years_in_job
>>>>
>>>>0001 1 1964 1966 64.6 2
>>>>0001 2 1966 1970 70.2 4
>>>>
>>>>I then ran the following syntax to create one line for every year in
>>>>paid work and this works well:
>>>>
>>>>loop copy = 1 to years_in_job.
>>>>. xsave
>>>> outfile = "/Users/jacknoone/desktop/expanded file.sav" /
>>>> keep = copy all .
>>>>end loop.
>>>>execute.
>>>>
>>>>However, I appreciate the issue you identified with start/end years
>>>>versus "years_in_job". In particular, some people have finished and
>>>>started another job in the same year (as above) In this instance I
>>>>would like to use the job that has the highest SES value. Other
>>>>people are doing two part-time at once and again I would like to use
>>>>the job with the highest SES rating.
>>>>
>>>>1. In the loop copy syntax above, what would be a better alternative
>>>>to "years_in_job"?
>>>
>>>As I wrote in my last post, it's far better to write out the calendar
>>>year than the year-number within job. (Among other things, that's the
>>>only way you can recognize when two jobs were held in the same year):
>>>
>>>>loop year = year_start TO year_end.
>>>>. xsave
>>>> outfile = "/Users/jacknoone/desktop/expanded file.sav" /
>>>> keep = P_ID year Job_number Job_SES.
>>>>end loop.
>>>>execute.
>>>>
>>>>2. Is there a way to write syntax that would automatically select
>>>>the highest SES job for any one year?
>>>
>>>Quite easy, though this loses the job number in the process. AFTER
>>>the above code,
>>>
>>>GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>AGGREGATE
>>> /OUTFILE=*
>>> /BREAK =P_ID year
>>> /JinYear 'Number of jobs held in calendar year'=NU
>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>>
>>>I haven't tested the code for this posting; apologies, for any mistakes.
>>>
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>> (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716178.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

David Marso

Re: Syntax help - duplicating variables

Administrator

Your initial followup:
"I need to keep some other variables as well and I can't seem to figure out how to do it."
If the values of these additional variables vary over year then you need to specify how these new variables will be represented in the new data file. If they don't vary then everything should be exactly as if they were not used in the AGG BREAK. Maybe time for you to post what the before/after (pre aggregate/post aggregated) data appear.

Point 2:
Data x1...x10
1 1 . . 3 4 5
-----
VECTOR V=V1 TO V10.
LOOP #=2 TO 10.
IF MISSING(V(#)) V(#)=V(#-1),
END LOOP.
----------------

Jack Noone wrote

Hi David,

Thanks for your help, but unfortunately the syntax didn't work as I'd
hoped.

I believe the context for the problem is in thread below. But, according
to point number 2 (see bottom of thread), the original syntax was designed
to "automatically select
the highest SES job for any one year" and it did this perfectly. Some
people had more than one job in a calendar year and I wanted to select the
job with the highest socioeconomic rating.

But, if I add the other variables under the break command as suggested,
then the highest SES job for any one year is not selected out. I thought
that they were constant within the structure, but I now suspect that I
didn't understand what you meant. Could you elaborate please?

I also have a another, somewhat related, syntax query. Having converted my
long file to wide I end up with a file looking like this:

P_ID ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6 .
1 34 34 . . 48 48
2 48 48 48 75 75 75

This is simply a occupation-based socioeconomic index for each year of my
participants' working lives - exactly what I wanted. However, I need to
fill in the missing data by substituting in the last SES score. For
example, participant 1 was out of the workforce for year 3 and year 4 and
I would like to substitute in their SES score of 34 (from their last job)
for the two points of missing data.

I'm sure there is an easy way to do this, but I have no idea how.

Thanks,

Jack

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411

On 14/11/12 5:39 PM, "David Marso" <[hidden email]> wrote:

>Without reviewing the entire thread:
>*If* p_id to AUSEI06_3digit are *CONSTANT *within the structure (P_ID *
>year) simply add them to the list of BREAKS
>ie
>AGGREGATE
> /OUTFILE=*
> /BREAK =P_ID to AUSEI06_3digit year
> /JinYear 'Number of jobs held in calendar year'=NU
> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>
>*Otherwise *your question requires greater specificity.
>
>
>Jack Noone wrote
>> Hi Richard,
>>
>> You may remember the thread below. The syntax you wrote was perfect,
>> however I need to keep some other variables as well and I can't seem to
>> figure out how to do it. Here is the piece of syntax in question.
>>
>>
>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>> AGGREGATE
>> /OUTFILE=*
>> /BREAK =P_ID year
>> /JinYear 'Number of jobs held in calendar year'=NU
>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>
>> So, how would I fit
>> keep = p_id to AUSEI06_3digit.
>> into the syntax above
>>
>> Thanks,
>>
>> Jack
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>> <SNIP>

Jack Noone

Re: Syntax help - duplicating variables

Hi David and all,

Point 1:

Here is what the data looks like prior to AGG BREAK.

P_ID year job yr_start yr_stop job_SES self_employed
1 1964 1 1964 1965 48.4 1
1 1965 1 1964 1965 48.4 1
1 1965 2 1965 1967 48.4 0
1 1965 2 1965 1967 48.4 0
1 1965 2 1965 1967 48.4 0
1 1967 3 1967 1969 48.4 1
1 1967 3 1967 1969 48.4 1
1 1968 4 1968 1969 48.4 0
1 1969 4 1968 1969 48.4 0
1 1969 5 1969 1974 83.7 1
1 1969 5 1969 1974 83.7 1

And so on

However, we can see that people are holding more than one job in a
calendar year.
So I applied this syntax (℅ R.Ristow) with the aim to have only the
highest job_SES for any given year:

AGGREGATE
/OUTFILE=*
/BREAK =P_ID year
/JinYear 'Number of jobs held in calendar year'=NU
/Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).

Which resulted in

P_ID year jinyear job_SES
1 1964 1 48.4
1 1965 2 48.4
1 1966 1 48.4
1 1967 2 48.4
1 1968 2 48.4
1 1969 2 83.7
1 1970 1 83.7
1 1971 1 83.7

Perfect! In 1969, this participant held one job with a SES rating of 48.4
and one with SES rating of 83.7. However, only the higher rating SES value
is chosen for 1969.

However, I want to know if the person was self-employed for the job that
has been selected. So I tried this:

AGGREGATE
/OUTFILE=*
/BREAK =P_ID year self_employed
/JinYear 'Number of jobs held in calendar year'=NU
/Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).

But it didn't work. Their were no error messages but the sorting by
Job_SES didn't run. HELP!

Point 2:

I converted the long format file to wide format so I could take a look at
the missing data.

I then applied this syntax after sorting the variables

VECTOR V=Job_ses_1 TO Job_SES_55.
LOOP #=2 TO 10.
IF MISSING(V(#)) V(#)=V(#-1),
END LOOP.

There were no errors but the missing data were not filled.

Sadly I don't have the knowledge to solve this one myself. HELP AGAIN!

Jack

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411

On 16/11/12 1:23 AM, "David Marso" <[hidden email]> wrote:

>Your initial followup:
>"I need to keep some other variables as well and I can't seem to figure
>out
>how to do it."
>If the values of these additional variables vary over year then you need
>to
>specify how these new variables will be represented in the new data file.
>If they don't vary then everything should be exactly as if they were not
>used in the AGG BREAK. Maybe time for you to post what the before/after
>(pre aggregate/post aggregated) data appear.
>
>Point 2:
>Data x1...x10
>1 1 . . 3 4 5
>-----
>VECTOR V=V1 TO V10.
>LOOP #=2 TO 10.
>IF MISSING(V(#)) V(#)=V(#-1),
>END LOOP.
>----------------
>
>Jack Noone wrote
>> Hi David,
>>
>> Thanks for your help, but unfortunately the syntax didn't work as I'd
>> hoped.
>>
>> I believe the context for the problem is in thread below. But, according
>> to point number 2 (see bottom of thread), the original syntax was
>>designed
>> to "automatically select
>> the highest SES job for any one year" and it did this perfectly. Some
>> people had more than one job in a calendar year and I wanted to select
>>the
>> job with the highest socioeconomic rating.
>>
>> But, if I add the other variables under the break command as suggested,
>> then the highest SES job for any one year is not selected out. I thought
>> that they were constant within the structure, but I now suspect that I
>> didn't understand what you meant. Could you elaborate please?
>>
>> I also have a another, somewhat related, syntax query. Having converted
>>my
>> long file to wide I end up with a file looking like this:
>>
>> P_ID ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6 .
>> 1 34 34 . . 48 48
>> 2 48 48 48 75 75 75
>>
>> This is simply a occupation-based socioeconomic index for each year of
>>my
>> participants' working lives - exactly what I wanted. However, I need to
>> fill in the missing data by substituting in the last SES score. For
>> example, participant 1 was out of the workforce for year 3 and year 4
>>and
>> I would like to substitute in their SES score of 34 (from their last
>>job)
>> for the two points of missing data.
>>
>> I'm sure there is an easy way to do this, but I have no idea how.
>>
>> Thanks,
>>
>> Jack
>>
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 14/11/12 5:39 PM, "David Marso" <
>
>> david.marso@
>
>> > wrote:
>>
>>>Without reviewing the entire thread:
>>>*If* p_id to AUSEI06_3digit are *CONSTANT *within the structure (P_ID *
>>>year) simply add them to the list of BREAKS
>>>ie
>>>AGGREGATE
>>> /OUTFILE=*
>>> /BREAK =P_ID to AUSEI06_3digit year
>>> /JinYear 'Number of jobs held in calendar year'=NU
>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>>
>>>*Otherwise *your question requires greater specificity.
>>>
>>>
>>>Jack Noone wrote
>>>> Hi Richard,
>>>>
>>>> You may remember the thread below. The syntax you wrote was perfect,
>>>> however I need to keep some other variables as well and I can't seem
>>>>to
>>>> figure out how to do it. Here is the piece of syntax in question.
>>>>
>>>>
>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>> AGGREGATE
>>>> /OUTFILE=*
>>>> /BREAK =P_ID year
>>>> /JinYear 'Number of jobs held in calendar year'=NU
>>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>>>
>>>> So, how would I fit
>>>> keep = p_id to AUSEI06_3digit.
>>>> into the syntax above
>>>>
>>>> Thanks,
>>>>
>>>> Jack
>>>>
>>>>
>>>> Dr. Jack Noone
>>>> Research Fellow & LHH/ABBA Project Manager
>>>> Ageing, Work and Health Research Unit
>>>> Faculty of Health Sciences
>>>> University of Sydney
>>>>
>>>> Ph: 02 9351 9411
>>>>
>>>>
>>>>
>>>>
>> <SNIP>
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716214.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

David Marso

Re: Syntax help - duplicating variables

Administrator

Please RTFM re AGGREGATE rather than just blindly running donated code.
ie MODE=ADDVARIABLES will be useful.
--
If Self Employed is a constant for P_ID and Year then you will get precisely the result required.
"Their were no error messages but the sorting by Job_SES didn't run. HELP!"

"didn't run" is not informative! What DID happen????
Maybe post what did occur???

Run the original AGG per RW using MODE=ADDVAR then do a SELECT IF for the MAX(SES).

Point 2.
See EXECUTE command (so the data pass is performed and the fills are populated).

---

Jack Noone wrote

Hi David and all,

Point 1:

Here is what the data looks like prior to AGG BREAK.

P_ID year job yr_start yr_stop job_SES self_employed
1 1964 1 1964 1965 48.4 1
1 1965 1 1964 1965 48.4 1
1 1965 2 1965 1967 48.4 0
1 1965 2 1965 1967 48.4 0
1 1965 2 1965 1967 48.4 0
1 1967 3 1967 1969 48.4 1
1 1967 3 1967 1969 48.4 1
1 1968 4 1968 1969 48.4 0
1 1969 4 1968 1969 48.4 0
1 1969 5 1969 1974 83.7 1
1 1969 5 1969 1974 83.7 1

And so on

However, we can see that people are holding more than one job in a
calendar year.
So I applied this syntax (℅ R.Ristow) with the aim to have only the
highest job_SES for any given year:

AGGREGATE
/OUTFILE=*
/BREAK =P_ID year
/JinYear 'Number of jobs held in calendar year'=NU
/Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).

Which resulted in

P_ID year jinyear job_SES
1 1964 1 48.4
1 1965 2 48.4
1 1966 1 48.4
1 1967 2 48.4
1 1968 2 48.4
1 1969 2 83.7
1 1970 1 83.7
1 1971 1 83.7

Perfect! In 1969, this participant held one job with a SES rating of 48.4
and one with SES rating of 83.7. However, only the higher rating SES value
is chosen for 1969.

However, I want to know if the person was self-employed for the job that
has been selected. So I tried this:

AGGREGATE
/OUTFILE=*
/BREAK =P_ID year self_employed
/JinYear 'Number of jobs held in calendar year'=NU
/Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).

But it didn't work. Their were no error messages but the sorting by
Job_SES didn't run. HELP!

Point 2:

I converted the long format file to wide format so I could take a look at
the missing data.

I then applied this syntax after sorting the variables

VECTOR V=Job_ses_1 TO Job_SES_55.
LOOP #=2 TO 10.
IF MISSING(V(#)) V(#)=V(#-1),
END LOOP.

There were no errors but the missing data were not filled.

Sadly I don't have the knowledge to solve this one myself. HELP AGAIN!

Jack

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411

On 16/11/12 1:23 AM, "David Marso" <[hidden email]> wrote:

>Your initial followup:
>"I need to keep some other variables as well and I can't seem to figure
>out
>how to do it."
>If the values of these additional variables vary over year then you need
>to
>specify how these new variables will be represented in the new data file.
>If they don't vary then everything should be exactly as if they were not
>used in the AGG BREAK. Maybe time for you to post what the before/after
>(pre aggregate/post aggregated) data appear.
>
>Point 2:
>Data x1...x10
>1 1 . . 3 4 5
>-----
>VECTOR V=V1 TO V10.
>LOOP #=2 TO 10.
>IF MISSING(V(#)) V(#)=V(#-1),
>END LOOP.
>----------------
>
>Jack Noone wrote
>> Hi David,
>>
>> Thanks for your help, but unfortunately the syntax didn't work as I'd
>> hoped.
>>
>> I believe the context for the problem is in thread below. But, according
>> to point number 2 (see bottom of thread), the original syntax was
>>designed
>> to "automatically select
>> the highest SES job for any one year" and it did this perfectly. Some
>> people had more than one job in a calendar year and I wanted to select
>>the
>> job with the highest socioeconomic rating.
>>
>> But, if I add the other variables under the break command as suggested,
>> then the highest SES job for any one year is not selected out. I thought
>> that they were constant within the structure, but I now suspect that I
>> didn't understand what you meant. Could you elaborate please?
>>
>> I also have a another, somewhat related, syntax query. Having converted
>>my
>> long file to wide I end up with a file looking like this:
>>
>> P_ID ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6 .
>> 1 34 34 . . 48 48
>> 2 48 48 48 75 75 75
>>
>> This is simply a occupation-based socioeconomic index for each year of
>>my
>> participants' working lives - exactly what I wanted. However, I need to
>> fill in the missing data by substituting in the last SES score. For
>> example, participant 1 was out of the workforce for year 3 and year 4
>>and
>> I would like to substitute in their SES score of 34 (from their last
>>job)
>> for the two points of missing data.
>>
>> I'm sure there is an easy way to do this, but I have no idea how.
>>
>> Thanks,
>>
>> Jack
>>
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 14/11/12 5:39 PM, "David Marso" <
>
>> david.marso@
>
>> > wrote:
>>
>>>Without reviewing the entire thread:
>>>*If* p_id to AUSEI06_3digit are *CONSTANT *within the structure (P_ID *
>>>year) simply add them to the list of BREAKS
>>>ie
>>>AGGREGATE
>>> /OUTFILE=*
>>> /BREAK =P_ID to AUSEI06_3digit year
>>> /JinYear 'Number of jobs held in calendar year'=NU
>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>>
>>>*Otherwise *your question requires greater specificity.
>>>
>>>
>>>Jack Noone wrote
>>>> Hi Richard,
>>>>
>>>> You may remember the thread below. The syntax you wrote was perfect,
>>>> however I need to keep some other variables as well and I can't seem
>>>>to
>>>> figure out how to do it. Here is the piece of syntax in question.
>>>>
>>>>
>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>> AGGREGATE
>>>> /OUTFILE=*
>>>> /BREAK =P_ID year
>>>> /JinYear 'Number of jobs held in calendar year'=NU
>>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>>>
>>>> So, how would I fit
>>>> keep = p_id to AUSEI06_3digit.
>>>> into the syntax above
>>>>
>>>> Thanks,
>>>>
>>>> Jack
>>>>
>>>>
>>>> Dr. Jack Noone
>>>> Research Fellow & LHH/ABBA Project Manager
>>>> Ageing, Work and Health Research Unit
>>>> Faculty of Health Sciences
>>>> University of Sydney
>>>>
>>>> Ph: 02 9351 9411
>>>>
>>>>
>>>>
>>>>
>> <SNIP>
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716214.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Jack Noone

Re: Syntax help - duplicating variables

>Please RTFM re AGGREGATE rather than just blindly running donated code.
>ie MODE=ADDVARIABLES will be useful.
>--
>If Self Employed is a constant for P_ID and Year then you will get
>precisely
>the result required.
>"Their were no error messages but the sorting by Job_SES didn't run.
>HELP!"
>
>"didn't run" is *not *informative! What DID happen????
>Maybe post what did occur???
>
>Run the original AGG per RW using MODE=ADDVAR then do a SELECT IF for the
>MAX(SES).
>
>Point 2.
>See EXECUTE command (so the data pass is performed and the fills are
>populated).
>
>---
>
>Jack Noone wrote
>> Hi David and all,
>>
>> Point 1:
>>
>> Here is what the data looks like prior to AGG BREAK.
>>
>> P_ID year job yr_start yr_stop job_SES
>> self_employed
>> 1 1964 1 1964 1965 48.4
>>1
>> 1 1965 1 1964 1965 48.4
>>1
>> 1 1965 2 1965 1967 48.4
>>0
>> 1 1965 2 1965 1967 48.4
>>0
>> 1 1965 2 1965 1967 48.4
>>0
>> 1 1967 3 1967 1969 48.4
>>1
>> 1 1967 3 1967 1969 48.4
>>1
>> 1 1968 4 1968 1969 48.4
>>0
>> 1 1969 4 1968 1969 48.4
>>0
>> 1 1969 5 1969 1974 83.7
>>1
>> 1 1969 5 1969 1974 83.7
>>1
>>
>> And so on
>>
>>
>> However, we can see that people are holding more than one job in a
>> calendar year.
>> So I applied this syntax (℅ R.Ristow) with the aim to have only the
>> highest job_SES for any given year:
>>
>> AGGREGATE
>> /OUTFILE=*
>> /BREAK =P_ID year
>> /JinYear 'Number of jobs held in calendar year'=NU
>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>
>> Which resulted in
>>
>> P_ID year jinyear job_SES
>> 1 1964 1 48.4
>> 1 1965 2 48.4
>> 1 1966 1 48.4
>> 1 1967 2 48.4
>> 1 1968 2 48.4
>> 1 1969 2 83.7
>> 1 1970 1 83.7
>> 1 1971 1 83.7
>>
>> Perfect! In 1969, this participant held one job with a SES rating of
>>48.4
>> and one with SES rating of 83.7. However, only the higher rating SES
>>value
>> is chosen for 1969.
>>
>> However, I want to know if the person was self-employed for the job that
>> has been selected. So I tried this:
>>
>> AGGREGATE
>> /OUTFILE=*
>> /BREAK =P_ID year self_employed
>> /JinYear 'Number of jobs held in calendar year'=NU
>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>
>> But it didn't work. Their were no error messages but the sorting by
>> Job_SES didn't run. HELP!
>>
>>
>> Point 2:
>>
>> I converted the long format file to wide format so I could take a look
>>at
>> the missing data.
>>
>> I then applied this syntax after sorting the variables
>>
>> VECTOR V=Job_ses_1 TO Job_SES_55.
>> LOOP #=2 TO 10.
>> IF MISSING(V(#)) V(#)=V(#-1),
>> END LOOP.
>>
>> There were no errors but the missing data were not filled.
>>
>>
>>
>>
>> Sadly I don't have the knowledge to solve this one myself. HELP AGAIN!
>>
>> Jack
>>
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 16/11/12 1:23 AM, "David Marso" <
>
>> david.marso@
>
>> > wrote:
>>
>>>Your initial followup:
>>>"I need to keep some other variables as well and I can't seem to figure
>>>out
>>>how to do it."
>>>If the values of these additional variables vary over year then you need
>>>to
>>>specify how these new variables will be represented in the new data
>>>file.
>>>If they don't vary then everything should be exactly as if they were not
>>>used in the AGG BREAK. Maybe time for you to post what the before/after
>>>(pre aggregate/post aggregated) data appear.
>>>
>>>Point 2:
>>>Data x1...x10
>>>1 1 . . 3 4 5
>>>-----
>>>VECTOR V=V1 TO V10.
>>>LOOP #=2 TO 10.
>>>IF MISSING(V(#)) V(#)=V(#-1),
>>>END LOOP.
>>>----------------
>>>
>>>Jack Noone wrote
>>>> Hi David,
>>>>
>>>> Thanks for your help, but unfortunately the syntax didn't work as I'd
>>>> hoped.
>>>>
>>>> I believe the context for the problem is in thread below. But,
>>>>according
>>>> to point number 2 (see bottom of thread), the original syntax was
>>>>designed
>>>> to "automatically select
>>>> the highest SES job for any one year" and it did this perfectly. Some
>>>> people had more than one job in a calendar year and I wanted to select
>>>>the
>>>> job with the highest socioeconomic rating.
>>>>
>>>> But, if I add the other variables under the break command as
>>>>suggested,
>>>> then the highest SES job for any one year is not selected out. I
>>>>thought
>>>> that they were constant within the structure, but I now suspect that I
>>>> didn't understand what you meant. Could you elaborate please?
>>>>
>>>> I also have a another, somewhat related, syntax query. Having
>>>>converted
>>>>my
>>>> long file to wide I end up with a file looking like this:
>>>>
>>>> P_ID ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6 .
>>>> 1 34 34 . . 48 48
>>>> 2 48 48 48 75 75 75
>>>>
>>>> This is simply a occupation-based socioeconomic index for each year of
>>>>my
>>>> participants' working lives - exactly what I wanted. However, I need
>>>>to
>>>> fill in the missing data by substituting in the last SES score. For
>>>> example, participant 1 was out of the workforce for year 3 and year 4
>>>>and
>>>> I would like to substitute in their SES score of 34 (from their last
>>>>job)
>>>> for the two points of missing data.
>>>>
>>>> I'm sure there is an easy way to do this, but I have no idea how.
>>>>
>>>> Thanks,
>>>>
>>>> Jack
>>>>
>>>>
>>>>
>>>> Dr. Jack Noone
>>>> Research Fellow & LHH/ABBA Project Manager
>>>> Ageing, Work and Health Research Unit
>>>> Faculty of Health Sciences
>>>> University of Sydney
>>>>
>>>> Ph: 02 9351 9411
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 14/11/12 5:39 PM, "David Marso" <
>>>
>>>> david.marso@
>>>
>>>> > wrote:
>>>>
>>>>>Without reviewing the entire thread:
>>>>>*If* p_id to AUSEI06_3digit are *CONSTANT *within the structure
>>>>>(P_ID *
>>>>>year) simply add them to the list of BREAKS
>>>>>ie
>>>>>AGGREGATE
>>>>> /OUTFILE=*
>>>>> /BREAK =P_ID to AUSEI06_3digit year
>>>>> /JinYear 'Number of jobs held in calendar year'=NU
>>>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>>>>
>>>>>*Otherwise *your question requires greater specificity.
>>>>>
>>>>>
>>>>>Jack Noone wrote
>>>>>> Hi Richard,
>>>>>>
>>>>>> You may remember the thread below. The syntax you wrote was perfect,
>>>>>> however I need to keep some other variables as well and I can't seem
>>>>>>to
>>>>>> figure out how to do it. Here is the piece of syntax in question.
>>>>>>
>>>>>>
>>>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>>>> AGGREGATE
>>>>>> /OUTFILE=*
>>>>>> /BREAK =P_ID year
>>>>>> /JinYear 'Number of jobs held in calendar year'=NU
>>>>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>>>>>
>>>>>> So, how would I fit
>>>>>> keep = p_id to AUSEI06_3digit.
>>>>>> into the syntax above
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jack
>>>>>>
>>>>>>
>>>>>> Dr. Jack Noone
>>>>>> Research Fellow & LHH/ABBA Project Manager
>>>>>> Ageing, Work and Health Research Unit
>>>>>> Faculty of Health Sciences
>>>>>> University of Sydney
>>>>>>
>>>>>> Ph: 02 9351 9411
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>> <SNIP>
>>>
>>>
>>>
>>>
>>>
>>>-----
>>>Please reply to the list and not to my personal email.
>>>Those desiring my consulting or training services please feel free to
>>>email me.
>>>--
>>>View this message in context:
>>>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-va
>>>ri
>>>ables-tp5715562p5716214.html
>>>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>>=====================
>>>To manage your subscription to SPSSX-L, send a message to
>>>
>
>> LISTSERV@.UGA
>
>> (not to SPSSX-L), with no body text except the
>>>command. To leave the list, send the command
>>>SIGNOFF SPSSX-L
>>>For a list of commands to manage subscriptions, send the command
>>>INFO REFCARD
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>> (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716240.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

thara vardhan-2

Re: Syntax help - duplicating variables -

Hi Jack

I have been following this discussion quite keenly.

It is a pity that you are reacting so strongly to David's response.

Just wanted to let you know that David is one of the most helpful and knowledgeable persons on the list.

In fact he is guru of SPSS syntax in the real sense.

His comments/suggestions and help with syntax go beyond the initial problem posted by members and thereby helps the person think more carefully and come to the best possible solution and conclusion for the issue they are working on.

Perhaps you are under a lot of stress right now.

Hopefully this will change your mind about not wanting to interact with him anymore on this forum. Oh yes I am writing from down under!

cheers
thara vardhan

From: Jack Noone <[hidden email]>
To: [hidden email]
Date: 16/11/2012 11:35
Subject: Re: Syntax help - duplicating variables
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Dear David, I find your language (e.g. RTFM) completely inappropriate for this forum. I am doing my best to solve a complex problem within a tight time-frame and with only limited knowledge of SPSS syntax. I would appreciate it if you did not respond to any more of my posts including this one. Jack. Dr. Jack Noone Research Fellow & LHH/ABBA Project Manager Ageing, Work and Health Research Unit Faculty of Health Sciences University of Sydney Ph: 02 9351 9411 On 16/11/12 1:09 PM, "David Marso" <[hidden email]> wrote: >Please RTFM re AGGREGATE rather than just blindly running donated code. >ie MODE=ADDVARIABLES will be useful. >-- >If Self Employed is a constant for P_ID and Year then you will get >precisely >the result required. >"Their were no error messages but the sorting by Job_SES didn't run. >HELP!" > >"didn't run" is *not *informative! What DID happen???? >Maybe post what did occur??? > >Run the original AGG per RW using MODE=ADDVAR then do a SELECT IF for the >MAX(SES). > >Point 2. >See EXECUTE command (so the data pass is performed and the fills are >populated). > >--- > >Jack Noone wrote >> Hi David and all, >> >> Point 1: >> >> Here is what the data looks like prior to AGG BREAK. >> >> P_ID year job yr_start yr_stop job_SES >> self_employed >> 1 1964 1 1964 1965 48.4 >>1 >> 1 1965 1 1964 1965 48.4 >>1 >> 1 1965 2 1965 1967 48.4 >>0 >> 1 1965 2 1965 1967 48.4 >>0 >> 1 1965 2 1965 1967 48.4 >>0 >> 1 1967 3 1967 1969 48.4 >>1 >> 1 1967 3 1967 1969 48.4 >>1 >> 1 1968 4 1968 1969 48.4 >>0 >> 1 1969 4 1968 1969 48.4 >>0 >> 1 1969 5 1969 1974 83.7 >>1 >> 1 1969 5 1969 1974 83.7 >>1 >> >> And so on >> >> >> However, we can see that people are holding more than one job in a >> calendar year. >> So I applied this syntax (℅ R.Ristow) with the aim to have only the >> highest job_SES for any given year: >> >> AGGREGATE >> /OUTFILE=* >> /BREAK =P_ID year >> /JinYear 'Number of jobs held in calendar year'=NU >> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES). >> >> Which resulted in >> >> P_ID year jinyear job_SES >> 1 1964 1 48.4 >> 1 1965 2 48.4 >> 1 1966 1 48.4 >> 1 1967 2 48.4 >> 1 1968 2 48.4 >> 1 1969 2 83.7 >> 1 1970 1 83.7 >> 1 1971 1 83.7 >> >> Perfect! In 1969, this participant held one job with a SES rating of >>48.4 >> and one with SES rating of 83.7. However, only the higher rating SES >>value >> is chosen for 1969. >> >> However, I want to know if the person was self-employed for the job that >> has been selected. So I tried this: >> >> AGGREGATE >> /OUTFILE=* >> /BREAK =P_ID year self_employed >> /JinYear 'Number of jobs held in calendar year'=NU >> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES). >> >> But it didn't work. Their were no error messages but the sorting by >> Job_SES didn't run. HELP! >> >> >> Point 2: >> >> I converted the long format file to wide format so I could take a look >>at >> the missing data. >> >> I then applied this syntax after sorting the variables >> >> VECTOR V=Job_ses_1 TO Job_SES_55. >> LOOP #=2 TO 10. >> IF MISSING(V(#)) V(#)=V(#-1), >> END LOOP. >> >> There were no errors but the missing data were not filled. >> >> >> >> >> Sadly I don't have the knowledge to solve this one myself. HELP AGAIN! >> >> Jack >> >> >> >> Dr. Jack Noone >> Research Fellow & LHH/ABBA Project Manager >> Ageing, Work and Health Research Unit >> Faculty of Health Sciences >> University of Sydney >> >> Ph: 02 9351 9411 >> >> >> >> >> >> On 16/11/12 1:23 AM, "David Marso" < > >> david.marso@ > >> > wrote: >> >>>Your initial followup: >>>"I need to keep some other variables as well and I can't seem to figure >>>out >>>how to do it." >>>If the values of these additional variables vary over year then you need >>>to >>>specify how these new variables will be represented in the new data >>>file. >>>If they don't vary then everything should be exactly as if they were not >>>used in the AGG BREAK. Maybe time for you to post what the before/after >>>(pre aggregate/post aggregated) data appear. >>> >>>Point 2: >>>Data x1...x10 >>>1 1 . . 3 4 5 >>>----- >>>VECTOR V=V1 TO V10. >>>LOOP #=2 TO 10. >>>IF MISSING(V(#)) V(#)=V(#-1), >>>END LOOP. >>>---------------- >>> >>>Jack Noone wrote >>>> Hi David, >>>> >>>> Thanks for your help, but unfortunately the syntax didn't work as I'd >>>> hoped. >>>> >>>> I believe the context for the problem is in thread below. But, >>>>according >>>> to point number 2 (see bottom of thread), the original syntax was >>>>designed >>>> to "automatically select >>>> the highest SES job for any one year" and it did this perfectly. Some >>>> people had more than one job in a calendar year and I wanted to select >>>>the >>>> job with the highest socioeconomic rating. >>>> >>>> But, if I add the other variables under the break command as >>>>suggested, >>>> then the highest SES job for any one year is not selected out. I >>>>thought >>>> that they were constant within the structure, but I now suspect that I >>>> didn't understand what you meant. Could you elaborate please? >>>> >>>> I also have a another, somewhat related, syntax query. Having >>>>converted >>>>my >>>> long file to wide I end up with a file looking like this: >>>> >>>> P_ID ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6 . >>>> 1 34 34 . . 48 48 >>>> 2 48 48 48 75 75 75 >>>> >>>> This is simply a occupation-based socioeconomic index for each year of >>>>my >>>> participants' working lives - exactly what I wanted. However, I need >>>>to >>>> fill in the missing data by substituting in the last SES score. For >>>> example, participant 1 was out of the workforce for year 3 and year 4 >>>>and >>>> I would like to substitute in their SES score of 34 (from their last >>>>job) >>>> for the two points of missing data. >>>> >>>> I'm sure there is an easy way to do this, but I have no idea how. >>>> >>>> Thanks, >>>> >>>> Jack >>>> >>>> >>>> >>>> Dr. Jack Noone >>>> Research Fellow & LHH/ABBA Project Manager >>>> Ageing, Work and Health Research Unit >>>> Faculty of Health Sciences >>>> University of Sydney >>>> >>>> Ph: 02 9351 9411 >>>> >>>> >>>> >>>> >>>> >>>> On 14/11/12 5:39 PM, "David Marso" < >>> >>>> david.marso@ >>> >>>> > wrote: >>>> >>>>>Without reviewing the entire thread: >>>>>*If* p_id to AUSEI06_3digit are *CONSTANT *within the structure >>>>>(P_ID * >>>>>year) simply add them to the list of BREAKS >>>>>ie >>>>>AGGREGATE >>>>> /OUTFILE=* >>>>> /BREAK =P_ID to AUSEI06_3digit year >>>>> /JinYear 'Number of jobs held in calendar year'=NU >>>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES). >>>>> >>>>>*Otherwise *your question requires greater specificity. >>>>> >>>>> >>>>>Jack Noone wrote >>>>>> Hi Richard, >>>>>> >>>>>> You may remember the thread below. The syntax you wrote was perfect, >>>>>> however I need to keep some other variables as well and I can't seem >>>>>>to >>>>>> figure out how to do it. Here is the piece of syntax in question. >>>>>> >>>>>> >>>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav". >>>>>> AGGREGATE >>>>>> /OUTFILE=* >>>>>> /BREAK =P_ID year >>>>>> /JinYear 'Number of jobs held in calendar year'=NU >>>>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES). >>>>>> >>>>>> So, how would I fit >>>>>> keep = p_id to AUSEI06_3digit. >>>>>> into the syntax above >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Jack >>>>>> >>>>>> >>>>>> Dr. Jack Noone >>>>>> Research Fellow & LHH/ABBA Project Manager >>>>>> Ageing, Work and Health Research Unit >>>>>> Faculty of Health Sciences >>>>>> University of Sydney >>>>>> >>>>>> Ph: 02 9351 9411 >>>>>> >>>>>> >>>>>> >>>>>> >>>> >> <SNIP> >>> >>> >>> >>> >>> >>>----- >>>Please reply to the list and not to my personal email. >>>Those desiring my consulting or training services please feel free to >>>email me. >>>-- >>>View this message in context: >>>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-va>>>ri >>>ables-tp5715562p5716214.html >>>Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>> >>>===================== >>>To manage your subscription to SPSSX-L, send a message to >>> > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >>>command. To leave the list, send the command >>>SIGNOFF SPSSX-L >>>For a list of commands to manage subscriptions, send the command >>>INFO REFCARD >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > > > > >----- >Please reply to the list and not to my personal email. >Those desiring my consulting or training services please feel free to >email me. >-- >View this message in context: >http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari>ables-tp5715562p5716240.html >Sent from the SPSSX Discussion mailing list archive at Nabble.com. > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except the >command. To leave the list, send the command >SIGNOFF SPSSX-L >For a list of commands to manage subscriptions, send the command >INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ All mail is subject to content scanning for possible violation of NSW Police Force policy, including the Email and Internet Policy and Guidelines. All NSW Police Force employees are required to familiarise themselves with these policies, available on the NSW Police Force Intranet.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The information contained in this email is intended for the named recipient(s)
only. It may contain private, confidential, copyright or legally privileged
information. If you are not the intended recipient or you have received this
email by mistake, please reply to the author and delete this email immediately.
You must not copy, print, forward or distribute this email, nor place reliance
on its contents. This email and any attachment have been virus scanned. However,
you are requested to conduct a virus scan as well. No liability is accepted
for any loss or damage resulting from a computer virus, or resulting from a delay
or defect in transmission of this email or any attached file. This email does not
constitute a representation by the NSW Police Force unless the author is legally
entitled to do so.

David Marso

Re: Syntax help - duplicating variables

Administrator

In reply to this post by Jack Noone

RTFM= Read the FINE manual. If you for some reason believed otherwise you have not been here in this forum very long. When people such as Richard and myself reach out to assist perhaps you should take some responsibility for your own outcomes and do a bit of reading!
I will refrain from any further assistance on any of your future issues since you really think looking a gift horse in the mouth is the modus operandi!
"complex problem"?? Hardly!
--

Jack Noone wrote

Dear David,

I find your language (e.g. RTFM) completely inappropriate for this forum.

I am doing my best to solve a complex problem within a tight time-frame
and with only limited knowledge of SPSS syntax.

I would appreciate it if you did not respond to any more of my posts
including this one.

Jack.

Dr. Jack Noone
Research Fellow & LHH/ABBA Project Manager
Ageing, Work and Health Research Unit
Faculty of Health Sciences
University of Sydney

Ph: 02 9351 9411

On 16/11/12 1:09 PM, "David Marso" <[hidden email]> wrote:

>Please RTFM re AGGREGATE rather than just blindly running donated code.
>ie MODE=ADDVARIABLES will be useful.
>--
>If Self Employed is a constant for P_ID and Year then you will get
>precisely
>the result required.
>"Their were no error messages but the sorting by Job_SES didn't run.
>HELP!"
>
>"didn't run" is *not *informative! What DID happen????
>Maybe post what did occur???
>
>Run the original AGG per RW using MODE=ADDVAR then do a SELECT IF for the
>MAX(SES).
>
>Point 2.
>See EXECUTE command (so the data pass is performed and the fills are
>populated).
>
>---
>
>Jack Noone wrote
>> Hi David and all,
>>
>> Point 1:
>>
>> Here is what the data looks like prior to AGG BREAK.
>>
>> P_ID year job yr_start yr_stop job_SES
>> self_employed
>> 1 1964 1 1964 1965 48.4
>>1
>> 1 1965 1 1964 1965 48.4
>>1
>> 1 1965 2 1965 1967 48.4
>>0
>> 1 1965 2 1965 1967 48.4
>>0
>> 1 1965 2 1965 1967 48.4
>>0
>> 1 1967 3 1967 1969 48.4
>>1
>> 1 1967 3 1967 1969 48.4
>>1
>> 1 1968 4 1968 1969 48.4
>>0
>> 1 1969 4 1968 1969 48.4
>>0
>> 1 1969 5 1969 1974 83.7
>>1
>> 1 1969 5 1969 1974 83.7
>>1
>>
>> And so on
>>
>>
>> However, we can see that people are holding more than one job in a
>> calendar year.
>> So I applied this syntax (℅ R.Ristow) with the aim to have only the
>> highest job_SES for any given year:
>>
>> AGGREGATE
>> /OUTFILE=*
>> /BREAK =P_ID year
>> /JinYear 'Number of jobs held in calendar year'=NU
>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>
>> Which resulted in
>>
>> P_ID year jinyear job_SES
>> 1 1964 1 48.4
>> 1 1965 2 48.4
>> 1 1966 1 48.4
>> 1 1967 2 48.4
>> 1 1968 2 48.4
>> 1 1969 2 83.7
>> 1 1970 1 83.7
>> 1 1971 1 83.7
>>
>> Perfect! In 1969, this participant held one job with a SES rating of
>>48.4
>> and one with SES rating of 83.7. However, only the higher rating SES
>>value
>> is chosen for 1969.
>>
>> However, I want to know if the person was self-employed for the job that
>> has been selected. So I tried this:
>>
>> AGGREGATE
>> /OUTFILE=*
>> /BREAK =P_ID year self_employed
>> /JinYear 'Number of jobs held in calendar year'=NU
>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>
>> But it didn't work. Their were no error messages but the sorting by
>> Job_SES didn't run. HELP!
>>
>>
>> Point 2:
>>
>> I converted the long format file to wide format so I could take a look
>>at
>> the missing data.
>>
>> I then applied this syntax after sorting the variables
>>
>> VECTOR V=Job_ses_1 TO Job_SES_55.
>> LOOP #=2 TO 10.
>> IF MISSING(V(#)) V(#)=V(#-1),
>> END LOOP.
>>
>> There were no errors but the missing data were not filled.
>>
>>
>>
>>
>> Sadly I don't have the knowledge to solve this one myself. HELP AGAIN!
>>
>> Jack
>>
>>
>>
>> Dr. Jack Noone
>> Research Fellow & LHH/ABBA Project Manager
>> Ageing, Work and Health Research Unit
>> Faculty of Health Sciences
>> University of Sydney
>>
>> Ph: 02 9351 9411
>>
>>
>>
>>
>>
>> On 16/11/12 1:23 AM, "David Marso" <
>
>> david.marso@
>
>> > wrote:
>>
>>>Your initial followup:
>>>"I need to keep some other variables as well and I can't seem to figure
>>>out
>>>how to do it."
>>>If the values of these additional variables vary over year then you need
>>>to
>>>specify how these new variables will be represented in the new data
>>>file.
>>>If they don't vary then everything should be exactly as if they were not
>>>used in the AGG BREAK. Maybe time for you to post what the before/after
>>>(pre aggregate/post aggregated) data appear.
>>>
>>>Point 2:
>>>Data x1...x10
>>>1 1 . . 3 4 5
>>>-----
>>>VECTOR V=V1 TO V10.
>>>LOOP #=2 TO 10.
>>>IF MISSING(V(#)) V(#)=V(#-1),
>>>END LOOP.
>>>----------------
>>>
>>>Jack Noone wrote
>>>> Hi David,
>>>>
>>>> Thanks for your help, but unfortunately the syntax didn't work as I'd
>>>> hoped.
>>>>
>>>> I believe the context for the problem is in thread below. But,
>>>>according
>>>> to point number 2 (see bottom of thread), the original syntax was
>>>>designed
>>>> to "automatically select
>>>> the highest SES job for any one year" and it did this perfectly. Some
>>>> people had more than one job in a calendar year and I wanted to select
>>>>the
>>>> job with the highest socioeconomic rating.
>>>>
>>>> But, if I add the other variables under the break command as
>>>>suggested,
>>>> then the highest SES job for any one year is not selected out. I
>>>>thought
>>>> that they were constant within the structure, but I now suspect that I
>>>> didn't understand what you meant. Could you elaborate please?
>>>>
>>>> I also have a another, somewhat related, syntax query. Having
>>>>converted
>>>>my
>>>> long file to wide I end up with a file looking like this:
>>>>
>>>> P_ID ses_yr1 ses_yr2 ses_yr3 ses_yr4 ses_yr5 ses_yr6 .
>>>> 1 34 34 . . 48 48
>>>> 2 48 48 48 75 75 75
>>>>
>>>> This is simply a occupation-based socioeconomic index for each year of
>>>>my
>>>> participants' working lives - exactly what I wanted. However, I need
>>>>to
>>>> fill in the missing data by substituting in the last SES score. For
>>>> example, participant 1 was out of the workforce for year 3 and year 4
>>>>and
>>>> I would like to substitute in their SES score of 34 (from their last
>>>>job)
>>>> for the two points of missing data.
>>>>
>>>> I'm sure there is an easy way to do this, but I have no idea how.
>>>>
>>>> Thanks,
>>>>
>>>> Jack
>>>>
>>>>
>>>>
>>>> Dr. Jack Noone
>>>> Research Fellow & LHH/ABBA Project Manager
>>>> Ageing, Work and Health Research Unit
>>>> Faculty of Health Sciences
>>>> University of Sydney
>>>>
>>>> Ph: 02 9351 9411
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 14/11/12 5:39 PM, "David Marso" <
>>>
>>>> david.marso@
>>>
>>>> > wrote:
>>>>
>>>>>Without reviewing the entire thread:
>>>>>*If* p_id to AUSEI06_3digit are *CONSTANT *within the structure
>>>>>(P_ID *
>>>>>year) simply add them to the list of BREAKS
>>>>>ie
>>>>>AGGREGATE
>>>>> /OUTFILE=*
>>>>> /BREAK =P_ID to AUSEI06_3digit year
>>>>> /JinYear 'Number of jobs held in calendar year'=NU
>>>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>>>>
>>>>>*Otherwise *your question requires greater specificity.
>>>>>
>>>>>
>>>>>Jack Noone wrote
>>>>>> Hi Richard,
>>>>>>
>>>>>> You may remember the thread below. The syntax you wrote was perfect,
>>>>>> however I need to keep some other variables as well and I can't seem
>>>>>>to
>>>>>> figure out how to do it. Here is the piece of syntax in question.
>>>>>>
>>>>>>
>>>>>> GET FILE="/Users/jacknoone/desktop/expanded file.sav".
>>>>>> AGGREGATE
>>>>>> /OUTFILE=*
>>>>>> /BREAK =P_ID year
>>>>>> /JinYear 'Number of jobs held in calendar year'=NU
>>>>>> /Job_SES 'Highest job SES in calendar year' =MAX(Job_SES).
>>>>>>
>>>>>> So, how would I fit
>>>>>> keep = p_id to AUSEI06_3digit.
>>>>>> into the syntax above
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jack
>>>>>>
>>>>>>
>>>>>> Dr. Jack Noone
>>>>>> Research Fellow & LHH/ABBA Project Manager
>>>>>> Ageing, Work and Health Research Unit
>>>>>> Faculty of Health Sciences
>>>>>> University of Sydney
>>>>>>
>>>>>> Ph: 02 9351 9411
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>> <SNIP>
>>>
>>>
>>>
>>>
>>>
>>>-----
>>>Please reply to the list and not to my personal email.
>>>Those desiring my consulting or training services please feel free to
>>>email me.
>>>--
>>>View this message in context:
>>>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-va
>>>ri
>>>ables-tp5715562p5716214.html
>>>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>>=====================
>>>To manage your subscription to SPSSX-L, send a message to
>>>
>
>> LISTSERV@.UGA
>
>> (not to SPSSX-L), with no body text except the
>>>command. To leave the list, send the command
>>>SIGNOFF SPSSX-L
>>>For a list of commands to manage subscriptions, send the command
>>>INFO REFCARD
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>> (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to
>email me.
>--
>View this message in context:
>http://spssx-discussion.1045642.n5.nabble.com/Syntax-help-duplicating-vari
>ables-tp5715562p5716240.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Rich Ulrich

Re: Syntax help - duplicating variables

In reply to this post by Jack Noone

That IF MISSING line needs to end with a period, not a comma.

It is curious that no syntax error was generated.

--
Rich Ulrich

> Date: Thu, 15 Nov 2012 22:58:45 +0000
> From: [hidden email]
> Subject: Re: Syntax help - duplicating variables
> To: [hidden email]
...

>
> Point 2:
>
> I converted the long format file to wide format so I could take a look at
> the missing data.
>
> I then applied this syntax after sorting the variables
>
> VECTOR V=Job_ses_1 TO Job_SES_55.
> LOOP #=2 TO 10.
> IF MISSING(V(#)) V(#)=V(#-1),
> END LOOP.
>
> There were no errors but the missing data were not filled.
>
> ...

Jack Noone

Re: Syntax help - duplicating variables

It is curious isn't it?

I'd actually corrected the comma and added the execute before hand – I'd just copied in David's syntax without correcting the error.

So this is what I am running:

VECTOR V=AUSEI06_2.1 TO AUSEI06_2.55.

LOOP #=2 TO 10.

IF MISSING(V(#)) V(#)=V(#-1).

END LOOP.

Execute.

The AUSEI06_2.1 variable is first in the database followed by all the others up to AUSEI06_2.55.

Any ideas? According to the output window, everything went fine. I might send the file and data to a friend to see if it works for them.

Thanks,

Jack

Dr. Jack Noone

Research Fellow & LHH/ABBA Project Manager

Ageing, Work and Health Research Unit

Faculty of Health Sciences

University of Sydney

Ph: 02 9351 9411

From: Rich Ulrich <[hidden email]>
Date: Fri, 16 Nov 2012 01:35:10 -0500
To: Jack Noone <[hidden email]>, SPSS list <[hidden email]>
Subject: RE: Syntax help - duplicating variables

That IF MISSING line needs to end with a period, not a comma.

It is curious that no syntax error was generated.

--
Rich Ulrich

> Date: Thu, 15 Nov 2012 22:58:45 +0000
> From: [hidden email]
> Subject: Re: Syntax help - duplicating variables
> To: [hidden email]
...
>
> Point 2:
>
> I converted the long format file to wide format so I could take a look at
> the missing data.
>
> I then applied this syntax after sorting the variables
>
> VECTOR V=Job_ses_1 TO Job_SES_55.
> LOOP #=2 TO 10.
> IF MISSING(V(#)) V(#)=V(#-1),
> END LOOP.
>
> There were no errors but the missing data were not filled.
>
> ...