Syntax for creating a composite Variable

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Syntax for creating a composite Variable

Johnny Amora
I run the following syntax to create a composite demographic variable SEXAGE.  The SEXAGE was created but showed empty. What is wrong or lacking with the syntax?
 
IF (SEX=1 AND AGE=1) SEXAGE= 1.
IF (SEX=1 AND AGE=2) SEXAGE= 2.
IF (SEX=1 AND AGE=3) SEXAGE= 3.
IF (SEX=1 AND AGE=4) SEXAGE= 4.
IF (SEX=1 AND AGE=5) SEXAGE= 5.
IF (SEX=1 AND AGE=6) SEXAGE= 6.
IF (SEX=1 AND AGE=7) SEXAGE= 7.
IF (SEX=1 AND AGE=8) SEXAGE= 8.
IF (SEX=1 AND AGE=9) SEXAGE= 9.
IF (SEX=2 AND AGE=1) SEXAGE=10.
IF (SEX=2 AND AGE=2) SEXAGE=11.
IF (SEX=2 AND AGE=3) SEXAGE=12.
IF (SEX=2 AND AGE=4) SEXAGE=13.
IF (SEX=2 AND AGE=5) SEXAGE=14.
IF (SEX=2 AND AGE=6) SEXAGE=15.
IF (SEX=2 AND AGE=7) SEXAGE=16.
IF (SEX=2 AND AGE=8) SEXAGE=17.
IF (SEX=2 AND AGE=9) SEXAGE=18.
VALUE LABELS
 SEXAGE
  1 'M<12'
  2 'M12-14'
  3 'M15-17'
  4 'M18-20'
  5 'M21-29'
  6 'M30-39'
  7 'M40-49'
  8 'M50-64'
  9 'M65+'
 10 'F<12'
 11 'F12-14'
 12 'F15-17'
 13 'F18-20'
 14 'F21-29'
 15 'F30-39'
 16 'F40-49'
 17 'F50-64'
 18 'F65+'.
 
Thank you.
Johnny


Interested in growing your business? Find out how with Yahoo! Search Marketing!
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for creating a composite Variable

Marta Garcia-Granero
Hi Johnny:

First, you could simplify your syntax:

COMPUTE SEXAGE = 9*(SEX=2) + AGE.

Will replace all the IF lines.

Second: With Richard Ristow's permission (big smile),  add EXE at the end.

COMPUTE SEXAGE = 9*(SEX=2) + AGE.
EXE.

Or, instead of EXE, use any procedure that reads your data:

COMPUTE SEXAGE = 9*(SEX=2) + AGE.
VALE LABELS .............(add the rest of the command)...
LIST SEXAGE /CASES=FROM 1 TO 10.

The values aren't actually computed until EXE or any procedure that
reads the data is run.

HTH,
Marta

Johnny Amora wrote:

> I run the following syntax to create a composite demographic variable
> SEXAGE.  The SEXAGE was created but showed empty. What is wrong or
> lacking with the syntax?
>
> IF (SEX=1 AND AGE=1) SEXAGE= 1.
> IF (SEX=1 AND AGE=2) SEXAGE= 2.
> IF (SEX=1 AND AGE=3) SEXAGE= 3.
> IF (SEX=1 AND AGE=4) SEXAGE= 4.
> IF (SEX=1 AND AGE=5) SEXAGE= 5.
> IF (SEX=1 AND AGE=6) SEXAGE= 6.
> IF (SEX=1 AND AGE=7) SEXAGE= 7.
> IF (SEX=1 AND AGE=8) SEXAGE= 8.
> IF (SEX=1 AND AGE=9) SEXAGE= 9.
> IF (SEX=2 AND AGE=1) SEXAGE=10.
> IF (SEX=2 AND AGE=2) SEXAGE=11.
> IF (SEX=2 AND AGE=3) SEXAGE=12.
> IF (SEX=2 AND AGE=4) SEXAGE=13.
> IF (SEX=2 AND AGE=5) SEXAGE=14.
> IF (SEX=2 AND AGE=6) SEXAGE=15.
> IF (SEX=2 AND AGE=7) SEXAGE=16.
> IF (SEX=2 AND AGE=8) SEXAGE=17.
> IF (SEX=2 AND AGE=9) SEXAGE=18.
> VALUE LABELS
>  SEXAGE
>   1 'M<12'
>   2 'M12-14'
>   3 'M15-17'
>   4 'M18-20'
>   5 'M21-29'
>   6 'M30-39'
>   7 'M40-49'
>   8 'M50-64'
>   9 'M65+'
>  10 'F<12'
>  11 'F12-14'
>  12 'F15-17'
>  13 'F18-20'
>  14 'F21-29'
>  15 'F30-39'
>  16 'F40-49'
>  17 'F50-64'
>  18 'F65+'.
>
> Thank you.
> Johnny
>
>
> ------------------------------------------------------------------------
> Interested in growing your business? Find out how with Yahoo! Search
> Marketing!
> <http://searchmarketing.yahoo.com/en_SG/arp/internetmarketing.php?o=SG0147>


--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for creating a composite Variable

Johnny Amora
In reply to this post by Johnny Amora
Hi Marta,
 
I dont understand the right side of your equation
 
"COMPUTE SEXAGE = 9*(SEX=2) + AGE."
 
I would illustrate what I want to do.  Suppose I have dataset with two numeric variables (sex and age) as shown. My goal is to create a third numeric variable SEXAGE
 
SEX       AGE           SEXAGE
M           <12             M<12
F            12-14          F12-14
F             <12            F<12
M            12-14          M12-14
 
Where:
 the codes for SEX: 1=male and 2=female
the codes for Age: 1=<12 and 2=12-14
the codes for SEXAGE: 1 =M<12, 2=F<12, 3=M12-14; 4=F12-14.
 
The syntax was exactly written in the spss training manual.  I wonder why it does not work.
Would you able to explain to me your suggestion?
 
Thank you.
Johnny
 



--- On Thu, 4/23/09, Marta García-Granero <[hidden email]> wrote:

From: Marta García-Granero <[hidden email]>
Subject: Re: Syntax for creating a composite Variable
To: [hidden email]
Date: Thursday, 23 April, 2009, 3:29 PM

Hi Johnny:

First, you could simplify your syntax:

COMPUTE SEXAGE = 9*(SEX=2) + AGE.

Will replace all the IF lines.

Second: With Richard Ristow's permission (big smile),  add EXE at the end.

COMPUTE SEXAGE = 9*(SEX=2) + AGE.
EXE.

Or, instead of EXE, use any procedure that reads your data:

COMPUTE SEXAGE = 9*(SEX=2) + AGE.
VALE LABELS .............(add the rest of the command)...
LIST SEXAGE /CASES=FROM 1 TO 10.

The values aren't actually computed until EXE or any procedure that
reads the data is run.

HTH,
Marta

Johnny Amora wrote:

> I run the following syntax to create a composite demographic variable
> SEXAGE.  The SEXAGE was created but showed empty. What is wrong or
> lacking with the syntax?
>
> IF (SEX=1 AND AGE=1) SEXAGE= 1.
> IF (SEX=1 AND AGE=2) SEXAGE= 2.
> IF (SEX=1 AND AGE=3) SEXAGE= 3.
> IF (SEX=1 AND AGE=4) SEXAGE= 4.
> IF (SEX=1 AND AGE=5) SEXAGE= 5.
> IF (SEX=1 AND AGE=6) SEXAGE= 6.
> IF (SEX=1 AND AGE=7) SEXAGE= 7.
> IF (SEX=1 AND AGE=8) SEXAGE= 8.
> IF (SEX=1 AND AGE=9) SEXAGE= 9.
> IF (SEX=2 AND AGE=1) SEXAGE=10.
> IF (SEX=2 AND AGE=2) SEXAGE=11.
> IF (SEX=2 AND AGE=3) SEXAGE=12.
> IF (SEX=2 AND AGE=4) SEXAGE=13.
> IF (SEX=2 AND AGE=5) SEXAGE=14.
> IF (SEX=2 AND AGE=6) SEXAGE=15.
> IF (SEX=2 AND AGE=7) SEXAGE=16..
> IF (SEX=2 AND AGE=8) SEXAGE=17.
> IF (SEX=2 AND AGE=9) SEXAGE=18.
> VALUE LABELS
>  SEXAGE
>   1 'M<12'
>   2 'M12-14'
>   3 'M15-17'
>   4 'M18-20'
>   5 'M21-29'
>   6 'M30-39'
>   7 'M40-49'
>   8 'M50-64'
>   9 'M65+'
>  10 'F<12'
>  11 'F12-14'
>  12 'F15-17'
>  13 'F18-20'
>  14 'F21-29'
>  15 'F30-39'
>  16 'F40-49'
>  17 'F50-64'
>  18 'F65+'.
>
> Thank you.
> Johnny
>
>
> ------------------------------------------------------------------------
> Interested in growing your business? Find out how with Yahoo! Search
> Marketing!
> <http://searchmarketing.yahoo.com/en_SG/arp/internetmarketing.php?o=SG0147>


--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Open emails faster.
Yahoo! recommends that you upgrade your browser to the new Internet Explorer 8 optimized for Yahoo!.Get it here! (It's free)
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for creating a composite Variable

Marta Garcia-Granero
Johnny Amora wrote:
> I dont understand the right side of your equation
>
> "COMPUTE SEXAGE = 9*(SEX=2) + AGE."
>

(SEX=2) is a logic expression that will return 0 if the expression is
false (cases where sex=1 will return 0) and 1 if it is true (cases where
sex=2).

For cases with sex=1, the expression reads as: sexage = 9*0 + age =
age (if age=1, sexage= 1; age=2, sexage= 2, and so on...).
For cases with sex=2, the expression reads as: sexage = 9*1 + age =
9+age (if age=1, sexage=10; age=2, sexage=11, and so on...)

>
>
> The syntax was exactly written in the spss training manual.  I wonder
> why it does not work.
>
Try adding EXE at the end of your syntax (since you don't trust mine,
although I can assure you it works).

Marta

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for creating a composite Variable

DataMaestro
very elegant Marta - 1 line of code!
 
Johnny, if either variable has a sysmis then you might need to preceed the syntax with an IF NOT (SYSMIS(SEX) OR (SYSMIS)AGE) unless there are "no answer" codes in each variable then the "IF" would change to accommodate that possibility.
 
George


--- On Thu, 4/23/09, Marta García-Granero <[hidden email]> wrote:
From: Marta García-Granero <[hidden email]>
Subject: Re: Syntax for creating a composite Variable
To: [hidden email]
Date: Thursday, April 23, 2009, 4:03 AM

Johnny Amora wrote:
> I dont understand the right side of your equation
>
> "COMPUTE SEXAGE = 9*(SEX=2) + AGE."
>

(SEX=2) is a logic expression that will return 0 if the expression is
false (cases where sex=1 will return 0) and 1 if it is true (cases where
sex=2).

For cases with sex=1, the expression reads as: sexage = 9*0 + age =
age (if age=1, sexage= 1; age=2, sexage= 2, and so on...).
For cases with sex=2, the expression reads as: sexage = 9*1 + age =
9+age (if age=1, sexage=10; age=2, sexage=11, and so on...)

>
>
> The syntax was exactly written in the spss training manual.  I wonder
> why it does not work.
>
Try adding EXE at the end of your syntax (since you don't trust mine,
although I can assure you it works).

Marta

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Syntax for creating a composite Variable

Johnny Amora
In reply to this post by Johnny Amora
It is not that I do not trust you, Marta.  I threw a follow-up question because I dont like to pretend that I understood the logic. At the first glance I did not understand where did you get the "9"  :).  Now I got the logic.  Thank you so much.
 
By the way, the syntax in the spss manual does not work even an EXE. is added in it.
 
A million thanks, Marta.
Johnny
 
 

--- On Thu, 4/23/09, Marta García-Granero <[hidden email]> wrote:

From: Marta García-Granero <[hidden email]>
Subject: Re: Syntax for creating a composite Variable
To: [hidden email]
Date: Thursday, 23 April, 2009, 3:29 PM

Hi Johnny:

First, you could simplify your syntax:

COMPUTE SEXAGE = 9*(SEX=2) + AGE.

Will replace all the IF lines.

Second: With Richard Ristow's permission (big smile),  add EXE at the end.

COMPUTE SEXAGE = 9*(SEX=2) + AGE.
EXE.

Or, instead of EXE, use any procedure that reads your data:

COMPUTE SEXAGE = 9*(SEX=2) + AGE.
VALE LABELS .............(add the rest of the command)...
LIST SEXAGE /CASES=FROM 1 TO 10.

The values aren't actually computed until EXE or any procedure that
reads the data is run.

HTH,
Marta

Johnny Amora wrote:

> I run the following syntax to create a composite demographic variable
> SEXAGE.  The SEXAGE was created but showed empty. What is wrong or
> lacking with the syntax?
>
> IF (SEX=1 AND AGE=1) SEXAGE= 1.
> IF (SEX=1 AND AGE=2) SEXAGE= 2.
> IF (SEX=1 AND AGE=3) SEXAGE= 3.
> IF (SEX=1 AND AGE=4) SEXAGE= 4.
> IF (SEX=1 AND AGE=5) SEXAGE= 5.
> IF (SEX=1 AND AGE=6) SEXAGE= 6.
> IF (SEX=1 AND AGE=7) SEXAGE= 7.
> IF (SEX=1 AND AGE=8) SEXAGE= 8.
> IF (SEX=1 AND AGE=9) SEXAGE= 9.
> IF (SEX=2 AND AGE=1) SEXAGE=10.
> IF (SEX=2 AND AGE=2) SEXAGE=11.
> IF (SEX=2 AND AGE=3) SEXAGE=12.
> IF (SEX=2 AND AGE=4) SEXAGE=13.
> IF (SEX=2 AND AGE=5) SEXAGE=14.
> IF (SEX=2 AND AGE=6) SEXAGE=15.
> IF (SEX=2 AND AGE=7) SEXAGE=16.
> IF (SEX=2 AND AGE=8) SEXAGE=17.
> IF (SEX=2 AND AGE=9) SEXAGE=18.
> VALUE LABELS
>  SEXAGE
>   1 'M<12'
>   2 'M12-14'
>   3 'M15-17'
>   4 'M18-20'
>   5 'M21-29'
>   6 'M30-39'
>   7 'M40-49'
>   8 'M50-64'
>   9 'M65+'
>  10 'F<12'
>  11 'F12-14'
>  12 'F15-17'
>  13 'F18-20'
>  14 'F21-29'
>  15 'F30-39'
>  16 'F40-49'
>  17 'F50-64'
>  18 'F65+'.
>
> Thank you.
> Johnny
>
>
> ------------------------------------------------------------------------
> Interested in growing your business? Find out how with Yahoo! Search
> Marketing!
> <http://searchmarketing.yahoo.com/en_SG/arp/internetmarketing.php?o=SG0147>


--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Design your own exclusive Pingbox today!
It's easy to create your personal chat space on your blogs
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for creating a composite Variable

Marta Garcia-Granero
Johnny Amora escribió:
> It is not that I do not trust you, Marta.
>
I was joking, of course.Perhaps I shuold have added an emoticon to make
that clear...
> I threw a follow-up question because I dont like to pretend that I
> understood the logic. At the first glance I did not understand where
> did you get the "9"  :).  Now I got the logic.
>
I admit it is complicated stuff. I got some off-line questions
concerning it.
> Thank you so much.
>

Always welcome.

>
> By the way, the syntax in the spss manual does not work even an EXE.
> is added in it.
>
Strange, it should. Does mine work? If it fails too, then you should
take a look at your data. You have already been given a hint concerning
missing values for one or both variables. Check that.

Do you get any error messages?

Just a silly question (no offense intended, please): is SEX variable
really coded as NUMERIC with 1=males and 2=females?. I had problems in
the past  when people handled me SEX variables coded as STRING, but with
values of 1 & 2 ('1' and '2', to be more precise), and value labels
attached to the numeric strings (a mess). That would cause any of the
two versions of the syntax (long&short) fail. If that is the case, then
this modified version should work:

COMPUTE SEXAGE = 9*(SEX='2') + AGE.
EXE.

Check AGE too, if it is also a STRING variable instead of NUMERIC, the
code might fail too, unless modified:

COMPUTE SEXAGE = 9*(SEX='2') + NUMBER(AGE,F8).
EXE.

HTH,
Marta

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for creating a composite Variable

Johnny Amora
In reply to this post by Johnny Amora
Hi Marta,
Both gender and age are numeric.Your suggested syntax is working well.  It is short and I found it easy after understanding the logic.  I like it.  Thank you once again.

Johnny

--- On Thu, 4/23/09, Marta García-Granero <[hidden email]> wrote:

From: Marta García-Granero <[hidden email]>
Subject: Re: Syntax for creating a composite Variable
To: [hidden email]
Date: Thursday, 23 April, 2009, 5:33 PM

Johnny Amora escribió:
> It is not that I do not trust you, Marta.
>
I was joking, of course.Perhaps I shuold have added an emoticon to make
that clear...
> I threw a follow-up question because I dont like to pretend that I
> understood the logic. At the first glance I did not understand where
> did you get the "9"  :).  Now I got the logic.
>
I admit it is complicated stuff. I got some off-line questions
concerning it.
> Thank you so much.
>

Always welcome.

>
> By the way, the syntax in the spss manual does not work even an EXE.
> is added in it.
>
Strange, it should. Does mine work? If it fails too, then you should
take a look at your data. You have already been given a hint concerning
missing values for one or both variables. Check that.

Do you get any error messages?

Just a silly question (no offense intended, please): is SEX variable
really coded as NUMERIC with 1=males and 2=females?. I had problems in
the past  when people handled me SEX variables coded as STRING, but with
values of 1 & 2 ('1' and '2', to be more precise), and value labels
attached to the numeric strings (a mess). That would cause any of the
two versions of the syntax (long&short) fail. If that is the case, then
this modified version should work:

COMPUTE SEXAGE = 9*(SEX='2') + AGE.
EXE.

Check AGE too, if it is also a STRING variable instead of NUMERIC, the
code might fail too, unless modified:

COMPUTE SEXAGE = 9*(SEX='2') + NUMBER(AGE,F8).
EXE.

HTH,
Marta

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Get your preferred Email name!
Now you can @ymail.com and @rocketmail.com.
Reply | Threaded
Open this post in threaded view
|

recoding all values greater than 1 in a file

bgreen
In reply to this post by Johnny Amora
Hello,

Can someone advise me how I would recode all values in a file that
are greater than 1 the value of 1? I can do this for individual
variables, however wanted to make the change to all variables,

Any assistance is appreciated,

Bob

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: recoding all values greater than 1 in a file

Marks, Jim
No time to test--

does the ALL keyword work in RECODE?

How about the keyword TO?-- "firstvar TO lastvar"
--jim


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bob Green
Sent: Thursday, April 23, 2009 4:22 PM
To: [hidden email]
Subject: recoding all values greater than 1 in a file

Hello,

Can someone advise me how I would recode all values in a file that
are greater than 1 the value of 1? I can do this for individual
variables, however wanted to make the change to all variables,

Any assistance is appreciated,

Bob

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: recoding all values greater than 1 in a file

Art Kendall
In reply to this post by bgreen
Is something like this what you mean?
recode firstvar to lastvar (1 thru hi=1) (else=copy) into newvar001 to
newvar999.


Art Kendall
Social Research Consultants



Bob Green wrote:

> Hello,
>
> Can someone advise me how I would recode all values in a file that
> are greater than 1 the value of 1? I can do this for individual
> variables, however wanted to make the change to all variables,
>
> Any assistance is appreciated,
>
> Bob
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for creating a composite Variable

Richard Ristow
In reply to this post by Marta Garcia-Granero
At 03:29 AM 4/23/2009, Marta García-Granero wrote:

>First, you could simplify your syntax:
>
>COMPUTE SEXAGE = 9*(SEX=2) + AGE.
>
>Will replace all the IF lines.

Exactly. However, I'd use

COMPUTE SEXAGE = 10*(SEX=2) + AGE.

Marta's syntax gives the coding the poster
requested, and mine doesn't. But I'd recommend
the coding, so the first (decimal) digit is sex,
the second one age, and the 'age' digit has the same meaning regardless of sex.

>Second: With Richard Ristow's permission (big smile),  add EXE at the end.

Marta, for you anything, even that.

-Smiling,
  Richard

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Strange behavior

Mark A Davenport MADAVENP
In reply to this post by bgreen

Working with v 16.0.1--networked.

I have a dataset with almost 4200 cases and about 150 fields.  Almost all are designated as strings although many contain only numbers. With at least one field, statistics such as frequencies and crosstabs miss picking up several valid values.

An example:

I have a string field for employee status that is a value of 'E' or 'S' or 'G'; there are no missing values.

I have a second string field for Occupational Activity Code:  '10', '20', '30', '40', '50', '60', '70'.  Again, no missing values.  Both are nominal measures.

When I look at the data file and sort by status and OACode I can count down the rows and see that I have 341 'E's with an OACode of '10' and 2 's's with a code of '10'.  When I crosstab, I shouls see these as observed cells.  There are NO filter variables in the dataset.  There is NO filter turned on.
                                                                           
When I run crosstabs, I get the following:

Occupational Activity Category * Personnel Act Crosstabulation
                Personnel Act
                E        G        S        Total
OACode        0        880        0        880
        10        3        0        0        3
        20        1242        0        0        1242
        30        346        0        286        632
        40        0        0        201        201
        50        0        0        501        501
        60        0        0        116        116
        70        0        0        277        277
        Total        1929        880        1383        4192

Note the 'E' column.  It shows a sum of 1591 'E' in cells but there are 1929 'E's in the dataset.  Looking at Frequecies the same thing occurs:

                                                   
                Occupational Activity Category    
                Freq        Perc        Valid         Cum        
                 880        21.0        21.0        21.0      
        10        3        .1        .1        29.2      
        20        1242        29.6        29.6        58.8      
        30        632        15.1        15.1        73.9      
Valid        40        201        4.8        4.8        78.7      
        50        501        12.0        12.0        90.6      
        60        116        2.8        2.8        93.4      
        70        277        6.6        6.6        100.0      
           Total        4192        100.0        100.0                  

If you sum down the frequency column, you will see that 340 cases are not being counted.  

An added rub: I CAN turn on the filter and a filter out cases where OACode is not '10' and it
 recognizes all of the ten values by filtering all but the 343 valid cases.

                Personnel Act                  
                Freq        Per        Valid        Cum    
        E        341        99.4        99.4        99.4  
Valid        S        2        .6        .6        100.0  
        Total        343        100.0        100.0              

As you can see above, all 343 non-filtered cases are there, yet they aren't registering when I
run frequencies.
                                             
                Occupational Activity Category  
                Freq        Per        Valid         Cum    
Valid        10        3        .9        .9        100.0  
        Total        343        100.0        100.0                
                                               

Again, I just opened this dataset.  There are no filters on.   I rerun it after Select Cases = all
and I get the same discrepancy.  I get no error messages in the log.

This file is my bread and butter and was CREATED using SPSS but now seems to be unt
rustworthy in SPSS.  Has anyone else seen this issue?

***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more than an exact answer to an approximate question.' --a paraphrase of J. W. Tukey (1962)


Reply | Threaded
Open this post in threaded view
|

Re: Strange behavior

Art Kendall
If you send me a .sav file with 3 variables,  an ID you can match back to, the status variable, and the Activity variable, I'll take a look at it.

Art Kendall
Social Research Consultants

Mark A Davenport MADAVENP wrote:

Working with v 16.0.1--networked.

I have a dataset with almost 4200 cases and about 150 fields.  Almost all are designated as strings although many contain only numbers. With at least one field, statistics such as frequencies and crosstabs miss picking up several valid values.

An example:

I have a string field for employee status that is a value of 'E' or 'S' or 'G'; there are no missing values.

I have a second string field for Occupational Activity Code:  '10', '20', '30', '40', '50', '60', '70'.  Again, no missing values.  Both are nominal measures.

When I look at the data file and sort by status and OACode I can count down the rows and see that I have 341 'E's with an OACode of '10' and 2 's's with a code of '10'.  When I crosstab, I shouls see these as observed cells.  There are NO filter variables in the dataset.  There is NO filter turned on.
                                                                           
When I run crosstabs, I get the following:

Occupational Activity Category * Personnel Act Crosstabulation
                Personnel Act
                E        G        S        Total
OACode        0        880        0        880
        10        3        0        0        3
        20        1242        0        0        1242
        30        346        0        286        632
        40        0        0        201        201
        50        0        0        501        501
        60        0        0        116        116
        70        0        0        277        277
        Total        1929        880        1383        4192

Note the 'E' column.  It shows a sum of 1591 'E' in cells but there are 1929 'E's in the dataset.  Looking at Frequecies the same thing occurs:

                                                   
                Occupational Activity Category    
                Freq        Perc        Valid         Cum        
                 880        21.0        21.0        21.0      
        10        3        .1        .1        29.2      
        20        1242        29.6        29.6        58.8      
        30        632        15.1        15.1        73.9      
Valid        40        201        4.8        4.8        78.7      
        50        501        12.0        12.0        90.6      
        60        116        2.8        2.8        93.4      
        70        277        6.6        6.6        100.0      
           Total        4192        100.0        100.0                  

If you sum down the frequency column, you will see that 340 cases are not being counted.  

An added rub: I CAN turn on the filter and a filter out cases where OACode is not '10' and it
 recognizes all of the ten values by filtering all but the 343 valid cases.

                Personnel Act                  
                Freq        Per        Valid        Cum    
        E        341        99.4        99.4        99.4  
Valid        S        2        .6        .6        100.0  
        Total        343        100.0        100.0              

As you can see above, all 343 non-filtered cases are there, yet they aren't registering when I
run frequencies.
                                             
                Occupational Activity Category  
                Freq        Per        Valid         Cum    
Valid        10        3        .9        .9        100.0  
        Total        343        100.0        100.0                
                                               

Again, I just opened this dataset.  There are no filters on.   I rerun it after Select Cases = all
and I get the same discrepancy.  I get no error messages in the log.

This file is my bread and butter and was CREATED using SPSS but now seems to be unt
rustworthy in SPSS.  Has anyone else seen this issue?

***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more than an exact answer to an approximate question.' --a paraphrase of J. W. Tukey (1962)


Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Strange behavior

Mark A Davenport MADAVENP
In reply to this post by Mark A Davenport MADAVENP

Thanks to Art and Jon.

Jon wins the cookie.  Your solution didn't seem to work but your diagnosis may have been right on the money.  I created a new variable manually and recoded it to to equal the old OACODE.  I now have my 341 missing cases.  I thought I had done this recode before with this data before.  Perhaps not.  I have had a great deal of trouble with this data set (hundreds of valid values suddenly turn into '?', etc.)  It has been coded and recoded a hundred times.  I guess it is getting tired.

Thanks again guys!

Mark


***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more than an exact answer to an approximate question.' --a paraphrase of J. W. Tukey (1962)





"Fry, Jonathan B." <[hidden email]>

04/28/2009 12:28 PM

To
"Mark A Davenport MADAVENP" <[hidden email]>
cc
Subject
RE:      Strange behavior





Strange behavior, indeed.  I have a strong hunch about what’s wrong.  Some of your cases may have some junk in the Activity Code variable past the second byte.  To check this, you could just try inserting
 
temporary.
compute  OACode = OACode.
 
in front of your CROSSTABS command.
 
Jonathan Fry
SPSS Inc.
 



From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mark A Davenport MADAVENP
Sent:
Monday, April 27, 2009 12:20 PM
To:
[hidden email]
Subject:
Strange behavior

 

Working with v 16.0.1--networked.


I have a dataset with almost 4200 cases and about 150 fields.  Almost all are designated as strings although many contain only numbers. With at least one field, statistics such as frequencies and crosstabs miss picking up several valid values.


An example:


I have a string field for employee status that is a value of 'E' or 'S' or 'G'; there are no missing values.


I have a second string field for Occupational Activity Code:  '10', '20', '30', '40', '50', '60', '70'.  Again, no missing values.  Both are nominal measures.


When I look at the data file and sort by status and OACode I can count down the rows and see that I have 341 'E's with an OACode of '10' and 2 's's with a code of '10'.  When I crosstab, I shouls see these as observed cells.  There are NO filter variables in the dataset.  There is NO filter turned on.

                                                                         
 
When I run crosstabs, I get the following:


Occupational Activity Category * Personnel Act Crosstabulation

               Personnel Act

               E        G        S        Total

OACode        0        880        0        880

       10        3        0        0        3

       20        1242        0        0        1242

       30        346        0        286        632

       40        0        0        201        201

       50        0        0        501        501

       60        0        0        116        116

       70        0        0        277        277

       Total        1929        880        1383        4192


Note the 'E' column.  It shows a sum of 1591 'E' in cells but there are 1929 'E's in the dataset.  Looking at Frequecies the same thing occurs:


                                                 
 
               Occupational Activity Category    
               Freq        Perc        Valid         Cum        

                880        21.0        21.0        21.0      
       10        3        .1        .1        29.2      
       20        1242        29.6        29.6        58.8      
       30        632        15.1        15.1        73.9      
Valid        40        201        4.8        4.8        78.7      
       50        501        12.0        12.0        90.6      
       60        116        2.8        2.8        93.4      
       70        277        6.6        6.6        100.0      

          Total        4192        100.0        100.0                  


If you sum down the frequency column, you will see that 340 cases are not being counted.  


An added rub: I CAN turn on the filter and a filter out cases where OACode is not '10' and it

recognizes all of the ten values by filtering all but the 343 valid cases.


               Personnel Act                  

               Freq        Per        Valid        Cum    

       E        341        99.4        99.4        99.4  
Valid        S        2        .6        .6        100.0  

       Total        343        100.0        100.0              


As you can see above, all 343 non-filtered cases are there, yet they aren't registering when I

run frequencies.

                                             
               Occupational Activity Category  

               Freq        Per        Valid         Cum    
Valid        10        3        .9        .9        100.0  
       Total        343        100.0        100.0                

                                               


Again, I just opened this dataset.  There are no filters on.   I rerun it after Select Cases = all
and I get the same discrepancy.  I get no error messages in the log.


This file is my bread and butter and was CREATED using SPSS but now seems to be unt

rustworthy in SPSS.  Has anyone else seen this issue?


***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more than an exact answer to an approximate question.' --a paraphrase of J. W. Tukey (1962)