retrieving part of numeric variable to recode it

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

retrieving part of numeric variable to recode it

msherman
Dear Listers: I have a variable that is made up from another of other variables. For instance
var1=10 or 0     car
var2=100 or 0    bus
var3-1000 or 0  walk
var4=10000 or 0 subway
var5=100000 or 0 train
var6=1000000 or 0 ferry
var7=10000000 or 0 bike
var8=100000000 or 0 other
Since individuals could respond to more than one variable I have added the above
variables to together to get one combined variable. Thus, I have all possible
combinations. Now I want to recode so that I will have a new variable that has subway and any combination of subway coded as 1 and all others [excluding subway and any combination of subway) as 0. Any suggestions. thanks,



Martin F. Sherman, Ph.D.
Professor of Psychology
Director of Masters Education, Thesis Track
4501 North Charles Street
Psychology Department
222 B Beatty Hall
Baltimore, MD 21210

Phone: 410 617-2417
Fax: 410 617-5341
email: [hidden email]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: retrieving part of numeric variable to recode it

ViAnn Beadle
Rather than doing all this, why not just tabulate your multiple responses
with something like CTABLES or MULT RESPONSE. Get back to your original
coding? by recoding var1 to var8 (0=0) (else=1).

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Martin Sherman
Sent: Monday, October 20, 2008 7:24 AM
To: [hidden email]
Subject: retrieving part of numeric variable to recode it

Dear Listers: I have a variable that is made up from another of other
variables. For instance
var1=10 or 0     car
var2=100 or 0    bus
var3-1000 or 0  walk
var4=10000 or 0 subway
var5=100000 or 0 train
var6=1000000 or 0 ferry
var7=10000000 or 0 bike
var8=100000000 or 0 other
Since individuals could respond to more than one variable I have added the
above
variables to together to get one combined variable. Thus, I have all
possible
combinations. Now I want to recode so that I will have a new variable that
has subway and any combination of subway coded as 1 and all others
[excluding subway and any combination of subway) as 0. Any suggestions.
thanks,



Martin F. Sherman, Ph.D.
Professor of Psychology
Director of Masters Education, Thesis Track
4501 North Charles Street
Psychology Department
222 B Beatty Hall
Baltimore, MD 21210

Phone: 410 617-2417
Fax: 410 617-5341
email: [hidden email]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Break a LAG

Albert-Jan Roskam
Hi,

I am trying to solve a little LAG mystery out here. I have two case ID's (a & b), which are string vars. I want to create a new id var (Y) such that it only increases with a unique id  a/b combination, starting with zero. Or, if that's not clear: it should look like the variable 'desired' below.

The code works for numerical vars, but not for string vars. Why is that? In my real data file the first 90-or-so cases have missing data. Should I use the LEAVE command or something?

... with frowning greets! ;-)
Albert-Jan


* data list free / id_a id_b desired. /* works!.
data list free / id_a (a) id_b (a) desired.  /* does not work!.
begin data
1 1 0
2 2 1
3 3 2
4 4 3
4 4 3
5 5 4
6 6 5
6 6 5
6 6 5
end data.

compute y = 0.
exe.
if ( id_a ne lag(id_a) and id_b ne lag(id_b) ) y = lag(y) + 1.
if ( id_a = lag(id_a)  and id_b   = lag(id_b) ) y = lag(y).
exe.


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Break a LAG

Art Kendall
try something like this untested syntax.



string id_ab(a2).
compute id_ab=concat(a,b).
autorecode variables = id_ab /into num_id.
compute num_id = num_id-1.



Art Kendall
Social Research Consultants

Albert-jan Roskam wrote:

> Hi,
>
> I am trying to solve a little LAG mystery out here. I have two case ID's (a & b), which are string vars. I want to create a new id var (Y) such that it only increases with a unique id  a/b combination, starting with zero. Or, if that's not clear: it should look like the variable 'desired' below.
>
> The code works for numerical vars, but not for string vars. Why is that? In my real data file the first 90-or-so cases have missing data. Should I use the LEAVE command or something?
>
> ... with frowning greets! ;-)
> Albert-Jan
>
>
> * data list free / id_a id_b desired. /* works!.
> data list free / id_a (a) id_b (a) desired.  /* does not work!.
> begin data
> 1 1 0
> 2 2 1
> 3 3 2
> 4 4 3
> 4 4 3
> 5 5 4
> 6 6 5
> 6 6 5
> 6 6 5
> end data.
>
> compute y = 0.
> exe.
> if ( id_a ne lag(id_a) and id_b ne lag(id_b) ) y = lag(y) + 1.
> if ( id_a = lag(id_a)  and id_b   = lag(id_b) ) y = lag(y).
> exe.
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Break a LAG

Maguin, Eugene
In reply to this post by Albert-Jan Roskam
Albert,

I'm testing this on 16.0.2. And on my version, I think there may be a
computation error. I may be wrong and if so, I'd like to know--very
clearly--and why.

Here is my reasoning.

Y is set to zero for all cases because of the execute. Now, the case 1
enters the first if statement.

if ( id_a ne lag(id_a) and id_b ne lag(id_b) ) y = lag(y) + 1.

Lag(id_a) and lag(id_b) should return sysmis. I think the expression should
evaluate to sysmis but I'm not sure and there was nothing in the
documentation, which there is for AND and OR. If my assumption is correct,
the statement evaluates to missing. Per CSR page 881, assignment is executed
only if the statement is true.

if ( id_a = lag(id_a)  and id_b   = lag(id_b) ) y = lag(y).

Same here.

Next, case 2 enters. Lag(id_a) and lag(id_b) both evaluate to a value. The
first comparison can be made, and is true. The second comparison also can be
made, and is false. Likewise, for the remaining cases. Thus, I think you
should get

Id_a id_b y.
1 1 .
2 2 1
3 3 2
4 4 3
4 4 3
5 5 4
6 6 5
6 6 5
6 6 5

I don't know why it works differently for id_a and Id_b as strings and as
numbers.

What exact version are you running?

The better code is

do if ($casenum gt 1).
+  if (id_a ne lag(id_a) and id_b ne lag(id_b)) y=lag(y)+1.
+  if (id_a = lag(id_a) and id_b = lag(id_b)) y=lag(y).
end if.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: retrieving part of numeric variable to recode it

Richard Ristow
In reply to this post by msherman
At 09:24 AM 10/20/2008, Martin Sherman wrote:

>Dear Listers: I have a variable that is made up from another of
>other variables. For instance
>var1=10 or 0     car
>var2=100 or 0    bus
>var3-1000 or 0  walk
>var4=10000 or 0 subway
>var5=100000 or 0 train
>var6=1000000 or 0 ferry
>var7=10000000 or 0 bike
>var8=100000000 or 0 other

>Since individuals could respond to more than one variable I have
>added the above variables to together to get one combined variable.

It's neither here nor there, but, why do this? It saves some file
space, but it makes most processing tasks much harder.

>I want to recode so that I will have a new variable that has subway
>and any combination of subway coded as 1 and all others [excluding
>subway and any combination of subway) as 0. Any suggestions. thanks,

If you don't have the original variables, just the combined variable
(I'll call it 'AllModes'), >and< you want to flag users of the
subway, regardless of what other modes they used, then (untested)

NUMERIC   Subway  (F2).
VAR LABEL Subway 'Flag: Uses subway'.

COMPUTE   Subway = TRUNC(AllModes/1E4).
COMPUTE   Subway = MOD(Subway,10).

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Break a LAG

Richard Ristow
In reply to this post by Albert-Jan Roskam
At 10:36 AM 10/20/2008, Albert-jan Roskam wrote:

>I have two case ID's (a & b), which are string vars. I want to
>create a new id var (Y) such that it only increases with a unique
>id  a/b combination, starting with zero. Or, if that's not clear: it
>should look like the variable 'desired' below. The code works for
>numerical vars, but not for string vars. Why is that?

Your code, somewhat reformatted:

compute y = 0.
if  ( id_a ne lag(id_a)
   and id_b ne lag(id_b) ) y = lag(y) + 1.
if  ( id_a =  lag(id_a)
   and id_b =  lag(id_b) ) y = lag(y).

I'm not posting the listings, but the behavior is: if 'id_a' and
'id_b' are numeric, 'y' matches 'desired' (see test data, below); if
they are string variables, 'y' is system-missing in all cases.

To start with, I think you have a logic mistake: neither IF will be
executed unless 'id_a' and 'id_b' either both change, or both don't
change, from the preceding case. In your test data (see the original
posting), 'id_a' and 'id_b' always have the same value, which masks
this problem.

But, also: Because, in your data, every case has both keys the same,
normally one of the two 'IF' statements will test 'true'. 'y',
therefore, will be computed based on 'lag(y)'. In the first case,
'lag(y)' is always missing, so 'y' becomes missing; in the second
case, 'lag(y)' is now missing, so 'y' is missing, ...  Initializing
'y=0' should have no effect.

EXCEPT, for numeric (but not string!) data, both tests fail in the
first case: 'lag(id_a)' and 'lag(id_b)' are missing, neither
assignment is executed, 'y' is properly initialized to 0, and all
proceeds. For string data, 'lag(id_a)' and 'lag(id_b)' are blank
(>not< missing) in the first case, the first 'if' tests 'true', 'y'
is initialized to system-missing, and ...

I think the following is what you want. Versions for numeric and for
string keys are both tested; I'm posting output only for string keys
(since you say that's what you have), but both versions of the code
are in the APPENDIX. You can't use the same code for numeric and
string keys, because for numeric keys you have to test for
'MISSING(LAG(<key>))', and that's invalid for strings.

. data list free / id_a (a) id_b (a) desired   /* String  keys */.
begin data
<data omitted>
end data.
FORMATS desired (F3).

.  /**/  LIST  /*-*/.
List
|-----------------------------|---------------------------|
|Output Created               |22-OCT-2008 19:18:20       |
|-----------------------------|---------------------------|
id_a id_b desired

.    1        0
.    2        1
1    1        2
1    1        2
1    2        3
2    2        4
3    3        5
3    5        6
3    5        6
4    4        7
4    4        7
5    .        8
5    5        9
6    6       10
6    6       10
6    6       10
7    .       11
7    1       12
7    2       13

Number of cases read:  19    Number of cases listed:  19


NUMERIC y       (F3).

*  First case in the file            .
DO IF   $CASENUM EQ 1.
.  COMPUTE y = 0.

*  Change value          of key 'a'  .
ELSE IF id_a NE LAG(id_a).
.  COMPUTE y = LAG(y) + 1.

*  Change value          of key 'b'  .
ELSE IF id_b NE LAG(id_b).
.  COMPUTE y = LAG(y) + 1.

*  Neither key has changed           .
ELSE.
.  COMPUTE y = LAG(y).
END IF.

LIST.
List
|-----------------------------|---------------------------|
|Output Created               |22-OCT-2008 19:18:20       |
|-----------------------------|---------------------------|
id_a id_b desired   y

.    1        0     0
.    2        1     1
1    1        2     2
1    1        2     2
1    2        3     3
2    2        4     4
3    3        5     5
3    5        6     6
3    5        6     6
4    4        7     7
4    4        7     7
5    .        8     8
5    5        9     9
6    6       10    10
6    6       10    10
6    6       10    10
7    .       11    11
7    1       12    12
7    2       13    13

Number of cases read:  19    Number of cases listed:  19
===============================
APPENDIX: Test data and code,
for numeric and for string keys
===============================
NEW FILE.
. data list free / id_a     id_b     desired   /* Numeric keys */.
begin data
. 1 0
. 2 1
1 1 2
1 1 2
1 2 3
2 2 4
3 3 5
3 5 6
3 5 6
4 4 7
4 4 7
5 . 8
5 5 9
6 6 10
6 6 10
6 6 10
7 . 11
7 1 12
7 2 13
end data.
FORMATS desired (F3).

.  /**/  LIST  /*-*/.

NUMERIC y       (F3).

*  First case in the file            .
DO IF   $CASENUM EQ 1.
.  COMPUTE y = 0.

*  Change missing status of key 'a'  .
ELSE IF MISSING(id_a) NE MISSING(LAG(id_a)).
.  COMPUTE y = LAG(y) + 1.
*  Change value          of key 'a'  .
ELSE IF NOT MISSING(id_a)
     AND NOT MISSING(LAG(id_a))
     AND id_a NE LAG(id_a).
.  COMPUTE y = LAG(y) + 1.

*  Change missing status of key 'b'  .
ELSE IF MISSING(id_b) NE MISSING(LAG(id_b)).
.  COMPUTE y = LAG(y) + 1.
*  Change value          of key 'b'  .
ELSE IF NOT MISSING(id_b)
     AND NOT MISSING(LAG(id_b))
     AND id_b NE LAG(id_b).
.  COMPUTE y = LAG(y) + 1.

*  Neither key has changed           .
ELSE.
.  COMPUTE y = LAG(y).
END IF.

LIST.



NEW FILE.
. data list free / id_a (a) id_b (a) desired   /* String  keys */.
begin data
. 1 0
. 2 1
1 1 2
1 1 2
1 2 3
2 2 4
3 3 5
3 5 6
3 5 6
4 4 7
4 4 7
5 . 8
5 5 9
6 6 10
6 6 10
6 6 10
7 . 11
7 1 12
7 2 13
end data.
FORMATS desired (F3).

.  /**/  LIST  /*-*/.

NUMERIC y       (F3).

*  First case in the file            .
DO IF   $CASENUM EQ 1.
.  COMPUTE y = 0.

*  Change value          of key 'a'  .
ELSE IF id_a NE LAG(id_a).
.  COMPUTE y = LAG(y) + 1.

*  Change value          of key 'b'  .
ELSE IF id_b NE LAG(id_b).
.  COMPUTE y = LAG(y) + 1.

*  Neither key has changed           .
ELSE.
.  COMPUTE y = LAG(y).
END IF.

LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Break a LAG

Richard Ristow
In reply to this post by Art Kendall
At 11:06 AM 10/20/2008, Art Kendall wrote, posting a really nice
solution, here tested and illustrated with the test data I previously
posted. Thanks, Art!

string id_ab(a2).
compute id_ab=concat(a,b).
autorecode variables = id_ab /into num_id.
compute num_id = num_id-1.

LIST.
List
|-----------------------------|---------------------------|
|Output Created               |22-OCT-2008 23:43:42       |
|-----------------------------|---------------------------|
a b desired id_ab num_id

. 1     0   .1       0
. 2     1   .2       1
1 1     2   11       2
1 1     2   11       2
1 2     3   12       3
2 2     4   22       4
3 3     5   33       5
3 5     6   35       6
3 5     6   35       6
4 4     7   44       7
4 4     7   44       7
5 .     8   5.       8
5 5     9   55       9
6 6    10   66      10
6 6    10   66      10
6 6    10   66      10
7 .    11   7.      11
7 1    12   71      12
7 2    13   72      13

Number of cases read:  19    Number of cases listed:  19

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Break a LAG

Art Kendall
You're welcome.

Art

Richard Ristow wrote:

> At 11:06 AM 10/20/2008, Art Kendall wrote, posting a really nice
> solution, here tested and illustrated with the test data I previously
> posted. Thanks, Art!
>
> string id_ab(a2).
> compute id_ab=concat(a,b).
> autorecode variables = id_ab /into num_id.
> compute num_id = num_id-1.
>
> LIST.
> List
> |-----------------------------|---------------------------|
> |Output Created               |22-OCT-2008 23:43:42       |
> |-----------------------------|---------------------------|
> a b desired id_ab num_id
>
> . 1     0   .1       0
> . 2     1   .2       1
> 1 1     2   11       2
> 1 1     2   11       2
> 1 2     3   12       3
> 2 2     4   22       4
> 3 3     5   33       5
> 3 5     6   35       6
> 3 5     6   35       6
> 4 4     7   44       7
> 4 4     7   44       7
> 5 .     8   5.       8
> 5 5     9   55       9
> 6 6    10   66      10
> 6 6    10   66      10
> 6 6    10   66      10
> 7 .    11   7.      11
> 7 1    12   71      12
> 7 2    13   72      13
>
> Number of cases read:  19    Number of cases listed:  19
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants