recode variable

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

recode variable

Tian Qiu

Hi all,

 

I have one varible with values from 1 to 4. I want to randomly select 60% of value ‘4’ respondents and recode them to ‘8’, and other other 40%, recode them to ‘9’, and get a new variable with values of 1,2,3,8 and 9.

 

Any help would be appreciated!

 

Reply | Threaded
Open this post in threaded view
|

Re: recode variable

Bruce Weaver
Administrator
Do you mean exactly 60% and 40%, or only approximately?  If approximately, something like this should work.

* Generate some sample data.
new file.
- input program.
-  loop #i = 1 to 1000.
-  compute v1 = trunc(rv.uniform(1,5)).
-  end case.
- end loop.
end file.
end input program.
execute.

compute v2 = v1. /* make copy of original variable.
if (v2 EQ 4) v2 = 8 + (uniform(1) GT .60) .
frequencies v1 v2.
crosstabs v1 by v2 / cells = row.

HTH.

Tian Qiu wrote
Hi all,

I have one varible with values from 1 to 4. I want to randomly select 60% of value '4' respondents and recode them to '8', and other other 40%, recode them to '9', and get a new variable with values of 1,2,3,8 and 9.

Any help would be appreciated!
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: recode variable

David Marso
Administrator
In reply to this post by Tian Qiu
* simulation of data * .
INPUT PROGRAM.
LOOP ID=1 TO 1000.
COMPUTE myvar=TRUNC(UNIFORM(4))+1.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
FREQ myvar.
**** Begin actual code using 'myvar' for your variable name ****.
*N.B. You can use MODE = ADDVARIABLES on AGGREGATE and skip the MATCH FILES **.
** See FM for details! **.
COMPUTE scramble=UNIFORM(1).
SORT CASES BY myvar scramble.
COMPUTE counter=SUM(1,LAG(COUNTER)*(Myvar EQ LAG(myvar))).
AGGREGATE OUTFILE 'tmp.sav' / BREAK= myvar / NCASES=N.
MATCH FILES / FILE * / TABLE 'tmp.sav' / BY myvar.
COMPUTE recoded=myvar.
IF myvar EQ 4 AND counter /ncases LE .6 recoded=8.
RECODE recoded (4=9).
FREQ recoded.


Tian Qiu wrote
Hi all,

I have one varible with values from 1 to 4. I want to randomly select 60% of value '4' respondents and recode them to '8', and other other 40%, recode them to '9', and get a new variable with values of 1,2,3,8 and 9.

Any help would be appreciated!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: recode variable

Bruce Weaver
Administrator
Nice method for generating a counter, David.  Combining the two methods that have been posted, and using MODE = ADDVARIABLES...


* Simulate some data.
new file.
- input program.
-  loop ID = 1 to 1000.
-  compute v1 = TRUNC(UNIFORM(4))+1.
-  end case.
- end loop.
end file.
end input program.
execute.

* Recode V1 such that 60% of the 4's become 8 and 40% become 9.

* Approximate version .
compute v2 = v1. /* make copy of original variable.
if (v2 EQ 4) v2 = 8 + (uniform(1) GT .60) .

* Pretty close to exact version (depending on the number of cases with v1=4).
COMPUTE scramble=UNIFORM(1).
SORT CASES BY v1 scramble.
COMPUTE counter=SUM(1,LAG(COUNTER)*(v1 EQ LAG(v1))).
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=v1
  /NCASES=N.
COMPUTE v3=v1.
IF v3 EQ 4 AND (counter / ncases LE .6) v3=8.
RECODE v3 (4=9).

variable labels
 v1 "Original variable"
 v2 "Approximate version"
 v3 "Pretty-close-to-exact version"
.
crosstabs v1 by v2 v3 / cells = row.



David Marso wrote
* simulation of data * .
INPUT PROGRAM.
LOOP ID=1 TO 1000.
COMPUTE myvar=TRUNC(UNIFORM(4))+1.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
FREQ myvar.
**** Begin actual code using 'myvar' for your variable name ****.
*N.B. You can use MODE = ADDVARIABLES on AGGREGATE and skip the MATCH FILES **.
** See FM for details! **.
COMPUTE scramble=UNIFORM(1).
SORT CASES BY myvar scramble.
COMPUTE counter=SUM(1,LAG(COUNTER)*(Myvar EQ LAG(myvar))).
AGGREGATE OUTFILE 'tmp.sav' / BREAK= myvar / NCASES=N.
MATCH FILES / FILE * / TABLE 'tmp.sav' / BY myvar.
COMPUTE recoded=myvar.
IF myvar EQ 4 AND counter /ncases LE .6 recoded=8.
RECODE recoded (4=9).
FREQ recoded.


Tian Qiu wrote
Hi all,

I have one varible with values from 1 to 4. I want to randomly select 60% of value '4' respondents and recode them to '8', and other other 40%, recode them to '9', and get a new variable with values of 1,2,3,8 and 9.

Any help would be appreciated!
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: recode variable

David Marso
Administrator
Thought that was pretty slick ;-)
Note it can be extended to multiple nested variables quite easily (simply multiply):
SORT CASES BY myvar myvar2 .
COMPUTE counter=SUM(1,LAG(COUNTER)*(Myvar EQ LAG(myvar))*(myvar2=LAG(myvar2) )).

Bruce Weaver wrote
Nice method for generating a counter, David.  Combining the two methods that have been posted, and using MODE = ADDVARIABLES...


* Simulate some data.
new file.
- input program.
-  loop ID = 1 to 1000.
-  compute v1 = TRUNC(UNIFORM(4))+1.
-  end case.
- end loop.
end file.
end input program.
execute.

* Recode V1 such that 60% of the 4's become 8 and 40% become 9.

* Approximate version .
compute v2 = v1. /* make copy of original variable.
if (v2 EQ 4) v2 = 8 + (uniform(1) GT .60) .

* Pretty close to exact version (depending on the number of cases with v1=4).
COMPUTE scramble=UNIFORM(1).
SORT CASES BY v1 scramble.
COMPUTE counter=SUM(1,LAG(COUNTER)*(v1 EQ LAG(v1))).
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=v1
  /NCASES=N.
COMPUTE v3=v1.
IF v3 EQ 4 AND (counter / ncases LE .6) v3=8.
RECODE v3 (4=9).

variable labels
 v1 "Original variable"
 v2 "Approximate version"
 v3 "Pretty-close-to-exact version"
.
crosstabs v1 by v2 v3 / cells = row.



David Marso wrote
* simulation of data * .
INPUT PROGRAM.
LOOP ID=1 TO 1000.
COMPUTE myvar=TRUNC(UNIFORM(4))+1.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
FREQ myvar.
**** Begin actual code using 'myvar' for your variable name ****.
*N.B. You can use MODE = ADDVARIABLES on AGGREGATE and skip the MATCH FILES **.
** See FM for details! **.
COMPUTE scramble=UNIFORM(1).
SORT CASES BY myvar scramble.
COMPUTE counter=SUM(1,LAG(COUNTER)*(Myvar EQ LAG(myvar))).
AGGREGATE OUTFILE 'tmp.sav' / BREAK= myvar / NCASES=N.
MATCH FILES / FILE * / TABLE 'tmp.sav' / BY myvar.
COMPUTE recoded=myvar.
IF myvar EQ 4 AND counter /ncases LE .6 recoded=8.
RECODE recoded (4=9).
FREQ recoded.


Tian Qiu wrote
Hi all,

I have one varible with values from 1 to 4. I want to randomly select 60% of value '4' respondents and recode them to '8', and other other 40%, recode them to '9', and get a new variable with values of 1,2,3,8 and 9.

Any help would be appreciated!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: recode variable

Bruce Weaver
Administrator
Yep...as you'll have seen by the time you read this, I was already working on extending it.  ;-)


David Marso wrote
Thought that was pretty slick ;-)
Note it can be extended to multiple nested variables quite easily (simply multiply):
SORT CASES BY myvar myvar2 .
COMPUTE counter=SUM(1,LAG(COUNTER)*(Myvar EQ LAG(myvar))*(myvar2=LAG(myvar2) )).

Bruce Weaver wrote
Nice method for generating a counter, David.  Combining the two methods that have been posted, and using MODE = ADDVARIABLES...


* Simulate some data.
new file.
- input program.
-  loop ID = 1 to 1000.
-  compute v1 = TRUNC(UNIFORM(4))+1.
-  end case.
- end loop.
end file.
end input program.
execute.

* Recode V1 such that 60% of the 4's become 8 and 40% become 9.

* Approximate version .
compute v2 = v1. /* make copy of original variable.
if (v2 EQ 4) v2 = 8 + (uniform(1) GT .60) .

* Pretty close to exact version (depending on the number of cases with v1=4).
COMPUTE scramble=UNIFORM(1).
SORT CASES BY v1 scramble.
COMPUTE counter=SUM(1,LAG(COUNTER)*(v1 EQ LAG(v1))).
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=v1
  /NCASES=N.
COMPUTE v3=v1.
IF v3 EQ 4 AND (counter / ncases LE .6) v3=8.
RECODE v3 (4=9).

variable labels
 v1 "Original variable"
 v2 "Approximate version"
 v3 "Pretty-close-to-exact version"
.
crosstabs v1 by v2 v3 / cells = row.



David Marso wrote
* simulation of data * .
INPUT PROGRAM.
LOOP ID=1 TO 1000.
COMPUTE myvar=TRUNC(UNIFORM(4))+1.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
FREQ myvar.
**** Begin actual code using 'myvar' for your variable name ****.
*N.B. You can use MODE = ADDVARIABLES on AGGREGATE and skip the MATCH FILES **.
** See FM for details! **.
COMPUTE scramble=UNIFORM(1).
SORT CASES BY myvar scramble.
COMPUTE counter=SUM(1,LAG(COUNTER)*(Myvar EQ LAG(myvar))).
AGGREGATE OUTFILE 'tmp.sav' / BREAK= myvar / NCASES=N.
MATCH FILES / FILE * / TABLE 'tmp.sav' / BY myvar.
COMPUTE recoded=myvar.
IF myvar EQ 4 AND counter /ncases LE .6 recoded=8.
RECODE recoded (4=9).
FREQ recoded.


Tian Qiu wrote
Hi all,

I have one varible with values from 1 to 4. I want to randomly select 60% of value '4' respondents and recode them to '8', and other other 40%, recode them to '9', and get a new variable with values of 1,2,3,8 and 9.

Any help would be appreciated!
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).