delete string variables

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

delete string variables

cristiano1974
Dear Listers,
 I'd like to delete  all-blanks-string-variables in my dataset.

I've use the following syntax for numeric data:

* create a sample dataset: .

> data list free / v1 to v4.
> begin data
> 0 0 1 0
> 0 1 0 0
> 0 0 0 0
> 0 1 0 0
> 0 0 0 0
> 0 1 1 0
> 0 0 0 0
> 0 0 0 0
> 0 0 1 0
> 0 0 0 0
> end data.
> recode all (0 = sysmis).
> exe.
>
> dataset name source.
>
> * the solution: .
> DATASET DECLARE agg.
> temp.
> compute w = 1.
> AGGREGATE
>   /OUTFILE='agg'
>   /BREAK=w
>   /v1 to v4 =NMISS(v1 to v4)
>   /N_BREAK=N.
> DATASET activate agg.
> do repe x = ALL .
> - compute x = (100 * x / N_BREAK = 100).
> end repe.
> exe.
>
> DELETE VARIABLES w N_BREAK.
> FLIP.
> SELECT IF  var001.
> string d (a250).
> compute d = concat("DELETE VARIABLES ",  CASE_LBL, ".").
> WRITE OUTFILE= 'c:/temp/delevar.sps' /d.
> exe.
>
> dataset activate source.
> INSERT FILE='c:/temp/delevar.sps' .


But I cannot find a solution for string variables....

Any kind of suggestion would be greatly appreciated.

Thanks in advance.

Cristiano.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: delete string variables

Albert-Jan Roskam
Hi!

The code below works, but it's not very concise. Maybe Jon Peck or somebody else has an ingenious two-line solution ;-)
v1, v4, blankstr and fubar are the vars to be deleted.

Cheers!!
Albert-Jan

* create a sample dataset: .
data list free / v1 to v4.
begin data
0 0 1 0
0 1 0 0
0 0 0 0
0 1 0 0
0 0 0 0
0 1 1 0
0 0 0 0
0 0 0 0
0 0 1 0
0 0 0 0
end data.
recode all (0 = sysmis).
string blankstr fubar (a8).
compute blankstr = " " .
compute fubar = " " .
exe.

* actual code.
begin program.
import spss, spssaux

spss.Submit("""
compute const = 1.
exe.
dataset name source.""")

# check numeric vars
numDict = spssaux.VariableDict(variableType='numeric')
cmd = ["aggr out = * / break = const / "] + ["  " + var.VariableName + " = sum (" + var.VariableName + ") / " \
 for var in numDict if var.VariableName != "const"] + ["."]
spss.Submit(cmd)
spss.Submit("""
dataset name thisone.
dataset activate source.""")

# check string vars
strDict = spssaux.VariableDict(variableType='string')
for var in strDict:
        spss.Submit("compute len_%s = length(rtrim(ltrim( %s )))." % (var.VariableName, var.VariableName))
cmd = ["aggr out = * / break = const / "] + ["   " + var.VariableName + " = sum (len_" + var.VariableName + ") / " \
 for var in strDict if var.VariableName != "const"] + ["."]
spss.Submit(cmd)

# build list of vars to be deleted
spss.Submit("""
dataset name thatone window = asis.
match files / file = thisone / file = thatone / by = const.
recode all (sysmis = 0) (else = copy).
exe.""")
dataCursor=spss.Cursor()
oneRow=dataCursor.fetchone()
varDict = spssaux.VariableDict()
delvars = [str(var) for k, var in enumerate(varDict) if oneRow[k] == 0]
dataCursor.close()
spss.Submit("dataset activate source.")
lastvar = ' '.join(spssaux.GetVariableNamesList()[-1:])
spss.Submit("""
add files / file = * / drop = %s const to %s.
exe.
dataset close all.
""" % (' '.join(delvars), lastvar))
end program.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: delete string variables

Peck, Jon
Well one pretty easy way if you have the Data Validation option would be to use that to construct a table of empty variables, capture that table with OMS, and use a trivial Python program to generate a delete variables command listing them.

It's not two commands, but it would be fewer than ten.

Even shorter would be to use the function FindEmptyVars in the spssaux2.py module on Developer Central.

begin program.
import spss, spssaux2
FindEmptyVars(delete=True)
end program.

HTH,
Jon Peck
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Albert-jan Roskam
Sent: Monday, January 26, 2009 11:40 AM
To: [hidden email]
Subject: Re: [SPSSX-L] delete string variables

Hi!

The code below works, but it's not very concise. Maybe Jon Peck or somebody else has an ingenious two-line solution ;-)
v1, v4, blankstr and fubar are the vars to be deleted.

Cheers!!
Albert-Jan

* create a sample dataset: .
data list free / v1 to v4.
begin data
0 0 1 0
0 1 0 0
0 0 0 0
0 1 0 0
0 0 0 0
0 1 1 0
0 0 0 0
0 0 0 0
0 0 1 0
0 0 0 0
end data.
recode all (0 = sysmis).
string blankstr fubar (a8).
compute blankstr = " " .
compute fubar = " " .
exe.

* actual code.
begin program.
import spss, spssaux

spss.Submit("""
compute const = 1.
exe.
dataset name source.""")

# check numeric vars
numDict = spssaux.VariableDict(variableType='numeric')
cmd = ["aggr out = * / break = const / "] + ["  " + var.VariableName + " = sum (" + var.VariableName + ") / " \
 for var in numDict if var.VariableName != "const"] + ["."]
spss.Submit(cmd)
spss.Submit("""
dataset name thisone.
dataset activate source.""")

# check string vars
strDict = spssaux.VariableDict(variableType='string')
for var in strDict:
        spss.Submit("compute len_%s = length(rtrim(ltrim( %s )))." % (var.VariableName, var.VariableName))
cmd = ["aggr out = * / break = const / "] + ["   " + var.VariableName + " = sum (len_" + var.VariableName + ") / " \
 for var in strDict if var.VariableName != "const"] + ["."]
spss.Submit(cmd)

# build list of vars to be deleted
spss.Submit("""
dataset name thatone window = asis.
match files / file = thisone / file = thatone / by = const.
recode all (sysmis = 0) (else = copy).
exe.""")
dataCursor=spss.Cursor()
oneRow=dataCursor.fetchone()
varDict = spssaux.VariableDict()
delvars = [str(var) for k, var in enumerate(varDict) if oneRow[k] == 0]
dataCursor.close()
spss.Submit("dataset activate source.")
lastvar = ' '.join(spssaux.GetVariableNamesList()[-1:])
spss.Submit("""
add files / file = * / drop = %s const to %s.
exe.
dataset close all.
""" % (' '.join(delvars), lastvar))
end program.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: delete string variables

Art Kendall
In reply to this post by cristiano1974
This is a new version of the syntax in your post.  If you do not use the
PYTHON approach, you can make empty strings be user missing.

It has a mix of string and numeric variables.
It removes the misuse of sysmis.
It does recode into the same variables because this is a demo.  This is
usually a poor practice.
It simplifies the logical condition in the DO REPEAT.
It changes the WRITE to go to the listing since this is just a demo.

Art Kendall
Social Research Consultants



data list list / v1 (A3) V2(A3) V3(F1) V4 (F1).
begin data
0 0 1 0
0 1 0 0
0 0 0 0
0 1 0 0
0 0 0 0
0 1 1 0
0 0 0 0
0 0 0 0
0 0 1 0
0 0 0 0
end data.
* one of the strengths of SPSS is the distinction between user missing
and system missing.
* a value is user missing because the user says it is.
* a value is system missing because the computer is unable to obey the
instructions the user has given.
*
*recode all (0 = sysmis).
*RECODE IN PLACE USED BECAUSE THIS IN ONLY A DEMO.
RECODE V1 V2 ('0  ' ='').
MISSING VALUES V1 V2 ('') V3 V4 (0).
exe.

dataset name source.

* the solution: .
DATASET DECLARE agg.
temp.
compute w = 1.
AGGREGATE
  /OUTFILE='agg'
  /BREAK=w
  /v1 to v4 =NMISS(v1 to v4)
  /N_BREAK=N.
DATASET activate agg.
do repe x = ALL .
- COMPUTE X = X EQ N_BREAK.
*- compute x = (100 * x / N_BREAK = 100).
end repe.
exe.

DELETE VARIABLES w N_BREAK.
FLIP.
SELECT IF  var001.
string d (a50).
compute d = concat("DELETE VARIABLES ",  CASE_LBL, ".").
WRITE  /d.
exe.


Cristiano wrote:

> Dear Listers,
>  I'd like to delete  all-blanks-string-variables in my dataset.
>
> I've use the following syntax for numeric data:
>
> * create a sample dataset: .
>
>> data list free / v1 to v4.
>> begin data
>> 0 0 1 0
>> 0 1 0 0
>> 0 0 0 0
>> 0 1 0 0
>> 0 0 0 0
>> 0 1 1 0
>> 0 0 0 0
>> 0 0 0 0
>> 0 0 1 0
>> 0 0 0 0
>> end data.
>> recode all (0 = sysmis).
>> exe.
>>
>> dataset name source.
>>
>> * the solution: .
>> DATASET DECLARE agg.
>> temp.
>> compute w = 1.
>> AGGREGATE
>>   /OUTFILE='agg'
>>   /BREAK=w
>>   /v1 to v4 =NMISS(v1 to v4)
>>   /N_BREAK=N.
>> DATASET activate agg.
>> do repe x = ALL .
>> - compute x = (100 * x / N_BREAK = 100).
>> end repe.
>> exe.
>>
>> DELETE VARIABLES w N_BREAK.
>> FLIP.
>> SELECT IF  var001.
>> string d (a250).
>> compute d = concat("DELETE VARIABLES ",  CASE_LBL, ".").
>> WRITE OUTFILE= 'c:/temp/delevar.sps' /d.
>> exe.
>>
>> dataset activate source.
>> INSERT FILE='c:/temp/delevar.sps' .
>>
>
>
> But I cannot find a solution for string variables....
>
> Any kind of suggestion would be greatly appreciated.
>
> Thanks in advance.
>
> Cristiano.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: delete string variables

cristiano1974
Thanks to ALL!!
I try with python solution because I'd like to see how pyton works in SPSS:
I should install Pyton Integration Plug-in in SPSS17 ?

Thanks again ;)

On Tue, Jan 27, 2009 at 9:56 PM, Art Kendall <[hidden email]> wrote:

>  This is a new version of the syntax in your post.  If you do not use the
> PYTHON approach, you can make empty strings be user missing.
>
> It has a mix of string and numeric variables.
> It removes the misuse of sysmis.
> It does recode into the same variables because this is a demo.  This is
> usually a poor practice.
> It simplifies the logical condition in the DO REPEAT.
> It changes the WRITE to go to the listing since this is just a demo.
>
> Art Kendall
> Social Research Consultants
>
>
>
> data list list / v1 (A3) V2(A3) V3(F1) V4 (F1).
> begin data
> 0 0 1 0
> 0 1 0 0
> 0 0 0 0
> 0 1 0 0
> 0 0 0 0
> 0 1 1 0
> 0 0 0 0
> 0 0 0 0
> 0 0 1 0
> 0 0 0 0
> end data.
> * one of the strengths of SPSS is the distinction between user missing and
> system missing.
> * a value is user missing because the user says it is.
> * a value is system missing because the computer is unable to obey the
> instructions the user has given.
> *
> *recode all (0 = sysmis).
> *RECODE IN PLACE USED BECAUSE THIS IN ONLY A DEMO.
> RECODE V1 V2 ('0  ' ='').
> MISSING VALUES V1 V2 ('') V3 V4 (0).
> exe.
>
> dataset name source.
>
> * the solution: .
> DATASET DECLARE agg.
> temp.
> compute w = 1.
> AGGREGATE
>   /OUTFILE='agg'
>   /BREAK=w
>   /v1 to v4 =NMISS(v1 to v4)
>   /N_BREAK=N.
> DATASET activate agg.
> do repe x = ALL .
> - COMPUTE X = X EQ N_BREAK.
> *- compute x = (100 * x / N_BREAK = 100).
> end repe.
> exe.
>
> DELETE VARIABLES w N_BREAK.
> FLIP.
> SELECT IF  var001.
> string d (a50).
> compute d = concat("DELETE VARIABLES ",  CASE_LBL, ".").
> WRITE  /d.
> exe.
>
>
> Cristiano wrote:
>
> Dear Listers,
>  I'd like to delete  all-blanks-string-variables in my dataset.
>
> I've use the following syntax for numeric data:
>
> * create a sample dataset: .
>
>
>  data list free / v1 to v4.
> begin data
> 0 0 1 0
> 0 1 0 0
> 0 0 0 0
> 0 1 0 0
> 0 0 0 0
> 0 1 1 0
> 0 0 0 0
> 0 0 0 0
> 0 0 1 0
> 0 0 0 0
> end data.
> recode all (0 = sysmis).
> exe.
>
> dataset name source.
>
> * the solution: .
> DATASET DECLARE agg.
> temp.
> compute w = 1.
> AGGREGATE
>   /OUTFILE='agg'
>   /BREAK=w
>   /v1 to v4 =NMISS(v1 to v4)
>   /N_BREAK=N.
> DATASET activate agg.
> do repe x = ALL .
> - compute x = (100 * x / N_BREAK = 100).
> end repe.
> exe.
>
> DELETE VARIABLES w N_BREAK.
> FLIP.
> SELECT IF  var001.
> string d (a250).
> compute d = concat("DELETE VARIABLES ",  CASE_LBL, ".").
> WRITE OUTFILE= 'c:/temp/delevar.sps' /d.
> exe.
>
> dataset activate source.
> INSERT FILE='c:/temp/delevar.sps' .
>
>
>  But I cannot find a solution for string variables....
>
> Any kind of suggestion would be greatly appreciated.
>
> Thanks in advance.
>
> Cristiano.
>
> =====================
> To manage your subscription to SPSSX-L, send a message [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD