syntax to share

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

syntax to share

Dominic Lusinchi

Dear fellow SPSS users,

 

I wanted to share some data management syntax I used to delete variables. I’m sure that for experienced users, it’s old news, and that the real programmers among you might even be able to improve on it. I am using version 21 for this exercise.

The problem I faced was a large data set with many empty variables, i.e. variables that had no values, questions in the survey that were answered by no one. I wanted a data file with only the variables I would work with. Of course, you could use DELETE VARIABLES (or even delete manually); but that means that you need to go through every variable to determine which are “empty” and which are not. That could be very tedious and time consuming, not to mention prone to error. So here is the syntax I used (the succinct version first; the explained version second):

 

FLIP VARIABLES=ALL.

COMPUTE FLTRvar=NVALID(var1 TO varN).

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

DELETE VARIABLES  FLTRvar.

FLIP VARIABLES=ALL.

DELETE VARIABLES  CASE_LBL.

 

FLIP VARIABLES=ALL.

* Cases become variables and variables become cases; so if you have 100 cases, the new data set will have 100 variables from VAR001 to VAR100.

* Also FLIP creates a new string variable CASE_LBL whose values are the names of the variables in the original data set.

* CAUTION: FLIP SETS ALL STRING VARIABLES in the original data set TO SYSMIS .

COMPUTE FLTRvar=NVALID(var1 TO varN).

* The FLTRvar using NVALID will tell me which of the variables (now cases) in the original data set are empty, which means they are set to “0”.

* NVALID tells us how many variables have valid values, i.e. are not missing; so if there 5 variables and all have values (none missing) NVALID returns a “5”.

* Replace “var1” and “varN” with the names of your variables.

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

* The SELECT IF deletes all the empty variables (now cases) from the original data set that have “0” on FLTRvar.

DELETE VARIABLES  FLTRvar.

* I get rid of the filter variable since I do not need it anymore.

FLIP VARIABLES=ALL.

* This sets back the original variables (cases in the first FLIP) to their status of variable.

DELETE VARIABLES  CASE_LBL.

* I delete the CASE_LBL created by the second FLIP.

* Then I save the resulting data file.

* ALWAYS SAVE THE ORIGINAL DATA  so you can always come back to it.

 

Best regards – Dominic

 

*********************************************

Dominic Lusinchi

Far West Research Consulting

Applied Statistics – Social Research – Sociology

San Francisco, California

[hidden email]

1-415-664-3032

CV: http://www.farwestresearch.com/staff/dl/dlcv.html

*********************************************

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: syntax to share

Jignesh Sutar
Well done! This is very much a common procedure that I personally use frequently also.

Just some tips...

You should look into working with DATASET NAMEs as when you flip a dataset a new (unnamed) dataset is created and preceding commands are intended to run on that dataset so your code should really should name that dataset and activate it. In this way you can run the entire syntax without any issues (else you have to step through it and in between select/activate the appropriate dataset, which I suspect is what you are doing to get this to work).

Secondly, the code "COMPUTE FLTRvar=NVALID(var1 TO varN)." requires input from the user, again which isn't ideal, and often there are easy workaround to avoid this.

You could also wrap and save this into a macro also (or even create a custom extension command), so it can be called upon by a single line than have to repeat all those lines of code. I shall let you figure out how to achieve that.

Here's a revised version of your code, that you may want to play with and try understand it's benefits in practice:


Ohhh and by the way, note, there is a spssaux2.FindEmptyVariables(delete=True) procedure which is dependent on python achieving this very same task.


DATA LIST FREE / A B C D.
BEGIN DATA 
1 2 3 4 5 6 7 8 9 10 11 12
END DATA. 
COMPUTE C=$SYSMIS.
DATASET NAME DSSIM.


FLIP VARIABLES=ALL.
DATASET NAME dsFlip1.
DATASET ACTIVATE dsFlip1.
NUMERIC MyEndVar(F1.0).
COMPUTE FLTRvar=NVALID(var001 TO MyEndVar).
FILTER OFF.
USE ALL.
SELECT IF (FLTRvar > 0).
EXECUTE.
DELETE VARIABLES  FLTRvar.
FLIP VARIABLES=ALL.
DATASET NAME dsFlip2.
DATASET ACTIVATE dsFlip2.
SELECT IF CASE_LBL<>"MyEndVar".
EXECUTE.
DELETE VARIABLES  CASE_LBL.




On 18 November 2015 at 18:40, Dominic Lusinchi <[hidden email]> wrote:

Dear fellow SPSS users,

 

I wanted to share some data management syntax I used to delete variables. I’m sure that for experienced users, it’s old news, and that the real programmers among you might even be able to improve on it. I am using version 21 for this exercise.

The problem I faced was a large data set with many empty variables, i.e. variables that had no values, questions in the survey that were answered by no one. I wanted a data file with only the variables I would work with. Of course, you could use DELETE VARIABLES (or even delete manually); but that means that you need to go through every variable to determine which are “empty” and which are not. That could be very tedious and time consuming, not to mention prone to error. So here is the syntax I used (the succinct version first; the explained version second):

 

FLIP VARIABLES=ALL.

COMPUTE FLTRvar=NVALID(var1 TO varN).

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

DELETE VARIABLES  FLTRvar.

FLIP VARIABLES=ALL.

DELETE VARIABLES  CASE_LBL.

 

FLIP VARIABLES=ALL.

* Cases become variables and variables become cases; so if you have 100 cases, the new data set will have 100 variables from VAR001 to VAR100.

* Also FLIP creates a new string variable CASE_LBL whose values are the names of the variables in the original data set.

* CAUTION: FLIP SETS ALL STRING VARIABLES in the original data set TO SYSMIS .

COMPUTE FLTRvar=NVALID(var1 TO varN).

* The FLTRvar using NVALID will tell me which of the variables (now cases) in the original data set are empty, which means they are set to “0”.

* NVALID tells us how many variables have valid values, i.e. are not missing; so if there 5 variables and all have values (none missing) NVALID returns a “5”.

* Replace “var1” and “varN” with the names of your variables.

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

* The SELECT IF deletes all the empty variables (now cases) from the original data set that have “0” on FLTRvar.

DELETE VARIABLES  FLTRvar.

* I get rid of the filter variable since I do not need it anymore.

FLIP VARIABLES=ALL.

* This sets back the original variables (cases in the first FLIP) to their status of variable.

DELETE VARIABLES  CASE_LBL.

* I delete the CASE_LBL created by the second FLIP.

* Then I save the resulting data file.

* ALWAYS SAVE THE ORIGINAL DATA  so you can always come back to it.

 

Best regards – Dominic

 

*********************************************

Dominic Lusinchi

Far West Research Consulting

Applied Statistics – Social Research – Sociology

San Francisco, California

[hidden email]

1-415-664-3032

CV: http://www.farwestresearch.com/staff/dl/dlcv.html

*********************************************

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: syntax to share

Jignesh Sutar
Preceding commands? I meant subsequent commands after any FLIP command.

On 18 November 2015 at 19:14, Jignesh Sutar <[hidden email]> wrote:
Well done! This is very much a common procedure that I personally use frequently also.

Just some tips...

You should look into working with DATASET NAMEs as when you flip a dataset a new (unnamed) dataset is created and preceding commands are intended to run on that dataset so your code should really should name that dataset and activate it. In this way you can run the entire syntax without any issues (else you have to step through it and in between select/activate the appropriate dataset, which I suspect is what you are doing to get this to work).

Secondly, the code "COMPUTE FLTRvar=NVALID(var1 TO varN)." requires input from the user, again which isn't ideal, and often there are easy workaround to avoid this.

You could also wrap and save this into a macro also (or even create a custom extension command), so it can be called upon by a single line than have to repeat all those lines of code. I shall let you figure out how to achieve that.

Here's a revised version of your code, that you may want to play with and try understand it's benefits in practice:


Ohhh and by the way, note, there is a spssaux2.FindEmptyVariables(delete=True) procedure which is dependent on python achieving this very same task.


DATA LIST FREE / A B C D.
BEGIN DATA 
1 2 3 4 5 6 7 8 9 10 11 12
END DATA. 
COMPUTE C=$SYSMIS.
DATASET NAME DSSIM.


FLIP VARIABLES=ALL.
DATASET NAME dsFlip1.
DATASET ACTIVATE dsFlip1.
NUMERIC MyEndVar(F1.0).
COMPUTE FLTRvar=NVALID(var001 TO MyEndVar).
FILTER OFF.
USE ALL.
SELECT IF (FLTRvar > 0).
EXECUTE.
DELETE VARIABLES  FLTRvar.
FLIP VARIABLES=ALL.
DATASET NAME dsFlip2.
DATASET ACTIVATE dsFlip2.
SELECT IF CASE_LBL<>"MyEndVar".
EXECUTE.
DELETE VARIABLES  CASE_LBL.




On 18 November 2015 at 18:40, Dominic Lusinchi <[hidden email]> wrote:

Dear fellow SPSS users,

 

I wanted to share some data management syntax I used to delete variables. I’m sure that for experienced users, it’s old news, and that the real programmers among you might even be able to improve on it. I am using version 21 for this exercise.

The problem I faced was a large data set with many empty variables, i.e. variables that had no values, questions in the survey that were answered by no one. I wanted a data file with only the variables I would work with. Of course, you could use DELETE VARIABLES (or even delete manually); but that means that you need to go through every variable to determine which are “empty” and which are not. That could be very tedious and time consuming, not to mention prone to error. So here is the syntax I used (the succinct version first; the explained version second):

 

FLIP VARIABLES=ALL.

COMPUTE FLTRvar=NVALID(var1 TO varN).

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

DELETE VARIABLES  FLTRvar.

FLIP VARIABLES=ALL.

DELETE VARIABLES  CASE_LBL.

 

FLIP VARIABLES=ALL.

* Cases become variables and variables become cases; so if you have 100 cases, the new data set will have 100 variables from VAR001 to VAR100.

* Also FLIP creates a new string variable CASE_LBL whose values are the names of the variables in the original data set.

* CAUTION: FLIP SETS ALL STRING VARIABLES in the original data set TO SYSMIS .

COMPUTE FLTRvar=NVALID(var1 TO varN).

* The FLTRvar using NVALID will tell me which of the variables (now cases) in the original data set are empty, which means they are set to “0”.

* NVALID tells us how many variables have valid values, i.e. are not missing; so if there 5 variables and all have values (none missing) NVALID returns a “5”.

* Replace “var1” and “varN” with the names of your variables.

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

* The SELECT IF deletes all the empty variables (now cases) from the original data set that have “0” on FLTRvar.

DELETE VARIABLES  FLTRvar.

* I get rid of the filter variable since I do not need it anymore.

FLIP VARIABLES=ALL.

* This sets back the original variables (cases in the first FLIP) to their status of variable.

DELETE VARIABLES  CASE_LBL.

* I delete the CASE_LBL created by the second FLIP.

* Then I save the resulting data file.

* ALWAYS SAVE THE ORIGINAL DATA  so you can always come back to it.

 

Best regards – Dominic

 

*********************************************

Dominic Lusinchi

Far West Research Consulting

Applied Statistics – Social Research – Sociology

San Francisco, California

[hidden email]

1-415-664-3032

CV: http://www.farwestresearch.com/staff/dl/dlcv.html

*********************************************

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: syntax to share

David Marso
Administrator
In reply to this post by Dominic Lusinchi
The basic problem I see here is that very first line.
What if you have 1 Million Cases?  FLIP will create havoc in attempting to build 1 Million variables.
In other words this scales horribly.
Avoid that problem as follows:  FWIW, This solution does not play nice with STRING variables so you will need to float a different boat for that.

MATRIX.
SAVE TRUNC(UNIFORM(10000,300)*3) /OUTFILE * / VARIABLES x001 TO x300.
END MATRIX.

RECODE x001 TO x010 x020 TO x025 x100 TO x200 (ELSE=999).
MISSING VALUES ALL (999).
DATASET NAME raw.
/* Sample data are all set */.

DATASET DECLARE DESC2.
OMS
  /SELECT TABLES
  /IF COMMANDS=['Descriptives'] SUBTYPES=['Descriptive Statistics']
  /DESTINATION FORMAT=SAV NUMBERED=TableNumber_
   OUTFILE='DESC2' .
DESCRIPTIVES ALL.
OMSEND.
DATASET ACTIVATE DESC2.
MATCH FILES /FILE * /KEEP Var1 N.
SELECT IF N GT 0.
RENAME VARIABLES (Var1 = CASE_LBL).
FLIP.
COMPUTE @LastValid@=1.
ADD FILES /FILE * / FILE raw.
COMPUTE @LastVar@=1.
SELECT IF CASE_LBL NE 'N'.
EXECUTE.
DELETE VARIABLES CASE_LBL @LastValid@ TO @LastVar@ .

Dominic Lusinchi wrote
Dear fellow SPSS users,

 

I wanted to share some data management syntax I used to delete variables.
I'm sure that for experienced users, it's old news, and that the real
programmers among you might even be able to improve on it. I am using
version 21 for this exercise.

The problem I faced was a large data set with many empty variables, i.e.
variables that had no values, questions in the survey that were answered by
no one. I wanted a data file with only the variables I would work with. Of
course, you could use DELETE VARIABLES (or even delete manually); but that
means that you need to go through every variable to determine which are
"empty" and which are not. That could be very tedious and time consuming,
not to mention prone to error. So here is the syntax I used (the succinct
version first; the explained version second):

 

FLIP VARIABLES=ALL.

COMPUTE FLTRvar=NVALID(var1 TO varN).

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

DELETE VARIABLES  FLTRvar.

FLIP VARIABLES=ALL.

DELETE VARIABLES  CASE_LBL.

 

FLIP VARIABLES=ALL.

* Cases become variables and variables become cases; so if you have 100
cases, the new data set will have 100 variables from VAR001 to VAR100.

* Also FLIP creates a new string variable CASE_LBL whose values are the
names of the variables in the original data set.

* CAUTION: FLIP SETS ALL STRING VARIABLES in the original data set TO SYSMIS
.

COMPUTE FLTRvar=NVALID(var1 TO varN).

* The FLTRvar using NVALID will tell me which of the variables (now cases)
in the original data set are empty, which means they are set to "0".

* NVALID tells us how many variables have valid values, i.e. are not
missing; so if there 5 variables and all have values (none missing) NVALID
returns a "5".

* Replace "var1" and "varN" with the names of your variables.

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

* The SELECT IF deletes all the empty variables (now cases) from the
original data set that have "0" on FLTRvar.

DELETE VARIABLES  FLTRvar.

* I get rid of the filter variable since I do not need it anymore.

FLIP VARIABLES=ALL.

* This sets back the original variables (cases in the first FLIP) to their
status of variable.

DELETE VARIABLES  CASE_LBL.

* I delete the CASE_LBL created by the second FLIP.

* Then I save the resulting data file.

* ALWAYS SAVE THE ORIGINAL DATA  so you can always come back to it.

 

Best regards - Dominic

 

*********************************************

Dominic Lusinchi

Far West Research Consulting

Applied Statistics - Social Research - Sociology

San Francisco, California

[hidden email]

1-415-664-3032

CV: http://www.farwestresearch.com/staff/dl/dlcv.html

*********************************************

 


=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: syntax to share

Kirill Orlov
This post was updated on .
Before OMS was introduced I'd designed my way to drop empty (all cases
missing) numeric variables as follows (Even now I still consider this
way to be acceptable. This is a third path (the first is via FLIP, the
second is via OMS). Goes as follows.

Do replace missing values (RMV command) with all the variables: replace
with series mean (If your variables either full or fully empty  you may
do replacing right in the original variables, otherwise you'll have to
produce variables with new names).

If a variable was empty the first its case will remain empty. Otherwise
it won't be empty. So, inspect, programmically, just the first case. Namely,

WRITE out the 'MATCH FILES' or 'DELETE VARIABLES' command (as a syntax
file) along with the list of variable names where case 1 is empty.
Finally, INSERT or INCLUDE what you've written out.

The tricky task to do the above WRITEing can be performed in more than
one way. One way is using MATRIX.
Another way is using it within a macro, macro-looping through variable
names. Such as:
do if $casenum=1.
  !do !var !in (!varlist)
-  do if sysmis(!var).
-   write outfile= 'to_insert.sps'  /' ' !quote(!var).

or within DO REPEAT, something like
do if $casenum=1.
- do repeat var= var1 var2 var3 etc /qname= 'var1' 'var2' 'var3' etc.
(quoted name list)
-  do if sysmis(var).
-   write outfile= 'to_insert.sps'  /' ' qname.



18.11.2015 22:21, David Marso пишет:
> The basic problem I see here is that very first line.
> What if you have 1 Million Cases?  FLIP will create havoc in attempting to
> build 1 Million variables.
> In other words this scales horribly.
> Avoid that problem as follows:  FWIW, This solution does not play nice with
> STRING variables so you will need to float a different boat for that.
>
> MATRIX.
> SAVE TRUNC(UNIFORM(10000,300)*3) /OUTFILE * / VARIABLES x001 TO x300.
> END MATRIX.
>
> RECODE x001 TO x010 x020 TO x025 x100 TO x200 (ELSE=999).
> MISSING VALUES ALL (999).
> DATASET NAME raw.
> /* Sample data are all set */.
>
> DATASET DECLARE DESC2.
> OMS
>    /SELECT TABLES
>    /IF COMMANDS=['Descriptives'] SUBTYPES=['Descriptive Statistics']
>    /DESTINATION FORMAT=SAV NUMBERED=TableNumber_
>     OUTFILE='DESC2' .
> DESCRIPTIVES ALL.
> OMSEND.
> DATASET ACTIVATE DESC2.
> MATCH FILES /FILE * /KEEP Var1 N.
> SELECT IF N GT 0.
> RENAME VARIABLES (Var1 = CASE_LBL).
> FLIP.
> COMPUTE @LastValid@=1.
> ADD FILES /FILE * / FILE raw.
> COMPUTE @LastVar@=1.
> SELECT IF CASE_LBL NE 'N'.
> EXECUTE.
> DELETE VARIABLES CASE_LBL @LastValid@ TO @LastVar@ .
>


=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: syntax to share

Bruce Weaver
Administrator
In reply to this post by David Marso
Nice one, David.  

It's not immediately clear to me how one might tweak that to work with string variables.  One can assign missing values to string variables (MISSING VALUES command), but the NVALID and NMISS functions only work with numeric variables.  

Perhaps one could AUTORECODE string variables to numeric, and include a BLANK sub-command to convert blanks to user-defined missing values?  Maybe SORT VARIABLES BY TYPE would be useful too?  Just "thinking out loud" here in case I say something that inspires you.  ;-)

Here's a variation on an AUTORECODE example from the FM that illustrates use of the BLANK sub-command.

DATA LIST /stringVar (A1).
BEGIN DATA
a
b

c
d
END DATA.
AUTORECODE VARIABLES=stringVar /INTO NumericVar /BLANK=MISSING.
COMPUTE MissingValue = MISSING(NumericVar).
FORMATS MissingValue(F1).
LIST.

OUTPUT:
stringVar NumericVar MissingValue
 
a              1           0
b              2           0
               5           1
c              3           0
d              4           0
 
Number of cases read:  5    Number of cases listed:  5

v The values a, b, c, and d are autorecoded into the numeric values 1 through 4.
v The blank value is autorecoded to 5, and 5 is defined as user-missing.



David Marso wrote
The basic problem I see here is that very first line.
What if you have 1 Million Cases?  FLIP will create havoc in attempting to build 1 Million variables.
In other words this scales horribly.
Avoid that problem as follows:  FWIW, This solution does not play nice with STRING variables so you will need to float a different boat for that.

MATRIX.
SAVE TRUNC(UNIFORM(10000,300)*3) /OUTFILE * / VARIABLES x001 TO x300.
END MATRIX.

RECODE x001 TO x010 x020 TO x025 x100 TO x200 (ELSE=999).
MISSING VALUES ALL (999).
DATASET NAME raw.
/* Sample data are all set */.

DATASET DECLARE DESC2.
OMS
  /SELECT TABLES
  /IF COMMANDS=['Descriptives'] SUBTYPES=['Descriptive Statistics']
  /DESTINATION FORMAT=SAV NUMBERED=TableNumber_
   OUTFILE='DESC2' .
DESCRIPTIVES ALL.
OMSEND.
DATASET ACTIVATE DESC2.
MATCH FILES /FILE * /KEEP Var1 N.
SELECT IF N GT 0.
RENAME VARIABLES (Var1 = CASE_LBL).
FLIP.
COMPUTE @LastValid@=1.
ADD FILES /FILE * / FILE raw.
COMPUTE @LastVar@=1.
SELECT IF CASE_LBL NE 'N'.
EXECUTE.
DELETE VARIABLES CASE_LBL @LastValid@ TO @LastVar@ .

Dominic Lusinchi wrote
Dear fellow SPSS users,

 

I wanted to share some data management syntax I used to delete variables.
I'm sure that for experienced users, it's old news, and that the real
programmers among you might even be able to improve on it. I am using
version 21 for this exercise.

The problem I faced was a large data set with many empty variables, i.e.
variables that had no values, questions in the survey that were answered by
no one. I wanted a data file with only the variables I would work with. Of
course, you could use DELETE VARIABLES (or even delete manually); but that
means that you need to go through every variable to determine which are
"empty" and which are not. That could be very tedious and time consuming,
not to mention prone to error. So here is the syntax I used (the succinct
version first; the explained version second):

 

FLIP VARIABLES=ALL.

COMPUTE FLTRvar=NVALID(var1 TO varN).

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

DELETE VARIABLES  FLTRvar.

FLIP VARIABLES=ALL.

DELETE VARIABLES  CASE_LBL.

 

FLIP VARIABLES=ALL.

* Cases become variables and variables become cases; so if you have 100
cases, the new data set will have 100 variables from VAR001 to VAR100.

* Also FLIP creates a new string variable CASE_LBL whose values are the
names of the variables in the original data set.

* CAUTION: FLIP SETS ALL STRING VARIABLES in the original data set TO SYSMIS
.

COMPUTE FLTRvar=NVALID(var1 TO varN).

* The FLTRvar using NVALID will tell me which of the variables (now cases)
in the original data set are empty, which means they are set to "0".

* NVALID tells us how many variables have valid values, i.e. are not
missing; so if there 5 variables and all have values (none missing) NVALID
returns a "5".

* Replace "var1" and "varN" with the names of your variables.

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

* The SELECT IF deletes all the empty variables (now cases) from the
original data set that have "0" on FLTRvar.

DELETE VARIABLES  FLTRvar.

* I get rid of the filter variable since I do not need it anymore.

FLIP VARIABLES=ALL.

* This sets back the original variables (cases in the first FLIP) to their
status of variable.

DELETE VARIABLES  CASE_LBL.

* I delete the CASE_LBL created by the second FLIP.

* Then I save the resulting data file.

* ALWAYS SAVE THE ORIGINAL DATA  so you can always come back to it.

 

Best regards - Dominic

 

*********************************************

Dominic Lusinchi

Far West Research Consulting

Applied Statistics - Social Research - Sociology

San Francisco, California

[hidden email]

1-415-664-3032

CV: http://www.farwestresearch.com/staff/dl/dlcv.html

*********************************************

 


=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: syntax to share

Dominic Lusinchi
In reply to this post by Jignesh Sutar

You see folks what I mean by “real programmers”! Thank you, Jignesh, David, Kirill and Bruce!! Had to lookup “FWIW”. I’m not up on… what kind of speak is that BTW?

I do have a question: are you saying that SPSS can’t handle 1M variables? If I understand the SPSS convention in naming variables, with 1M variables it would go: VAR0000001….VAR1000000.

– Dominic

 

From: [hidden email] [[hidden email]] On Behalf Of Jignesh Sutar
Sent: Wednesday, November 18, 2015 11:18 AM
To: Dominic Lusinchi
Cc:
[hidden email]
Subject: Re: syntax to share

 

Preceding commands? I meant subsequent commands after any FLIP command.

 

On 18 November 2015 at 19:14, Jignesh Sutar <[hidden email]> wrote:

Well done! This is very much a common procedure that I personally use frequently also.


Just some tips...

 

You should look into working with DATASET NAMEs as when you flip a dataset a new (unnamed) dataset is created and preceding commands are intended to run on that dataset so your code should really should name that dataset and activate it. In this way you can run the entire syntax without any issues (else you have to step through it and in between select/activate the appropriate dataset, which I suspect is what you are doing to get this to work).

Secondly, the code "COMPUTE FLTRvar=NVALID(var1 TO varN)." requires input from the user, again which isn't ideal, and often there are easy workaround to avoid this.

 

You could also wrap and save this into a macro also (or even create a custom extension command), so it can be called upon by a single line than have to repeat all those lines of code. I shall let you figure out how to achieve that.

 

Here's a revised version of your code, that you may want to play with and try understand it's benefits in practice:

 

 

Ohhh and by the way, note, there is a spssaux2.FindEmptyVariables(delete=True) procedure which is dependent on python achieving this very same task.

 

 

DATA LIST FREE / A B C D.

BEGIN DATA 

1 2 3 4 5 6 7 8 9 10 11 12

END DATA. 

COMPUTE C=$SYSMIS.

DATASET NAME DSSIM.

 

 

FLIP VARIABLES=ALL.

DATASET NAME dsFlip1.

DATASET ACTIVATE dsFlip1.

NUMERIC MyEndVar(F1.0).

COMPUTE FLTRvar=NVALID(var001 TO MyEndVar).

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

DELETE VARIABLES  FLTRvar.

FLIP VARIABLES=ALL.

DATASET NAME dsFlip2.

DATASET ACTIVATE dsFlip2.

SELECT IF CASE_LBL<>"MyEndVar".

EXECUTE.

DELETE VARIABLES  CASE_LBL.

 

 

 

 

On 18 November 2015 at 18:40, Dominic Lusinchi <[hidden email]> wrote:

Dear fellow SPSS users,

 

I wanted to share some data management syntax I used to delete variables. I’m sure that for experienced users, it’s old news, and that the real programmers among you might even be able to improve on it. I am using version 21 for this exercise.

The problem I faced was a large data set with many empty variables, i.e. variables that had no values, questions in the survey that were answered by no one. I wanted a data file with only the variables I would work with. Of course, you could use DELETE VARIABLES (or even delete manually); but that means that you need to go through every variable to determine which are “empty” and which are not. That could be very tedious and time consuming, not to mention prone to error. So here is the syntax I used (the succinct version first; the explained version second):

 

FLIP VARIABLES=ALL.

COMPUTE FLTRvar=NVALID(var1 TO varN).

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

DELETE VARIABLES  FLTRvar.

FLIP VARIABLES=ALL.

DELETE VARIABLES  CASE_LBL.

 

FLIP VARIABLES=ALL.

* Cases become variables and variables become cases; so if you have 100 cases, the new data set will have 100 variables from VAR001 to VAR100.

* Also FLIP creates a new string variable CASE_LBL whose values are the names of the variables in the original data set.

* CAUTION: FLIP SETS ALL STRING VARIABLES in the original data set TO SYSMIS .

COMPUTE FLTRvar=NVALID(var1 TO varN).

* The FLTRvar using NVALID will tell me which of the variables (now cases) in the original data set are empty, which means they are set to “0”.

* NVALID tells us how many variables have valid values, i.e. are not missing; so if there 5 variables and all have values (none missing) NVALID returns a “5”.

* Replace “var1” and “varN” with the names of your variables.

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

* The SELECT IF deletes all the empty variables (now cases) from the original data set that have “0” on FLTRvar.

DELETE VARIABLES  FLTRvar.

* I get rid of the filter variable since I do not need it anymore.

FLIP VARIABLES=ALL.

* This sets back the original variables (cases in the first FLIP) to their status of variable.

DELETE VARIABLES  CASE_LBL.

* I delete the CASE_LBL created by the second FLIP.

* Then I save the resulting data file.

* ALWAYS SAVE THE ORIGINAL DATA  so you can always come back to it.

 

Best regards – Dominic

 

*********************************************

Dominic Lusinchi

Far West Research Consulting

Applied Statistics – Social Research – Sociology

San Francisco, California

[hidden email]

1-415-664-3032

CV: http://www.farwestresearch.com/staff/dl/dlcv.html

*********************************************

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: syntax to share

David Marso
Administrator
FWIW = For What It's Worth: a FLA (Four Letter Acronym).
Just an abbreviation for a commonly used expression.  Sort of like WTF, RTFM, LOL,  ROFLMAO ...
I'm not saying that SPSS can't handle 1M variables (It likely can, I have never tried it... don't tempt me).
The question is whether you want it to or need it to?  How much time do you have on your hands?
Or more to the point, can your computer handle 1M variables?  Do you want it to or need it to?

Dominic Lusinchi wrote
You see folks what I mean by “real programmers”! Thank you, Jignesh, David, Kirill and Bruce!! Had to lookup “FWIW”. I’m not up on… what kind of speak is that BTW?

I do have a question: are you saying that SPSS can’t handle 1M variables? If I understand the SPSS convention in naming variables, with 1M variables it would go: VAR0000001….VAR1000000.

– Dominic

 

From:  <mailto:[hidden email]> [hidden email] [ <mailto:[hidden email]> mailto:[hidden email]] On Behalf Of Jignesh Sutar
Sent: Wednesday, November 18, 2015 11:18 AM
To: Dominic Lusinchi
Cc:  <mailto:[hidden email]> [hidden email]
Subject: Re: syntax to share

 

Preceding commands? I meant subsequent commands after any FLIP command.

 

On 18 November 2015 at 19:14, Jignesh Sutar <[hidden email]> wrote:

Well done! This is very much a common procedure that I personally use frequently also.


Just some tips...

 

You should look into working with DATASET NAMEs as when you flip a dataset a new (unnamed) dataset is created and preceding commands are intended to run on that dataset so your code should really should name that dataset and activate it. In this way you can run the entire syntax without any issues (else you have to step through it and in between select/activate the appropriate dataset, which I suspect is what you are doing to get this to work).

Secondly, the code "COMPUTE FLTRvar=NVALID(var1 TO varN)." requires input from the user, again which isn't ideal, and often there are easy workaround to avoid this.

 

You could also wrap and save this into a macro also (or even create a custom extension command), so it can be called upon by a single line than have to repeat all those lines of code. I shall let you figure out how to achieve that.

 

Here's a revised version of your code, that you may want to play with and try understand it's benefits in practice:

 

 

Ohhh and by the way, note, there is a spssaux2.FindEmptyVariables(delete=True) procedure which is dependent on python achieving this very same task.

 

 

DATA LIST FREE / A B C D.

BEGIN DATA

1 2 3 4 5 6 7 8 9 10 11 12

END DATA.

COMPUTE C=$SYSMIS.

DATASET NAME DSSIM.

 

 

FLIP VARIABLES=ALL.

DATASET NAME dsFlip1.

DATASET ACTIVATE dsFlip1.

NUMERIC MyEndVar(F1.0).

COMPUTE FLTRvar=NVALID(var001 TO MyEndVar).

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

DELETE VARIABLES  FLTRvar.

FLIP VARIABLES=ALL.

DATASET NAME dsFlip2.

DATASET ACTIVATE dsFlip2.

SELECT IF CASE_LBL<>"MyEndVar".

EXECUTE.

DELETE VARIABLES  CASE_LBL.

 

 

 

 

On 18 November 2015 at 18:40, Dominic Lusinchi <[hidden email]> wrote:

Dear fellow SPSS users,

 

I wanted to share some data management syntax I used to delete variables. I’m sure that for experienced users, it’s old news, and that the real programmers among you might even be able to improve on it. I am using version 21 for this exercise.

The problem I faced was a large data set with many empty variables, i.e. variables that had no values, questions in the survey that were answered by no one. I wanted a data file with only the variables I would work with. Of course, you could use DELETE VARIABLES (or even delete manually); but that means that you need to go through every variable to determine which are “empty” and which are not. That could be very tedious and time consuming, not to mention prone to error. So here is the syntax I used (the succinct version first; the explained version second):

 

FLIP VARIABLES=ALL.

COMPUTE FLTRvar=NVALID(var1 TO varN).

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

DELETE VARIABLES  FLTRvar.

FLIP VARIABLES=ALL.

DELETE VARIABLES  CASE_LBL.

 

FLIP VARIABLES=ALL.

* Cases become variables and variables become cases; so if you have 100 cases, the new data set will have 100 variables from VAR001 to VAR100.

* Also FLIP creates a new string variable CASE_LBL whose values are the names of the variables in the original data set.

* CAUTION: FLIP SETS ALL STRING VARIABLES in the original data set TO SYSMIS .

COMPUTE FLTRvar=NVALID(var1 TO varN).

* The FLTRvar using NVALID will tell me which of the variables (now cases) in the original data set are empty, which means they are set to “0”.

* NVALID tells us how many variables have valid values, i.e. are not missing; so if there 5 variables and all have values (none missing) NVALID returns a “5”.

* Replace “var1” and “varN” with the names of your variables.

FILTER OFF.

USE ALL.

SELECT IF (FLTRvar > 0).

EXECUTE.

* The SELECT IF deletes all the empty variables (now cases) from the original data set that have “0” on FLTRvar.

DELETE VARIABLES  FLTRvar.

* I get rid of the filter variable since I do not need it anymore.

FLIP VARIABLES=ALL.

* This sets back the original variables (cases in the first FLIP) to their status of variable.

DELETE VARIABLES  CASE_LBL.

* I delete the CASE_LBL created by the second FLIP.

* Then I save the resulting data file.

* ALWAYS SAVE THE ORIGINAL DATA  so you can always come back to it.

 

Best regards – Dominic

 

*********************************************

Dominic Lusinchi

Far West Research Consulting

Applied Statistics – Social Research – Sociology

San Francisco, California

[hidden email]

1-415-664-3032

CV: http://www.farwestresearch.com/staff/dl/dlcv.html

*********************************************

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

 

 


=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: syntax to share

Jon K Peck
There are no limits on the number of variables that can be handled (other than available memory), but I can promise that with 1M variables, it will be slow.  And FLIPping a dataset of that size is surely a bad idea.
 

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621
 
 
----- Original message -----
From: David Marso <[hidden email]>
Sent by: "SPSSX(r) Discussion" <[hidden email]>
To: [hidden email]
Cc:
Subject: Re: [SPSSX-L] syntax to share
Date: Thu, Nov 19, 2015 8:04 AM
 
FWIW = For What It's Worth: a FLA (Four Letter Acronym).
Just an abbreviation for a commonly used expression.  Sort of like WTF,
RTFM, LOL,  ROFLMAO ...
I'm not saying that SPSS can't handle 1M variables (It likely can, I have
never tried it... don't tempt me).
The question is whether you want it to or need it to?  How much time do you
have on your hands?
Or more to the point, can your computer handle 1M variables?  Do you want it
to or need it to?


Dominic Lusinchi wrote

> You see folks what I mean by “real programmers”! Thank you, Jignesh,
> David, Kirill and Bruce!! Had to lookup “FWIW”. I’m not up on… what kind
> of speak is that BTW?
>
> I do have a question: are you saying that SPSS can’t handle 1M variables?
> If I understand the SPSS convention in naming variables, with 1M variables
> it would go: VAR0000001….VAR1000000.
>
> – Dominic
>
>  
>
> From:  &lt;mailto:

> jignesh.sutar@

> &gt;

> jignesh.sutar@

>  [ &lt;mailto:

> jignesh.sutar@

> &gt; mailto:

> jignesh.sutar@

> ] On Behalf Of Jignesh Sutar
> Sent: Wednesday, November 18, 2015 11:18 AM
> To: Dominic Lusinchi
> Cc:  &lt;mailto:

> SPSSX-L@.UGA

> &gt;

> SPSSX-L@.UGA

> Subject: Re: syntax to share
>
>  
>
> Preceding commands? I meant subsequent commands after any FLIP command.
>
>  
>
> On 18 November 2015 at 19:14, Jignesh Sutar &lt;

> jsutar@

> &gt; wrote:
>
> Well done! This is very much a common procedure that I personally use
> frequently also.
>
>
> Just some tips...
>
>  
>
> You should look into working with DATASET NAMEs as when you flip a dataset
> a new (unnamed) dataset is created and preceding commands are intended to
> run on that dataset so your code should really should name that dataset
> and activate it. In this way you can run the entire syntax without any
> issues (else you have to step through it and in between select/activate
> the appropriate dataset, which I suspect is what you are doing to get this
> to work).
>
> Secondly, the code "COMPUTE FLTRvar=NVALID(var1 TO varN)." requires input
> from the user, again which isn't ideal, and often there are easy
> workaround to avoid this.
>
>  
>
> You could also wrap and save this into a macro also (or even create a
> custom extension command), so it can be called upon by a single line than
> have to repeat all those lines of code. I shall let you figure out how to
> achieve that.
>
>  
>
> Here's a revised version of your code, that you may want to play with and
> try understand it's benefits in practice:
>
>  
>
>  
>
> Ohhh and by the way, note, there is a
> spssaux2.FindEmptyVariables(delete=True) procedure which is dependent on
> python achieving this very same task.
>
>  
>
>  
>
> DATA LIST FREE / A B C D.
>
> BEGIN DATA
>
> 1 2 3 4 5 6 7 8 9 10 11 12
>
> END DATA.
>
> COMPUTE C=$SYSMIS.
>
> DATASET NAME DSSIM.
>
>  
>
>  
>
> FLIP VARIABLES=ALL.
>
> DATASET NAME dsFlip1.
>
> DATASET ACTIVATE dsFlip1.
>
> NUMERIC MyEndVar(F1.0).
>
> COMPUTE FLTRvar=NVALID(var001 TO MyEndVar).
>
> FILTER OFF.
>
> USE ALL.
>
> SELECT IF (FLTRvar > 0).
>
> EXECUTE.
>
> DELETE VARIABLES  FLTRvar.
>
> FLIP VARIABLES=ALL.
>
> DATASET NAME dsFlip2.
>
> DATASET ACTIVATE dsFlip2.
>
> SELECT IF CASE_LBL<>"MyEndVar".
>
> EXECUTE.
>
> DELETE VARIABLES  CASE_LBL.
>
>  
>
>  
>
>  
>
>  
>
> On 18 November 2015 at 18:40, Dominic Lusinchi &lt;

> dominic@

> &gt; wrote:
>
> Dear fellow SPSS users,
>
>  
>
> I wanted to share some data management syntax I used to delete variables.
> I’m sure that for experienced users, it’s old news, and that the real
> programmers among you might even be able to improve on it. I am using
> version 21 for this exercise.
>
> The problem I faced was a large data set with many empty variables, i.e.
> variables that had no values, questions in the survey that were answered
> by no one. I wanted a data file with only the variables I would work with.
> Of course, you could use DELETE VARIABLES (or even delete manually); but
> that means that you need to go through every variable to determine which
> are “empty” and which are not. That could be very tedious and time
> consuming, not to mention prone to error. So here is the syntax I used
> (the succinct version first; the explained version second):
>
>  
>
> FLIP VARIABLES=ALL.
>
> COMPUTE FLTRvar=NVALID(var1 TO varN).
>
> FILTER OFF.
>
> USE ALL.
>
> SELECT IF (FLTRvar > 0).
>
> EXECUTE.
>
> DELETE VARIABLES  FLTRvar.
>
> FLIP VARIABLES=ALL.
>
> DELETE VARIABLES  CASE_LBL.
>
>  
>
> FLIP VARIABLES=ALL.
>
> * Cases become variables and variables become cases; so if you have 100
> cases, the new data set will have 100 variables from VAR001 to VAR100.
>
> * Also FLIP creates a new string variable CASE_LBL whose values are the
> names of the variables in the original data set.
>
> * CAUTION: FLIP SETS ALL STRING VARIABLES in the original data set TO
> SYSMIS .
>
> COMPUTE FLTRvar=NVALID(var1 TO varN).
>
> * The FLTRvar using NVALID will tell me which of the variables (now cases)
> in the original data set are empty, which means they are set to “0”.
>
> * NVALID tells us how many variables have valid values, i.e. are not
> missing; so if there 5 variables and all have values (none missing) NVALID
> returns a “5”.
>
> * Replace “var1” and “varN” with the names of your variables.
>
> FILTER OFF.
>
> USE ALL.
>
> SELECT IF (FLTRvar > 0).
>
> EXECUTE.
>
> * The SELECT IF deletes all the empty variables (now cases) from the
> original data set that have “0” on FLTRvar.
>
> DELETE VARIABLES  FLTRvar.
>
> * I get rid of the filter variable since I do not need it anymore.
>
> FLIP VARIABLES=ALL.
>
> * This sets back the original variables (cases in the first FLIP) to their
> status of variable.
>
> DELETE VARIABLES  CASE_LBL.
>
> * I delete the CASE_LBL created by the second FLIP.
>
> * Then I save the resulting data file.
>
> * ALWAYS SAVE THE ORIGINAL DATA  so you can always come back to it.
>
>  
>
> Best regards – Dominic
>
>  
>
> *********************************************
>
> Dominic Lusinchi
>
> Far West Research Consulting
>
> Applied Statistics – Social Research – Sociology
>
> San Francisco, California

> dominic@

>
> 1-415-664-3032
>
> CV: http://www.farwestresearch.com/staff/dl/dlcv.html
>
> *********************************************
>
>  
>
> ===================== To manage your subscription to SPSSX-L, send a
> message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
>  
>
>  
>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/syntax-to-share-tp5730975p5730985.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: syntax to share

Dominic Lusinchi
In reply to this post by Dominic Lusinchi

Thank you, David and Jon. Well said. Fortunately, when I’ve had to deal with a data file in the millions, I did not have to delete hundreds of variables. Cheers – Dominic

P.S. Thanks for the acronymish lesson…

 

From: Jon K Peck [mailto:[hidden email]]
Sent: Thursday, November 19, 2015 7:00 AM
Subject: Re: syntax to share

 

There are no limits on the number of variables that can be handled (other than available memory), but I can promise that with 1M variables, it will be slow.  And FLIPping a dataset of that size is surely a bad idea.

 


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621

 

 

----- Original message -----
From: David Marso <[hidden email]>
Sent by: "SPSSX(r) Discussion" <[hidden email]>
To: [hidden email]
Cc:
Subject: Re: [SPSSX-L] syntax to share
Date: Thu, Nov 19, 2015 8:04 AM
 

FWIW = For What It's Worth: a FLA (Four Letter Acronym).
Just an abbreviation for a commonly used expression.  Sort of like WTF,
RTFM, LOL,  ROFLMAO ...
I'm not saying that SPSS can't handle 1M variables (It likely can, I have
never tried it... don't tempt me).
The question is whether you want it to or need it to?  How much time do you
have on your hands?
Or more to the point, can your computer handle 1M variables?  Do you want it
to or need it to?


Dominic Lusinchi wrote


> You see folks what I mean by “real programmers”! Thank you, Jignesh,
> David, Kirill and Bruce!! Had to lookup “FWIW”. I’m not up on… what kind
> of speak is that BTW?
>
> I do have a question: are you saying that SPSS can’t handle 1M variables?
> If I understand the SPSS convention in naming variables, with 1M variables
> it would go: VAR0000001….VAR1000000.
>
> – Dominic
>
>  
>
> From:  &lt;mailto:

> [hidden email]

> &gt;

> [hidden email]

>  [ &lt;mailto:

> [hidden email]

> &gt; mailto:

> [hidden email]

> ] On Behalf Of Jignesh Sutar
> Sent: Wednesday, November 18, 2015 11:18 AM
> To: Dominic Lusinchi
> Cc:  &lt;mailto:

> [hidden email]

> &gt;

> [hidden email]

> Subject: Re: syntax to share
>
>  
>
> Preceding commands? I meant subsequent commands after any FLIP command.
>
>  
>
> On 18 November 2015 at 19:14, Jignesh Sutar &lt;

> jsutar@

> &gt; wrote:
>
> Well done! This is very much a common procedure that I personally use
> frequently also.
>
>
> Just some tips...
>
>  
>
> You should look into working with DATASET NAMEs as when you flip a dataset
> a new (unnamed) dataset is created and preceding commands are intended to
> run on that dataset so your code should really should name that dataset
> and activate it. In this way you can run the entire syntax without any
> issues (else you have to step through it and in between select/activate
> the appropriate dataset, which I suspect is what you are doing to get this
> to work).
>
> Secondly, the code "COMPUTE FLTRvar=NVALID(var1 TO varN)." requires input
> from the user, again which isn't ideal, and often there are easy
> workaround to avoid this.
>
>  
>
> You could also wrap and save this into a macro also (or even create a
> custom extension command), so it can be called upon by a single line than
> have to repeat all those lines of code. I shall let you figure out how to
> achieve that.
>
>  
>
> Here's a revised version of your code, that you may want to play with and
> try understand it's benefits in practice:
>
>  
>
>  
>
> Ohhh and by the way, note, there is a
> spssaux2.FindEmptyVariables(delete=True) procedure which is dependent on
> python achieving this very same task.
>
>  
>
>  
>
> DATA LIST FREE / A B C D.
>
> BEGIN DATA
>
> 1 2 3 4 5 6 7 8 9 10 11 12
>
> END DATA.
>
> COMPUTE C=$SYSMIS.
>
> DATASET NAME DSSIM.
>
>  
>
>  
>
> FLIP VARIABLES=ALL.
>
> DATASET NAME dsFlip1.
>
> DATASET ACTIVATE dsFlip1.
>
> NUMERIC MyEndVar(F1.0).
>
> COMPUTE FLTRvar=NVALID(var001 TO MyEndVar).
>
> FILTER OFF.
>
> USE ALL.
>
> SELECT IF (FLTRvar > 0).
>
> EXECUTE.
>
> DELETE VARIABLES  FLTRvar.
>
> FLIP VARIABLES=ALL.
>
> DATASET NAME dsFlip2.
>
> DATASET ACTIVATE dsFlip2.
>
> SELECT IF CASE_LBL<>"MyEndVar".
>
> EXECUTE.
>
> DELETE VARIABLES  CASE_LBL.
>
>  
>
>  
>
>  
>
>  
>
> On 18 November 2015 at 18:40, Dominic Lusinchi &lt;

> dominic@

> &gt; wrote:
>
> Dear fellow SPSS users,
>
>  
>
> I wanted to share some data management syntax I used to delete variables.
> I’m sure that for experienced users, it’s old news, and that the real
> programmers among you might even be able to improve on it. I am using
> version 21 for this exercise.
>
> The problem I faced was a large data set with many empty variables, i.e.
> variables that had no values, questions in the survey that were answered
> by no one. I wanted a data file with only the variables I would work with.
> Of course, you could use DELETE VARIABLES (or even delete manually); but
> that means that you need to go through every variable to determine which
> are “empty” and which are not. That could be very tedious and time
> consuming, not to mention prone to error. So here is the syntax I used
> (the succinct version first; the explained version second):
>
>  
>
> FLIP VARIABLES=ALL.
>
> COMPUTE FLTRvar=NVALID(var1 TO varN).
>
> FILTER OFF.
>
> USE ALL.
>
> SELECT IF (FLTRvar > 0).
>
> EXECUTE.
>
> DELETE VARIABLES  FLTRvar.
>
> FLIP VARIABLES=ALL.
>
> DELETE VARIABLES  CASE_LBL.
>
>  
>
> FLIP VARIABLES=ALL.
>
> * Cases become variables and variables become cases; so if you have 100
> cases, the new data set will have 100 variables from VAR001 to VAR100.
>
> * Also FLIP creates a new string variable CASE_LBL whose values are the
> names of the variables in the original data set.
>
> * CAUTION: FLIP SETS ALL STRING VARIABLES in the original data set TO
> SYSMIS .
>
> COMPUTE FLTRvar=NVALID(var1 TO varN).
>
> * The FLTRvar using NVALID will tell me which of the variables (now cases)
> in the original data set are empty, which means they are set to “0”.
>
> * NVALID tells us how many variables have valid values, i.e. are not
> missing; so if there 5 variables and all have values (none missing) NVALID
> returns a “5”.
>
> * Replace “var1” and “varN” with the names of your variables.
>
> FILTER OFF.
>
> USE ALL.
>
> SELECT IF (FLTRvar > 0).
>
> EXECUTE.
>
> * The SELECT IF deletes all the empty variables (now cases) from the
> original data set that have “0” on FLTRvar.
>
> DELETE VARIABLES  FLTRvar.
>
> * I get rid of the filter variable since I do not need it anymore.
>
> FLIP VARIABLES=ALL.
>
> * This sets back the original variables (cases in the first FLIP) to their
> status of variable.
>
> DELETE VARIABLES  CASE_LBL.
>
> * I delete the CASE_LBL created by the second FLIP.
>
> * Then I save the resulting data file.
>
> * ALWAYS SAVE THE ORIGINAL DATA  so you can always come back to it.
>
>  
>
> Best regards – Dominic
>
>  
>
> *********************************************
>
> Dominic Lusinchi
>
> Far West Research Consulting
>
> Applied Statistics – Social Research – Sociology
>
> San Francisco, California

> dominic@

>
> 1-415-664-3032
>
> CV: http://www.farwestresearch.com/staff/dl/dlcv.html
>
> *********************************************
>
>  
>
> ===================== To manage your subscription to SPSSX-L, send a
> message to

> [hidden email]

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
>  
>
>  
>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> [hidden email]

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/syntax-to-share-tp5730975p5730985.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
 


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD