Method to Select Specified Number of Variables from Dataset by Measure

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Method to Select Specified Number of Variables from Dataset by Measure

standwu91
Hello. I am fairly new to SPSS and this forum, but I have this project I am
working on and I need some assistance.

In particular, I am trying to find an easy way to write a piece of code that
allows me to only use X number of variables from my data set using their
measure as the criteria. Now I am able to select all the variables through
SPSSINC Select Variables, but that I would also like to select a specified
number of variables (e.g. 3 numeric variables).

While I understand I can click and choose, I was hoping for some syntax such
that I can later on apply to other aspects of my project.

Thank you so much.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

Jon Peck
The SPSSINC SELECT VARIABLES extension command does not provide a way to limit the number of variables within the selection criteria other than through the specific-names list box.  This could be done by writing a macro that created another macro by taking the first n variables in the list generated by SELECT VARIABLES.  However, what would  you want to do if there are more than three (or your chosen limit) qualifying variables?  You could take the first 3, perhaps, and SELECT VARIABLES gives you some control over the order of the list, but that probably isn't sufficient for this purpose.

On Wed, Mar 21, 2018 at 11:55 AM, standwu91 <[hidden email]> wrote:
Hello. I am fairly new to SPSS and this forum, but I have this project I am
working on and I need some assistance.

In particular, I am trying to find an easy way to write a piece of code that
allows me to only use X number of variables from my data set using their
measure as the criteria. Now I am able to select all the variables through
SPSSINC Select Variables, but that I would also like to select a specified
number of variables (e.g. 3 numeric variables).

While I understand I can click and choose, I was hoping for some syntax such
that I can later on apply to other aspects of my project.

Thank you so much.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

FW: Method to Select Specified Number of Variables from Dataset by Measure

John F Hall

Should have gone to list, not just Jon.

 

How about changing the level of the other variables to nominal and leaving your selected variables as numeric?  You can always change the file name or not change it when exiting SPSS

 

John F Hall  MA (Cantab) Dip Ed (Dunelm)

[Retired academic survey researcher]

 

Email:          [hidden email]

Website:     Journeys in Survey Research

Course:       Survey Analysis Workshop (SPSS)

Research:   Subjective Social Indicators (Quality of Life)

 

From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Jon Peck
Sent: 22 March 2018 00:40
To: [hidden email]
Subject: Re: Method to Select Specified Number of Variables from Dataset by Measure

 

The SPSSINC SELECT VARIABLES extension command does not provide a way to limit the number of variables within the selection criteria other than through the specific-names list box.  This could be done by writing a macro that created another macro by taking the first n variables in the list generated by SELECT VARIABLES.  However, what would  you want to do if there are more than three (or your chosen limit) qualifying variables?  You could take the first 3, perhaps, and SELECT VARIABLES gives you some control over the order of the list, but that probably isn't sufficient for this purpose.

 

On Wed, Mar 21, 2018 at 11:55 AM, standwu91 <[hidden email]> wrote:

Hello. I am fairly new to SPSS and this forum, but I have this project I am
working on and I need some assistance.

In particular, I am trying to find an easy way to write a piece of code that
allows me to only use X number of variables from my data set using their
measure as the criteria. Now I am able to select all the variables through
SPSSINC Select Variables, but that I would also like to select a specified
number of variables (e.g. 3 numeric variables).

While I understand I can click and choose, I was hoping for some syntax such
that I can later on apply to other aspects of my project.

Thank you so much.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



 

--

Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

standwu91
This post was updated on .
In reply to this post by Jon Peck
Thank you for your insight.

I was also wondering does SPSS have the ability to extract variables not in
order nor have any commonalities in their name/label? For instance, I want
to select variable 1, 3, 9, 15.

If not, is there a script, Python or R code that code help with this
process?

The end goal is to attempt to generate all combinations of variables within the data set. In particular, it would be include all 3 variable combinations to all n variable combinations.

EDIT: Thought I was finished typing up this message but that was clearly not the case.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: FW: Method to Select Specified Number of Variables from Dataset by Measure

Jon Peck
In reply to this post by John F Hall
That doesn't seem to solve the problem.  The user might just as well define the macro manually in that case.  But to automate this, they would need some selection rule for the case when there are more than 3 (n) eligible variables.  Also, it is bad practice IMO to alter variable properties with the intention of that being temporary but vulnerable to having saved the file for other purposes.

On Thu, Mar 22, 2018 at 12:55 AM, John F Hall <[hidden email]> wrote:

Should have gone to list, not just Jon.

 

How about changing the level of the other variables to nominal and leaving your selected variables as numeric?  You can always change the file name or not change it when exiting SPSS

 

John F Hall  MA (Cantab) Dip Ed (Dunelm)

[Retired academic survey researcher]

 

Email:          [hidden email]

Website:     Journeys in Survey Research

Course:       Survey Analysis Workshop (SPSS)

Research:   Subjective Social Indicators (Quality of Life)

 

From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Jon Peck
Sent: 22 March 2018 00:40
To: [hidden email]
Subject: Re: Method to Select Specified Number of Variables from Dataset by Measure

 

The SPSSINC SELECT VARIABLES extension command does not provide a way to limit the number of variables within the selection criteria other than through the specific-names list box.  This could be done by writing a macro that created another macro by taking the first n variables in the list generated by SELECT VARIABLES.  However, what would  you want to do if there are more than three (or your chosen limit) qualifying variables?  You could take the first 3, perhaps, and SELECT VARIABLES gives you some control over the order of the list, but that probably isn't sufficient for this purpose.

 

On Wed, Mar 21, 2018 at 11:55 AM, standwu91 <[hidden email]> wrote:

Hello. I am fairly new to SPSS and this forum, but I have this project I am
working on and I need some assistance.

In particular, I am trying to find an easy way to write a piece of code that
allows me to only use X number of variables from my data set using their
measure as the criteria. Now I am able to select all the variables through
SPSSINC Select Variables, but that I would also like to select a specified
number of variables (e.g. 3 numeric variables).

While I understand I can click and choose, I was hoping for some syntax such
that I can later on apply to other aspects of my project.

Thank you so much.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



 

--

Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

Jon Peck
In reply to this post by standwu91
I am interpreting "extract variables" to mean creating a macro that lists the selected variables and can be used in places in syntax where variable lists are accepted.  To define macros this way requires using Python programmability.

Here is a solution.  I will also send the file as an attachment to standwu91, since the list sometimes mangles indentation.
Of course, a macro based on variable names could be defined directly without the use of Python.

begin program.
import spss

def varmacro(varnumbers, macroname="!vars"):
    """Create a macro listing variable names for specified numbers

    varnumbers is a list of variable numbers to select counting from 1 for the first variable.
    The numbers refer to the order in the file.
    macroname is the name for the macro and defaults to !vars."""

    varnames = " ".join(spss.GetVariableName(v-1) for v in varnumbers)
    spss.SetMacroValue(macroname, varnames)
    print "Macro: %s\nVariables: %s" % (macroname, varnames)
end program.

* usage example.
* The program block above must be run at least once in a session before 
* generating a macro as below.
* Note that the list of variable numbers must be enclosed in square brackets.

begin program.
varmacro([1, 3, 5], "!somevars")
end program.

On Thu, Mar 22, 2018 at 8:09 AM, standwu91 <[hidden email]> wrote:
Thank you for your insight.

I was also wondering does SPSS have the ability to extract variables not in
order nor have any commonalities in their name/label? For instance, I want
to select variable 1, 3, 9, 15.

If not, is there a script, Python or R code that code help with this
process?

The end goal is to attempt to generate all combinations of variables ranging
from



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

David Marso
Administrator
In reply to this post by standwu91
"The end goal is to attempt to generate all combinations of variables ranging
from ..."

I believe you forgot something here?
Please finish the thought.

You could use MATRIX with nested loops to create a file of variable names
and then python to generate macro calls.
OTOH I don't know what you actually want to do.


standwu91 wrote

> Thank you for your insight.
>
> I was also wondering does SPSS have the ability to extract variables not
> in
> order nor have any commonalities in their name/label? For instance, I want
> to select variable 1, 3, 9, 15.
>
> If not, is there a script, Python or R code that code help with this
> process?
>
> The end goal is to attempt to generate all combinations of variables
> ranging
> from
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

Art Kendall
What is the purpose of the exercise? WHY?

Do you just want to test some syntax that you will later use on all of the
variables?

Do you want a random sample?



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

standwu91
The reason I need this is to run tests based on all combinations such that I
can find the best model specification to use.

While I have my design for the model, I just have an problem with gathering
all variable combinations through a loop-ed extraction method.





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

Bruce Weaver
Administrator
Do you mean that you want to use all-possible-subsets (aka., best-subsets)
regression?  

If so, see these slides:

http://dev1.education.uconn.edu/m3c/assets/File/Automatic%20Linear%20Modeling%20-%20Yang%281%29.pdf

I now feel as if I've handed you a loaded gun.  Therefore, I URGE you to
also see Frank Harrell's comment about all-possible-subsets regression in
this Stata FAQ on stepwise regression (with stepwise broadly defined):

 
https://www.stata.com/support/faqs/statistics/stepwise-regression-problems/

Here is Frank's final comment:  "All possible subsets" regression solves
none of these problems.

Re the issue of "phantom degrees of freedom" and over-fitting, see Mike
Babyak's nice article.  

  https://www.cs.vu.nl/~eliens/sg/local/theory/overfitting.pdf

HTH.



standwu91 wrote

> The reason I need this is to run tests based on all combinations such that
> I
> can find the best model specification to use.
>
> While I have my design for the model, I just have an problem with
> gathering
> all variable combinations through a loop-ed extraction method.
>
>
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

Art Kendall
As Bruce pointed out all-possible-subsets regression is a highly dubious
procedure.

Perhaps if  you describe the overall project and the questions you are
trying to answer  in more detail list members will able to suggest less
questionable approaches.

What Are your dependent variables?  Are they continuous variables?

What are the predictor variables? Are they continuous? interval-ish? Are
they some sort of repeated measure: Likert or other summative scale items,
time points, conditions, etc? Are there logical subsets of variables? etc.?

What are the cases?
Are they a random sample, a convenient bunch of cases, stratified, etc.?








-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

standwu91
In reply to this post by Bruce Weaver
Thank you for the information.

I think I should clarify that I am not doing regression modeling. I am doing
this for clustering to determine the best clustering model.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

David Marso
Administrator
Here is a very simple macro for generating ALL triplets of variable names
from the active file:

DEFINE !GetTriplets ()
MATRIX.
GET Data/FILE=*/NAMES=vars.
LOOP #=1 TO NCOL(Data)-2.
LOOP ##=#+1 TO NCOL(Data)-1.
LOOP ###=##+1 TO NCOL(Data).
SAVE {Vars(#),Vars(##),Vars(###)} /OUTFILE * /VARIABLES V1 V2 V3/STRINGS V1
V2 V3.
END LOOP.
END LOOP.
END LOOP.
END MATRIX.
!ENDDEFINE.


standwu91 wrote

> Thank you for the information.
>
> I think I should clarify that I am not doing regression modeling. I am
> doing
> this for clustering to determine the best clustering model.
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

Jon Peck
In reply to this post by standwu91
You might look at twostep cluster, which includes a display of predictor importance rather than running all possible cluster models separately.  Deciding which cluster model is the best is an art,  not a science.  You might find the STATS CLUS SIL extension command (Analyze > Classify > Cluster Silhouttes) helpful in assessing the clusters.

On Fri, Mar 23, 2018 at 9:54 AM, standwu91 <[hidden email]> wrote:
Thank you for the information.

I think I should clarify that I am not doing regression modeling. I am doing
this for clustering to determine the best clustering model.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Method to Select Specified Number of Variables from Dataset by Measure

Art Kendall
In reply to this post by standwu91
Are you clustering variables or cases?

What are your criteria for best clustering model?

If you are clustering cases, depending on the nature of your data you MAY
want to create composite scores that hold the shared variance from your
variables.

All clustering methods are heuristics.  In the 70s while at the Statistics
Research Division of the US Census Bureau I developed an approach that finds
clusters based on using several clustering methods and similarity
coefficients.

Although the early year archives of this discussion list have disappeared,
the archives since 1996 are still available.  search for "core clusters" in
the archives for this list.

What would you use a clustering solution for?  

Again, a presentation of what you are trying to achieve, and what data you
have to help that effort would make it more likely you would get useful
suggestions.





-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants