Wildcard

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Wildcard

Jo Gulstad
Hello all,
Is there a "wildcard" option when selecting cases, such as "609.*" as in Access? That is, all statutes 609.xxxx will be selected?
Thank you!
 
 
Jo Gulstad
Research Analyst
Minnesota Department of Corrections
1450 Energy Park Drive, Suite 200
St Paul, MN 55108
651.361.7383

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard

ViAnn Beadle
Putting aside for the moment the things that you can do with Python, there
is no notion of a wild card in SPSS transformations per se. Note however,
you can select cases using the INDEX function which finds the first
occurrence of the string. Here's a simple example:

COMPUTE var609=INDEX(stringvar, '609.').
FILTER by var609.

The FILTER command selects all cases for which the BY variable equals 1 or
greater. If "609" can occur somewhere else in the string and you only want
values starting at 1, then you'll have to take an extra step here and recode
your filter variable to take on values 1 and 0, as in
RECODE var609 (1=1) (else=0).

Python provides much more powerful pattern matching but that isn't required
for this example.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jo
Gulstad
Sent: Thursday, December 20, 2007 3:09 PM
To: [hidden email]
Subject: Wildcard

Hello all,
Is there a "wildcard" option when selecting cases, such as "609.*" as in
Access? That is, all statutes 609.xxxx will be selected?
Thank you!


Jo Gulstad
Research Analyst
Minnesota Department of Corrections
1450 Energy Park Drive, Suite 200
St Paul, MN 55108
651.361.7383

=======
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Python: exporting a file to its string equivalent

Albert-Jan Roskam
Hi listers,

I am trying to learn Python and to practice I thought
it would be nice to write a program that converts all
numeric variables to string. This might come in handy
when I use disease code classification and I export
those to xls. I've noticed several times that Excel
has the annoying habit to 'abbreviate' e.g. code 00001
to 1. This could have a completely different meaning!

For now I want to convert F and N type variables to
their string equivalents. Maybe the code below is
overly complicated already, but I just wanted to try
the various Python commands.

I have some questions about the code I wrote:

Ad ## 1 ## Not all variables are converted to string.
It's like the IF-ELIF jumps too soon to the next step.

How come?

Ad ## 2 ## Minor nuisance: how can the list of
variables be numbered? I thought "enumerate" was the
way to do it, but maybe not in this case?

Ad ## 3 ## How does string replacement work in this
case? Why are the double backslashes needed? This
complicates things.


Could you help me with this? Any other, general
remarks?

Thanks in advance!

Albert-Jan

*******************.
** sample file
input program.
set seed = 12262007.
+  loop #i=1 to 100.
+     numeric n1 n2 n3 (n8.0).
+     compute f = abs(rnd(rv.normal(10,5))).
+     compute n1 = abs(rnd(rv.normal(10,5))).
+     compute n2 = abs(rnd(rv.normal(5,5))).
+     compute n3 = abs(rnd(rv.normal(0,5))).
+     end case.
+  end loop.
+ end file.
end input program.
exe.
save outfile = 'd:\temp\testit.sav'.

*******************.
** Actual program.

BEGIN PROGRAM.
import spss, os
def Num2String(file):
        """ Convert all numeric values of an SPSS sav
file to string
        and write entire file to num2string.xls. Handy
for e.g. disease codes
        because code '00001' would not be converted to
'1' by Excel. """
        try:
                # open file and build tuple of numeric
vars.
                spss.Submit("get file = %(myfile)s ."
%{'myfile' : file})
                numericvars=[ ]
                for i in
range(spss.GetVariableCount()):
                        if spss.GetVariableType(i) ==
0:

numericvars.append(spss.GetVariableName(i))
                for j in range(len(numericvars)):
                        k = numericvars[j]
                        # convert F vars to string.
                        if
spss.GetVariableFormat(j).find("F")==0:       ## 1 ##

                                spss.Submit(r"""
                                string temp (a10).
                                compute temp =
string(%(k)s,f8).
                                exe.
                                delete variables
%(k)s.
                                rename variables (temp
= %(k)s).
                                """ %locals())
                        # convert N vars to string.
                        elif
spss.GetVariableFormat(j).find("N")==0:
                                spss.Submit(r"""
                                string temp (a10).
                                compute temp =
string(%(k)s,n8).
                                exe.
                                delete variables
%(k)s.
                                rename variables (temp
= %(k)s).
                                """ %locals())
                # copy variable labels to 'stringed'
file and export to xls.
                spss.Submit("apply dictionary from
%(myfile)s / varinfo varlabel." %{'myfile' : file})
                spss.Submit(r"""save translate /
outfile = 'd:\temp\num2string.xls'
                / type = xls / version = 8 /
fieldnames / replace.""")
                print "The following numeric vars were
converted to string format: "
                for x, y  in
enumerate(range(len(numericvars))):
                        print y,
"\n".join(numericvars)  ## 2 ##

                        break
        # print something if file extension is not
sav, and if file or path does not exist.
        except:
                if  file[-4:-1] != 'sav':
                        print "File extension is " +
str.upper(file[-4:-1])
                        "\n This is probably not an
SPSS sav file."
                else:
                        if
os.path.exists('d:\\temp\\employee data.sav') ==
False:      ## 3 ##
                                print r"Input path
and/or file do not exist. Try again, dude."
# Num2String(r"'c:\program files\spss\employee
data.sav'")
Num2String(r"'d:\temp\testit.sav'")
END PROGRAM.




Cheers!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Did you know that 87.166253% of all statistics claim a precision of results that is not justified by the method employed? [HELMUT RICHTER]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Cherrypicking & publication bias

Albert-Jan Roskam
Hi again listers,

Another, completely different post this time. I was
wondering if you could recommend some texts about
"cherrypicking" and "publication bias". Cherry picking
may be defined as "the act of pointing at individual
cases or data that seem to confirm a particular
position, while ignoring a significant portion of
related cases or data that may contradict that
position". Publication bias refers to the "tendency
for researchers and editors to handle experimental
results that are positive (they found something)
differently from results that are negative (found that
something did not happen) or inconclusive."

In a time where the number of publications sometimes
appears to be more important than the actual
*contents*of those papers, where the researcher's
daily bread so heavily depends on how often s/he has a
paper accepted, those two phenomena may (in my view)
become a serious threat to science. This would result
in a 'polished' version of reality, esp. in
meta-analyses. I was wondering if any of you could
recommend some reading materials, quantifications,
etc. (published or unpublished! ;-) about this. I am
aware of some papers, I believe in the Lancet and BMJ
that did not find evidence for publication bias. One
disclosure, however: these were written by the editors
of those journals!

Thanks in advance and merry x-mas!

Albert-Jan




Cheers!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Did you know that 87.166253% of all statistics claim a precision of results that is not justified by the method employed? [HELMUT RICHTER]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Cherrypicking & publication bias

Anthony Babinec
While not a central theme of the book,
"cherry-picking" comes up in Rex Kline's
book "Beyond Significance Testing." His
review of the misconceptions and misuses
surrounding p-values is very good.

Anthony Babinec
[hidden email]

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Albert-jan Roskam
Sent: Friday, December 21, 2007 7:53 AM
To: [hidden email]
Subject: Cherrypicking & publication bias

Hi again listers,

Another, completely different post this time. I was
wondering if you could recommend some texts about
"cherrypicking" and "publication bias". Cherry picking
may be defined as "the act of pointing at individual
cases or data that seem to confirm a particular
position, while ignoring a significant portion of
related cases or data that may contradict that
position". Publication bias refers to the "tendency
for researchers and editors to handle experimental
results that are positive (they found something)
differently from results that are negative (found that
something did not happen) or inconclusive."

In a time where the number of publications sometimes
appears to be more important than the actual
*contents*of those papers, where the researcher's
daily bread so heavily depends on how often s/he has a
paper accepted, those two phenomena may (in my view)
become a serious threat to science. This would result
in a 'polished' version of reality, esp. in
meta-analyses. I was wondering if any of you could
recommend some reading materials, quantifications,
etc. (published or unpublished! ;-) about this. I am
aware of some papers, I believe in the Lancet and BMJ
that did not find evidence for publication bias. One
disclosure, however: these were written by the editors
of those journals!

Thanks in advance and merry x-mas!

Albert-Jan




Cheers!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Did you know that 87.166253% of all statistics claim a precision of results
that is not justified by the method employed? [HELMUT RICHTER]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



____________________________________________________________________________
________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard

Richard Ristow
In reply to this post by Jo Gulstad
At 05:08 PM 12/20/2007, Jo Gulstad wrote:

>Is there a "wildcard" option when selecting cases, such as "609.*"
>as in Access? That is, all statutes 609.xxxx will be selected?

As ViAnn Beadle wrote,

>COMPUTE var609=INDEX(stringvar, '609.').
>FILTER by var609.
>
>The FILTER command selects all cases for which the BY variable
>equals 1 or greater.

Right. And you can get many other effects as well. I'm just giving
the tests; use those for SELECT IF commands, or to set filter
variables that you can use with FILTER commmands. Like this:

Filtering:
SET       FilterV = <test>.
FILTER BY FilterV.

Select the cases you want, and *PERMANENTLY DELETE THE OTHERS*:
SELECT IF <test>.

Select the cases you want *FOR THE NEXT PROCEDURE ONLY*:
TEMPORARY.
SELECT IF <test>.
.................
Here are some tests you could use. (These are expressions, not
complete statements; insert them where "<test>" occurs, above):

The string '609.' occurs anywhere in your string (per Viann Beadle):
INDEX(string,'609.') GT 0

The string '609.1' occurs at the beginning of the string:
SUBSTR(string,1,4) EQ '609.1'

The string '609.1' is at the first non-blank characters in the string:
SUBSTR(LTRIM(string),1,4) EQ '609.1'

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Cherrypicking & publication bias

Albert-Jan Roskam
In reply to this post by Anthony Babinec
Hi all,

Thanks to everybody who responded to my mail!

Somebody sent me a very interesting article off-list.
It's in PLoS Medicine (open source):

Why most published research findings are false.
Ioannidis JP. (2005) PLoS Med. 2005 Aug;2(8):e124.
Epub 2005 Aug 30.

http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=16060722&ordinalpos=4&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum

If that link is too l-o-n-g, try:
http://tinyurl.com/2te7ub

** PubMed Abstract:
There is increasing concern that most current
published research findings are false. The probability
that a research claim is true may depend on study
power and bias, the number of other studies on the
same question, and, importantly, the ratio of true to
no relationships among the relationships probed in
each scientific field. In this framework, a research
finding is less likely to be true when the studies
conducted in a field are smaller; when effect sizes
are smaller; when there is a greater number and lesser
preselection of tested relationships; where there is
greater flexibility in designs, definitions, outcomes,
and analytical modes; when there is greater financial
and other interest and prejudice; and when more teams
are involved in a scientific field in chase of
statistical significance. Simulations show that for
most study designs and settings, it is more likely for
a research claim to be false than true. Moreover, for
many current scientific fields, claimed research
findings may often be simply accurate measures of the
prevailing bias. In this essay, I discuss the
implications of these problems for the conduct and
interpretation of research.

PMID: 16060722 [PubMed - indexed for MEDLINE]


Cheers!!
Albert-Jan


--- Anthony Babinec <[hidden email]> wrote:

> While not a central theme of the book,
> "cherry-picking" comes up in Rex Kline's
> book "Beyond Significance Testing." His
> review of the misconceptions and misuses
> surrounding p-values is very good.
>
> Anthony Babinec
> [hidden email]
>
> -----Original Message-----
> From: SPSSX(r) Discussion
> [mailto:[hidden email]] On Behalf Of
> Albert-jan Roskam
> Sent: Friday, December 21, 2007 7:53 AM
> To: [hidden email]
> Subject: Cherrypicking & publication bias
>
> Hi again listers,
>
> Another, completely different post this time. I was
> wondering if you could recommend some texts about
> "cherrypicking" and "publication bias". Cherry
> picking
> may be defined as "the act of pointing at individual
> cases or data that seem to confirm a particular
> position, while ignoring a significant portion of
> related cases or data that may contradict that
> position". Publication bias refers to the "tendency
> for researchers and editors to handle experimental
> results that are positive (they found something)
> differently from results that are negative (found
> that
> something did not happen) or inconclusive."
>
> In a time where the number of publications sometimes
> appears to be more important than the actual
> *contents*of those papers, where the researcher's
> daily bread so heavily depends on how often s/he has
> a
> paper accepted, those two phenomena may (in my view)
> become a serious threat to science. This would
> result
> in a 'polished' version of reality, esp. in
> meta-analyses. I was wondering if any of you could
> recommend some reading materials, quantifications,
> etc. (published or unpublished! ;-) about this. I am
> aware of some papers, I believe in the Lancet and
> BMJ
> that did not find evidence for publication bias. One
> disclosure, however: these were written by the
> editors
> of those journals!
>
> Thanks in advance and merry x-mas!
>
> Albert-Jan
>
>
>
>
> Cheers!
> Albert-Jan
>
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Did you know that 87.166253% of all statistics claim
> a precision of results
> that is not justified by the method employed?
> [HELMUT RICHTER]
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
>
>
____________________________________________________________________________
> ________
> Be a better friend, newshound, and
> know-it-all with Yahoo! Mobile.  Try it now.
>
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

>
> =====================
> To manage your subscription to SPSSX-L, send a
> message to
> [hidden email] (not to SPSSX-L), with no
> body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send
> the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a
> message to
> [hidden email] (not to SPSSX-L), with no
> body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send
> the command
> INFO REFCARD
>



      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD