repeating analyses, exluding one case at a time

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

repeating analyses, exluding one case at a time

nina
Hello

I'd like to check the sensitivity of my analyses with regard to single cases. For example, if I were to conduct a simple t-test with 100 cases, which syntax would you recomend to conduct the test for 99 cases with one case excluded, then again with 99 cases but now a different case excluded, and so on?

Best
Nina

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: repeating analyses, exluding one case at a time

Jon Peck
In the case of a t test, the easiest way to get a compact representation of all the leave-one-out results is to formulate the problem as a regression (the group variable is the independent variable) and save the DfBeta variable and perhaps other similar variables.  You can then look at a summary of that variable.

  • DfBeta(s)The difference in beta value is the change in the regression coefficient that results from the exclusion of a particular case. A value is computed for each term in the model, including the constant.
Note also that bootstrapping is already doing something similar in spirit, but it won't show you all the individual results.

​In the general case, doing all the leave-one-out computations is going to generate a lot of output, but you can do it by using a Python loop.  For example

begin program.
import spss
n = spss.GetCaseCount()

for i in range(1,n+1):
    spss.Submit("""compute allbut1 = $casenum ne %s.
filter by allbut1.""" %i)
    spss.Submit("""
T-TEST GROUPS=minority(0 1)
  /VARIABLES=salary.
""")
end program.
 You would put whatever procedure you want where the T-TEST syntax is in the example.  The code loops omitting each case in turn and running the specified syntax.  Note the indentation as shown in the example is important.​

On Sun, May 21, 2017 at 4:25 PM, Nina Lasek <[hidden email]> wrote:
Hello

I'd like to check the sensitivity of my analyses with regard to single cases. For example, if I were to conduct a simple t-test with 100 cases, which syntax would you recomend to conduct the test for 99 cases with one case excluded, then again with 99 cases but now a different case excluded, and so on?

Best
Nina

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: repeating analyses, exluding one case at a time

Rich Ulrich
In reply to this post by nina

A better idea for checking the possible influence of one or more outliers on  a t-test is to

start with FREQ to get all the z-scores and see how big the largest (absolute) z's are.  Then,

maybe, drop those cases. (See how much of the total variance they account for, if you want

a guide to "over-influence".)


The biggest z's show which cases are going to have the biggest effect on both the mean

and the variance.  And you can see if there is more than one.


Tukey's formal jack-knife procedure is a bit more complicated than a simple "leave-one-out";

it has an extra computation at each step to estimate a corrected-t.  I do not find even the

jack-knife interesting for a t-test, but if I were going to do the tedious part of the work, I

think I would generate the statistic that /some/ people might appreciate.


--

Rich Ulrich



From: SPSSX(r) Discussion <[hidden email]> on behalf of Nina Lasek <[hidden email]>
Sent: Sunday, May 21, 2017 6:25:07 PM
To: [hidden email]
Subject: repeating analyses, exluding one case at a time
 
Hello

I'd like to check the sensitivity of my analyses with regard to single cases. For example, if I were to conduct a simple t-test with 100 cases, which syntax would you recomend to conduct the test for 99 cases with one case excluded, then again with 99 cases but now a different case excluded, and so on?

Best
Nina

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: repeating analyses, exluding one case at a time

Art Kendall
In addition to z-scores. Use EXPLORE and check the data entry for what it labels "outliers". YMMV but think or outliers as values that are suspicious and so should be checked.

I suggest using the procedures for checking  anomalous values, EXPLORE, z-scores, visualizations. etc  as part of the data preparation and validation before any actual test are applied to the data.

P.s. Did you double enter or proof read your data?
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: repeating analyses, exluding one case at a time

Empi
In reply to this post by Jon Peck
Dear Dr.  Peck,

checking the SPSS X list archives I fortunately found your code for
controlling potentially influential cases by exlcuding them one  at a time.
This is exactly what I would need for a simple  twolevel regression using
SPSS 25  mixed for 20,000 respondents nested in 30 countries. I would like
to exlude one country per run (to check for the  influence of single
countries), but I reiceve an error message. Could it be that SPSS 25 can't
read the code you developed in 2017 - and, if time permits, could that code
be adapted? (I think instead of an person-level identifier the only
difference might be to use a country-level identifier).

Best,
E.




--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: repeating analyses, exluding one case at a time

Jon Peck
I would need to know what the error message said and see the code you are running in order to provide any help on this.  Also, which procedure you are running.

I don't think there would be a version incompatibility issue.

You could send those materials directly to my email ([hidden email]).

On Sat, Jun 20, 2020 at 12:08 AM Empi <[hidden email]> wrote:
Dear Dr.  Peck,

checking the SPSS X list archives I fortunately found your code for
controlling potentially influential cases by exlcuding them one  at a time.
This is exactly what I would need for a simple  twolevel regression using
SPSS 25  mixed for 20,000 respondents nested in 30 countries. I would like
to exlude one country per run (to check for the  influence of single
countries), but I reiceve an error message. Could it be that SPSS 25 can't
read the code you developed in 2017 - and, if time permits, could that code
be adapted? (I think instead of an person-level identifier the only
difference might be to use a country-level identifier).

Best,
E.




--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: repeating analyses, exluding one case at a time

Bruce Weaver
Administrator
Jon, just a request that you post your solution here so that all who are
interested (not just the OP) can benefit from it.  (You were likely going to
do that anyway, but better safe than sorry.)  

Cheers,
Bruce



Jon Peck wrote
> I would need to know what the error message said and see the code you are
> running in order to provide any help on this.  Also, which procedure you
> are running.
>
> I don't think there would be a version incompatibility issue.
>
> You could send those materials directly to my email (

> jkpeck@

> ).





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).