Hi,
I have SPSS 23 without the Bootstrapping module installed on my computer. I am also running Windows 7. I have looked through various forums and I am confused on how to bootstrap interquartile range without the bootstrap module installed. Thanks! Bryan Mac |
Administrator
|
This thread from a few years ago may give you some ideas.
http://spssx-discussion.1045642.n5.nabble.com/Sampling-WITH-replacement-td5618318.html HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
Yes, Just generate the samples using the MATRIX code (beginning part) SAVE before the END LOOP and then use OMS with FREQ (SPLIT FILE) with PERCENTILE 25 75 then mop up the mess from OMS. ------------------------
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Is this the syntax for what you are suggesting? I'm new to creating syntaxes in SPSS.
LOOP CASEID=1 TO N. COMPUTE SAMPLES(CASEID)=DATA(RARRAY(CASEID)). SAVE OMS FREQ 25 75 SPLIT FILE OMSEND END LOOP. |
Administrator
|
You need the top part too. Read up on MATRIX command. Also look at OMS. That just captures output. FREQ is what generates the PTiles. Procs can't be in loops. Long weekend for you. On Sun, Sep 4, 2016 at 3:08 AM, Bryan Mac [via SPSSX Discussion] <[hidden email]> wrote: Is this the syntax for what you are suggesting? I'm new to creating syntaxes in SPSS.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Bryan Mac
Although there are technical meanings for "bootstrap", many times there are variations in actual use of the term.
Please explain what you are trying to accomplish. please explain what the tern "bootstrap" means to you? An interquartile range is often considered a rather robust descriptive. If you follow up on the suggestions made on this list, please report back how much difference the bootstrapping makes.
Art Kendall
Social Research Consultants |
I want to accomplish case resampling and estimate the distribution of the sample mean. From my understanding, bootstrap means to estimating the means of large sample. In general, does the bootstrapped mean larger compared to the non-bootstrapped mean. Each time I increased the sample (ie.100,200,etc.) the mean kept getting larger. I thought the bootstrapped mean was near similar to the non-bootstrapped mean.
Also, here is the syntax that with suggestions included. However, I am not getting the interquartile range. PRESERVE. DEFINE BOOT (VAR !TOKENS(1) / NSAMP !TOKENS(1)/SPRINT !TOKENS(1) !DEFAULT(F) ). PRESERVE. SET MXLOOPS=200000. *Replace NAR with Desired Variable Name*. EXAMINE VARIABLES=NAR /PLOT NONE /STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL. MATRIX. GET DATA / VARIABLES !VAR / FILE *. *N is the number of cases*. COMPUTE N=NROW(DATA). *Determine ranks of Median case(s)*. COMPUTE CRIT={(N/2)+.5, (N/2)+.5 }. DO IF N/2=TRUNC(N/2). + COMPUTE CRIT=CRIT + {-1.5,0.5}. END IF. COMPUTE Stats=MAKE(!NSAMP,3,0). COMPUTE SAMPLES=MAKE(N,1,0). LOOP SAMPLE=1 TO !NSAMP. * Construct array of random Indexes (Data pointers). COMPUTE RArray=TRUNC(UNIFORM(N,1)*N +1). LOOP CASEID=1 TO N. COMPUTE SAMPLES(CASEID)=DATA(RARRAY(CASEID)). END LOOP. OMS /SELECT TABLES /IF COMMANDS=['Frequencies'] SUBTYPES=['Frequencies'] LABELS=["Bootstrap"] /DESTINATION FORMAT=SAV NUMBERED=TableNumber_ OUTFILE='Bootstrap' VIEWER=YES. FREQUENCIES VARIABLES=NAR /FORMAT=NOTABLE /PERCENTILES=25.0 75.0 /STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM MEAN MEDIAN SKEWNESS SESKEW KURTOSIS SEKURT /ORDER=ANALYSIS. omsend tag = ['Bootstrap']. ** Calculate Median **. COMPUTE MEDSTAT=GRADE(SAMPLES). COMPUTE Stats(SAMPLE,3) =0. LOOP I=1 TO N. LOOP J=1 TO 2. DO IF MEDSTAT(I)=CRIT(J). COMPUTE Stats(SAMPLE,3) =Stats(SAMPLE,3)+SAMPLES(I). END IF. END LOOP. END LOOP. COMPUTE Stats(SAMPLE,3)=Stats(SAMPLE,3)/2. *Replace NAR with Desired Variable Name*. * Generate Sum(NAR) and SUM(NAR**2). COMPUTE Stats(SAMPLE,1)=CSUM(Samples)/N. COMPUTE Stats(SAMPLE,2)=T(Samples)*Samples). END LOOP. * Calculate StdDev *. COMPUTE Stats(:,2)=SQRT((Stats(:,2)-N*Stats(:,1))/(N-1)). !IF (!SPRINT !EQ T) !THEN PRINT Stats /TITLE "Individual Bootstrapped Sample Statistics" /CLABELS "Mean","SD","Median". !IFEND * Calculate Averages of Bootsrapped statistics *. PRINT (CSUM(Stats)/!NSAMP) /TITLE="Averaged Bootstrapped Statistics" /CLABELS "Mean","StdDev","Median". COMPUTE HCI=(CSUM(Stats)/!NSAMP) + 1.96/SQRT(N). COMPUTE LCI=(CSUM(Stats)/!NSAMP) - 1.96/SQRT(N). * Calculate 95% Confidence Interval of Bootsrapped statistics *. PRINT LCI /TITLE="Lower Bound for 95% Confidence Interval of Bootstrapped Statistics" /CLABELS "Mean","StdDev","Median". PRINT HCI /TITLE="Higher Bound for 95% Confidence Interval of Bootstrapped Statistics" /CLABELS "Mean","StdDev","Median". END MATRIX. !ENDDEFINE. *Replace 100 with the desired sample*. BOOT Var=NAR NSAMP=100 SPRINT=T. RESTORE. |
Administrator
|
You can not embed an OMS/FREQ within the MATRIX END MATRIX block!
I wrote the base code you are using before SPSS had OMS. Best bet is to create the samples in MATRIX (see SAVE command) and use SPLIT FILE with OMS and FREQ to get the biz done. Best suggestion I can offer and it will work like a charm. Alternatively shell out $P$$ cash for the Bootstrapping module (A waste IMN$HO for something this basic).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Very briefly, I have a swarm of alligators messing with me right now (ambiguous non delimited text data).
GET your data. MATRIX. GET your data... LOOP generate ONE bootstrap sample... SAVE... END LOOP... END MATRIX. SPLIT FILE BY sample. OMS .... FREQ... OMSEND. parse the OMS, calculate, aggregate or whatever.... DONE! have fun. read back a few msgs and you will find this same advice I posted about a month ago.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Bryan Mac
What is the population you are sampling from?
Is an actual population or an abstract population? What do you know or anticipate the population distribution shape to be? How is (are) the sample(s) drawn? What does/do visualizations of the sample(s) show? Is this an exercise to develop your understanding of the central limit theorem or an application? Often bootstrapping etc is used to get a perspective/handle on how well the mean looks like the pop mean.
Art Kendall
Social Research Consultants |
If the user's time has any value, purchasing the bootstrap option is by far the cheapest option here. The bootstrapping code would look like this, BOOTSTRAP /SAMPLING METHOD=SIMPLE /VARIABLES TARGET=salary /CRITERIA CILEVEL=95 CITYPE=PERCENTILE NSAMPLES=1000 /MISSING USERMISSING=EXCLUDE. EXAMINE VARIABLES=salary /PLOT NONE /STATISTICS DESCRIPTIVES /NOTOTAL. I don't know the current cost of that option, but it is one of the less expensive ones, and it would then be available for other statistics, too. I'm not trying to sell anything, but the economics seem obvious. On Wed, Sep 14, 2016 at 7:39 AM, Art Kendall <[hidden email]> wrote: What is the population you are sampling from? |
Free forum by Nabble | Edit this page |