Hi all,
I managed to get together the code (1) below. Its essentially a monte carlo discriminant function script that also uses oms to output. It runs a given number of cases through discriminant function analysis and replicates this, say 100 times and sends via oms the classification table to a new .sav. This works great. However I need to be able to select, say 100 random cases from each of four groups. I can in effect do this with this code (2). However, ive tried to incorporate this into code (1) but im getting a little out of my depth. I would still like to keep the loop and compute #reps = 100 element (i think) as this gives me the monte carlo function in effect. Hope ive made sense. Could anyone help me to modify code (1) below to allow it to stratify on the group variable please? Regards Andy To summarise, I need to: Get the file. select x random cases from each group (variable = behaviour code). Use these selected cases as the "selection variable" in discriminant function analysis. Code (1) get FILE= 'E:\S1.sav'. set seed = 1234 . . compute ident = $casenum . save outfile 'c:\temp\bootdata.sav'. input program . . . compute #reps = 100 . /* number of samples desired. compute #ssize = 250 . /* sample size desired compute #psize = 10737 . /* size of population . . loop samp=1 to #reps . loop v = 1 to #ssize . compute ident=trunc(rv.uniform(1,#psize +1) ) . end case. leave samp. end loop. end loop. end file. end input program . exe . sort cases by ident. match files / file * / table 'c:\temp\bootdata.sav' / by ident . sort cases by samp. split file by samp. . Capture statistic of interest . oms /select tables /if commands = subtypes = /destination format = sav outfile = "c:\temp\results.sav" /tag = "results" . . Procedure of interest goes here . discriminant /GROUPS=behaviour_code(1 4) /VARIABLES=hX hY hZ /ANALYSIS ALL /PRIORS EQUAL /STATISTICS=TABLE /CLASSIFY=NONMISSING POOLED. . End OMS . omsend tag= . omsend tag = . Examine results . get file = 'c:\temp\results.sav' . Code (2) COMPUTE tempvar=UNIFORM(10). SORT CASES BY behaviour_code tempvar (A). SPLIT FILE BY behaviour_code. COMPUTE tempvar=1. CREATE filter=CSUM(tempvar). RECODE filter (1 thru 100=1) (3 thru highest=0). SPLIT FILE OFF. DISCRIMINANT /GROUPS=behaviour_code(1 4) /VARIABLES=hX hY hZ /SELECT=filter(1) /ANALYSIS ALL /PRIORS EQUAL /STATISTICS=TABLE CROSSVALID /CLASSIFY=NONMISSING POOLED. EXECUTE. |
I everyone, If anyone could offer any help, I would be extremely grateful. Kind regards Andy.
|
Administrator
|
In reply to this post by Andy H
See: http://spssx-discussion.1045642.n5.nabble.com/Re-random-sample-of-cases-by-groups-td4693710.html#a4700558 --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Thanks very much David. I'll get to work on it and see how I do.
|
In reply to this post by David Marso
In response to Davids suggestion, I have for some time been trying to amend the code. I am no expert with syntax and consequently keep hitting a brick wall. Depending on what lines I add or remove I obviously get differing results ranging from outright errors to the discrim using what I think are approximate sizes from each group. I was thinking some of my problems might be to do with weighting as well as implementing Davids suggestion within the loop of the original code? The code pasted below is only one of the many different ways I have tried. Please forgive me for my ignorance when it comes to writing and understanding syntax. The original code (which i realise now was written by David some years ago) works well and I also like the output (when not suppressed). I just need equal group sizes (randomly generated) to be used. I would be most grateful if anyone could suggest some appropriate code for this. Kind regards Andy
get FILE='E:\s1.sav'. COMPUTE SCRAMBLE=UNIFORM(1). SORT CASES BY Behaviour_code SCRAMBLE. IF $CASENUM=1 OR (LAG(Behaviour_code) NE Behaviour_code) Counter=1. IF MISSING(Counter) Counter=LAG(Counter)+1. COMPUTE Keeper=Behaviour_code. RECODE Keeper (1=48)(2=100)(3=150)(4=125). *EXECUTE . /* Probably DON'T need EXE here. If you get odd results then remove *. SELECT IF (Counter LE Keeper). compute #reps = 100 . /* number of samples desired. loop samp=1 to #reps . end case. leave samp. end loop. end file. exe . sort cases by samp. split file by samp. oms /select tables /if commands = ["discriminant"] subtypes = ["Classification Results"] /destination format = sav outfile = "c:\temp\results.sav" /tag = "results" . discriminant /GROUPS=behaviour_code(1 4) /VARIABLES=hx hy hz /ANALYSIS ALL /PRIORS EQUAL /STATISTICS=TABLE /CLASSIFY=NONMISSING POOLED. * End OMS . omsend tag=["blank"] . omsend tag = ["results"] . * Examine results . get file = 'c:\temp\results.sav' . |
Administrator
|
Please see the BOLD stuff and ponder the original thread I cited earlier relative to your specific requirements!!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Thanks for taking the time with your suggestions David. I have amended the RECODE element to RECODE Keeper (1=200)(2=200)(3=200)(4=200). However, my efforts at correcting my errors are proving to be quite unproductive. I suspect its something to do with the element below. Any further help or pointers would be much appreciated.
Kind regards Andy compute #reps = 100 . /* number of samples desired. compute #ssize = 200 . /* sample size desired compute #psize = 10379 . /* size of population * . * . loop samp=1 to #reps . loop v = 1 to #ssize . compute ident=trunc(rv.uniform(1,#psize +1) ) . end case. leave samp. end loop. end loop. end file. end input program . exe . sort cases by ident. match files / file * / table 'c:\temp\bootdata.sav' / by ident . sort cases by samp. split file by samp. |
Administrator
|
You need to study the code line by line with reference to the manual to understand it before trying to modify it. You also appear to be conflating two different -unrelated- sets of code neither of which you understand (e.g. what's with the end input program without an input program).
Hate to be curt, but sample code is simply sample code. You need to comprehend it to benefit from it. You mention 'errors' but do not provide the error text or context.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
I am very grateful for your guidance. I have started to go through the manual as suggested. Please note though that much of this syntax world is extremely new to me. I have not come asking from the start for someone to outright write the code for me. I have been happy to try and have a go and learn. Please, im asking nicely if someone would be so kind as to modify the code I have been using. On the back of the previous message, below is my latest effort with error messages. I have been trying to solve this since last week. If perhaps someone could show me the correct code, this in itself would help accelerate my learning as I would be able to see and understand where I have been going wrong. Kind regards Andy
Error: 47 Discriminant For split file samp=1.00, there is only one non-empty group and 1.000 (1 unweighted) cases that are valid. Not enough non-empty groups. Not enough weighted or unweighted cases. Analyses not executed. 56 OMSEnd Command OMSEND. There is no active OMS destination for tag blank. COMPUTE tempvar=UNIFORM(10). SORT CASES BY behaviour_code tempvar (A). SPLIT FILE BY behaviour_code. COMPUTE tempvar=1. CREATE filter=CSUM(tempvar). RECODE filter (1 thru 100=1) (3 thru highest=0). SPLIT FILE OFF. * . compute ident = $casenum . save outfile 'c:\temp\bootdata.sav'. input program . * . * . compute #reps = 100 . /* number of samples desired. compute #psize = 10379 . /* size of population * . * . loop samp=1 to #reps . compute ident=trunc(rv.uniform(1,#psize +1) ) . end case. leave samp. end loop. end file. end input program . exe . sort cases by ident. match files / file * / table 'c:\temp\bootdata.sav' / by ident . sort cases by samp. split file by samp. * . * Capture statistic of interest . oms /select tables /if commands = ["discriminant"] subtypes = ["Classification Results"] /destination format = sav outfile = "c:\temp\results.sav" /tag = "results" . * . * Procedure of interest goes here . discriminant /GROUPS=behaviour_code(1 4) /VARIABLES=hx hy hz /ANALYSIS ALL /PRIORS EQUAL /STATISTICS=TABLE /CLASSIFY=NONMISSING POOLED. * . * End OMS . omsend tag=["blank"] . omsend tag = ["results"] . * Examine results . get file = 'c:\temp\results.sav' . |
Is anybody willing to help please?
|
Administrator
|
In reply to this post by Andy H
BACK UP to the beginning. Review the scope of your requirement. Examine your code. Comment it and verify each step actually is relevant. WTF is up with the INPUT PROGRAM? The code is all over the place and there is no focus. What I originally cited for you is ALL you need with appropriate mods which I will leave you to follow up with. Yes, using YOUR desired frequencies rather than what was used in the original is a good first step. Get rid of all the other crap.
--
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
I did change your suggested code with appropriate mods (as far as I know) as you can see below. The problem is it does not work where ever I fit it in the original code. Believe me, I have tried what feels like a gazillion ways each time trying to understand the logic.
COMPUTE SCRAMBLE=UNIFORM(1). SORT CASES BY behaviour_code SCRAMBLE. IF $CASENUM=1 OR (LAG(behaviour_code) NE behaviour_code) Counter=1. IF MISSING(Counter) Counter=LAG(Counter)+1. COMPUTE Keeper=behaviour_code. RECODE Keeper (1=200)(2=200)(3=200)(4=200). *EXECUTE . /* Probably DON'T need EXE here. If you get odd results then remove *. SELECT IF (Counter LE Keeper). |
I do not have the full context and do not know here this fits into the
whole project, so may be off base. I have not followed the full discussion but have a few remarks. it appears you may be trying to bootstrap. If so, I suggest you click <help><topics><bootstrap>. If you do not have that procedure available and you want to sample WITH replacement check the recent archives for "Sampling WITH replacement used in bootstrapping, complex sampling etc. Demo syntax." > IF $CASENUM=1 OR (LAG(behaviour_code) NE behaviour_code) Counter=1. I get the impression you are fairly new to SPSS, so try to nip this habit in the bud, "using a fundamental symbol with 2 different meanings. use the equal sign exclusively as an assignment operator NOT as a logical operator. Beginners often confuse themselves with this. It also slows you quality assurance review. Many people do better using the conventional operators GT GE NOT AND LT LE. If $casenum EQ 1 ... > IF $CASENUM=1 OR (LAG(behaviour_code) NE behaviour_code) Counter=1. > IF MISSING(Counter) Counter=LAG(Counter)+1. Would be more readable if you did do IF $CASENUM eq 1 OR (LAG(behaviour_code) NE behaviour_code) compute Counter=1. else. compute Counter=LAG(Counter)+1. > COMPUTE Keeper=behaviour_code. > RECODE Keeper (1=200)(2=200)(3=200)(4=200). just compute Keeper = 200. Art Kendall Social Research Consultants On 3/12/2013 8:23 AM, Andy H wrote: > I did change your suggested code with appropriate mods (as far as I know) as > you can see below. The problem is it does not work where ever I fit it in > the original code. Believe me, I have tried what feels like a gazillion ways > each time trying to understand the logic. > > COMPUTE SCRAMBLE=UNIFORM(1). > SORT CASES BY behaviour_code SCRAMBLE. > IF $CASENUM=1 OR (LAG(behaviour_code) NE behaviour_code) Counter=1. > IF MISSING(Counter) Counter=LAG(Counter)+1. > COMPUTE Keeper=behaviour_code. > RECODE Keeper (1=200)(2=200)(3=200)(4=200). > *EXECUTE . /* Probably DON'T need EXE here. If you get odd results then > remove *. > SELECT IF (Counter LE Keeper). > > > > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Adding-randomly-selecting-x-number-of-cases-from-4-groups-to-this-script-tp5718455p5718569.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
Thank you for your input. I appreciate you giving some time to this. Unfortunately I do not have access to the bootstraping module. I have also played around with complex samples to generate some code to stratify by group. Again, however I struggle to incorporate that into my original code. I have tried to make sense of what you said to me in the last post. Wherever i fit in the stratify code into my original code I still get unequal group sample sizes run through the discriminant command or a variety of errors in the syntax editor. My skills in SPSS and syntax are something that I am very keen to build on in the future. However, at the moment time simply is not on my side.
I am asking this community of SPSS experts to please help me by showing clearly (to a keen beginner) how to get the following code to draw equal sample sizes from a group variable called "behaviour_code". Thank you kindly. get FILE=‘E:\s1.sav'. set seed = 1234 . * . compute ident = $casenum . save outfile 'c:\temp\bootdata.sav'. input program . * . * . compute #reps = 100 . /* number of samples desired. compute #ssize = 250 . /* sample size desired compute #psize = 10379 . /* size of population * . * . loop samp=1 to #reps . loop v = 1 to #ssize . compute ident=trunc(rv.uniform(1,#psize +1) ) . end case. leave samp. end loop. end loop. end file. end input program . exe . sort cases by ident. match files / file * / table 'c:\temp\bootdata.sav' / by ident . sort cases by samp. split file by samp. * . * Suppress output . oms /select all /if commands = ["discriminant"] /destination viewer = no /tag = "blank" . * . * Capture statistic of interest . oms /select tables /if commands = ["discriminant"] subtypes = ["Classification Results"] /destination format = sav outfile = "c:\temp\results.sav" /tag = "results" . * . * Procedure of interest goes here . discriminant /GROUPS=behaviour_code(1 4) /VARIABLES=hx hy hz /ANALYSIS ALL /PRIORS EQUAL /STATISTICS=TABLE /CLASSIFY=NONMISSING POOLED. * . * End OMS . omsend tag=["blank"] . omsend tag = ["results"] . * Examine results . get file = 'c:\temp\results.sav' . |
Administrator
|
"how to get the following code to draw equal sample sizes from a group variable called "behaviour_code".
This was done for you roughly a week ago! The number of iterations of this thread has been exceeded. The solution has not converged!!! "Time is not on my side" Well, time is precious. Most people here have other fish to fry. Maybe time for you to sit down with a very patient trainer to extricate the cranial blockage? Maybe others here will sacrifice more of their time. My last contribution to this thread. Consult the fine manual WRT how DSC treats missing values? ---------------------------------------------------------------------------------------------------------
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
"how to get the following code to draw equal sample sizes from a group variable called "behaviour_code.
This was done for you roughly a week ago!". Im sorry if my lack of understanding is annoying. I have simply come here asking politely for help on a solution from others who are more experienced. Not through lack of trying, I have failed to come up with the solution myself. "Consult the fine manual WRT how DSC treats missing values? " I do not know what this means. I have searched for the terms but get no results. "Maybe time for you to sit down with a very patient trainer to extricate the cranial blockage? " I have attempted to find someone familiar with syntax but found no one. That is why I am here searching for help. Please, I ask the community politely if someone would be so kind as to show how I should implement Davids suggested code (1) or other code into my original code (2). to allow equal sample sizes from the group variable (behaviour_code) to be repeatedly put through the DISCRMINANT command. (1) Davids suggested code > COMPUTE SCRAMBLE=UNIFORM(1). > SORT CASES BY behaviour_code SCRAMBLE. > IF $CASENUM=1 OR (LAG(behaviour_code) NE behaviour_code) Counter=1. > IF MISSING(Counter) Counter=LAG(Counter)+1. > COMPUTE Keeper=behaviour_code. > RECODE Keeper (1=200)(2=200)(3=200)(4=200). > *EXECUTE . /* Probably DON'T need EXE here. If you get odd results then > remove *. > SELECT IF (Counter LE Keeper). (2) The original code get FILE=‘E:\s1.sav'. set seed = 1234 . * . compute ident = $casenum . save outfile 'c:\temp\bootdata.sav'. input program . * . * . compute #reps = 100 . /* number of samples desired. compute #ssize = 250 . /* sample size desired compute #psize = 10379 . /* size of population * . * . loop samp=1 to #reps . loop v = 1 to #ssize . compute ident=trunc(rv.uniform(1,#psize +1) ) . end case. leave samp. end loop. end loop. end file. end input program . exe . sort cases by ident. match files / file * / table 'c:\temp\bootdata.sav' / by ident . sort cases by samp. split file by samp. * . * Suppress output . oms /select all /if commands = ["discriminant"] /destination viewer = no /tag = "blank" . * . * Capture statistic of interest . oms /select tables /if commands = ["discriminant"] subtypes = ["Classification Results"] /destination format = sav outfile = "c:\temp\results.sav" /tag = "results" . * . * Procedure of interest goes here . discriminant /GROUPS=behaviour_code(1 4) /VARIABLES=hx hy hz /ANALYSIS ALL /PRIORS EQUAL /STATISTICS=TABLE /CLASSIFY=NONMISSING POOLED. * . * End OMS . omsend tag=["blank"] . omsend tag = ["results"] . * Examine results . get file = 'c:\temp\results.sav' . |
Free forum by Nabble | Edit this page |