Hi Mike,
Thanks for your response. I will try to see how it goes. Thank you. Sincerely, Jialin Huang
On Tue, Jan 24, 2012 at 3:31 PM, Michael Palij <[hidden email]> wrote: I agree with David that this is a strange request and |
Administrator
|
In reply to this post by huang jialin
I'm still trying to work out exactly what it is you are trying to accomplish.
In the quoted message below, you described what kind of sample you want to achieve. In a later message, you said: "What I am trying to do is to see the effects of range restriction. That is why I have the exact mean and sd." Looking at the effect of restricted range suggests you are working with some kind of regression model. Is that right? It might help if you backed up a few steps, and started by telling us what kind of analysis you are doing in the first place, what you saw in the results that made you want to "see the effects of range restriction", etc. It may be that someone will suggest a different way of seeing those effects. HTH. p.s. - I agree with David's comments about this looking "fishy".
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Bruce,
Thanks for your suggestion. I will try to make it more clearer. I did a linear equating on two sets of score (S1 and S2). They were collected from the same sample. The S1 distribution used for equating is m=17 and sd=5.1. The result (the look-up table of original scores and scaled scores) was applied on a larger sample with m=15 and sd=5.7, which only has S1. What I am trying to explore is whether the equating is affected by the smaller sample, as the mean and sd showed it is restricted. However, there is no way to collect more data at the moment.
Thus, I am planning to pull out cases from the smaller sample with exact mean and sd as the larger one. Then, re-do the equating and check the differences between samples. Does it make sense now? What would you suggest me to do?
Thank you very much. Sincerely, Jialin Huang On Tue, Jan 24, 2012 at 4:55 PM, Bruce Weaver <[hidden email]> wrote: I'm still trying to work out exactly what it is you are trying to accomplish. |
Administrator
|
"Thus, I am planning to pull out cases from the smaller sample with exact mean and sd as the larger one. Then, re-do the equating and check the differences between samples."
I submit the following with some hesitation and do not plan to provide exact code but only a general description. Say you have 2 files: file1 and file2. File1 contains the sample you wish to replicate the distribution from the cases in file2. ADD FILES / FILE 'file1' /IN=FLAG/ FILE='file2'. COMPUTE TAKEN=0. **BARELY TESTED/ YMMV/ GOOD LUCK... * Repeat the following code as needed*. SORT CASES BY TAKEN Y. COMPUTE GRAB=NOT(FLAG) AND LAG(FLAG). CREATE GRABFROM=LEAD(GRAB,1). COMPUTE TAKEN=TAKEN OR GRAB OR GRABFROM. Gotta Go, It's Miller time! -- HTH, David
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Somewhat different approach which will possibly work better for discrete (integer valued) distributions.
Again, YMMV and I make *NO* guarantee re the validity or advisability of this approach since I really have no freaking idea of what the hell you are really up to, the specifics of your distributions etc... If you can't figure out what the code is doing then refer to the manual. If that fails, DELETE it and pretend that it doesn't exist!!! --- input program. loop I=1 to 300. compute Y=RND(RV.NORMAL(15,5.7)). end case. end loop. end file. end input program. SAVE OUTFILE "test.sav". DESC Y. FREQ Y. AGGREGATE OUTFILE "testAGG.sav" / BREAK Y / N=N. input program. loop I=1 to 1000. compute Y=RND(RV.NORMAL(17,5.1)). COMPUTE OTHERDAT=UNIFORM(1). end case. end loop. end file. end input program. DESC Y. FREQ Y. COMPUTE SCRAMBLE=UNIFORM(1). SORT CASES BY Y SCRAMBLE. MATCH FILES / FILE * / IN=RAW / FILE="testAGG.sav" /IN=TAB/ BY Y. COMPUTE COUNTER=SUM(LAG(COUNTER)*(Y EQ LAG(Y)), RAW). COMPUTE ReqN=N. IF SYSMIS(ReqN) ReqN=LAG(ReqN). COMPUTE DRAWN=RAW AND COUNTER LE ReqN. TEMPORARY. SELECT IF DRAWN. DESC Y.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
David,
Thanks all the same. Jialin Huang On Thu, Jan 26, 2012 at 3:38 AM, David Marso <[hidden email]> wrote: Somewhat different approach which will possibly work better for discrete |
Similar idea to David. Generate a file of 300 random numbers using COMPUTE Y= RV.NORMAL(15,5.7) as he has described (but no need to round) Then use the SPSS FUZZY command to pull out cases from your database that approximately match on Y. This is just a suggestion, which I have not tried, but perhaps worth looking into. You will need to install Python and the SPSS FUZZY extension if you haven’t already done so. Garry Gelade From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin David, Thanks all the same. Jialin Huang On Thu, Jan 26, 2012 at 3:38 AM, David Marso <[hidden email]> wrote: Somewhat different approach which will possibly work better for discrete > > "Thus, I am planning to pull out cases from the smaller sample with exact > mean and sd as the larger one. Then, re-do the equating and check the > differences between samples." > > I submit the following with some hesitation and do not plan to provide > exact code but only a general description. Say you have 2 files: file1 > and file2. File1 contains the sample you wish to replicate the > distribution from the cases in file2. > > ADD FILES / FILE 'file1' /IN=FLAG/ FILE='file2'. > COMPUTE TAKEN=0. > **BARELY TESTED/ YMMV/ GOOD LUCK... > * Repeat the following code as needed*. > SORT CASES BY TAKEN Y. > COMPUTE GRAB=NOT(FLAG) AND LAG(FLAG). > CREATE GRABFROM=LEAD(GRAB,1). > COMPUTE TAKEN=TAKEN OR GRAB OR GRABFROM. > > Gotta Go, It's Miller time! > -- > HTH, David > > > > > huang jialin wrote >> >> Bruce, >> >> Thanks for your suggestion. I will try to make it more clearer. >> >> I did a linear equating on two sets of score (S1 and S2). They were >> collected from the same sample. The S1 distribution used for equating is >> m=17 and sd=5.1. The result (the look-up table of original scores and >> scaled scores) was applied on a larger sample with m=15 and sd=5.7, which >> only has S1. What I am trying to explore is whether the equating is >> affected by the smaller sample, as the mean and sd showed it is >> restricted. >> However, there is no way to collect more data at the moment. >> >> Thus, I am planning to pull out cases from the smaller sample with exact >> mean and sd as the larger one. Then, re-do the equating and check the >> differences between samples. >> >> Does it make sense now? What would you suggest me to do? >> >> Thank you very much. >> >> Sincerely, >> Jialin Huang >> >> >> On Tue, Jan 24, 2012 at 4:55 PM, Bruce Weaver <bruce.weaver@>wrote: >> >>> I'm still trying to work out exactly what it is you are trying to >>> accomplish. >>> >>> In the quoted message below, you described what kind of sample you want >>> to >>> achieve. In a later message, you said: >>> >>> "What I am trying to do is to see the effects of range restriction. That >>> is >>> why I have the exact mean and sd." >>> >>> Looking at the effect of restricted range suggests you are working with >>> some >>> kind of regression model. Is that right? It might help if you backed >>> up a >>> few steps, and started by telling us what kind of analysis you are doing >>> in >>> the first place, what you saw in the results that made you want to "see >>> the >>> effects of range restriction", etc. It may be that someone will suggest >>> a >>> different way of seeing those effects. >>> >>> HTH. >>> >>> p.s. - I agree with David's comments about this looking "fishy". >>> >>> >>> >>> huang jialin wrote >>> > >>> > Hi everyone, >>> > >>> > Thanks for your reply. Let me elaborate what I am planning to do. >>> > >>> > I have a dataset of 1000 cases, considering it as a population. M= 17, >>> SD >>> > = >>> > 5.1. I am trying to pull out a sample size of roughly 300 cases, but >>> the >>> > mean need to be around 15, and SD is around 5.7. >>> > >>> > I was wondering whether SPSS has any syntax that I can use. Your helps >>> are >>> > very appreciated. >>> > >>> > Thank you again. >>> > >>> > Sincerely, >>> > Jialin Huang >>> > >>> > >>> > >>> > On Tue, Jan 24, 2012 at 11:54 AM, John F Hall <johnfhall@> >>> wrote: >>> > >>> >> Should have said you do that in syntax.**** >>> >> >>> >> ** ** >>> >> >>> >> From data editor:**** >>> >> >>> >> ** ** >>> >> >>> >> ** ** >>> >> >>> >> File > New > Syntax**** >>> >> >>> >> ** ** >>> >> >>> >> . . to open a new syntax file. Write the command, but make sure you >>> put >>> >> a >>> >> full stop (period) at the end of it, then press the green triangle >>> >> etc.*** >>> >> * >>> >> >>> >> ** ** >>> >> >>> >> ** ** >>> >> >>> >> Email: johnfhall@ **** >>> >> >>> >> Website: www.surveyresearch.weebly.com >>> >> <http://surveyresearch.weebly.com/> >>> >> **** >>> >> >>> >> Skype: surveyresearcher1**** >>> >> >>> >> Phone: <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47">(+33) (0) 2.33.45.91.47**** >>> >> >>> >> ** ** >>> >> >>> >> ** ** >>> >> >>> >> ** ** >>> >> >>> >> ** ** >>> >> >>> >> ** ** >>> >> >>> >> *From:* John F Hall [mailto:[hidden email]] >>> >> *Sent:* 24 January 2012 18:43 >>> >> *To:* 'huang jialin'; 'SPSSX-L@.UGA' >>> >> *Subject:* RE: sampling with fixed mean and SD**** >>> >> >>> >> ** ** >>> >> >>> >> You can sample in SPSS with:**** >>> >> >>> >> ** ** >>> >> >>> >> sample <n> from <N>**** >>> >> >>> >> ** ** >>> >> >>> >> where n is the sample size you want and N is the number of cases in >>> the >>> >> data set, or you can use:**** >>> >> >>> >> ** ** >>> >> >>> >> sample <p> **** >>> >> >>> >> ** ** >>> >> >>> >> where p is the proportion you want to sample expressed as a >>> decimal.**** >>> >> >>> >> ** ** >>> >> >>> >> John Hall**** >>> >> >>> >> ** ** >>> >> >>> >> Email: johnfhall@ **** >>> >> >>> >> Website: www.surveyresearch.weebly.com >>> >> <http://surveyresearch.weebly.com/> >>> >> **** >>> >> >>> >> Skype: surveyresearcher1**** >>> >> >>> >> Phone: <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47">(+33) (0) 2.33.45.91.47**** >>> >> >>> >> ** ** >>> >> >>> >> ** ** >>> >> >>> >> ** ** >>> >> >>> >> ** ** >>> >> >>> >> *From:* SPSSX(r) Discussion [mailto:[hidden email].UGA] *On Behalf >>> >> Of *huang jialin >>> >> *Sent:* 24 January 2012 17:46 >>> >> *To:* SPSSX-L@.UGA >>> >> *Subject:* sampling with fixed mean and SD**** >>> >> >>> >> ** ** >>> >> >>> >> Hi,**** >>> >> >>> >> ** ** >>> >> >>> >> I am planning to sample cases from a known dataset with fixed mean >>> and >>> >> SD. >>> >> The sample size is from 300-500. The replacement is not allowed. Can >>> I >>> do >>> >> it in SPSS? If so, how can I do it? **** >>> >> >>> >> ** ** >>> >> >>> >> Thank you for your attention.**** >>> >> >>> >> ** ** >>> >> >>> >> Sincerely,**** >>> >> >>> >> Jialin Huang**** >>> >> >>> >> ** ** >>> >> >>> > >>> >>> >>> ----- >>> -- >>> Bruce Weaver >>> bweaver@ >>> http://sites.google.com/a/lakeheadu.ca/bweaver/ >>> >>> "When all else fails, RTFM." >>> >>> NOTE: My Hotmail account is not monitored regularly. >>> To send me an e-mail, please use the address shown above. >>> >>> -- >>> View this message in context: >>> http://spssx-discussion.1045642.n5.nabble.com/sampling-with-fixed-mean-and-SD-tp5315312p5428999.html >>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >>> >> > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/sampling-with-fixed-mean-and-SD-tp5315312p5432396.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. |
Garry,
Thanks for your response. Sincerely, Jialin Huang On Fri, Jan 27, 2012 at 8:57 AM, Garry Gelade <[hidden email]> wrote:
|
Administrator
|
In reply to this post by Garry Gelade
The intention of my posting was to use the available data as the source table for the sampling ;-)
No sim needed!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |