|
Hello,
I have a dataset with cases in variable A, and I would like to randomly select 3 controls (in variable B) matched to the year (variable C) that each case was reported in. SPSS allows for random selection, but I am not sure how to select controls at random matched to the year for each case. Appreciate any insight and assistance on how to do this in SPSS. Thanks much, SM Cripe ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
SM,
Some things in your explanation don't make sense to me. So, is variable A is the id number? What is variable B about? What values does it take on? Variable C is the year, I got that. Also, define your use of the word 'case'? Are you using the word to refer to a record in the file or are you using the word to refer to a record that is positive for a condition relative to a record that is negative for the condition, as in a case control study? Please post back to the list. >>I have a dataset with cases in variable A, and I would like to randomly select 3 controls (in variable B) matched to the year (variable C) that each case was reported in. Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Thank you, Gene, for your follow-up post. To clarify, yes, this is
similar to a case-control study. A correction--closer look at the dataset revealed that Variable A denotes individuals not from the US (variable A=1 [case]) and individuals from the US [American Caucasians (variable A=2 {first control group}), and African Americans (variable A=3 {second control group})]. Variable B refers to country of origin for all individuals in the dataset (numeric value for each country represented). The analysis will be conducted by country of origin. For example, to compare characteristics of individuals from Peru (variable A=1 and variable B=Peru) with those of American Caucasians (control group 1) and with those of African Americans (control group 2), I would like to randomly select three Caucasians and three African Americans for each Peru-born case matched by year of birth (variable C). Any advice on how to execute this random selection with matching in SPSS would be greatly appreciated. Please let me know if there are further questions. Thanks much, SM On Mon, 13 Oct 2008, Gene Maguin wrote: > SM, > > Some things in your explanation don't make sense to me. So, is variable A is > the id number? What is variable B about? What values does it take on? > Variable C is the year, I got that. Also, define your use of the word > 'case'? Are you using the word to refer to a record in the file or are you > using the word to refer to a record that is positive for a condition > relative to a record that is negative for the condition, as in a case > control study? Please post back to the list. > >>> I have a dataset with cases in variable A, and I would like to randomly > select 3 controls (in variable B) matched to the year (variable C) that > each case was reported in. > > > Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
One tool available with SPSS 16 or later is the extension command
CASECTRL, which can be downloaded from SPSS Developer Central (www.spss.com/devcentral). You specify "supplier" and "demander" datasets and the list of keys that define the (exact) match. You can specify how many supplier matches you want for each demander, and there are various options for what sort of output to produce. When there are multiple matches for a demander case, CASECTRL picks randomly from supplier cases. HTH, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Swee May Cripe Sent: Monday, October 13, 2008 3:10 PM To: [hidden email] Subject: Re: [SPSSX-L] Question about random selection with matching Thank you, Gene, for your follow-up post. To clarify, yes, this is similar to a case-control study. A correction--closer look at the dataset revealed that Variable A denotes individuals not from the US (variable A=1 [case]) and individuals from the US [American Caucasians (variable A=2 {first control group}), and African Americans (variable A=3 {second control group})]. Variable B refers to country of origin for all individuals in the dataset (numeric value for each country represented). The analysis will be conducted by country of origin. For example, to compare characteristics of individuals from Peru (variable A=1 and variable B=Peru) with those of American Caucasians (control group 1) and with those of African Americans (control group 2), I would like to randomly select three Caucasians and three African Americans for each Peru-born case matched by year of birth (variable C). Any advice on how to execute this random selection with matching in SPSS would be greatly appreciated. Please let me know if there are further questions. Thanks much, SM On Mon, 13 Oct 2008, Gene Maguin wrote: > SM, > > Some things in your explanation don't make sense to me. So, is variable A is > the id number? What is variable B about? What values does it take on? > Variable C is the year, I got that. Also, define your use of the word > 'case'? Are you using the word to refer to a record in the file or are you > using the word to refer to a record that is positive for a condition > relative to a record that is negative for the condition, as in a case > control study? Please post back to the list. > >>> I have a dataset with cases in variable A, and I would like to randomly > select 3 controls (in variable B) matched to the year (variable C) that > each case was reported in. > > > Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Swee May Cripe
Swee May,
This seems to be pretty hard to do. 1) Is it correct to assume that for US whites (caucasians) and US blacks, country (variable B) equals US? 2) Is it also correct to assume that there are no other matching variables? Not age, gender, rural-urban, etc? Specific answer to each question, please. I'm sure something like this has been done, and maybe kind of recently, but nothing specific comes to mind. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Swee May Cripe Sent: Monday, October 13, 2008 5:10 PM To: [hidden email] Subject: Re: Question about random selection with matching Thank you, Gene, for your follow-up post. To clarify, yes, this is similar to a case-control study. A correction--closer look at the dataset revealed that Variable A denotes individuals not from the US (variable A=1 [case]) and individuals from the US [American Caucasians (variable A=2 {first control group}), and African Americans (variable A=3 {second control group})]. Variable B refers to country of origin for all individuals in the dataset (numeric value for each country represented). The analysis will be conducted by country of origin. For example, to compare characteristics of individuals from Peru (variable A=1 and variable B=Peru) with those of American Caucasians (control group 1) and with those of African Americans (control group 2), I would like to randomly select three Caucasians and three African Americans for each Peru-born case matched by year of birth (variable C). Any advice on how to execute this random selection with matching in SPSS would be greatly appreciated. Please let me know if there are further questions. Thanks much, SM On Mon, 13 Oct 2008, Gene Maguin wrote: > SM, > > Some things in your explanation don't make sense to me. So, is variable A is > the id number? What is variable B about? What values does it take on? > Variable C is the year, I got that. Also, define your use of the word > 'case'? Are you using the word to refer to a record in the file or are you > using the word to refer to a record that is positive for a condition > relative to a record that is negative for the condition, as in a case > control study? Please post back to the list. > >>> I have a dataset with cases in variable A, and I would like to randomly > select 3 controls (in variable B) matched to the year (variable C) that > each case was reported in. > > > Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Gene,
I will respond to your questions below here. I have also included Jon Peck's response below regarding a CASECTRL extension that is available for version 16.0.1 and above. Unfortunately, for now, I do not have access to version 16.0.1. On Tue, 14 Oct 2008, Gene Maguin wrote: > Swee May, > > This seems to be pretty hard to do. > > 1) Is it correct to assume that for US whites (caucasians) and US blacks, > country (variable B) equals US? Yes, US caucasians and US African-Americans have the same country code in variable B. > > 2) Is it also correct to assume that there are no other matching variables? > Not age, gender, rural-urban, etc? Correct, there are no other matching variables. > Specific answer to each question, please. > > I'm sure something like this has been done, and maybe kind of recently, but > nothing specific comes to mind. > > > Gene Maguin > Thanks much for your insight. --Swee May _____________________________________________ Date: Mon, 13 Oct 2008 16:28:16 -0500 From: "Peck, Jon" <[hidden email]> To: [hidden email] Subject: RE: Re: [SPSSX-L] Question about random selection with matching One tool available with SPSS 16 or later is the extension command CASECTRL, which can be downloaded from SPSS Developer Central (www.spss.com/devcentral). You specify "supplier" and "demander" datasets and the list of keys that define the (exact) match. You can specify how many supplier matches you want for each demander, and there are various options for what sort of output to produce. When there are multiple matches for a demander case, CASECTRL picks randomly from supplier cases. HTH, Jon Peck ________________________________________ > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Swee May Cripe > Sent: Monday, October 13, 2008 5:10 PM > To: [hidden email] > Subject: Re: Question about random selection with matching > > > Thank you, Gene, for your follow-up post. To clarify, yes, this is > similar to a case-control study. > > A correction--closer look at the dataset revealed that Variable A denotes > individuals not from the US (variable A=1 [case]) and individuals from the > US [American Caucasians (variable A=2 {first control group}), and African > Americans (variable A=3 {second control group})]. > > Variable B refers to country of origin for all individuals in the dataset > (numeric value for each country represented). The analysis will be > conducted by country of origin. > > For example, to compare characteristics of individuals from Peru (variable > A=1 and variable B=Peru) with those of American Caucasians (control group > 1) and with those of African Americans (control group 2), I would like to > randomly select three Caucasians and three African Americans for each > Peru-born case matched by year of birth (variable C). > > Any advice on how to execute this random selection with matching in SPSS > would be greatly appreciated. Please let me know if there are further > questions. > > Thanks much, > SM > > On Mon, 13 Oct 2008, Gene Maguin wrote: > >> SM, >> >> Some things in your explanation don't make sense to me. So, is variable A > is >> the id number? What is variable B about? What values does it take on? >> Variable C is the year, I got that. Also, define your use of the word >> 'case'? Are you using the word to refer to a record in the file or are you >> using the word to refer to a record that is positive for a condition >> relative to a record that is negative for the condition, as in a case >> control study? Please post back to the list. >> >>>> I have a dataset with cases in variable A, and I would like to randomly >> select 3 controls (in variable B) matched to the year (variable C) that >> each case was reported in. >> >> >> Gene Maguin > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Swee May Cripe
At 12:58 PM 10/13/2008, Swee May Cripe wrote:
>I have a dataset with cases in variable [file?] A, and I would like >to randomly select 3 controls (in variable [file?] B) matched to the >year (variable C) that each case was reported in. The easiest way is to sort by year (or by whatever set of match variables you are using) and a random quantity. Use AGGREGATE on file A to get the number of cases for each set of matching variables. Merge that with file B by the matching variables. Within each matching group, if you need k controls for the number of cases in file A, select the first k (which, remember, are in random order). I think King Douglas first posted this solution. > SPSS allows for random selection, but I am not >sure how to select controls at random matched to the year for each case. > >Appreciate any insight and assistance on how to do this in SPSS. > >Thanks much, >SM Cripe > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except the >command. To leave the list, send the command >SIGNOFF SPSSX-L >For a list of commands to manage subscriptions, send the command >INFO REFCARD > > >No virus found in this incoming message. >Checked by AVG - http://www.avg.com >Version: 8.0.173 / Virus Database: 270.8.0/1721 - Release Date: >10/12/2008 12:00 PM ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
