Regression with sampling weights

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Regression with sampling weights

Patsousasilva
I am trying to use a logistic regression in a survey.

As the sampling technique used involved a random selection - because we did not know the size of the population - information on some variables can be biased. Specifically, as we wanted to know how many people were appointed by different governments, we collected data on appointments in the beginning and in the end of each governing period.

The problem is that now, for some governments we have ten months collected and for other governments we only have six months.

Now I wanted to build a logistic regression model to predict appointments in different hierarchical levels - so I built dummy variables for each hierarchical level.

The problem is that now I want to run the model but i fear that some of the variables were oversampled - because of the different months that were collected per government.
 
How do I correct this? I know the model should compensate for sample features, but I do not know how to do it.

I really hope you can help me!
Thank you!
Reply | Threaded
Open this post in threaded view
|

Re: Regression with sampling weights

Poes, Matthew Joseph-2
Given that this is an SPSS list serve, I always feel bad doing this.  However, as I understand it, the base version of SPSS does not include an advanced sampling program, this is an add on, and I have no experience with it to speak of.  Instead I have relied on Stata and SAS for this.  In general I use SAS to create and analyze advanced survey samples.

What you are describing sounds like a 2 stage stratified random probability sample.  It also sounds like you don't yet have the population to sample mapping worked out yet.  What you need to start with is some set of information which reflects what your actual population is, and then you utilize these variables to create weights and in this case, weights within strata which give the probability of that person being in the sample, given the population parameters.

I have seen people develop ways to fudge this approach and cobble through in SPSS.  In my opinion, it's important to read the extent literature on probability sampling, and look at the approaches that have been developed, and the strengths and weaknesses of each.  In the end, I believe you will find that cobbling through is not the best approach, and doing this through a canned program will ultimately yield better and more reliable results.

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
Phone: 217-265-4576
email: [hidden email]


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Patsousasilva
Sent: Wednesday, March 14, 2012 7:30 AM
To: [hidden email]
Subject: Regression with sampling weights

I am trying to use a logistic regression in a survey.

As the sampling technique used involved a random selection - because we did not know the size of the population - information on some variables can be biased. Specifically, as we wanted to know how many people were appointed by different governments, we collected data on appointments in the beginning and in the end of each governing period.

The problem is that now, for some governments we have ten months collected and for other governments we only have six months.

Now I wanted to build a logistic regression model to predict appointments in different hierarchical levels - so I built dummy variables for each hierarchical level.

The problem is that now I want to run the model but i fear that some of the variables were oversampled - because of the different months that were collected per government.

How do I correct this? I know the model should compensate for sample features, but I do not know how to do it.

I really hope you can help me!
Thank you!

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Regression-with-sampling-weights-tp5564470p5564470.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD