Good evening. I am using a dataset that comes with two weight variables: VariableX for projections to population totals, and VariableY for balancing in statistics models. Which model should I use - or more appropriately, when / why would I use balancing for statistics models? Thanks - cY |
What statistical procedures are you going to use? On Mon, Feb 4, 2019 at 5:55 PM Chao yawo <[hidden email]> wrote:
-- |
Various regressions..linear and logistics plus descriptive statistics..chi-square, t-tests and ANOVA. Thanks, cy On Mon, Feb 4, 2019 at 10:32 PM Jon Peck <[hidden email]> wrote:
Emmanuel F. Koku, PhD
Associate Professor
Department of Sociology
Drexel University
Philadelphia, PA 19104
[hidden email]
215 - 895 - 6144
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
|
In reply to this post by Chao yawo-2
Statistics supports several different kinds of weights, so the appropriate weight will depend on the procedure. Most procedures support frequency or replication weights. That means that the weight is interpreted as that number of identical cases. It is very important for procedures such as regression that the weights be normalized so that the total weight equals the number of cases. Otherwise, the degrees of freedom for tests will be wrong, perhaps wildly wrong. Weights are often used to make the weighted sample more representative of the population from which it is drawn. For nonrandom samples, this is the domain of the CS procedures, where you supply information on the sampling plan to use for inference. CTABLES supports a simpler approximation known as effective base weighting. Weighted Least Squares (WLS) uses entirely different weights to account for heterscedasticity. That would not be either of the weights you refer to. Balancing with weights is sometimes used to give more prominence to rare cases in building models with procedures such as SVM. The TREE procedure also supports importance weights for a similar purpose. So what weights to use where is a complicated question. But one thing to keep in mind is that if weighting changes the results a lot, this is a sign of possible misspecification of the model. A model assumes that all responses have the same parameters for the explanatory variables. If results are sensitive to the weighting, this assumption is doubtful. On Mon, Feb 4, 2019 at 5:55 PM Chao yawo <[hidden email]> wrote:
|
Very good post, Jon.
I especially like, "if weighting changes the results a lot, this is a sign
of possible misspecification of the model."
One place where weighting can innocently change "results" does exist
within the modeling Jon mentioned. Epidemiologists may over-sample
the elderly (say) in order to measure deaths more accurately, and then
apply those to a standard population profile in order to estimate national
mortality rates (which will be far lower that the morality in the aged sample).
- That is a matter of reporting /estimates/ rather than drawing conclusions
from /tests/. Be most wary when weighting affects the tests.
--
Rich Ulrich
From: SPSSX(r) Discussion <[hidden email]> on behalf of Jon Peck <[hidden email]>
Sent: Tuesday, February 5, 2019 9:48 AM To: [hidden email] Subject: Re: When to use balancing weights Statistics supports several different kinds of weights, so the appropriate weight will depend on the procedure. Most procedures support frequency or replication weights. That means that the weight is interpreted
as that number of identical cases. It is very important for procedures such as regression that the weights be normalized so that the total weight equals the number of cases. Otherwise, the degrees of freedom for tests will be wrong, perhaps wildly wrong.
Weights are often used to make the weighted sample more representative of the population from which it is drawn. For nonrandom samples, this is the domain of the CS procedures, where you supply information
on the sampling plan to use for inference. CTABLES supports a simpler approximation known as effective base weighting.
Weighted Least Squares (WLS) uses entirely different weights to account for heterscedasticity. That would not be either of the weights you refer to.
Balancing with weights is sometimes used to give more prominence to rare cases in building models with procedures such as SVM. The TREE procedure also supports importance weights for a similar purpose.
So what weights to use where is a complicated question. But one thing to keep in mind is that if weighting changes the results a lot, this is a sign of possible misspecification of the model. A model assumes
that all responses have the same parameters for the explanatory variables. If results are sensitive to the weighting, this assumption is doubtful.
On Mon, Feb 4, 2019 at 5:55 PM Chao yawo <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |