|
I have a question about survival analysis. We are looking at a data set in
which employment status is recorded for clients at more than one time. For example, on a quarterly basis for the period of a year. So, we will have clients who: 1) remained employed for the entire duration of the study; 2) remained unemployed for the entire study; 3) changed from employed at some point to unemployed and remained unemployed thereafter; 4) changed from unemployed at some point to employed and remained employed thereafter; or 5) changed from one employment status to the other several times (oscillating status). And there can be as many as 5 cases per client, sometimes more. We want to use survival analysis because we have censored cases in the data set. Time in measured in days. Employment status is employed = 1, unemployed = 2. And for now, we are not using any co-variants in the study. We are just looking at time and employment status. My question is how to arrange the data set? The survival analysis is typically used for things like patient survival, so it only takes into account length of time to patient's death, for example. Here, our variable - employment status - often changes. So, should we create one data set that includes only clients who's first status is employed and then a second data set which includes client's who's first status is unemployed? Thanks in advance. |
|
Apparently your subjects can fluctuate between employment and
unemployment. Thus your model is not a simple "time to event" model, like a life table, in which the "event" is terminal. The "time to event" may be counted since the latest "event" or since the start of observation. In the latter case, observations are left censored, since they may be in that state since some time ago when the study commences. In other words: people who are unemployed at the start of the study and become employed 3 months later have a "time to event" that appears to be 3 months but is actually longer, since they had been unemployed since some time before the start of the study. Instead, people who become unemployed during the study and then find a new job have a perfectly defined duration of unemployment. Thus your model should include both left and right censored cases. If you do not have any predictor, what is your "model"? You may have a simple "Markov-like" model in which people change (or do not change) from state "i" to state "j" in a given time period, and thus your task is computing the transition rates per unit of time. Or you may have a Kaplan-Meier model to estimate survival rates. Or you may use the initial state and the number of prior transitions (or the time since the latest transition) as predictors, and apply Cox regression to predict the odds of the next event. Perhaps these random thoughts help you to think about how to proceed. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of J Scelza Sent: 25 September 2007 16:42 To: [hidden email] Subject: survival analysis: organizing the data I have a question about survival analysis. We are looking at a data set in which employment status is recorded for clients at more than one time. For example, on a quarterly basis for the period of a year. So, we will have clients who: 1) remained employed for the entire duration of the study; 2) remained unemployed for the entire study; 3) changed from employed at some point to unemployed and remained unemployed thereafter; 4) changed from unemployed at some point to employed and remained employed thereafter; or 5) changed from one employment status to the other several times (oscillating status). And there can be as many as 5 cases per client, sometimes more. We want to use survival analysis because we have censored cases in the data set. Time in measured in days. Employment status is employed = 1, unemployed = 2. And for now, we are not using any co-variants in the study. We are just looking at time and employment status. My question is how to arrange the data set? The survival analysis is typically used for things like patient survival, so it only takes into account length of time to patient's death, for example. Here, our variable - employment status - often changes. So, should we create one data set that includes only clients who's first status is employed and then a second data set which includes client's who's first status is unemployed? Thanks in advance. |
| Free forum by Nabble | Edit this page |
