|
Hi all,
Here is a statistical issue I'm hoping to get some feedback on from you. I have done survival analysis on cross-sectional data with a large sample with questions covering past year behavior. There are 13 questions pertaining to activities A to M: did you engage in activity A over the past 12 months? Did you engage in activity B over the past 12 months? And so on, until activity M is covered. Answers range from 0=never to 8=daily. About 25% of the large sample has answered never. The rest vary. I also have information about the outcome of a disorder. I am trying to determine if the outcome (binomial) is related to frequency of activities A to M, and if so, how. I used the 13 activity questions A to M and computed a new variable "Total-frequency". Total-frequency is the sum of the number of times an individual engaged in activities A to M. That is, if the person engaged in A once per day, it tallied at 365, if the same person also engaged in activity B once per month, it tallied 12. Then, the Total-frequency for that person becomes 365+12=377. The Total-frequency for cases ranged from 0 to over 4,200 activities for a 12 month period. Life Tables: The Total-frequency variable was used as "time" and the disorder as "event" where event was the occurrence of the disorder. From this I can see, for example, that the higher the frequency of activities the higher the probability of having the disorder. I also see that despite the frequency and disorder association, the highest hazard rate occurred within the first interval (an interval covering between 1 to 99 activities though the highest number of frequency in the data is 4,200). My questions to you: 1- Would the "Total-frequency" constitute a legitimate method of conceptualizing total activities, and is using Total-frequency as a "time" variable appropriate in survival analysis? 2- Is there another way of looking at the 13 activities at once, in SPSS? 3- Would it be appropriate to continue with this logic and apply Cox regression and use explanatory variables as IV's? I am concerned with the fact that the 13 activities are now examined jointly as if they were qualitatively equivalent when in fact they are not. Some activities are longer in duration even if engaged in only rarely whereas others are much shorter in duration even if engaged in more frequently. Some require more effort. Some engender higher risk for the disorder. The assumption is that all activities are of the same "type". In another analysis, I have examined the types and found that "people" who engage in an assortment of combinations of these activities make for distinct groups (using latent class analysis) Feedback would be much appreciated. Neda ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
I do not think your Total Frequency variable could be used as a Time
variable. For one thing, it is a measure of "intensity of time use" more than time itself: using it as a measure of time, the result is that one person with more activities appears as having lived more "time" during those 12 months, while a person with no activity (25% of the sample) will have lived the whole year in an instant of no duration at all. Rather artificial, don't you think? In fact, you do not have a time variable. You only know that those people did or did not engage on each activity, and with what frequency, over the past 12 months considered as a whole. For the purpose of analysis, the period of 12 months counts as a single instant preceding the present (assuming the outcome occurs in the present: for all you say, the outcome of the disorder may have happened at any moment during those long (or short) 12 months, and thus the activities may be antecedent or consequent to the outcome (are you sure of the temporal order of the variables?). You may achieve what you want (did more intense activity influence the outcome?) by means of a number of multivariate analyses other than survival analysis. My personal recommendation: You may use the M activities as M ordinal variables (ranked from daily to never or the reverse) as predictors of the outcome in a logistic regression, which is the one I should try first, or you may try the M variables converted into numeric (interval) measures of frequency (with values 365, 12, 52, or whatever), for the same type of analysis (log reg). The accumulation of all activities into a single unweighted sum assumes that all activities have the same influence or weight, which is something you may be interested in testing: perhaps each additional instance of activity F has more influence than activity M. I suppose that some activities are more likely to be performed daily (say, reading the newspaper) while others are more likely to occur once a year (sending a Valentine card), so the naked frequencies are not equivalent. Some activities may carry more subjective effort (apologizing for past sins) than others (greeting some passing acquaintance), and should be weighted accordingly. So I'd prefer treating each kind of activity as a separate predictor. Hope this helps. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Neda Faregh Sent: 13 November 2008 14:13 To: [hidden email] Subject: a logic question regarding survival analysis Hi all, Here is a statistical issue I'm hoping to get some feedback on from you. I have done survival analysis on cross-sectional data with a large sample with questions covering past year behavior. There are 13 questions pertaining to activities A to M: did you engage in activity A over the past 12 months? Did you engage in activity B over the past 12 months? And so on, until activity M is covered. Answers range from 0=never to 8=daily. About 25% of the large sample has answered never. The rest vary. I also have information about the outcome of a disorder. I am trying to determine if the outcome (binomial) is related to frequency of activities A to M, and if so, how. I used the 13 activity questions A to M and computed a new variable "Total-frequency". Total-frequency is the sum of the number of times an individual engaged in activities A to M. That is, if the person engaged in A once per day, it tallied at 365, if the same person also engaged in activity B once per month, it tallied 12. Then, the Total-frequency for that person becomes 365+12=377. The Total-frequency for cases ranged from 0 to over 4,200 activities for a 12 month period. Life Tables: The Total-frequency variable was used as "time" and the disorder as "event" where event was the occurrence of the disorder. From this I can see, for example, that the higher the frequency of activities the higher the probability of having the disorder. I also see that despite the frequency and disorder association, the highest hazard rate occurred within the first interval (an interval covering between 1 to 99 activities though the highest number of frequency in the data is 4,200). My questions to you: 1- Would the "Total-frequency" constitute a legitimate method of conceptualizing total activities, and is using Total-frequency as a "time" variable appropriate in survival analysis? 2- Is there another way of looking at the 13 activities at once, in SPSS? 3- Would it be appropriate to continue with this logic and apply Cox regression and use explanatory variables as IV's? I am concerned with the fact that the 13 activities are now examined jointly as if they were qualitatively equivalent when in fact they are not. Some activities are longer in duration even if engaged in only rarely whereas others are much shorter in duration even if engaged in more frequently. Some require more effort. Some engender higher risk for the disorder. The assumption is that all activities are of the same "type". In another analysis, I have examined the types and found that "people" who engage in an assortment of combinations of these activities make for distinct groups (using latent class analysis) Feedback would be much appreciated. Neda To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
