|
Hi all,
I'm running a stepwise regression of organizational practices on construction projects that predict project cost growth. I have data for 115 projects, yet some organizational practices were not applicable on some projects (in a random fashion). the missing data is obviously purposeful and not due to not filling in questionnaires etc. spss automatically excludes cases with any missing values, or wants to substitute a value, so I end up with a regression being carried out on 10 projects, obviously not useful. Any suggestions for syntax to include all cases or suggestions to rectify this problem? Thanks in advance, Hunna Watson |
|
Hunna,
Your use of the stepwise method for regression instead of running with all the variables at once is immaterial for your problem. What you need is some way of dealing with projects where some organizational practice does not apply. You do not give many details about the variables, but I imagine that each organizational practice might be a dummy variable, either present or absent. In such case, you may posit its effect on costs not as a result of "choosing or not choosing it when it is adequate to choose it" but as a result of its mere presence. A project may benefit from a practice if (a) the practice is applicable and (b) it is actually used; otherwise the project does not benefit from that practice. The absence of a practice may thus be a result of deliberate choice or impossibility of application, but in either case it would result in its effect not being observed. In other words, you may (if your particular situation affords this interpretation) treat the "missing" cases as negative instances, as zeroes in the dummies, and proceed with the regression. If this road is not conceptually adequate, you're in trouble. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hunna Watson Sent: 25 July 2007 10:59 To: [hidden email] Subject: stepwise regression how to include all cases despite missing data Hi all, I'm running a stepwise regression of organizational practices on construction projects that predict project cost growth. I have data for 115 projects, yet some organizational practices were not applicable on some projects (in a random fashion). the missing data is obviously purposeful and not due to not filling in questionnaires etc. spss automatically excludes cases with any missing values, or wants to substitute a value, so I end up with a regression being carried out on 10 projects, obviously not useful. Any suggestions for syntax to include all cases or suggestions to rectify this problem? Thanks in advance, Hunna Watson |
|
Hector,
When I worked at ACT, Inc we often treated the student's school identifier this way, usually with great success. Granted, we had many thousands of cases to draw from. Hunna only has 115? She is going to run out of cases pretty quickly, don't you think? *************************************************************************************************************************************************************** Mark A. Davenport Ph.D. Senior Research Analyst Office of Institutional Research The University of North Carolina at Greensboro 336.256.0395 [hidden email] 'An approximate answer to the right question is worth a good deal more than an exact answer to an approximate question.' --a paraphrase of J. W. Tukey (1962) Hector Maletta <[hidden email]> Sent by: "SPSSX(r) Discussion" <[hidden email]> 07/25/2007 10:15 AM Please respond to Hector Maletta <[hidden email]> To [hidden email] cc Subject Re: stepwise regression how to include all cases despite missing data Hunna, Your use of the stepwise method for regression instead of running with all the variables at once is immaterial for your problem. What you need is some way of dealing with projects where some organizational practice does not apply. You do not give many details about the variables, but I imagine that each organizational practice might be a dummy variable, either present or absent. In such case, you may posit its effect on costs not as a result of "choosing or not choosing it when it is adequate to choose it" but as a result of its mere presence. A project may benefit from a practice if (a) the practice is applicable and (b) it is actually used; otherwise the project does not benefit from that practice. The absence of a practice may thus be a result of deliberate choice or impossibility of application, but in either case it would result in its effect not being observed. In other words, you may (if your particular situation affords this interpretation) treat the "missing" cases as negative instances, as zeroes in the dummies, and proceed with the regression. If this road is not conceptually adequate, you're in trouble. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hunna Watson Sent: 25 July 2007 10:59 To: [hidden email] Subject: stepwise regression how to include all cases despite missing data Hi all, I'm running a stepwise regression of organizational practices on construction projects that predict project cost growth. I have data for 115 projects, yet some organizational practices were not applicable on some projects (in a random fashion). the missing data is obviously purposeful and not due to not filling in questionnaires etc. spss automatically excludes cases with any missing values, or wants to substitute a value, so I end up with a regression being carried out on 10 projects, obviously not useful. Any suggestions for syntax to include all cases or suggestions to rectify this problem? Thanks in advance, Hunna Watson |
|
In reply to this post by Hector Maletta
Hunna:
Even with a scale, the "missing" responses can be reinterpreted as saying "This practice was not effective in this case because -for one reason or another- it was not used". This is not quite clean conceptually, but is your only choice unless you put up with working with 10 cases. The problem, apparently, is in the design of the questionnaire, asking for the effectiveness of a practice that is not in universal use among the cases under analysis. In any case, a practice cannot have any effectiveness if it is not used, so I insist you can treat it as having zero effectiveness when it was not used. On the other hand, since your variables seem to be many, and your cases seem to be few, perhaps you should consider a more artisanal approach for identifying effective strategies instead of your alli-in-one regression. With 87 predictors and 115 cases you don't have a chance even without a single missing value. Hector _____ From: Hunna Watson [mailto:[hidden email]] Sent: 25 July 2007 11:36 To: Hector Maletta Subject: FW: Re: stepwise regression how to include all cases despite missing data thanks for your reply, i've just come on board this project in the past two weeks, the data has been collected already and, this is essentially what happened though i'm simplifying it, respondents rated how effective the use of the strategy was for preventing cost growth in the form of work that had to be done again on the project, so I have data on a scale and no possibility for coding absent or present :S extra information.... yes I know all the horrible things about stepwise, but it is the only suitable method I can think of to answer the research questions, I have just come on board the project in the last two weeks. The research is very exploratory and the topic hasn't been examined before. Data has been collected on many different predictor variables (design-related sources, subcontractor sources, site management sources, contract documentation, the list goes on and on - up to a terrible 87 predictors). There are 115 projects, so each is a case if you like, and we want to first look at this data set (no options there), but after that we can merge it with another data set containing information on a further 160 projects. Some predictors weren't relevant to projects. for instance, some didn't use incentives, but we have ratings on scales of 1 to 5 (assessing raters perceptions of contribution of use of that method to costs) and we are seeking to predict costs from the predictor variables. IF a method wasn't applicable e.g., use of a particular incentive plan, it has been left blank on questionnaires. No logical ordering. Hunna, Your use of the stepwise method for regression instead of running with all the variables at once is immaterial for your problem. What you need is some way of dealing with projects where some organizational practice does not apply. You do not give many details about the variables, but I imagine that each organizational practice might be a dummy variable, either present or absent. In such case, you may posit its effect on costs not as a result of "choosing or not choosing it when it is adequate to choose it" but as a result of its mere presence. A project may benefit from a practice if (a) the practice is applicable and (b) it is actually used; otherwise the project does not benefit from that practice. The absence of a practice may thus be a result of deliberate choice or impossibility of application, but in either case it would result in its effect not being observed. In other words, you may (if your particular situation affords this interpretation) treat the "missing" cases as negative instances, as zeroes in the dummies, and proceed with the regression. If this road is not conceptually adequate, you're in trouble. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hunna Watson Sent: 25 July 2007 10:59 To: [hidden email] Subject: stepwise regression how to include all cases despite missing data Hi all, I'm running a stepwise regression of organizational practices on construction projects that predict project cost growth. I have data for 115 projects, yet some organizational practices were not applicable on some projects (in a random fashion). the missing data is obviously purposeful and not due to not filling in questionnaires etc. spss automatically excludes cases with any missing values, or wants to substitute a value, so I end up with a regression being carried out on 10 projects, obviously not useful. Any suggestions for syntax to include all cases or suggestions to rectify this problem? Thanks in advance, Hunna Watson |
| Free forum by Nabble | Edit this page |
