I'm trying to choose between two ways of building a regression model,
namely between two ways of representing my dependent variable. Here is the model that I am building - I'm looking for a relationship between the number of people that registered for a specific program (y-value) and how they found out about a program - tv, radio, newspaper (x-values). My two choices here are to use the actual number of registrants or to use a new measure that I've created - a ratio between new registrants and total number of ads they were exposed to. My intuition tells me that since I am looking at several months worth of data, where the number of available ads varied significantly, it makes more sense to look at the ratio rather than the number of registants. Is this correct or am I forcing x and y to be too correlated? thank you, Alina Sheyman |
Both approaches are worth the try. Here the exploratory analysis should
help you decide how well each predictor is correlated with the response variable and with each other. Moreover modeling diagnostics (satisfaction of basic assumptions, F-test, t-statistics, confidence intervals) will help you decide which model is more statistically robust. Having said that, it seems that a third approach could be done. If you define the outcome as binary for participants and non participants as function of the different add exposures then you will have a probability value of participation which is also interesting. Then you could sell the idea of who is more likely to enroll in the service. For this you could use logistic regression. Fermin Ornelas, Ph.D. Management Analyst III, AZ DES Tel: (602) 542-5639 E-mail: [hidden email] -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Alina Sheyman Sent: Friday, April 06, 2007 8:45 AM To: [hidden email] Subject: Building a model I'm trying to choose between two ways of building a regression model, namely between two ways of representing my dependent variable. Here is the model that I am building - I'm looking for a relationship between the number of people that registered for a specific program (y-value) and how they found out about a program - tv, radio, newspaper (x-values). My two choices here are to use the actual number of registrants or to use a new measure that I've created - a ratio between new registrants and total number of ads they were exposed to. My intuition tells me that since I am looking at several months worth of data, where the number of available ads varied significantly, it makes more sense to look at the ratio rather than the number of registants. Is this correct or am I forcing x and y to be too correlated? thank you, Alina Sheyman NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed. It may contain information that is privileged and confidential under state and federal law. This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail. Thank you. |
In reply to this post by Alina Sheyman-3
How many months are we talking here? If there are more than 20, then you
may need to apply time series analysis, i.e. to link monthly registrants to monthly ad volume, or lagged monthly ad volume. That would deal your concern that ad quantity varied significantly over time. What is your analysis unit, e.g. individual, mall, week, or month? -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Alina Sheyman Sent: Friday, April 06, 2007 11:45 AM To: [hidden email] Subject: Building a model I'm trying to choose between two ways of building a regression model, namely between two ways of representing my dependent variable. Here is the model that I am building - I'm looking for a relationship between the number of people that registered for a specific program (y-value) and how they found out about a program - tv, radio, newspaper (x-values). My two choices here are to use the actual number of registrants or to use a new measure that I've created - a ratio between new registrants and total number of ads they were exposed to. My intuition tells me that since I am looking at several months worth of data, where the number of available ads varied significantly, it makes more sense to look at the ratio rather than the number of registants. Is this correct or am I forcing x and y to be too correlated? thank you, Alina Sheyman This message is the property of Draftfcb and contains information which may be privileged or confidential. It is meant only for the intended recipients and/or their authorized agents. If you believe you have received this message in error, please notify us immediately by return e-mail and destroy any printed or electronic copies of the message. Any unauthorized use, dissemination, disclosure, or copying of this message or the information contained in it, is strictly prohibited and may be unlawful. Thank you for your cooperation. (A) |
Free forum by Nabble | Edit this page |