All,
I am looking for a bit of advice on sampling methodology for a project I am going to be starting, This is a fairly long post so ignore if not of interest. I have been given a project in which peoples online usage is being tracked, on a basic level they will visit 'v' a website, and then either leave or click 'c' on a advert. This is called a campaign and there may be many running at anyone time. (users per campaign may vary from a few 1000, to long running campaigns of in excess of 1,000,000) Each user is a given a unique ID and tracked via a cookie and can take part in any number of campaigns. I expect the distribution of people clicking on an advert to be approximately Poisson, (or negative binomial?? Any input would be helpful). Part of my analysis would be to run some variety of Poisson regression in order to estimate click-thru rate for a specific campaign. I will also want to track if there are sig. differences between campaigns and advert types (based on click-thru rate) I expect to start with t-test to see if there are Sig. differences from one campaign to another (click tru rate c/v) and then build a multi way ANOVA to check differences between campaigns and advert types based on click-thru rate and to explore the interactions. So to my question of sampling, in order to create a sound methodology for this project I can keep all responses from all users upto a 3 month period. In order for my findings to be valid, should I track a sample of users across campaigns for 3 months and then run all test against this selection in order to build up my representative sample. There will be some drop out as people delete cookies from time to time but I hope that the size of select from is large enough it should not be of great concern (maybe 20% drop out i.e. delete cookies). OR Is it legitimate to sample a number of people at random from a particular campaign, and compare them against a random number of people from another campaign? Would this be valid when running a Poisson regression, to model click rate for a particular campaign over a 3 month period? Or would running an ARIMA model be my best choice? Any help of this methodology would be gratefully appreciated Mike ______________________________________________________________________ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email ______________________________________________________________________ |
Sorry I should probably mention the way I was going to get a continuous
variable to run the ANOVA models by dividing the sample for a particular campaign into 300 subgroups and working out the average click-thru ratio from here, thus giving me a values for my between and within groups. Is this valid? Mike -----Original Message----- From: Michael Pearmain Sent: 24 October 2006 11:14 To: [hidden email] Subject: Sampling Questioin All, I am looking for a bit of advice on sampling methodology for a project I am going to be starting, This is a fairly long post so ignore if not of interest. I have been given a project in which peoples online usage is being tracked, on a basic level they will visit 'v' a website, and then either leave or click 'c' on a advert. This is called a campaign and there may be many running at anyone time. (users per campaign may vary from a few 1000, to long running campaigns of in excess of 1,000,000) Each user is a given a unique ID and tracked via a cookie and can take part in any number of campaigns. I expect the distribution of people clicking on an advert to be approximately Poisson, (or negative binomial?? Any input would be helpful). Part of my analysis would be to run some variety of Poisson regression in order to estimate click-thru rate for a specific campaign. I will also want to track if there are sig. differences between campaigns and advert types (based on click-thru rate) I expect to start with t-test to see if there are Sig. differences from one campaign to another (click tru rate c/v) and then build a multi way ANOVA to check differences between campaigns and advert types based on click-thru rate and to explore the interactions. So to my question of sampling, in order to create a sound methodology for this project I can keep all responses from all users upto a 3 month period. In order for my findings to be valid, should I track a sample of users across campaigns for 3 months and then run all test against this selection in order to build up my representative sample. There will be some drop out as people delete cookies from time to time but I hope that the size of select from is large enough it should not be of great concern (maybe 20% drop out i.e. delete cookies). OR Is it legitimate to sample a number of people at random from a particular campaign, and compare them against a random number of people from another campaign? Would this be valid when running a Poisson regression, to model click rate for a particular campaign over a 3 month period? Or would running an ARIMA model be my best choice? Any help of this methodology would be gratefully appreciated Mike ______________________________________________________________________ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email ______________________________________________________________________ |
Free forum by Nabble | Edit this page |