Sampling Questioin

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Sampling Questioin

Mike P-5
All,

I am looking for a bit of advice on sampling methodology for a project I
am going to be starting, This is a fairly long post so ignore if not of
interest.

I have been given a project in which peoples online usage is being
tracked, on a basic level they will visit 'v' a website, and then either
leave or click 'c' on a advert. This is called a campaign and there may
be many running at anyone time. (users per campaign may vary from a few
1000, to long running campaigns of in excess of 1,000,000)

Each user is a given a unique ID and tracked via a cookie and can take
part in any number of campaigns.

I expect the distribution of people clicking on an advert to be
approximately Poisson, (or negative binomial?? Any input would be
helpful). Part of my analysis would be to run some variety of Poisson
regression in order to estimate click-thru rate for a specific campaign.


I will also want to track if there are sig. differences between
campaigns and advert types (based on click-thru rate)

I expect to start with t-test to see if there are Sig. differences from
one campaign to another (click tru rate c/v) and then build a multi way
ANOVA to check differences between campaigns and advert types based on
click-thru rate and to explore the interactions.

So to my question of sampling, in order to create a sound methodology
for this project I can keep all responses from all users upto a 3 month
period.

In order for my findings to be valid, should I track a sample of users
across campaigns for 3 months and then run all test against this
selection in order to build up my representative sample.  There will be
some drop out as people delete cookies from time to time but I hope that
the size of select from is large enough it should not be of great
concern (maybe 20% drop out i.e. delete cookies).

OR

Is it legitimate to sample a number of people at random from a
particular campaign, and compare them against a random number of people
from another campaign?  Would this be valid when running a Poisson
regression, to model click rate for a particular campaign over a 3 month
period?  Or would running an ARIMA model be my best choice?

Any help of this methodology would be gratefully appreciated

Mike

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
Reply | Threaded
Open this post in threaded view
|

Re: Sampling Questioin

Mike P-5
Sorry I should probably mention the way I was going to get a continuous
variable to run the ANOVA models by dividing the sample for a particular
campaign into 300 subgroups and working out the average click-thru ratio
from here, thus giving me a values for my between and within groups.

Is this valid?

Mike

-----Original Message-----
From: Michael Pearmain
Sent: 24 October 2006 11:14
To: [hidden email]
Subject: Sampling Questioin

All,

I am looking for a bit of advice on sampling methodology for a project I
am going to be starting, This is a fairly long post so ignore if not of
interest.

I have been given a project in which peoples online usage is being
tracked, on a basic level they will visit 'v' a website, and then either
leave or click 'c' on a advert. This is called a campaign and there may
be many running at anyone time. (users per campaign may vary from a few
1000, to long running campaigns of in excess of 1,000,000)

Each user is a given a unique ID and tracked via a cookie and can take
part in any number of campaigns.

I expect the distribution of people clicking on an advert to be
approximately Poisson, (or negative binomial?? Any input would be
helpful). Part of my analysis would be to run some variety of Poisson
regression in order to estimate click-thru rate for a specific campaign.


I will also want to track if there are sig. differences between
campaigns and advert types (based on click-thru rate)

I expect to start with t-test to see if there are Sig. differences from
one campaign to another (click tru rate c/v) and then build a multi way
ANOVA to check differences between campaigns and advert types based on
click-thru rate and to explore the interactions.

So to my question of sampling, in order to create a sound methodology
for this project I can keep all responses from all users upto a 3 month
period.

In order for my findings to be valid, should I track a sample of users
across campaigns for 3 months and then run all test against this
selection in order to build up my representative sample.  There will be
some drop out as people delete cookies from time to time but I hope that
the size of select from is large enough it should not be of great
concern (maybe 20% drop out i.e. delete cookies).

OR

Is it legitimate to sample a number of people at random from a
particular campaign, and compare them against a random number of people
from another campaign?  Would this be valid when running a Poisson
regression, to model click rate for a particular campaign over a 3 month
period?  Or would running an ARIMA model be my best choice?

Any help of this methodology would be gratefully appreciated

Mike

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________