Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 63204

An interesting data set, and I want your opinion

$
0
0

(Hope this is appropriate for this subreddit) Hey fellow redditors,

tl;dr I am asking about a data set for a marketing problem, I am interested in feedback on which models and direction would be the best for gleaning the most useful insight. In addition to the models, I am also interested in the reasoning for choosing that direction/model and any trade-offs.

Objective: To determine the effects of Media (Radio, Television, Cable) type on online performance. Online performance is a transaction count for our purpose.

I have a "media spend" data set which looks like this:

  • Market: El Paso, Tx
  • Media: Radio
  • Commercial Code: (the commercial itself)
  • Date aired: to the minute
  • Length: Length of Ad

and an "online spend" data set which looks like this:

  • Market: El Paso, Tx
  • Date : to the hour
  • Visits: the amount of visits the site received from El Paso in this hour
  • Bounces: " " " bounces " " "
  • Count: our performance metric by hour by market

Nuances about the data: there are 27 markets and we are interested in the effects per market. Some of campaigns are designated as Hispanic (target audience). Will probably update this section for questions.

I have more data but I am interested in this "core" set of features. Because of the count data I immediately went for a Poisson regression. I went this route because I could offset the market "size" (Dallas > El Paso) using exposure in the regression model (I offset using visits) and the size of the market was something I wanted to control for. Above all I wanted a simple clean solution that was interpretable and that represented how well each advertisement did in its market. I also turned the variable into a binary (did they purchase or not) by hour and used logit/probit regression to determine how effective an ad was at someone purchasing or not, but I feel this method is subpar in terms of garnering insight

I was worried about the time component though, is this correct because technically if you see an ad you will most likely (obviously) not buy that instant, but later. So I started to go down a dynlm (R Package) route to look at lagged effects but felt a little hesitant about this approach. I couldn't get a poisson regression with exposure working quickly, so I wanted to avoid a rabbit hole if I could find a better alternative. It's also worth noting that in popular markets they would often advertise all hours of the day. Quite often as a matter of fact.

Is it safe to use this kind of model because I am technically violating that assumption noted above? what direction or models would make sense for this particular problem (estimating how effective each media type is)?

submitted by 139385703
[link][comment]

Viewing all articles
Browse latest Browse all 63204

Trending Articles