How to Overcome Statistical Challenges in Marketing Mix Modeling

Explore top LinkedIn content from expert professionals.

Summary

Marketing mix modeling (MMM) helps businesses understand the effectiveness of their marketing efforts across different channels by analyzing data to estimate the impact of each one. Overcoming statistical challenges in MMM involves addressing issues like multicollinearity, outliers, and ensuring accurate predictions through rigorous testing and validation frameworks.

Validate your model: Start with a solid causal model based on marketing theory, test it with simulated data, and compare results against known truths to ensure accuracy.
Account for outliers: Recognize that even a small number of outlier data points can significantly distort insights, especially when combined with correlated variables, and take steps to identify and mitigate their impact.
Utilize Bayesian methods carefully: Use prior predictive checks to refine your assumptions and align them with your understanding of marketing dynamics, while remaining flexible to adapt your priors when necessary.

Summarized by AI based on LinkedIn member posts

Michael Kaminsky

Recast Co-Founder | Writes about marketing science, incrementality, and rigorous statistical methods

13,891 followers 2y
Report this post
“How do we know if the model is right?” That’s one of the most important questions in MMM. MMM is different from a lot of other ML / AI applications. There are a lot of prediction problems that machine learning is very good at and are easy to validate. For example: predicting what someone is most likely to purchase given the products in their basket. You can test whether algorithm A or algorithm B drove more purchases and roll out a new algorithm. Unfortunately, marketing mix modeling doesn’t work that way. In MMM, we don’t just care about predictive accuracy. We care about getting a correct read on the true incrementality of our marketing dollars. If we spend an extra $10,000 on a channel like Meta, how many additional conversions will that drive regardless of tracking? No one can go look up what that number is. That is the core problem we’re up against. So when we think about how to validate an MMM, here’s how we do it: 1 - Start with a causal model based on marketing science When doing causal inference, you always need to start with theory. We start with a causal diagram outlining how marketing scientists believe marketing works in the real world, and we use that to determine our statistical model. 2 - Parameter recovery If we generate a data set where we know the true Facebook marketing performance (because we created the data), can we run the model and get results back that match the truth when we run simulated data through the model? 3 - Compare model results against the truth where we have it. If we ran a lift test or did some other high-quality experiment, we know the actual true parameters. At a certain point in time, we can compare those parameters with our model results to ensure the model is close to those parameters. 4 - Look for stability over time on different subsets of data. Next, we go back and run the model with different subsets of data (i.e., subsequently cutting off the last week of data). If we use different slices of the data, are the parameter estimates coming out of the model stable? If they're not, we have reason to believe that the model isn't well specified and the parameters are randomly jumping around as we grab different slices of data. 5 - Look at out-of-sample forecasting accuracy. The theory here is that if we have truly captured the underlying causal model we should be able to make accurate forecasts on data we haven’t observed yet. In fact, we should be able to make these accurate forecasts even when the marketing budget and channel mix change. So, when we predict the next 30 days, can we make that prediction accurately? Can we do that over and over again for the last 6 months? This is a necessary but not sufficient condition for having a good model. Just because you can do out-of-sample prediction doesn't mean that you have good reads on incrementality, which is the thing that we care about and why we do the other steps in this process.

24 Comments
Like Comment
Jonathan Hershaff

Data Scientist @ Airbnb | ex-Stripe | Causal Inference | Economist | WhatsTheImpact.com

7,634 followers 6mo
Report this post
Even just 1% of outlier observations can dramatically distort data science models, especially in media mix models (MMM) where multicollinearity is a common problem. https://lnkd.in/exXyr9JJ If you’re building marketing models and want to make reliable business decisions, it's critical to understand how model identification can be driven by even a small number of quirky, outlier observations -- especially in the face of deep multicollinearity (correlated media inputs). In this tutorial, I walk through: ✅ Generating synthetic data with known ground truth (revenue as a function of Google Ads and Social Media Ads) ✅ Running a baseline linear regression to verify the true relationship ✅ Adding just 1% of outliers with high Google - low social or vice versa, and showing the impact ✅ Introducing multicollinearity to demonstrate how the distortion becomes even more severe 🎥 Full walkthrough: https://lnkd.in/exXyr9JJ #datascience #marketinganalytics #mediamixmodel #MMM #causalinference

Even scarce outliers can wreck your Media Mix Model (MMM) especially with multicollinearity

https://www.youtube.com/

2 Comments
Like Comment
Ted Lorenzen

Developing Analytics That Drive Marketing Outcomes @ ScanmarQED

6,230 followers 10mo
Report this post
Bayesian solves many problems for marketing mix models. It also creates a few (tanstaafl, and all that). Perhaps the most obvious is that the priors selected matter and selecting priors on MMM parameters (lag, carryover, coefficients) is not always very intuitive for stakeholders. Many bayesian MMM practitioners will use relatively vague priors as a work around, but this reduces (in my opinion, substantially reduces) the value bayesian methods bring to MMM, and also is a bit of a cheat in that sign-constrained effect estimates are actually a very informative prior but are often treated as 'uninformative.' The best practice for using informative priors on those unintuitive parameters is a prior predictive simulation, where priors are selected and then model fitted values are generated on samples pulled from the prior distribution without considering the data-generated likelihood function at all. pymc-marketing put together a nice workbook example of this recent (link in the comments). But this does put bayesian MMM in a bit of pickle -- if I set a prior based on my current knowledge, then run the prior predictive check and decide my prior must be wrong, can I adjust my prior? It's not really a prior belief anymore, in that I guess I didn't really believe it in the first place, yes? I take this be a reflection of the lack of intuition around how transformations and coefficients combine to create predictions -- it's hard to keep track of the effects of adstock and saturation across many drivers. But I think it's also true that I don't actually have a prior belief about the carryover parameter. I have a prior belief about the how long marketing takes to work but not the carryover parameter itself. So, in my mind, the prior predictive check is a conversation I have between my understanding/beliefs about how marketing works AND the arithmetic of the model to help me specify the prior distributions that match my beliefs . . .and predictive performance (a bit). Sometimes that conversation can bring value to marketing decisions in and of itself -- so put another check in the 'pro' column for bayesian marketing mix IF you use prior predictive checks.
No more previous content

No more next content
3 Comments
Like Comment

How to Overcome Statistical Challenges in Marketing Mix Modeling

Summary

Even scarce outliers can wreck your Media Mix Model (MMM) especially with multicollinearity

https://www.youtube.com/

More in Marketing Mix Modeling Insights

Explore categories