The biggest blocker to running an accurate media mix model (MMM) isn't the modeling. It's messy data. To run an MMM you need: - 2-3 years of daily spend data from ALL channels - Conversion data (revenue, leads, transactions) - Basic external factors (seasonality, promotions, etc.) Yes, you can have more inputs, but that's the basics. The model itself is pretty straightforward too if you're slightly technical and willing to read "read the manual". PyMC, Meta's Robyn, or Google's Meridian. These are all open source and well-documented. But lack of clean data is what will grind things to a halt. The most common issues we see when running MMM's for mid-market and enterprise brands. 1. Missing data - "We switched platforms last year and lost the exports" - "We did a big one-off media buy, can't find the amount" - "Our old agency has the Facebook data but won't share it" 2. Bad data structure We want to be able to break down spend by channel, tactic, and funnel stage (at minimum). Your campaigns/adsets/ads should have a consistent structure. - Campaign names like "FB_Prosp_Q1" vs "Meta_Cold_Audience" - Conversions are tracked differently across sales channels - No way to separate branded vs non-branded search - Lumping Search/Disp/PMAX into one big "google" bucket 3. Access issues: - No one/team has access to all data or platform logins - Data scattered across 15+ platforms - Nobody knows who owns what So, if you're thinking about running MMM, start by cleaning up and finding all your data. Otherwise, you'll just be paying agencies to organize it for you (if it's even possible). Quick steps to avoid MMM data delays - Create a Google Sheet that lists all your media platforms (current and past) - Note who in your org has access - Audit campaign names and standardize if needed - Start exporting this data (spend, conversions, rev) - Automate the export to a warehouse (if you have the tech/know-how) What's your biggest MMM data challenge? Drop it in the comments. We've run models for hundreds of mid-market and enterprise brands. By now, we've seen it all (most.)
Tips for Navigating Media Mix Modeling Challenges
Explore top LinkedIn content from expert professionals.
Summary
Media mix modeling (MMM) helps businesses understand how various marketing channels contribute to outcomes like sales or leads. However, common challenges like messy data, inaccurate assumptions, or lack of data variation can hinder its accuracy and usefulness for decision-making.
- Start with clean data: Ensure you have 2-3 years of complete and well-structured historical data, including spend, conversions, and external factors like seasonality. Missing or fragmented data can derail the entire process.
- Validate with real-world insights: Don’t rely solely on statistical outputs; cross-check your model’s results with domain expertise and practical benchmarks to ensure they make sense.
- Create meaningful variations: Avoid flat or steady spending patterns in key channels, and use experiments like holdout tests to generate the necessary data fluctuations for more accurate modeling.
-
-
I made my media mix model lie and then I made it lie again. My PyMC-based MMM had beautiful R-squared scores and impressive MAPEs. It even nailed the train-test splits. But guess what? The results were still completely misleading. How could I tell? Because the outputs failed the sniff test. Channels known from real-world experience to drive revenue weren't showing up as impactful, and some minor channels were inflated beyond reality. Good-looking statistical measures don’t guarantee an accurate reflection of your marketing reality, especially if your data isn't telling the whole story. Here's what actually went wrong: My model lacked enough meaningful variation—or "signal"—in key marketing channels. Without clear fluctuations in spend and impressions, even sophisticated Bayesian models like PyMC can't accurately infer each channel's true incremental impact. They end up spreading credit randomly or based on spurious correlations. Here’s what I do differently now: I always start client engagements with a signal audit. Specifically, this means: * Reviewing historical spend patterns and ensuring sufficient spend variation across weeks or regions. * Checking for collinearity between channels (e.g., Google Search branded and non-branded), which can cause misleading attribution. * Identifying channels stuck in “steady state” spending—these need deliberate experimentation to create fluctuation. Once the audit flags weak-signal channels, I run deliberate, controlled lift tests (such as holdout tests or incrementality experiments) to create the necessary data variation. Only after these signal issues are fixed and lift tests integrated do I trust the model: * I feed the experimental data into the model * I validate the model against domain knowledge, sanity-checking contributions with known benchmarks and incrementality test results. * And only then do I let the model drive budgeting and channel allocation decisions. Bottom line: Great statistical fit isn't enough. Your model must pass both statistical tests and practical, real-world "sniff tests."
-
10 common mistakes I’ve seen MMM consultants make: 1. Measuring only in-sample accuracy which encourages over-fitting and bad modeling practices. Modelers should focus on predictive accuracy on data never seen before. 2. Assuming that marketing performance doesn’t change over time. No one believes that marketing performance is constant over time, so why make that assumption? 3. Assuming seasonality is additive and doesn’t interact with marketing performance. This will generate nonsensical results like telling you to advertise sunscreen in the winter and not in the summer! 4. Using automated variable selection to account for multicollinearity. Automatic variable selection methods (including methods like ridge regression and LASSO) don’t make any sense for MMM since they will “randomly” choose one of two correlated variables to get all of the credit. 5. Assuming that promotions and holidays are independent of marketing performance, rather than directly impacted by it. 6. Using hard-coded long-time-shift variables to account for “brand effects” that aren’t actually based in reality. By “assuming” long time shifts for certain channels they can force the model to assign way too much credit to that channel. 7. Allowing the analyst/modeler to make too many decisions that influence the final results. If the modeler is choosing adstock rates and which variables to include in the model, then your “final” model will not show you the true range of possibilities compatible with your data. 8. Assuming channels like branded search and affiliates are independent of other marketing activity rather than driven by it. 9. Only updating infrequently to avoid accountability – if your results are always out of date then no one can hold the model accountable. 10. Forcing the model to show results that stakeholders want to hear instead of what they need to hear. With a sufficiently complex model, you can make the results say anything. Unfortunately, this doesn’t help businesses actually improve their marketing spend.
-
Ever watched a market mix model bend reality to fit a senior exec’s hunch? That’s a bad prior at work. In Bayesian MMM we start with beliefs (priors) and let data update them. Done right, priors guide us toward plausible answers fast. Done wrong, they blindfold the model and force it into bad answers. Where it goes off the rails... In a few examples. Firstly, self-serving priors: An external party bakes in a high TV elasticity so the post-analysis screams “double your GRPs.” Secondly, internal wish-casting: A BI team hard-codes “brand search drives 40 % of sales” because it always has in last-click. So how can you keep your MMM models honest? Interrogate the priors. Ask exactly which distributions are pinned down and why. “Industry benchmarks” without proof is not an answer. Stress-test them. Swap in weak priors and compare ROI swings. More than ±20 %? Your prior is steering the ship. Demand hold-out accuracy. A model that can’t predict next month isn’t worth your budget. Bad priors are going to be the bane of the modern Bayesian MMM stack. Treat them like any other financial assumption - and challenge until they break or prove themselves. Models should be stable, fast and subject to scrutiny. Anything less is going to turn MMM into MTA 2.0. And that's bad for everyone.