This number is technically correct. So why doesn’t anyone trust it? This was one of the hardest lessons to learn early in my analytics career: Data accuracy ≠ data trust. You can build the cleanest model. You can double-check the SQL, audit the joins, QA the filters. And still… stakeholders say: “That number feels off.” “I don’t think that’s right.” “Let me check in Excel and get back to you.” Here’s what’s often really happening: 🔄 They don’t understand where the number is coming from. If they can’t trace it, they can’t trust it. Exposing calculation steps or using drill-throughs can help. 📊 The metric name isn’t aligned to what they think it means. You might call it Net Revenue. They think it’s Net Revenue after refunds. Boom, there is misalignment. 📆 They forgot the filters they asked for. “Why are we only looking at this year?” → “Because you asked for YTD only, remember?” Keep context visible. Always. 🧠 They’re comparing your number to what they expected, not what’s correct. And unfortunately, expectations are rarely documented. 🤝 You weren’t part of the business process that generates the data. So when something looks odd, they assume it’s a reporting issue, not a process or input issue. Here’s the kicker: Sometimes, being accurate isn’t enough. You also need to be understandable, explainable, and collaborative. That’s when trust happens. Have you ever been 100% confident in a metric, only to spend more time defending it than building it? #PowerBI #AnalyticsLife #DataTrust #DAX #SQL #DataQuality #DataStorytelling
Limits of trust in data quality
Explore top LinkedIn content from expert professionals.
Summary
The “limits-of-trust-in-data-quality” refers to the point where having accurate or clean data is no longer enough—trust depends on whether data meets people’s expectations and is understandable within context. Even the best quality data can be doubted if users don’t know where it comes from, how it’s defined, or how it supports their decisions.
- Clarify data meaning: Make sure everyone understands what each metric represents and how it’s calculated to prevent confusion and misalignment.
- Document data changes: Keep a clear history of how and why data has been updated, so users can follow its journey and trust its reliability.
- Align with business needs: Regularly check that your data quality matches the requirements and expectations of your team or project, especially as your use cases evolve.
-
-
Here are a few simple truths about Data Quality: 1. Data without quality isn't trustworthy 2. Data that isn't trustworthy, isn't useful 3. Data that isn't useful, is low ROI Investing in AI while the underlying data is low ROI will never yield high-value outcomes. Businesses must put an equal amount of time and effort into the quality of data as the development of the models themselves. Many people see data debt as another form of technical debt - it's worth it to move fast and break things after all. This couldn't be more wrong. Data debt is orders of magnitude WORSE than tech debt. Tech debt results in scalability issues, though the core function of the application is preserved. Data debt results in trust issues, when the underlying data no longer means what its users believe it means. Tech debt is a wall, but data debt is an infection. Once distrust drips in your data lake, everything it touches will be poisoned. The poison will work slowly at first and data teams might be able to manually keep up with hotfixes and filters layered on top of hastily written SQL. But over time, the spread of the poison will be so great and deep that it will be nearly impossible to trust any dataset at all. A single low-quality data set is enough to corrupt thousands of data models and tables downstream. The impact is exponential. My advice? Don't treat Data Quality as a nice to have, or something that you can afford to 'get around to' later. By the time you start thinking about governance, ownership, and scale it will already be too late and there won't be much you can do besides burning the system down and starting over. What seems manageable now becomes a disaster later on. The earliest you can get a handle on data quality, you should. If you even have a guess that the business may want to use the data for AI (or some other operational purpose) then you should begin thinking about the following: 1. What will the data be used for? 2. What are all the sources for the dataset? 3. Which sources can we control versus which can we not? 4. What are the expectations of the data? 5. How sure are we that those expectations will remain the same? 6. Who should be the owner of the data? 7. What does the data mean semantically? 8. If something about the data changes, how is that handled? 9. How do we preserve the history of changes to the data? 10. How do we revert to a previous version of the data/metadata? If you can affirmatively answer all 10 of those questions, you have a solid foundation of data quality for any dataset and a playbook for managing scale as the use case or intermediary data changes over time. Good luck! #dataengineering
-
If You Can't Trust Your Data, You Can't Trust Your Decisions. 𝗕𝗮𝗱 𝗱𝗮𝘁𝗮 𝗶𝘀 𝗲𝘃𝗲𝗿𝘆𝘄𝗵𝗲𝗿𝗲—𝗮𝗻𝗱 𝗶𝘁'𝘀 𝗰𝗼𝘀𝘁𝗹𝘆. Yet, many businesses don't realise the damage until too late. 🔴 𝗙𝗹𝗮𝘄𝗲𝗱 𝗳𝗶𝗻𝗮𝗻𝗰𝗶𝗮𝗹 𝗿𝗲𝗽𝗼𝗿𝘁𝘀? Expect dire forecasts and wasted budgets. 🔴 𝗗𝘂𝗽𝗹𝗶𝗰𝗮𝘁𝗲 𝗰𝘂𝘀𝘁𝗼𝗺𝗲𝗿 𝗿𝗲𝗰𝗼𝗿𝗱𝘀? Say goodbye to personalisation and marketing ROI. 🔴 𝗜𝗻𝗰𝗼𝗺𝗽𝗹𝗲𝘁𝗲 𝘀𝘂𝗽𝗽𝗹𝘆 𝗰𝗵𝗮𝗶𝗻 𝗱𝗮𝘁𝗮? Prepare for delays, inefficiencies, and lost revenue. 𝘗𝘰𝘰𝘳 𝘥𝘢𝘵𝘢 𝘲𝘶𝘢𝘭𝘪𝘵𝘺 𝘪𝘴𝘯'𝘵 𝘫𝘶𝘴𝘵 𝘢𝘯 𝘐𝘛 𝘪𝘴𝘴𝘶𝘦—𝘪𝘵'𝘴 𝘢 𝘣𝘶𝘴𝘪𝘯𝘦𝘴𝘴 𝘱𝘳𝘰𝘣𝘭𝘦𝘮. ❯ 𝑻𝒉𝒆 𝑺𝒊𝒙 𝑫𝒊𝒎𝒆𝒏𝒔𝒊𝒐𝒏𝒔 𝒐𝒇 𝑫𝒂𝒕𝒂 𝑸𝒖𝒂𝒍𝒊𝒕𝒚 To drive real impact, businesses must ensure their data is: ✓ 𝗔𝗰𝗰𝘂𝗿𝗮𝘁𝗲 – Reflects reality to prevent bad decisions. ✓ 𝗖𝗼𝗺𝗽𝗹𝗲𝘁𝗲 – No missing values that disrupt operations. ✓ 𝗖𝗼𝗻𝘀𝗶𝘀𝘁𝗲𝗻𝘁 – Uniform across systems for reliable insights. ✓ 𝗧𝗶𝗺𝗲𝗹𝘆 – Up to date when you need it most. ✓ 𝗩𝗮𝗹𝗶𝗱 – Follows required formats, reducing compliance risks. ✓ 𝗨𝗻𝗶𝗾𝘂𝗲 – No duplicates or redundant records that waste resources. ❯ 𝑯𝒐𝒘 𝒕𝒐 𝑻𝒖𝒓𝒏 𝑫𝒂𝒕𝒂 𝑸𝒖𝒂𝒍𝒊𝒕𝒚 𝒊𝒏𝒕𝒐 𝒂 𝑪𝒐𝒎𝒑𝒆𝒕𝒊𝒕𝒊𝒗𝒆 𝑨𝒅𝒗𝒂𝒏𝒕𝒂𝒈𝒆 Rather than fixing insufficient data after the fact, organisations must 𝗽𝗿𝗲𝘃𝗲𝗻𝘁 it: ✓ 𝗠𝗮𝗸𝗲 𝗘𝘃𝗲𝗿𝘆 𝗧𝗲𝗮𝗺 𝗔𝗰𝗰𝗼𝘂𝗻𝘁𝗮𝗯𝗹𝗲 – Data quality isn't just IT's job. ✓ 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗲 𝗚𝗼𝘃𝗲𝗿𝗻𝗮𝗻𝗰𝗲 – Proactive monitoring and correction reduce costly errors. ✓ 𝗣𝗿𝗶𝗼𝗿𝗶𝘁𝗶𝘀𝗲 𝗗𝗮𝘁𝗮 𝗢𝗯𝘀𝗲𝗿𝘃𝗮𝗯𝗶𝗹𝗶𝘁𝘆 – Identify issues before they impact operations. ✓ 𝗧𝗶𝗲 𝗗𝗮𝘁𝗮 𝘁𝗼 𝗕𝘂𝘀𝗶𝗻𝗲𝘀𝘀 𝗢𝘂𝘁𝗰𝗼𝗺𝗲𝘀 – Measure the impact on revenue, cost, and risk. ✓ 𝗘𝗺𝗯𝗲𝗱 𝗮 𝗖𝘂𝗹𝘁𝘂𝗿𝗲 𝗼𝗳 𝗗𝗮𝘁𝗮 𝗘𝘅𝗰𝗲𝗹𝗹𝗲𝗻𝗰𝗲 – Treat quality as a mindset, not a project. ❯ 𝑯𝒐𝒘 𝑫𝒐 𝒀𝒐𝒖 𝑴𝒆𝒂𝒔𝒖𝒓𝒆 𝑺𝒖𝒄𝒄𝒆𝒔𝒔? The true test of data quality lies in outcomes: ✓ 𝗙𝗲𝘄𝗲𝗿 𝗲𝗿𝗿𝗼𝗿𝘀 → Higher operational efficiency ✓ 𝗙𝗮𝘀𝘁𝗲𝗿 𝗱𝗲𝗰𝗶𝘀𝗶𝗼𝗻-𝗺𝗮𝗸𝗶𝗻𝗴 → Reduced delays and disruptions ✓ 𝗟𝗼𝘄𝗲𝗿 𝗰𝗼𝘀𝘁𝘀 → Savings from automated data quality checks ✓ 𝗛𝗮𝗽𝗽𝗶𝗲𝗿 𝗰𝘂𝘀𝘁𝗼𝗺𝗲𝗿𝘀 → Higher CSAT & NPS scores ✓ 𝗦𝘁𝗿𝗼𝗻𝗴𝗲𝗿 𝗰𝗼𝗺𝗽𝗹𝗶𝗮𝗻𝗰𝗲 → Lower regulatory risks 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗱𝗮𝘁𝗮 𝗱𝗿𝗶𝘃𝗲𝘀 𝗯𝗲𝘁𝘁𝗲𝗿 𝗱𝗲𝗰𝗶𝘀𝗶𝗼𝗻𝘀. 𝗣𝗼𝗼𝗿 𝗱𝗮𝘁𝗮 𝗱𝗲𝘀𝘁𝗿𝗼𝘆𝘀 𝘁𝗵𝗲𝗺.
-
67% of senior leaders are prioritizing generative AI (GenAI) for their business within the next 18 months — and it’s introducing huge potential risks to their organizations. Since ChatGPT launched in November 2022, execs have become increasingly fixated on GenAI. Whether they’re driven by competitive pressures, a desire to boost efficiency, or plain old hype, the race is on to implement GenAI for internal and external use cases. And instead of aiming for a strategic journey towards trustworthy AI, the goal is often to just get it up and running as fast as possible. So they sideline the most important part of any AI-powered system: data quality and the data team that manages it. This leads to a vicious cycle. Bad data, with enough nods of approval, becomes “good enough” data. And when this “good enough”-but-not-actually-good data goes into the AI models at the data team’s rebuke, garbage comes out. Trust is lost. We've seen this mess unfold over and over, especially through last decade’s data science wave. Yet somehow, we still haven’t put the spotlight on our data quality. But now, with execs full-speed-ahead on AI, it’s up to data teams to throw up the “yield” sign and make some changes, starting with: • Implementing robust data validation processes to ensure accuracy and reliability from the get-go. • Fostering a culture of data literacy, where questioning and verifying data sources becomes second nature. • Establishing clear guidelines for data usage and model training to prevent the normalization of low-quality data inputs. We need to fix our data — and now’s a better time than ever. Because if we can't trust our data, how are we supposed to trust AI? #dataengineering #dataquality #genai #ai
-
Data trust often drops as data quality improves. Sounds backwards, right? When I first started working with data and AI at large enterprises more than a decade ago, one of the biggest blockers for data and AI adoption was the lack of trust in data. That is what got me into the world of data quality in the first place. But here is what I did not expect: 👉 As companies became more data mature and improved their data foundations (including data quality), data trust across the organisation often dropped. Should it not be the other way around? You would think that better data and increased data maturity would increase data trust. The reality is: 👉 Trust in data is not just about data quality. It is about whether the data, and its quality, meet the expectations and requirements for the data and AI use cases. When organisations become more data mature, their use cases evolve. Typically, it can look like this: 1️⃣ Ad hoc analytics 2️⃣ Dashboards used by management 3️⃣ Data products 4️⃣ Data as a product 5️⃣ AI/ML in production Advanced data use cases such as data products and AI/ML in production require much higher data quality than the "simpler" use cases. And here is the big problem: the data quality requirement increases much faster than the speed at which the underlying data quality improves. That is why organisational trust in data decreases when data maturity increases, even though the underlying data quality actually improves. 👉 For data leaders, here is the takeaway: To come out on top, you have to take data quality extremely seriously and proactively. Much more so than what is happening at the average enterprise out there right now. Data quality cannot be an afterthought. Do you agree?
-
“If nobody trusts your #CMDB, it doesn't matter how accurate it is.” Imagine having the world’s most advanced medical scanner… It’s fast, precise, 99% accurate but if doctors don’t trust the readings, they’ll never use it to make life-saving decisions. That’s what a clean but untrusted CMDB feels like in #IT operations. You can have: - 90% data accuracy - All the CI relationships mapped - Owners assigned - Discovery tools humming... But if: ⚠️ No one references it in Change, Incident and Problem ⚠️ DevOps ignores it ⚠️ Support teams question the ownership …it becomes shelfware with a heartbeat... impressive, but irrelevant. Data quality ≠ Data usability. Data isn’t valuable until someone relies on it. So how do you build trust in the CMDB? Not with tools but with culture: - Visibility: Make it part of workflows, not a silo. - Stewardship: Assign owners who own and evolve data. - Accountability: Align SLAs to CI health, not just ticket closure. - Application: Use it in Change risk scoring, Incident impact, and AI Ops correlation. Even the most intelligent #AI can’t operate on data your people don’t believe in. AI isn’t just about ingesting data. It’s about acting on trusted data. So here’s the question. Who owns CMDB trust in your organization? Is it the tool or the people behind it? #CMDB #CSDM #AIOps #ServiceNow #DigitalTransformation #Data #ITOperations #Leadership #ITSM #Strategy
-
The end-to-end data quality validation must combine data contract and business rules validation. A lot has been said about data contracts. They should define the structure and provide proof of validating data to meet constraints. In theory, if you receive data with a defined data contract, you should trust its validity since it has been tested. There is one issue with this belief. You don't want to receive data that is affected by any data quality issue, but the issues could be very subtle. Perhaps a data steward established a rule requiring at least 1000 records per weekday in the transactions table. What if there were no records for a bank holiday? The rule for validating the daily count of transactions is a typical business rule that data stewards should validate. Those types of tests are very valuable because they serve the primary purpose of data quality - to ensure the data is usable for its purpose. However, they should not break the data delivery when they are incorrectly defined. Here comes the real difference between data contracts and business rules. ⚡ Data contracts ensure that the data is in the correct format so it can be ingested without any transformations, filtering out corrupted records, or enriching data to add missing values. ⚡ Business rules validate that the data is usable for various business processes where the data is used. They can fail, but any issue should trigger an investigation. By understanding the difference, we can examine the end-to-end data quality process in data products, the perfect architecture. 👉 Data suppliers should validate that the data they share is not corrupted. The data consumer can revalidate the data only when the data supplier cannot be trusted. 👉 The data platform should reevaluate business rules defined by data stewards. 👉 If the data product shares any datasets with downstream data consumers, it should define a data contract that is validated on published data. #dataquality #datagovernance #dataengineering
-
Is “good enough” data really good enough? For 88% of MOps pros, the answer is a resounding no. Why? Because data hygiene is more than just a technical checkbox. It’s a trust issue. When your data is stale or inconsistent, it doesn’t just hurt campaigns; it erodes confidence across the org. Sales stops trusting leads. Marketing stops trusting segmentation. Leadership stops trusting analytics. And once trust is gone, so is the ability to make bold, data-driven decisions. Research tells that data quality is the #1 challenge holding teams back from prioritizing the initiatives that actually move the needle. Think of it like a junk drawer: If you can’t find what you need (or worse, if what you find is wrong), you don’t just waste time, you stop looking altogether. So what do high-performing teams do differently? → They schedule routine maintenance. → They establish ownership - someone is accountable for data processes. → They invest in validation tools - automation reduces the manual grind. → They set governance policies - because clean data only stays clean if everyone protects it. Build a culture where everyone values accuracy, not just the Ops team. Because clean data leads to clearer decisions and a business that can finally operate with confidence.