The Data Fairy is Dead

Five data lineage myths that the masterclass got right

Nicola Askham has spent years helping organizations untangle their data. During a manual lineage exercise at one company, she traced a data quality issue upstream through multiple systems, only to hear a colleague snap: “What do you think it is? The Data Fairy lives in the data warehouse and magically makes all our data right?”

The joke landed because the belief behind it is widespread. Across industries, organizations still operate as if governance happens on its own and data arrives in the right place, at the right quality, with the right permissions. No one maps how it got there because everyone assumes someone else already did. Gartner’s 2025 CDAO survey found that governance ranked as the least effective capability among data and analytics teams, with only 55% of organizations rating their efforts as effective.1

At a recent Solidatus Data Masterclass, three practitioners dismantled the myths that keep this fairy tale alive. Nicola Askham is an independent data governance coach who has guided governance programs across dozens of enterprises. Philip Dutton is the CEO and co-founder of Solidatus, a data lineage platform built for complex enterprises, drawing on nearly two decades of experience in financial services. Caleb Watkins is a Solidatus solutions engineer who builds lineage implementations for complex organizations. Together, their conversation surfaced five misconceptions that most governance teams still carry and exposed why AI is about to stress-test every one of them.

1 – Data lineage is a documentation exercise

Data lineage tracks how data moves through an organization, from its origin through every transformation, system, and output. When organizations think of lineage as documentation, they get documentation: static diagrams that go stale the moment a pipeline changes. Nicola has watched teams produce lineage artifacts for an audit, file them away, and then start from scratch when the next review cycle arrives because everything has changed in the interim.

“Don’t treat data lineage just as documentation. Think of data lineage as infrastructure for your organization.” — Caleb Watkins, Solidatus

Philip sharpened the distinction during the session: “Lineage isn’t documentation. It’s part of an operating model of a really agile business.” Infrastructure implies something living, queryable, and load-bearing. It means a map that updates when the territory changes, one that people can query without filing a request and waiting weeks for an answer. According to Gartner, only 34% of organizations have connected governance to outcomes and business value.2 The rest are still producing artifacts that satisfy a compliance requirement but fail to inform the decisions that governance exists to support.

2 – Data quality problems live where you find them

When a dashboard shows bad numbers, the instinct is to fix the dashboard. A report that fails validation gets investigated at the report level. Nicola described spending years doing exactly this kind of manual forensic work, tracing data quality issues through layers of systems and transformations: “Just because it’s come from the data warehouse doesn’t mean it’s right, because that’s come from somewhere else before that, and probably somewhere else before that.”

“It’s like the human body. You might have an ache in your foot, but that’s not necessarily where the problem is.” — Philip Dutton, Solidatus

Philip framed it as a systems problem: “Data isn’t bad. The quality of data isn’t bad. That’s a problem systemically, somewhere in the system that is creating that.” Without lineage tracing the data back to its origin, organizations patch the symptom and wait for it to recur. He also challenged the assumption that all data requires the same level of rigor: “All data doesn’t need to be at 100% quality. Sometimes you need to know close enough is good enough.” Lineage makes that judgment possible by revealing what each data element feeds and what downstream decisions depend on it.

3 – If you own the data, you know where it goes

Nicola has heard one response more than any other when she asks someone to take on the data owner role.

“I think I am the data owner, Nicola, but until you can forensically prove where it is in the organization, I’m refusing to take on that role.” — A data owner, quoted by Nicola Askham

Versions of this pushback come up in nearly every engagement. Data owners are asked to take accountability for something they cannot see. When no one can show them where their data flows, what depends on it, and where it transforms, the role feels like liability without leverage.

The alternative is manual discovery, which Nicola described from experience: “I have to come to you, then you say talk to Caleb, then Caleb’s out of office, Hannah just ignores me for weeks. It just kind of drags on.” What should be a straightforward lookup becomes weeks of calendar chasing, all to answer a question that automated lineage can resolve in seconds. Gartner defines governance as “the specification of decision rights and an accountability framework.”3 But accountability without visibility is unenforceable. Data owners need lineage not as a reporting obligation but as the evidence base that makes their role actionable.

4 – Governance slows innovation down

Nicola named a fear that governance teams carry quietly: “Data governance is a slow process. It takes a long time to put in place. But the whole point of AI is it’s automating, making things faster, and we’re in danger of being the ones that go ‘no, stop all innovative things.'”

She was candid about it because she lives in that tension every day. When governance depends on manual lineage, manual classification, and manual approvals, it does slow things down. But the bottleneck isn’t governance itself. It’s the tooling that governance has been forced to rely on.

During the masterclass demo, Caleb used the Solidatus platform to scan a Power BI environment in approximately 2 seconds and a Snowflake database in roughly 30 seconds, automatically producing column-level lineage. Compare that to the weeks Nicola described spending on manual tracing. That speed gap points to an infrastructure problem, not a governance one, and better tooling now exists to close it.

“Data governance is the thing that helps us go faster. Helps us go faster safely.” — Philip Dutton, Solidatus

5 – AI governance is a separate problem

This may be the most consequential misconception on the list. At the Gartner Data & Analytics Summit in March 2026, analyst Sarah Turkaly warned that “data governance will be the single point of failure for organizations’ AI ambitions.”4 Gartner also predicts that by 2027, 60% of organizations will fail to realize the value they expect from AI because they haven’t integrated data governance with AI governance.5

Philip made the case that the integration starts with lineage: “We can’t automate what we don’t understand.” When Solidatus partnered to deconstruct open-source large language models, they found that most of the training data was publicly generated content rather than enterprise or scientific data. The question of whether an AI system’s outputs can be trusted starts with knowing what went in.

The urgency is both regulatory and operational. The EU AI Act begins enforcing requirements for high-risk AI systems in August 2026, with penalties reaching €35 million or 7% of global turnover.6 Nicola raised a gap that most existing governance frameworks have not addressed. AI systems can consume data without the data owner even knowing it. The governance structures built for human-driven processes were not designed for systems that autonomously ingest, transform, and act on data across the organization. Philip pointed to a near future where organizations may run “hundreds of thousands and millions of agents,” each consuming data, making decisions, and generating new data of its own. As Gartner analyst Mark Beyer has observed, “Agents can only learn from what is documented and present. No real-world context, only data.”7

“Agents can only learn from what is documented and present.” — Mark Beyer, Gartner

The governance teams that have spent years building lineage, enforcing accountability, and tracing data flows already have the foundation for this. The work isn’t starting over. It’s extending what they built.

Where to start

Near the end of the session, each speaker offered one piece of advice for organizations getting started with data lineage. Their answers converged on the same principle: start with something specific and let the results make the case for what comes next.

Nicola recommended starting with what’s broken: “Don’t boil the ocean. Focus on the essentials first. Is there something that’s broken and causing a problem in your organization right now? Look at that first.” In a recent blog post, she applied the same thinking to AI governance.8 Don’t lead with “we need governance first,” because that framing loses the room. Start with the AI initiative, understand what data it needs, and work backward.

Caleb grounded the advice in a principle: “Trustworthy AI is built on observable, explainable, and governed data pipelines.” Lineage is what makes those pipelines observable.

Philip offered the most forward-looking version: “Get started. You don’t have to start with trying to do everything. Start small with one use case that delivers real value to the business. I guarantee that it will unlock the next ‘what do we need to do and why are we doing it?’ And very quickly it will start to snowball.”

Watch the full Data Masterclass to hear the complete conversation and see automated lineage in action.

Ready to see what data lineage-as-infrastructure looks like for your organization? Request a demo.

1Gartner. “5 Things That Cause Sleepless Nights for Heads of Governance.” Gartner Data & Analytics Summit, March 2026.

2Turkaly, Sarah. “The Future of D&A Governance.” Gartner Data & Analytics Summit, March 2026.

3Gartner. “5 Things That Cause Sleepless Nights for Heads of Governance.” Gartner Data & Analytics Summit, March 2026.

4Gartner. “Gartner Data & Analytics Summit 2026 Orlando: Day 3 Highlights.” Gartner Newsroom, March 11, 2026.
https://www.gartner.com/en/newsroom/press-releases/2026-03-11-gartner-data-and-analytics-summit-2026-orlando-day-3-highlights

5White, Andrew and Lauren Kornutick. “Data and Analytics Governance vs. AI Governance.” Gartner Data & Analytics Summit, March 2026.

6EU Artificial Intelligence Act. “Implementation Timeline.”
https://artificialintelligenceact.eu/implementation-timeline/

7Beyer, Mark. “Using Active Metadata to Support Data Agents for AI.” Gartner Data & Analytics Summit, March 2026.

8Askham, Nicola. “Why Your Executives Need to Hear This Before Your Next AI Project.” NicolaAskham.com, March 2026.
https://www.nicolaaskham.com/blog/aianddatagovernance

Frequently asked questions

01.

What is data lineage and why does it matter for governance?

Data lineage maps how data moves through an organization, from its point of origin through every transformation, system, and output. It matters for governance because it provides the visibility that accountability depends on. Without lineage, data owners cannot verify where their data flows, quality teams cannot trace problems to their source, and compliance teams cannot demonstrate the integrity of regulatory reports. Gartner’s 2025 CDAO survey found that only 55% of organizations rate their governance efforts as effective, and the lack of lineage infrastructure is a primary reason.

02.

How is data lineage different from data documentation?

Documentation is a static artifact, typically produced for an audit or compliance filing and then shelved until the next review cycle. Lineage as infrastructure is living and continuously updated. It reflects the current state of data flows across the organization, can be queried on demand, and informs real-time decisions about quality, impact analysis, and change management. As Solidatus solutions engineer Caleb Watkins described it, lineage should function as “infrastructure for your organization,” not a set of diagrams that go stale the moment a pipeline changes.

03.

Why is data lineage important for AI governance?

AI systems consume data from across the organization, and their outputs are only as trustworthy as their inputs. Lineage provides the traceability needed to assess whether the data feeding an AI model is fit for purpose, correctly sourced, and governed. For a deeper look at this topic, see Four AI Governance Questions Your Data Catalog Cannot Answer. Gartner predicts that by 2027, 60% of organizations will fail to realize the value they expect from AI because they haven’t integrated data governance with AI governance. Lineage is the connective tissue between those two disciplines.

04.

Where should organizations start with data lineage?

The practitioners in the Solidatus Data Masterclass converged on the same advice: start small. Identify one use case that is broken, causing risk, or consuming disproportionate manual effort. Build lineage for that use case, demonstrate measurable value, and let the results make the case for expanding. Governance coach Nicola Askham recommends working backward from an AI initiative to understand what data it needs, rather than leading with “we need governance first.”

05.

How does automated lineage compare to manual lineage?

Manual lineage involves tracing data flows person by person, system by system, which can take weeks for a single question. During the Solidatus Data Masterclass, a live demo demonstrated automated scanning of a Power BI environment in approximately 2 seconds and of a Snowflake database in roughly 30 seconds, producing column-level lineage. Automated lineage also stays current as systems change, while manual documentation becomes outdated almost immediately after it is created.

06.

What does Gartner say about data governance and AI?

At the Gartner Data & Analytics Summit in March 2026, analyst Sarah Turkaly warned that “data governance will be the single point of failure for organizations’ AI ambitions.” Gartner research also shows that only 34% of organizations have connected governance to business outcomes. As AI adoption accelerates and organizations deploy autonomous agents at scale, the governance teams that have invested in lineage, accountability, and traceability are best positioned to extend their foundations into AI governance. Solidatus was named in the inaugural Gartner Magic Quadrant for Data and Analytics Governance Platforms in 2025.

Published on: April 7, 2026

Contents

Related articles

Blog

Detailed Expectations Around End-to-End Data Lineage and BCBS 239 From the European Central Bank RDARR Guide

In May 2024, the ECB released its ‘Guide on effective risk data aggregation and risk reporting (RDARR)’...

Blog

Why is Advanced Data Lineage Fundamental for Financial Services Organizations?

Read why advanced data lineage is crucial for business success

Blog

Continuing Innovation in Advanced Data Lineage to Help Answer Business Questions

An update on some recent developments in our latest product releases

Blog

Unveiling the Path: Why Data Lineage is Crucial for Building Effective AI Products

Read more about data lineage and its business impact, including on AI, BCBS 239 and more

Blog

Solidatus & Microsoft Purview: Elevating Data Governance in the AI Era

Solidatus data lineage partners with Microsoft Purview to help enterprises trust their data

Blog

Blasting Off: Why Proactive Data Governance is Propelling Innovation

Read our key takeaways from Gartner D&A Summit 2024

Blog

Live Demo: Explore our All-new Interface

Video introducing our new interface and core features like Connected Catalog and Data Map

Blog

Visualize Snowflake Horizon and Enhance its Impact

Read about Solidatus and Snowflake Horizon's governance solution

Blog

Advanced Data Lineage: The Cornerstone of Modern Data Governance

Explore the various aspects of data lineage and its crucial role in your organization.

Blog

Achieving Basel III Compliance: A 3-Step Action Plan

Basel III is changing – are you prepared? Read 3 easy steps with Solidatus

Blog

Building a Data Community

Read how we helped successfully launched the Houston Women in Data Chapter

Blog

Data Lineage for Better Planning

Exploring the parallels between urban planning and data planning projects

Blog

Data Distress: Data Leaders on Brink of Quitting Jobs

71% of senior data leaders in financial services polled are close to quitting their jobs

Blog

Supercharging your Snowflake Governance with Solidatus

Take a look at what's new in our partnership with Snowflake

Blog

The Value of Data: Reflections from Attending Gartner

VP Product, Tina Chace, reflects on the Gartner conference, covering data governance and AI

Blog

Douze Points for New Way to Visualize Eurovision Data

We’ve linked the Eurovision Song Contest to the realm of data governance and data lineage

Blog

Quick Answer: What is Active Metadata?

In the latest Gartner® research note, find out what active metadata is

Blog

The Amazing World of Active Metadata

The role of metadata, dynamic visualization and inference across metadata

Blog

5 Ways Great Metadata Connectors are Game-Changers

Automatic connectors are essential for efficiently mapping metadata but not all are created equal. We look at the most important...

Blog

5 Things we Can’t Wait to Do at the Gartner Summit, Orlando

We discuss injecting active metadata into your governance and 4 other things we’re looking forward to at the Gartner® Data...