The Engineering of Trust: Why Metadata isn’t enough for AI
Every enterprise racing to deploy AI eventually hits the same moment. The board, the regulator, or a senior risk officer asks a simple question: can we trust this model? The data team turns to their catalog, clicks the lineage tab, and discovers what one enterprise CTO described as “a blank screen, or just two boxes and a really high-level picture. Maybe this system talks to this system.”
That blank screen is the governance gap that most organizations live with today. A data catalog can tell you that a dataset exists, label it, describe it, and help people find it. What it cannot tell you is whether that dataset is safe to feed into a model, where the data originally came from, how it was transformed as it moved through dozens of systems, or what breaks downstream if something changes upstream. The distinction between indexing data and mapping its complete journey is the difference between a library card and a blueprint. And when the question shifts from “what data do we have?” to “can we prove how our AI makes decisions?”, the library card is no longer sufficient.
Most enterprises believe they have solved data governance because they’ve invested in a data catalog. That belief is understandable. Catalogs were built to solve a real problem: helping organizations find, label, and understand their data assets. They serve as an internal marketplace where data scientists, analysts, stewards, and business users can discover what data exists, who owns it, how it should be used, and what it means.
The problem is that discovery and auditability are fundamentally different capabilities, built on materially different architectures. Catalogs index what exists, while lineage platforms map how data moves, how it transforms, and what depends on what as it flows across systems. In catalog-first platforms, lineage is often bolted onto an architecture designed for something else. In lineage-first platforms, the model of relationships between data assets is the primary object, and catalog metadata serves as context that enriches the map.
This architectural difference becomes visible the moment lineage hits a gap. In a catalog-first tool, a missing lineage is a dead end. There is no modeling interface, no way to manually bridge what automation missed. As one CTO put it, “when you have a catalog-first lineage solution, when there’s a gap, it’s a fatal error. There’s a gap in the lineage, I can’t get it. I’m stuck.” That gap might not matter when governance is a periodic documentation exercise. It matters enormously when you need a complete, end-to-end picture of every data flow feeding an AI model.
The scale of this architectural mismatch is becoming visible in the market. Microsoft, which has its own native lineage capability within Purview, chose to partner with a specialist lineage vendor as its Data Lineage Integration Partner. Even the largest platform providers recognize that basic catalog lineage is insufficient for complex enterprise environments.
That architectural gap was manageable when governance meant periodic documentation and occasional audits. AI has changed the calculus because regulators are now writing lineage requirements into law, and the engineering reality of how models fail demands a level of traceability that catalogs were never designed to provide.
The EU AI Act now imposes specific transparency requirements for AI systems classified as “high-risk,” including those used in credit scoring, hiring decisions, insurance underwriting, and law enforcement. Organizations deploying these systems are required to provide information about the system’s data lineage and the logic behind its decisions. Separately, the European Central Bank’s May 2024 guidance on risk data aggregation specifies data lineage at the data attribute level as a minimum requirement of an effective data governance framework. (We’ll explore the BCBS 239 compliance strategy in detail in an upcoming piece.)
These regulatory requirements reflect a broader reality that McKinsey has identified in its research on AI trust: the benefits of AI can only be realized if it is deployed safely and with clear accountability.1 Business leaders, board members, and risk officers are all asking the same underlying question, and most organizations cannot answer it with the tools they have today.
The engineering reality is just as pressing. When a production model starts returning biased or inaccurate outputs, someone needs to trace back through the data supply chain to find the root cause. Consider a scenario that has played out at more than one financial institution. A third- party data vendor changes its address format from three lines to two, a routine upstream schema change of the kind that happens constantly across large data ecosystems. But that address field was also a feature in a downstream machine learning model. The format change inadvertently broke the model’s ability to match and score records, and no one knew why until the outputs had already degraded. No catalog would have surfaced this dependency because catalogs index.
Any approach to governing AI should be measured against three capabilities. If your current tools cannot answer these questions about your own environment, the gap between your governance posture and your actual exposure is wider than you think.
Can you demonstrate the complete journey of your AI’s data?
This is the “prove test”. Can you show a regulator or auditor the complete path of every dataset feeding your AI, including all transformations, at the data attribute level? HSBC faced this challenge with cross-border data sharing compliance, where answering a single governance query required months of manual review. After deploying a lineage-based self-service solution, those same queries now return answers in minutes, with up to 90% efficiency gains and 400 operational staff redeployed to higher-value work.
Can you see when sensitive data flows where it shouldn’t?
BNY has built what their former Chief Data Officer, Eric Hirschhorn, calls a “data risk usage board.” Before any AI model reaches the proof- of-concept stage, a cross-functional team spanning ethics, privacy, the CDO office, AI, and legal reviews the data supply chain to ensure “the security and the privacy and the ethics follows the information regardless of the transformation.” This is the “protect test” in practice. Policy documents define what data is permissible for what use, but policies alone are unenforceable without lineage that maps actual data flows against the rules that govern them.
Can you simulate what happens before you make a change?
Before a major platform migration, one UK-based asset manager mapped 2,000 source tables, over 175,000 fields, and 99,000 data attribute flows across 41 systems, identifying every downstream dependency and potential breaking point before a single change was made. The capability that made this possible is bi- temporal lineage, which provides the ability to view past states and simulate future states of your data estate. It amounts to version control for your entire data architecture, and it is the “predict test” that most organizations cannot pass today. (We’ll go deeper on AI forensics and bi-temporal debugging in a future piece.)
The institutions that have moved the furthest on this journey followed a recognizable pattern. They started with regulation and capabilities, not AI. LSEG described the EU’s Digital Operational Resilience Act as the trigger that helped them secure senior leadership support for a multi-year data lineage program2. BNY’s Hirschhorn describes a similar trajectory, moving “from defense to offense,” where building governance foundations created the platform they now use to govern AI outcomes. M&T Bank used lineage to connect business context to technical reality, transforming governance from a compliance mechanism into a strategic asset.
The common thread is that the lineage foundation built for compliance turned out to be the most valuable asset for AI readiness. These organizations did not build one system for regulatory reporting and another for AI governance. They built a single architectural layer that serves both.
For CDOs and CROs evaluating where their own organizations stand, three steps are worth considering.
1. Run the blank screen test.
Pick your most business-critical AI model, open your current governance tool, and try to view the complete lineage for its training data at the data attribute level. If you see gaps, high-level boxes, or nothing at all, you have identified your exposure.
2. Start with a regulatory use case.
Compliance budgets are already allocated, and BCBS 239, DORA, or EU AI Act readiness can serve as the on-ramp for building lineage infrastructure. The institutions that moved fastest started here.
3. Evaluate your architecture.
Ask your vendor whether their platform is catalog-first or lineage-first, and ask them to demonstrate end-to-end lineage for a single regulatory report at the column level for some non-standard sources and applications. The answer will tell you whether you can meet the governance demands that AI is creating. (In an upcoming piece, we’ll map the full technical and strategic differences between data lineage and metadata management.)
The organizations deploying AI with confidence are the ones building the engineering layer underneath it. They started by asking a simple question: when someone clicks the lineage tab, what do they see? If the answer is still a blank screen, the distance between your governance posture and your actual risk is growing every day. The blueprint matters more than the library card. It always did. AI just made it urgent.
Want more insights on AI governance, data lineage, and regulatory readiness? Subscribe to our newsletter below.
1. Giovine, Carlo, and Roger Roberts, with Mara Pometti and Medha Bankhwal. 2024. “Building AI Trust: The Key Role of Explainability.” McKinsey. November 26, 2024. https://www.mckinsey.com/capabilities/quantumblack/our- insights/building-ai-trust-the-key-role-of-explainability what exists, not what depends on what. With end-to-end lineage at the data attribute level, the dependency would have been visible, and the impact could have been assessed before the change reached production.
2. Aliya Shibli, “LSEG’s Journey from Regulation to Revenue through Data,” The Banker, September 25, 2025, https://www.thebanker.com/content/18fd41d8-f187-4b3e-81fa-534797e24f8f.
Published on: February 19, 2026