Home » Data management
Blog banners snowflake 1

What is Snowflake Horizon?

Snowflake Horizon is Snowflake’s built-in governance solution that unifies compliance, security, privacy, interoperability, and access capabilities ​enabling​ customers to easily govern data, apps, and more across clouds, teams, partners, and customers. 

Find Solidatus on Snowflake Horizon here.

Solidatus and Snowflake Horizon

Solidatus supercharges Snowflake’s governance capabilities with active metadata management and advanced visualization. With an end-to-end blueprint of your data landscape, you can understand its usage, define a control strategy, and make major time and cost savings. 

  • Enhance data observability 
  • Boost governance and compliance 
  • Highlight use of restricted data 
  • Reduce risk and improve operating efficiency 
  • Fast-track Snowflake migrations
Untitled design 6

With Solidatus you can:

Bring all the components of your data fabric into one, unified view

Solidatus’ powerful automation creates data blueprints covering your entire data landscape, including the sources upstream of, within and downstream of your Snowflake instance. By doing this, you can gain a clear vantage point from which to view your data ecosystem, answer the questions that matter most, and carry out strategic change.

Simplify and streamline data asset discovery, mapping, and monitoring in both hybrid multi-cloud and on-premise environments.

  • Create dynamic blueprints of a Snowflake instance, illustrating databases, schemas, tables, and views, displaying column-level dependencies and lineage, including between Snowflake databases. 
  • Boost data governance by integrating Snowflake governance rules such as tags, row access and data masking policies into your data blueprint. 
  • Derive unique insights by analyzing Snowflake governance rule application to tables, views and columns. 

Enhance data security

  • Highlight and investigate active Snowflake row access policies on relevant tables, accessing ownership and logic details, and easily visualize dynamic data masking policies across Snowflake databases. 
  • Rapidly determine if the table you’re using has access restrictions and ensure proper protection for dependent views. 
  • Identify and govern sensitive data movement across your data landscape, both upstream and downstream from Snowflake. 
  • Ensure consistent application of security policies across your data supply chain and enhance your data security insight and capability. 

Leverage tailored business insights

  • Snowflake tags enhance tables and columns with predefined allowable code sets. Solidatus incorporates these allowable values into your data blueprint. 
  • Easily view all Snowflake tags and their application across Snowflake databases and downstream targets. 
  • Apply Solidatus rules and filters to your data blueprint to capture additional business insights and identify gaps, issues and exceptions. 

Collaborate across teams and automate your audit trail

Solidatus provides a complete audit trail of all changes, automatically maintained and always accessible, with powerful visualization to display all differences along a timeline. Coordinate, control and plan changes throughout your organization regardless of the type of system, the data in use, where it is or who owns it. 

Looking to the future

We’ll be enhancing the depth and breadth of Snowflake Horizon by​ enabling users to​: 

  • Ingest Snowflake roles and visualize access permissions to database objects to improve audit control and governance. 
  • Automatically categorize semantic and privacy information for most accessed tables – such as PII as Snowflake tags – and visualize ​them atop the​ir​ organization’s data blueprint through Snowflake Classification. 
  • Broaden the scope of our end-to-end data blueprint by directly ingesting Snowpipe data flow information. 
red black and white ribbons HIGH RES 2 scaled

2023 shows no signs of slowing down for us and our partners, marked by a significant win at the Banking Tech Awards from FinTech Futures for Best Use of Tech in Business Lending. We take pride in this recognition, which reflects our successful and ongoing efforts to deliver effective services to our customers.

1701428373096

Our joint submission showed how HSBC supercharged their Wholesale Lending business with our next-gen version-controlled graph tech.

Within six months of launching, a small business team documented and modelled HSBC’s entire wholesale lending book, demonstrating traceability from source to consumption. The team has now successfully modelled 2,000 source tables with 80,000+ fields, and 20,000+ data linkages across 45 source systems used globally.

With this Winter win of 2023, we are in high anticipation of 2024 and what it will bring. If you want to learn more about our data sharing work with HSBC, you can access it all in our case study 

Fiber optics blue HIGH RES scaled

After a remarkable year filled with success and expanding our reach, we are happy to announce our win in the ‘Best Data Governance Solution’ category at the A-Team Data Management Insight Awards USA 2023.

This marks our third win for us in the ‘Best Data Governance Solution’ category – a testament to our transformative data governance capabilities in data management.

Our data governance solution goes far beyond basic data management, adding different dimensions through powerful dynamic visualizations and this award solidifies our position as leaders in this domain. In today’s complex digital landscape, having a visual representation of the connections that define and drive an organization can be the key differentiator between success and failure.

DataManagmentInsightsAwards 023

Solidatus CEO & Founder Philip Dutton

With high hopes for the year ahead, we anticipate 2024 will be just as rewarding, and we eagerly anticipate sharing these accomplishments with our colleagues, partners, clients, and industry peers.

For more information on this great awards ceremony, you can check out the A-Team’s dedicated showcase page: https://a-teaminsight.com/awards/data-management-insight-awards-usa/?section=winners

23.08 000005 GRIC Data lineage for better planning SG top banner horizontal scaled

Urban planning is a lineage problem. Well, to be completely honest, at Solidatus, we’re convinced that all manner of things are lineage problems.

What does this have to do with data management? Well, bear with us, as the parallels we explore at the start of this blog post – the third and penultimate part in our lineage series – will resonate with anyone planning to alter or enhance their data estate in any way.

Before we dive in, if you need a primer on the concepts detailed in this article, be sure to read our guide to data lineage.

Let’s start by looking at an aerial view of two cities and their approach to their development. The first is Kolkata in India; the second is Shanghai in China.

Kolkata

Above: aerial view of Kolkata in India

shanghai

Above: Shanghai in China before expansion (left); after expansion (right)

You can clearly see the differences in the planned expansion of the city. Kolkata has expanded into new territory and planned its road systems, fitting the various city zones around it, whereas Shanghai has just replaced the old with the new. SimCity, anyone?

It’s about the availability of resources – just like in SimCity, the urban planner has choices about what they can do – Do I have land? Do I have transport links? Do I have money? What’s most important? In both cases, there’s a river – a vital resource for most of the world’s great cities (New York has the Hudson, Rome has the Tiber, London the Thames, Paris the Seine…). So, building near to this really important asset is essential – nature is the best builder after all. Just as with the expansion of London to the south of the Thames, Kolkata had the option to expand into new territory on the opposite bank and could then plan for this. In the case of Shanghai, they clearly had built-up areas on both sides of their river, meaning a choice was needed on whether to bulldoze and replace. This is what they went with.

Why this is important

Is the project planned or under-planned? Note: very little is totally unplanned!

No one goes into any building project with no plan – they’re always building something. They can be qualified for this and have the right tools, they can consult others and try to make something that fits into the surrounding area’s plans – or they can just put up their building not knowing about utilities, the type of ground (Pisa, we’re looking at you) and hope for the best.

Let’s bring it a little closer to Solidatus’ UK HQ in London and the Thames.

This river was once much wider than it is now (four times wider, as it happens), and when London needed to expand, it exploited this space, most recently to create room for underground tube lines and sewerage (both in the 1860s and again in the 2020s). Below are a couple of images giving you an idea of the construction projects where the cut-and-cover techniques were used to expand the Victoria Embankment, first for the removal of waste – thank you, Joseph Bazalgette – and then for the commuting public. We have a new(ish) road above it as a result, and the north side of the river is that bit bigger.

embankment

Above: historical view of the Victoria Embankment on London’s River Thames in the UK

planks

Above: detail of historical cut-and-cover work on the Victoria Embankment

shed

Above: further historical construction work in London

In the planned city, it’s easier to see how to add in utilities, and how to modify things to bring in new systems. One side of Kolkata will be easier to modernize than the other. Shanghai was built with a modern plan and these things in mind.

London certainly wasn’t built like this. It is a city built on a city built on a city approaching the end of its second millennium. That is why it is so expensive and increasingly impossible to add in new facilities, as the environment is so difficult to upgrade.

Cities as enterprises

Let’s bring this back to the ground and think of a large enterprise as a city. It has buildings, utilities, people etc… It has systems and controls, it has legacy and aspiration to modernize. Like London it has archaeology, and decisions made decades (or longer) ago have created the foundations upon which to build the new. Like London, it wants to be modern and be able to use the latest and greatest of everything. Fitting it all together is going to determine whether it will be a successful company that can trusted and governed, indeed whether it can even survive.

We can see how decisions such as the height and width of tunnels determine the dimensions of a London tube train.

So it is with the size of data centres, the choice of technology, and the network connections and locations in an enterprise.

Bringing it back to data

As we know, how we connect everything in an organization is most aptly modelled by lineage. This is best used in the first instance as a planning tool by providing an understanding of how elements are related and how they make the system work. This information helps inform decisions about the architecture, processes, and systems needed to support the data.

It also provides insight into what changes need to be made when new data sources, applications or utilities are introduced. Lineage can also be used to identify potential risks associated with data and ensure that it is managed in accordance with policies and regulations. Finally, lineage can help identify opportunities for optimization, such as reducing redundant processing or combining multiple data sources into a single source.

Putting lineage-centred planning into practice in the data office

Lineage can be used for planning in a variety of ways. First, it can provide an overview of the entire system and identify any weak points or errors in the process. By understanding the full scope of the system, organizations can better plan out their strategy and make decisions about which areas need improvement. Additionally, lineage can be used to track changes over time and identify trends that could be useful when making decisions about future projects or initiatives.

Lineage can be used for planning to help organizations understand how their data is being used by different stakeholders. By understanding where data comes from, who has access to it, and how it is being utilized, organizations can develop strategies that are tailored towards specific groups of users. This will help them identify potential opportunities for growth or areas where they need to focus more attention on improving their processes or systems.

Finally, lineage can also help organizations plan out their resources more effectively and conduct the sort of impact analysis necessary for any transformation project. By understanding where they come from and how they are being used throughout the organization, they can better allocate resources such as personnel or technology so that they are able to meet their goals more efficiently.

Insights for success

At its core, lineage provides us with a framework for understanding how certain events or ideas are connected to one another. This helps us see patterns in our life that may not be immediately obvious. For example, if you look at your family tree, you may notice that certain traits or interests run through generations. If you look at the history of an industry or field of study, you can identify key moments when certain innovations emerged and how they affected the development of that particular area.

By looking at lineage as a planning tool we can also gain insights into how different elements interact with each other in order to achieve success or failure. For instance, if we want to start a business, we need to consider all the factors involved such as market trends, customer preferences, competition etc., and then plan accordingly so that we maximize our chances of success. We can also use lineage to predict potential outcomes by looking at past successes or failures and using them as indicators for what might happen in the future.

Greenfield will always be easier, and if we plan with a lineage-first approach we’ll make a more sustainable environment. However, if we have legacy and stop to map it, then we can make the best of the situation we find ourselves in.

If Bazalgette were an enterprise architect, he might say: The principle in planning with lineage first was to divert the cause of the mischief to a locality where it can do no mischief.

The Happy CDO Project

We asked at the top of this article what this all has to do with data management, and lineage is a huge part of good data management. But it’s more than that. We think that lineage tools, better data management technology more generally, and methodology fit for the 2020s are central to being a happy CDO. These are core findings of proprietary research we recently commissioned, as discussed in a new white paper, Data Distress: Is the Data Office on the Brink of Breakdown? Part of The Happy CDO Project, we highlight in this research that 71% of the 300 senior data leaders in financial services in the US and UK that were surveyed have considered quitting their jobs as a result of a phenomenon that we define as ‘data distress’. This is just one of the findings that we explore – along with suggested remedies.

Data Distress: Is the Data Office on the Brink of Breakdown?

23.07 000011 BROC Data distress are banks on the brink of a mental health crisis top banner scaled

Senior global data leaders within banking and financial services firms are currently experiencing alarmingly high levels of data-related stress in the workplace, with 64% reporting that their data-related stress levels are sometimes or always high.

87% of those who reported any data-related stress, regardless of how often they feel it, say that it has affected their mental health and well-being, with 74% having taken sick days as a result, and 61% enduring an average of two to six nights of disrupted sleep per week.

And this anxiety has prompted 71% to consider quitting their jobs.

We call this phenomenon ‘data distress’, and these statistics are just a handful of the headline findings from new research recently commissioned by Solidatus.

The first major study of its kind, it involved 300 senior data leaders across the US and the UK in the financial services sector answering a series of questions on their levels of data-related stress and their views on the contributory factors.

Whitepaper

The assessment is bleak, and you can read about it in our whitepaper: Data Distress: Is the Data Office on the Brink of Breakdown? How US and UK Data Leaders in Banking and Financial Services are Facing Data Burnout. And the conversations we have day in, day out with practitioners – the ones that prompted us to commission this work in the first place – bear out our findings, so these will be very familiar pain points to people in our space.

But there’s hope; by quantifying it, as we do in the report, remedies have emerged.

Mistrust and the burden of regulations

So, what’s getting practitioners down?

A broad area we’ve defined as ‘data ambiguity and uncertainty’ appears to be the most significant cause of data-related stress, with 82% of respondents choosing at least one option from a range of answers that fall within this umbrella category.

A particular source of frustration is how time-consuming and stressful managing data for financial regulations is. The most significant category of reason cited for why is one we designate ‘tech deficiencies’. 93% of respondents’ answers fell into this bucket, and this doesn’t just cause data distress; it’s a huge commercial distraction.

“With tactical firefighting and fine avoidance being the default, productivity and opportunity discovery will be stifled,” said Philip Dutton, CEO and founder of Solidatus, adding that with “global banking estimated to be worth around $20 trillion per year, even as little as a 5% drop in strategic activity due to data distress represents a $1 trillion reduction in value”.

Better tech and governance

In our report, we identify how the right tech stack and a better approach to data governance are both key to unlocking the cure for data distress, increasing trust in your data and generally increasing practitioners’ levels of happiness. By heeding the advice in the report, you can:

  • Deliver data that you and your colleagues can trust and on which you can base decisions confidently;
  • Reveal business opportunities that might otherwise have been obscured in the attempt to demonstrate compliance with suboptimal systems; and
  • Reduce data distress.

Download the report to dig deeper into our quantitative research.

The Happy CDO Project

You’ll notice that we mention happiness above. We use this word more than in passing, as this research represents the first piece of activity in our The Happy CDO Project, a new initiative from Solidatus.

We’re shining a light on the issues that matter to data leaders to help them to be successful, fulfilled and happy in their work.​

By focusing on their challenges, we can highlight solutions and strategies to help CDOs, data leaders and data practitioners to be their best.

We look forward to sharing more on this project with you in the months ahead.

Data Distress: Is the Data Office on the Brink of Breakdown?

23.07 000011 BROC Data distress are banks on the brink of a mental health crisis top banner scaled

Data distress: 71% of senior data leaders in financial services polled on brink of quitting jobs

  • 74% of data leaders who experience workplace stress surveyed have had to take sick days as a result of stress
  • 61% endure 2 – 6 nights of disrupted sleep per week
  • 33% cite ‘too many disparate and siloed data sources’ as the cause of data distress
  • Data distress could be costing the global banking system hundreds of billions of dollars in lost productivity and missed opportunities,” says data management software CEO

27 July 2023, London and Houston: An alarming 71%[1] of senior global data leaders within financial services firms surveyed who experience stress in the workplace are ready to quit their job due to high levels of work-related stress with 87%[2] saying it affects their mental health and well-being, according to new research by data management firm Solidatus.

With 64%[3] experiencing high levels of stress in the workplace, which causes more than two-thirds (61%) to suffer from between two and six nights of disrupted sleep per week, it appears that senior data leaders are in an intense state of anxiety and potentially heading for burnout.

Solidatus’ research, conducted in collaboration with market-research firm Censuswide, surveyed 300 senior data leaders across the UK and the US in the financial services sector.

Respondents cite the top three causes of data distress as:

  • too many disparate and siloed sources of data (33%);
  • having to establish the appropriate sources of data for a task in hand (31%); and
  • the risk of fines relating to data governance and regulatory compliance (31%).

Philip Dutton, CEO and Founder of Solidatus, warned that each of these factors compound the others and result in a fundamental breakdown in trust, which drives even higher levels of stress.

He said: “Data has become the lifeblood of organizations, driving innovation and decision-making. However, the exponential growth of data, the atomization of data supply chains, the tsunami of regulation and the ever-increasing rate of change of business processes and systems has created almost unmanageable complexity. The resultant demands and pressures faced by data leaders have given rise to a mounting crisis: data distress. This is particularly acute in financial services, where our research found elevated levels of workplace data stress, which is having a significant impact on mental health. Left unchecked, this could have serious consequences across the organizations that rely on the expertise and leadership of these individuals.”

Impact on job satisfaction and performance

Data distress is taking its toll on the job satisfaction and performance of today’s financial services data leaders  80%[4] said the high level of stress at work impacts their ability to do their job properly, rising to 86%[5] among 25- to 34-year-olds.

79% of respondents who have been at their company for 1-4 years agree[6] that this level of stress makes them want to leave their job, compared to 62% of those who have been in their job for 5+ years. This indicates that the less time you are in a job, the more stress you experience and the greater the urge you have to quit your job.

Tough regulatory requirements contributing to data distress

But despite 83%[7] feeling confident about their company’s ability to collate and report the right data for regulatory requirements, it still takes too much time. Almost a third (32%) of respondents say their team spends four to five hours per week managing data for financial regulations. Nearly three-quarters (73%) believe that up to half their time in this pursuit is wasted through inefficiencies, such as poor systems and data.

As a result, just under half (47%) say ‘lack of data management tools’ is one reason why managing data for financial regulation takes as long as it does and is so stressful. Over a third (34%) state that it’s because ‘our data sets are all in siloed systems’ and 32% believe ‘we don’t have a good view of our full data estate’.

Dutton added: “Lack of data trust, decreasing efficiencies, increases risk and ultimately creates data distress. With such a seismic shift in organizations’ data, and regulatory and change environments, a totally new operating model needs to be deployed, supported by modern data management tools designed to cope with infinitely connected and complex environments within financial organizations. Urgent action is needed to deal with data distress and protect the mental health of the custodians of organizations’ most valuable asset  their data.

“Organizations’ data management foundations and supporting structures need to be replaced to enable simplicity, transparency, and understanding to promote implicit data trust. Only then will data leaders operate in a sustainable environment free from data distress.

“What’s more, if the pressure to go faster continues to induce data distress, with tactical firefighting and fine avoidance being the default, productivity and opportunity discovery will be stifled. With global banking estimated to be worth around $20 trillion per year[8], even as little as a 5% drop in strategic activity due to data distress represents a $1 trillion reduction in value.”

Dannielle Haig is an independent Business Psychologist who coaches senior business leaders.

She said: “In today’s data-driven world, the abundance and chaos of information are having a severe impact on data leaders. The sheer volume, velocity, and variety of datasets available are overwhelming and taking a toll on their mental health and well-being.

“To navigate this data deluge, it is crucial for data leaders to prioritize self-care. By fostering a healthy work-life balance and seeking support with the right tools and techniques – which can increase their capacity to make better decisions that they’re more confident in and to de-risk – they can maintain their mental stability, manage their spiralling datasets and lead with optimism, clarity and resilience.”

-ENDS-

Notes to Editor

To find out more about the Solidatus ‘Data Distress’ report, please download it here: https://www.solidatus.com/resources/whitepapers/data-distress-is-the-data-office-on-the-brink-of-breakdown/

The research was conducted by Censuswide from 19.05.2023 – 26.05.2023. An online survey reached 308 Senior Data Leaders (CDOs etc.) in the financial services sector, aged 18+, across the UK and USA. Censuswide abides by and employs members of the Market Research Society which is based on the ESOMAR principles.

About Solidatus

Solidatus is an innovative data management solution that empowers organizations to connect and visualize their data relationships, simplifying how they identify, access, and understand them. With a sustainable data foundation in place, data-rich enterprises can meet regulatory requirements, drive digital transformation, capture business insights, and make better, less risky and more informed data-driven decisions. We provide solutions to several key areas of endeavor, including governance and regulatory compliance; data risk and controls; business integration; environment, social, governance (ESG); and data sharing. Our clients and investors include top-tier global financial services brands such as Citi and HSBC, healthcare, and retail organizations as well as government institutions.

www.solidatus.com

Media Contact:
E-mail: solidatus@babelpr.com
Telephone: +44 (0) 20 7434 5550

[1] 71% combines “Strongly agree” and “Somewhat agree”

[2] 87% combines “Significantly impacts” and “Somewhat impacts”

[3] 64% combines “Always high” and “Sometimes high”

[4] 80% combines “Significantly impacts” and “Somewhat impacts”

[5] 86% combines “Significantly impacts” and “Somewhat impacts”

[6] Agree combines “Strongly agree” and “Somewhat agree”

[7] 83% combines “Very confident” and “Somewhat confident”

[8] In September 2021, Investopedia said that “Market estimates project that by the end of 2021, the financial services market is likely to reach $22.5 trillion,” in an article entitled Financial Services: Sizing the Sector in the Global Economy. See also McKinsey’s Global Banking Annual Review, published December 2022.

You may also be interested in our guide to data lineage

reflections scaled

The value of data and other reflections from attending last month’s Gartner® Data & Analytics Summit 2023 in London, U.K.

By Tina Chace, VP Product

In April, the latest chapter in my 10-year career in fintech product management began: I was appointed VP Product at Solidatus.
Though I’m new to this side of data and analytics (DA) governance, I had been that person who felt the pain when data flows, systems, data lineage and quality aren’t mapped; it leads to inefficiency or failure in your operations, and it’s my mission at Solidatus to help put this right for people who subscribe to our sophisticated software.
From previous roles, I also came equipped with experience of AML products, model risk management, and artificial intelligence (AI) and machine learning (ML).
 
So I was delighted when asked to fly in from New York to join the team at last month’s Gartner® Data & Analytics Summit in London.
 
 
tina et al

That’s me on the far left of the picture above, accompanied by some of my colleagues, including our founder and CEO, Philip Dutton, second from the right.

But my main objective in attending was to listen to people from beyond the Solidatus stand. What was the mood music in this space? And what might I be able to do with it in my new role?

We’ll start with an overview of what I felt were the major themes, these views and observations of mine being an amalgam of the various sessions I attended from a wide variety of speakers and organizations.

Major themes

As a first-time attendee to a data and analytics conference, I observed that:

  • Articulating the value of data and governance projects and teams is still challenging. Putting real numbers or quantifying the impact of a governance project can be challenging, and recommendations of talking about financial influencers (such as enabling faster decision-making) should be touted as highly as direct tactical impact, such as headcount reduction. The best examples of getting buy-in from stakeholders outside the data office are real-world use cases which tell the story of value.
  • Culture and people are key to a successful data governance organization. There were many examples of success – where an inspirational data leader was able to align technologies, processes and people to achieve their outcomes, and potential pitfalls where there were warnings that purchasing a particular piece of technology isn’t enough if there are no trained data engineers to work in the product and end-users don’t adopt those tools.
  • As you’d expect, artificial intelligence was talked about everywhere. It seemed like there was a proliferation of AI/ML/data science (DS) vendors or more traditional vendors that touted how AI/ML was powering their platforms. A significant proportion of sessions had AI as a topic as well. In my view, there are two ways to approach AI through a data and analytics lens: how data governance is a key part of a successful enterprise application of AI and ML, and how AI/ML can assist data and analytics governance. More on this further in the blog.

Let’s expand on these points by rolling the first two into one discussion and finishing with the third on its own.

Data governance programs: the vision

The ideal state for a successful data and governance program is for self-service data products and data governance to be owned by their various domains within a centralized framework – essentially, this dovetails with descriptions I saw of how data fabric architecture and data mesh operating models can be used in combination.

For the more practical data leader, a valuable steppingstone is just breaking the silos between domains to have an accurate map of their own data ecosystem. Our customer, Lewis Reeder from the Bank of New York Mellon, presented with Philip Dutton on the bank’s success in leveraging Solidatus’ metadata layered with business context to clearly visualize their data ecosystem.

One general impression I got from this topic as a whole is of the merits of ensuring that getting to the vision is taken in reasonable and planned increments that each deliver value themselves. The value of understanding your data enables tactical things – such as making the data office operate more efficiently – but also underpins all other functions in the organization. Understanding where your data comes from enables functions such as KYC and AML programs, financial and ESG reporting, algorithmic trading functions, and of course feeding the ML models that help drive operational efficiency and automation in other domains.

The verdict on AI

As discussed in a session entitled The Enterprise Implications of ChatGPT and Generative AI on Monday 22nd May, Gartner has rated techniques such as LLMs (large language models) “not ready for prime time” but something to monitor and research.

I agree wholeheartedly with this judgement based on my past experience of trying to implement them for large and highly regulated institutions. The main limitations are the inability of enterprise customers to safely use and potentially contribute to the open-source datasets that drive models such as OpenAI’s ChatGPT, the effort and responsibility for creating and curating a private dataset while utilizing GPT3/4 technologies to build your own LLMs, and the lack of a big enough use case to make these efforts worth the time and money.

The main use case demonstrated is to enable what I saw called “natural language questioning”. This allows business users to easily ask questions to assist them with self-service data governance like “which systems contain information on customer addresses?” or “what is the impact of adding a new field, tax ID number, to a customer entity?”. I’d describe this as search on steroids.

More traditional machine learning methods, such as classification, matching, data extraction, anomaly detection, and topic modeling, can be used to great effect to support data and analytics governance. Plus, the subject matter expertise will be easier to come by, and implementation is more straightforward.

Classification, for example, can be used to determine whether a system should be controlled by certain procedures based on its properties and relationships to other systems. Matching on both name and semantics can be used to suggest mappings between systems for both technical and business terms to assist in creating and enforcing a centralized data dictionary.

On the flip side, understanding your data ecosystem is key to a successful AI/ML project. From my experience in implementing ML products for regulated enterprise customers, we ran into many roadblocks that a well-mapped understanding of the available data and impacted systems would have solved. Understanding what datasets are available and the quality of the datasets to train your ML model is helpful; there were times when it took days or weeks to approve data to just test one of the ML models we implemented – even on prem. And there were many times when data that could have improved the automation rate of the ML model was excluded because the business stakeholders didn’t know or trust the quality of the data.

Finally, there were many scenarios where changes to upstream systems negatively impacted the performance of ML models and the customer didn’t find out until it had already flowed through.

Putting it into practice

It was great to see so many data leaders talking about their real use cases and success stories. But it was even more interesting to hear about their struggles and what they have learned while implementing their data governance programs.

There was a lot to learn from the panels and sessions but also from talking directly to practising data leaders about their own specific scenarios and pain points. The willingness to share knowledge with peers will surely drive the D&A industry forward as a whole.

I can’t wait to put these insights into practice in my new role.

GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.

Written for Solidatus – a leading data lineage solutions provide.

active metadata blog header scaled

Last month, Gartner® published its Market Guide for Active Metadata Management*.

We were delighted – but not surprised – to see that Solidatus was named a Representative Vendor in this Market Guide report as to us, active metadata is at the heart of everything we do.

But what is active metadata?

Gartner opinion

In the report, Gartner describes active metadata management as “a set of capabilities across multiple data management markets, led primarily by recent advancements in how metadata can be used for continuous analysis. Data and analytics leaders must consider the market evolution as transformational in all data-enabling technologies”.

In our opinion, this is a great overview, and we’d recommend you read the full report. Highlights include:

  • A strategic assumption that “[t]hrough 2024, organizations that adopt aggressive metadata analysis across their complete data management environment will decrease time to delivery of new data assets to users by as much as 70%”;
  • A market direction, which states that, “[o]verall, the metadata management software market grew at 21.6%, reaching $1.54 billion in U.S. dollars. This is one of the highest growing markets within data management software overall, following the DBMS market growth of 22%, although from a much smaller revenue base”; and
  • A market analysis that states that “[c]ollaborative utilization will require new ways to capture and visualize metadata (driven by data preparation for analytics). Included is the capability of rating, ranking, tagging of data and ability to communicate within the metadata solutions”.

But we think active metadata means slightly different things to different vendors.

In this short blog post, a prelude to a series of more detailed blog posts on this increasingly important subject, we summarize what active metadata means to Solidatus and its growing body of users.

The DNA of Solidatus

It took others to identify and name active metadata. But – as with DNA itself, which obviously existed before Watson and Crick discovered and named it in the 1950s – active metadata is, and has always been, in our DNA.

It’s what we do and it’s what underpins our technology, through whichever use case lens you view our data lineage solution.

It starts with metadata itself, which we’d define as a special kind of data that describes business processes, people, data and technology, and the relations between them, bringing context and clarity to the decisions that link them. Traditional examples include data catalog and business glossary.

This brings us to active metadata. We believe our definition resonates with Gartner’s: the way we see it, active metadata is the facility to reason about, visualize dynamically and gain continuous insight from information about data, data systems, business entities and business concepts, the relations between them, and stored knowledge about them.

How we make metadata active

So, what makes active metadata active and why is it so different from what went on before? At Solidatus, we’d answer these questions with four points:

  • Active metadata includes logical reasoning;
  • Active metadata offers a very dynamic form of visualization;
  • The information in active metadata is not just about the entities themselves, but about the connections between them; and
  • Active metadata should include stored knowledge. This is subtly different from other metadata, because it sits at a higher level, and offers more general, or more universal, information, such as business definitions.

The consequence of all of these is continuous insight. It’s more dynamic, it’s more complete, it’s based on context as well as content, and it respects standards.

It’s a whole different ballgame.

We’ll expand on these in future blog posts, but anyone familiar with Solidatus will immediately appreciate how we sit right in the centre of this space.

The wider context

We’ll finish by contextualizing active metadata, at least as we see it, in terms of what’s delivered, the attributes of an active metadata solution, and – crucially – the main areas for which it can be used.

What’s delivered
An active metadata solution:

  • Is embedded within an organization’s data and business practices;
  • Presents a continuous, coordinated, enterprise-wide capability; and
  • Provides monitoring, insight, alerts, recommendations and design.

Solution attributes

An active metadata analytics workflow:

  • Is integrated, managed and collaborative; and
  • Orchestrates inter-platform metadata assets and cross-platform data asset management.

What it can be used for

Active metadata assets are used to create insight solutions which, among other things, enhance:

  • Data integration;
  • Resource management;
  • Data quality management;
  • Data governance;
  • Corporate governance;
  • Regulatory control;
  • Risk management;
  • Digital transformation; and
  • ESG.

Above all, the benefits of good metadata capabilities boil down to: making business information complete, coherent, informed and logical; delivering faster, richer and deeper insight; keeping everything up to date; and making your processes reliable and responsive.

Watch this space for our detailed follow-up blog posts and, as ever, we encourage readers to request a demo of Solidatus.

To read more on this subject, see What is active metadata? and Mining value from active metadata.

*Gartner, “Market Guide for Active Metadata Management”, November 14, 2022.

GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved. Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s Research & Advisory organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Quick Answer: What Is Active Metadata?

In the latest Gartner® report, find out what active metadata is, how to use it, and how to get started.

22.05 000022 waterfall blog image

We all know that climate change is a critical topic for all of us. COP27 couldn’t be clearer on this. But organizations are having to learn how to report and manage their impact and influence on the planet, and this can be a complex process to manage.

A key part of reporting and managing climate change is to have a shared taxonomy so that you, your regulators, your suppliers, and your customers all understand your impact.

Sadly, in the case of climate change, that is not as simple as it would be in an ideal world because across the globe, regulators and governments have implemented a bewildering range of regulations, standards, frameworks and laws.

This means it can be difficult to be confident that everyone is speaking the same language.

To help overcome this, Solidatus is a contributing partner to the Open-Source Sustainable Finance Taxonomy Project.

The aim of this initiative is to create a global, open-source and available-to-all resource that holds global taxonomies that will allow all organizations to improve the understanding of the requirement to help standardize the language used in climate change.

Launched last month in Dublin, this resource is now available on GitHub to any organization wishing to understand a wide range of climate change and sustainable finance taxonomies, including ISSB accounting standards, the SFDR and TCFD frameworks, and a host of other key taxonomies such as NACE and NAICS industry categorizations. This resource will grow over time and includes cross-mappings between the key taxonomies.

Each of these taxonomies are available as downloadable text files and in addition represented as Solidatus models which allow users to visually:

  • Understand the structure in each taxonomy
  • Understand the relationships between different taxonomies
  • Focus in on the similarities, overlaps and differences between key taxonomies

These visualizations are available to view for all users. However, Solidatus clients can also download these models and import them within their Solidatus instance as a reusable resource and a key component in linking the taxonomies with their internal business processes.

This approach will deliver a platform to demonstrate to regulators and auditors how you are meeting climate change reporting requirements.

The links to these taxonomies are below.

Please contact Solidatus at hello@solidatus.com if you want to know how to best use them within your organization.

Organizations that wish to get involved in the open-source Sustainable Finance Taxonomy project can do so via the partner page on the GitHub site (link below). The more partners we have involved, the more valuable this shared resource will be.

Solidatus are proud to be doing its part in the fight against climate change by being a founding member of the First Global Project for Open-Source Sustainable Finance Taxonomy (‘OS-SFT’), and we forward to working with you on this in the future.

OS SFT model

Screenshot of the relationship between TCFD, EBA Pillar 3 and ISSB, highlighting the added value of linking topics across these taxonomies within Solidatus

Further reading

Open-Source Sustainable Finance Taxonomy
According to its GitHub page, the “objective of the project is to provide the marketplace with the following open-source, practical tools to advance user implementation of sustainable finance data systems into business operating models for new and evolving taxonomy frameworks, standards, regulations and laws”.

See https://github.com/FD-SustainableFinance/0-OS-SFT-OVERVIEW.

Open-Source Sustainable Finance Taxonomy Partner
See https://github.com/FD-SustainableFinance/06-COLLABORATORS-PARTNERS.

UN Climate Change report
See https://unfccc.int/news/climate-plans-remain-insufficient-more-ambitious-action-needed-now

snowflake solidatus

Snowflake’s native governance capabilities gives you the power to know, control and unlock your data. They created the Data Governance Accelerated Program for partners who can integrate and enhance these capabilities, and the team at Solidatus are proud to partner with them to add new levels of context and visualization to Snowflake, enhancing its governance capabilities, giving you more control and a deeper understanding of how data is used across your organization, and where policies need to be applied. With this knowledge, you can better address regulatory requirements, drive digital transformation, capture business insights, understand data lineage, and make better, less risky and more informed data-driven decisions..

Key benefits of the Solidatus approach

As a Snowflake user, you have access to a suite of data governance controls like access policies, data masking, object tagging and coarse grain lineage. Solidatus’ data lineage enriches these capabilities by allowing you to create living blueprints that map how your data flows as it moves through your systems – both now and at other points in time. When metadata is ingested from Snowflake into Solidatus, we integrate any security policies, data masking, and tagging information as an extra dimension to your data flows. Through understanding how and where these controls are being applied, you establish a trusted foundation for policy enforcement, and radically speed up cloud transformation projects by cutting discovery time, streamlining implementation, and mitigating future risk and costs.

To summarize, Snowflake with Solidatus will:

  • Cut costs by reducing the time spent managing and implementing resources.
  • Speed up your migration to Snowflake.
  • Provide needed visibility to reduce operational risk and improve operating efficiency.
  • Boost governance and regulatory compliance.
  • Establish controlled planning and implementation of change.
  • Monitor data sharing for compliance.
  • Visualize and contextualize data policies, regulations and other data assets present in Snowflake.
Play Video