Data lineage tools

Collibra Data Lineage

Collibra Data Lineage automatically maps relationships between data to show how data flows from system to system and how data sets are built, aggregated, sourced and used, providing complete, end-to-end lineage visualization.

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: Yes
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: No
NoSQL: Yes
Pipelines lineage: Yes
RDBMS: Yes

Octopai

Octopai Data Lineage XD is a complete, in-depth, and trustworthy automated lineage tool. With 3 different types of lineage, you’ll find everything you need in one easy-to-use platform. The 3 linage types include Cross-System Lineage (provides end-to-end lineage at the system level from the entry point into the BI landscape, all the way to reporting and analytics),
End-to-End Column Lineage (view column to column-level lineage between systems from the entry point into the BI landscape, all the way through to reporting and analytics), and Inner-System Lineage (details the column-level lineage within an ETL process, report, or database object).

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: Yes
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: No
NoSQL: No
Pipelines lineage: No
RDBMS: Yes

MANTA

MANTA is a data lineage platform that automatically scans your data environment to build a powerful map of all data flows and deliver it through a native UI and other channels to both technical and non-technical users. It automatically scans every nook and cranny to get immediate, accurate, and up-to-date lineage.

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: Yes
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: Yes
NoSQL: No
Pipelines lineage: Yes
RDBMS: Yes

Tokern

Tokern Lineage Engine is a fast and easy to use platform to collect, visualize and analyze column-level data lineage in databases, data warehouses and data lakes in AWS and GCP. You can use the API or library to access column-level lineage and automate data quality triage, scan and tag PII/PHI/sensitive data, programmatically monitor and manage ACLs, data and ETL pipeline cleanup, and impact analysis.

BI Tools lineage: No
Commercial: Free
Data migration tools lineage: No
Data warehouses lineage: Yes
ETLs: Yes
Free edition: Yes
Hadoop: No
NoSQL: No
Pipelines lineage: Yes
RDBMS: Yes

SolarWinds Database Mapper

SolarWinds Database Mapper (formerly SentryOne Document) delivers powerful documentation and data lineage analysis capabilities in a cloud or software solution. With Database Mapper, you can track data lineage with a visual display that clearly shows data dependencies across your environment

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: No
Data warehouses lineage: No
ETLs: No
Free edition: No
Hadoop: No
NoSQL: No
Pipelines lineage: No
RDBMS: Yes

SQLFlow

SQLFlow is a SQL data lineage tool and provides a visual representation of the overall flow of data. It offers automated SQL data lineage analysis across Databases, ETL, Business Intelligence, Cloud, and Hadoop environments by parsing SQL Script and stored procedure. It enables impact analysis at a granular level, drill down into table, column, and query-level lineage.

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: No
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: Yes
NoSQL: No
Pipelines lineage: No
RDBMS: Yes

ASG Data Intelligence

ASG Data Intelligence (ASG DI) is the solution for data distrust. It is a metadata-driven platform that makes technical data “smarter” with end-to-end views of the data and its movements (data lineage) combined with business meaning and usage guardrails. It lets you visualize data flows mapped to business context, and it uniquely traces lineage by parsing code from data sources, applications, tools, and source code.

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: No
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: Yes
NoSQL: Yes
Pipelines lineage: No
RDBMS: Yes

Global IDs

Global IDs Data Lineage provides automated analysis of the actual flow of data through your enterprise, enabling you to understand – in real-time – where data originates, how it flows through the ecosystem, and how it is transformed en route.

BI Tools lineage: No
Commercial: Commercial
Data migration tools lineage: Yes
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: Yes
NoSQL: Yes
Pipelines lineage: No
RDBMS: Yes

Atlan

Atlan provides effortless data lineage & governance by letting you auto-construct data lineage & deploy best-in-class data access governance without compromising on data democratization. It automatically parses through your SQL query logs in your data warehouses and BI tools to create a visual view of data lineage.

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: No
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: Yes
NoSQL: No
Pipelines lineage: Yes
RDBMS: Yes

Alteryx Connect

Alteryx Connect uses powerful search capabilities to find and reuse information contained in data files, databases, visualizations, dashboards, workflows, analytic apps, and more. It lets you automatically capture and visualize data lineage between assets, improving the overall quality and reliability of shared information between data, process, and people. You can get technical data lineage by loading metadata from source and target systems and interpreting Alteryx workflows.

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: No
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: Yes
NoSQL: Yes
Pipelines lineage: No
RDBMS: Yes

Apache Atlas

Apache Atlas provides open metadata management and governance capabilities for organizations. It offers intuitive UI to view lineage of data as it moves through various processes. It equips one with an intuitive UI to engage in pre-defined and ad-hoc exploration of data types by type, classification, attribute value or free-text It also maintains a history of how a data source or explicit data was constructed, and how it has evolved over time. It’s also possible to access and update lineages via rest APIs.

BI Tools lineage: No
Commercial: Free
Data migration tools lineage: No
Data warehouses lineage: Yes
ETLs: Yes
Free edition: Yes
Hadoop: Yes
NoSQL: No
Pipelines lineage: No
RDBMS: No

openAudit

openAudit by mixing data lineage, audit log analysis and other techniques, instantly defines on a single screen the operational sources of each data and its end uses: who is viewing the data, how and when. Due to a joint analysis of different data processing technologies, openAudit makes it possible to understand end-to-end multitechnlological data flows and to zoom in on the underlying code. Breaks linked to views and dynamic procedures are handled.

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: Yes
Data warehouses lineage: No
ETLs: Yes
Free edition: No
Hadoop: No
NoSQL: No
Pipelines lineage: No
RDBMS: Yes

Informatica Metadata Management

Informatica Metadata Manager is a web-based metadata management tool. You can view data lineage for objects in the Metadata Manager warehouse. Data lineage shows the origin of the data, describes the path, and shows how it arrives at the target. Use data lineage to analyze data flow and troubleshoot data transformation errors.

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: Yes
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: Yes
NoSQL: Yes
Pipelines lineage: No
RDBMS: Yes

IBM Watson Knowledge Catalog

IBM Watson Knowledge Catalog is an open and intelligent data catalog for managing enterprise data that also lets you ensure well-structured and maintained data lineage. It lets you track where data originated and how it’s consumed, increasing trust when accessing data across many sources and destinations

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: No
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: Yes
NoSQL: Yes
Pipelines lineage: Yes
RDBMS: Yes

MetaCenter

MetaCenter automates data lineage analysis across Databases, ETL, Business Intelligence, Cloud, and Hadoop environments. It lets you reduce data management costs by automating data lineage and impact analysis documentation.

BI Tools lineage: Yes
Commercial: Commercial
Data migration tools lineage: No
Data warehouses lineage: Yes
ETLs: Yes
Free edition: No
Hadoop: Yes
NoSQL: Yes
Pipelines lineage: No
RDBMS: Yes