SQL Lineage Tools

Azure Purview

Microsoft Purview provides a unified data governance solution that offers many capabilities, one of which is showing lineage between datasets created by data processes. It supports automated asset-level lineage for the datasets and processes, while manual lineage allows you to document lineage metadata for sources where automation isn't yet supported without using any code.
Metadata collected in Microsoft Purview from enterprise data systems are stitched across to show an end-to-end data lineage.

Automatic discovery: Yes
Data flow visualization: Yes
Environment: Online
Free edition: No
Metadata management: Yes
Version control integration: Yes

Informatica Data Lineage

Informatica Data Lineage tool provides automated end-to-end data lineage with detailed and summary views of data movement across data pipelines. With Informatica, you can derive lineage from code in SQL scripts, stored procedures and AI/ML code. It streamlines tracking data flow from system- to column-level for detailed impact analysis.

Automatic discovery: Yes
Data flow visualization: Yes
Environment: Online
Free edition: No
Metadata management: Yes
Version control integration: Yes

DoltHub

DoltHub is a modern, secure, always on database management web GUI to the Dolt ecosystem. Dolt is the SQL database that you can fork, clone, branch, merge, push, and pull, just like a Git repository. It allows you to explore the full history of your data directly from SQL. It is a version-controlled database and a versioned MySQL replica but can be configured just like any other MySQL replica. This allows you to get most of the features of a version-controlled database without migrating from MySQL.

Automatic discovery: No
Data flow visualization: No
Environment: Online
Free edition: No
Metadata management: No
Version control integration: Yes

Fivetran

Fivetran is the automated data movement platform moving data out of, into and across your cloud data platforms. It allows you to monitor data movement, logs, and status from connector extract to successful warehouse load through modeling — all in one data lineage graph (DLGs). DLGs show the dependencies between your dbt models so that you can track the flow of data from your connectors to your destination.

Automatic discovery: No
Data flow visualization: Yes
Environment: Online
Free edition: No
Metadata management: No
Version control integration: No

Blindata

Blindata SQL Lineage helps you effortlessly track and manage data movements within your database. The SQL Lineage module uses schema metadata and extracted SQL statements to infer data flows and transformations, including standard database objects such as views and routines, query logs, and scripts generated by ELT tools.

Automatic discovery: Yes
Data flow visualization: Yes
Environment: Online
Free edition: No
Metadata management: Yes
Version control integration: No

SQLLineage

SQLLineage is a SQL lineage analysis tool powered by Python. Given a SQL command, SQLLineage will tell you its source and target tables, without worrying about Tokens, Keyword, Identified and all the jagons used by a SQL parser.

Automatic discovery: No
Data flow visualization: No
Environment: On-premises
Free edition: Yes
Metadata management: No
Version control integration: No

OpenLineage

OpenLineage is an open platform for collection and analysis of data lineage. It tracks metadata about datasets, jobs, and runs. Pipeline components - like schedulers, warehouses, analysis tools, and SQL engines - can use a standard API for capturing lineage events to send data about runs, jobs, and datasets to a compatible OpenLineage backend for further study.

Automatic discovery: Yes
Data flow visualization: No
Environment: On-premises
Free edition: Yes
Metadata management: Yes
Version control integration: No