Data intelligence tools for Apache Hive

Data intelligence tools refer to the artificial intelligence and machine learning tools used by companies in order to analyze and transform data into information that is valuable and relevant for improving the company's operations.

Dataedo

Dataedo is an on-premises data catalog & metadata management tool. It allows you to catalog, document, and understand your data with a data dictionary, business glossary, and ERDs. It reads your schema and lets you easily describe each data element with descriptions, business-friendly aliases, and custom fields. It features a data community module, which allows you to crowdsource knowledge about data from everyone in your organization.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes
Dataedo Data Catalog
Dataedo Data Lineage
Dataedo Data Profiling
Dataedo Data Catalog - list of data sources

Collibra Catalog

Collibra Catalog empowers business users to quickly discover, understand, contribute, and govern the data that matters so they can generate impactful insights that drive business value. It also allows data stewards to certify datasets so that business users can trust the data that they use in their analysis.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

Alation Data Catalog

Alation pioneered the data catalog market and is now leading its evolution into a platform for a broad range of data intelligence solutions including data search & discovery, data governance, stewardship, analytics, and digital transformation. Thanks to its powerful Behavioral Analysis Engine, inbuilt collaboration capabilities, and open interfaces, Alation combines machine learning with human insight to successfully tackle even the most demanding challenges in data and metadata management.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

Informatica Enterprise Data Catalog

Informatica Data Catalog is a machine learning-based data catalog that lets you classify and organize data assets across any environment to maximize data value and reuse, and provides a metadata system of record for the enterprise. It automatically scans and catalogs data across the enterprise, indexing it for enterprise-wide discovery using simple, Google-like search.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

IBM Watson Knowledge Catalog

IBM Watson® Knowledge Catalog is an open and intelligent data catalog for managing enterprise data and AI model governance, quality and collaboration. It enables you to organize, define and manage enterprise data to provide the right context to drive value across imperatives like regulatory compliance to data monetization.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

Talend Data Catalog

Talend Data Catalog automatically crawls, profiles, organizes, links, and enriches all your metadata. It makes easy to search and access data, then verify its validity before sharing with peers. Up to 80% of the information associated with the data is documented automatically and kept up-to-date through smart relationships and machine learning, continually delivering the most meaningful data to the user.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

Ataccama Metadata Management & Data Catalog

Ataccama ONE Data Catalog is an AI-powered metadata management module. It’s a central storage for all of your metadata—imported from other sources, crowdsourced, or automatically captured in continuous data discovery processes.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

Apache Atlas

Apache Atlas provides open metadata management and governance capabilities for organizations to build a catalog of their data assets, classify and govern these assets, and provide collaboration capabilities around these data assets for data scientists, analysts, and the data governance team.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: No

Alteryx Connect

Alteryx Connect is a social data cataloging and data exploration platform for the enterprise. The powerful data cataloging provided by Alteryx Connect centralizes business terms and definitions, metrics, and information assets for maximum consistency, discoverability, and collaboration.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

Truedat

Truedat is an open source data cataloging and governance tool that allows to quickly unify and explore combined metadata from different sources on the same interface. It enables to organize & enrich information through configurable workflows and monitor data governance activity.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

Io-Tahoe

Io-Tahoe is an enterprise smart data discovery and AI-driven data catalog product that enables enterprises to accelerate to next-generation data management practices, radically improving data governance and regulatory compliance. Population of the data catalog is automated by using artificial intelligence and leveraging the discovery functionality and natural language analysis to automatically tag data.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

Global IDs

The Global IDs Data Catalog automates the linking of logical business data models to physical data assets, keeps the metadata up to date, and scales with the size of your enterprise, from small to very large.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

Atlan

Atlan is a modern, cloud native data catalog. It's ease of use and intuitive interface enables diverse personas including engineers, data stewards and business users to discover, understand and trust data. Atlan leverages machine learning and a bots ecosystem to automate documentation and stewardship tasks such as automatic data profiling, data quality alerts and glossary tagging. It is built on an Open API architecture, and has a pay as you go pricing model, making it a good fit for teams of all sizes.

Data Catalog: Yes
Data Lineage: Yes
Data Profiling: Yes

The use of data intelligence tools improves the understanding of data and leads to better decision making in the future. There are other, various benefits that data intelligence tools can bring into the company. Some of them are:

• Providing correct context for datasets.
• Improving data quality.
• Increasing data accessibility.

In an evolving world of technology, data intelligence is crucial for the company's growth. It can help understand consumer preferences as well as a company's investment and its effectiveness. Through data analysis the company can view which areas can be optimized and present approaches that might be more beneficial for the future.

To help you find the right tool for your organization, we have put together this list of best data intelligence tools.