Data dictionary tools for Apache Hive

List of tools that enable design and building of data dictionaries .

Data Dictionary is a set of important information about data used within an organization (metadata). This information includes names, definitions, and attributes about data, owners, and creators of assets. Data Dictionary tools provide insights into meaning and purposes of data elements. They add useful aliases about the scope and characteristics of data elements, as well as the rules for their usage and application.

Dataedo

Dataedo allows you to connect and scan metadata from multiple sources and build data dictionary automatically in a couple of minutes.

Desktop/Cloud: Desktop
ER Diagram: Yes
Export: HTML,MS Excel,PDF
Metadata stored in: Documentation repository/file
Commercial: Commercial
Free edition: No
Notable features: ER diagrams, metadata repository, schema change tracking, organizing with modules, documenting missing FKs, custom fields, description suggestions, documentation progress tracking, rich text with images
Runs on: (for desktop): Mac OS,Windows
Dataedo Data Catalog
Dataedo Data Profiling
Dataedo Data Search
Dataedo ERD

SolarWinds Database Mapper

SolarWinds Database Mapper (formerly SentryOne Document) offers complete documentation for SQL Server databases and BI tools, including SSIS, SSRS, SSAS, Oracle, Hive, Tableau, Informatica, and Excel. In addition, it allows object description editing and documentation can be customized for different audiences, so users only see the most relevant information for their role.

Desktop/Cloud: Desktop
ER Diagram: Yes
Export: CHM,HTML,MS Word,RTF
Metadata stored in: Documentation repository/file
Commercial: Commercial
Free edition: No
Notable features: Lineage Analysis and Impact Analysis, include images
Runs on: (for desktop): Windows

DbSchema

DbSchema facilitates to design, document and manage SQL and NoSQL databases. It is an intuitive designer for complex databases. It allows editing tables or columns directly in the layout, by double-clicking them.

Desktop/Cloud: Desktop
ER Diagram: Yes
Export: HTML,PDF
Metadata stored in: Database metadata
Commercial: Commercial
Free edition: No
Notable features: Entity relationship diagram, Reverse engineer schema from database, Relational data browse, SQL editor
Runs on: (for desktop): Linux,Mac OS,Windows

ER/Studio

ER/Studio Data Architect helps to easily reverse- and forward-engineer, compare and merge, and visually document data assets across multiple platforms and data sources. The Data Dictionary is a feature of ER/Studio that allows the sharing of many objects including domains, defaults, rules, and attachments. Using the Data Dictionary you can enforce standards, promote reuse, and build a common framework across all models.

Desktop/Cloud: Desktop
ER Diagram: Yes
Export: HTML
Metadata stored in: Documentation repository/file
Commercial: Commercial
Free edition: No
Notable features: Data modeler, Entity relation diagram, Model sharing
Runs on: (for desktop): Windows

Alation Data Catalog

Alation data dictionary defines and describes technical data terms. Data terms could be database schemas, tables, or columns. Once connected to data sources, Alation automatically indexes data and populates catalog pages. For example, a column catalog page shows the technical column name, a business title name, the data type, and popularity. Additional context can be added to the data dictionary, for shared understanding across the organization.

Desktop/Cloud: Cloud
ER Diagram: Yes
Export: MS Excel
Metadata stored in: -
Commercial: Commercial
Free edition: No
Notable features: ML auto-suggested business glossary terms
Runs on: (for desktop): -

Ataccama Metadata Management & Data Catalog

Ataccama Data Catalog & Business Glossary tool provides automatic mapping of terms to real data sources in the Data Catalog during profiling, ensuring the Data Catalog is always up-to-date and synced with the Business Glossary.

Desktop/Cloud: Desktop
ER Diagram: No
Export: CSV,MS Excel,XML
Metadata stored in: Program respository
Commercial: Commercial
Free edition: Yes
Notable features: Automated Mapping of Business Terms, Up-to-date Business Glossary, Data Discovery on Multiple Sources
Runs on: (for desktop): Windows

Atlan

Atlan's data dictionary allows you to document databases, data warehouses, data lakes and BI tools in one easy interface. It uses automation to generate pre-configured critical data quality metrics and automation to help propogate column descriptions through your data ecosystem. Using Atlan's interface, you can easily update table descriptions, column descriptions, assigns owners and stewards and attach a powerful readme to every object. Atlan also allows you to capture relationships such as primary key, foreign key relationships, lineage and more.

Desktop/Cloud: Cloud
ER Diagram: No
Export: -
Metadata stored in: Graph Database
Commercial: Commercial
Free edition: No
Notable features: Automated data dictionary, column level search, visual frequency, versioned data dictionary
Runs on: (for desktop): -

Key functionality of Data Dictionary tools is to give users the ability to document data. Moreover, very important is the possibility to create a collection of multiple repositories, based on different system engines. For a better understanding of the data, some tools allow visualization of the data structure using ERD (Entity-Relationship Diagrams).

From the organization's point of view, a community module within a data dictionary tool proves to be useful. It facilitates the proper information flow, as well as provides sharing opinions on specific objects among the members of an organization.

Nowadays, data discovery and understanding becomes crucial for proper organization performance. There are many benefits to using Data Dictionaries, such as:
• helps avoid data inconsistencies problems,
• it allows introducing unified nomenclature used in the project,
• Make data searchable, and understandable,
• Create a single source of truth about the data from different repositories,

The prepared list includes simple, open-source data dictionaries as well as more advanced software.