Data dictionary tools for Apache Hive
List of tools that enable design and building of data dictionaries .
Data Dictionary is a set of important information about data used within an organization (metadata). This information includes names, definitions, and attributes about data, owners, and creators of assets. Data Dictionary tools provide insights into meaning and purposes of data elements. They add useful aliases about the scope and characteristics of data elements, as well as the rules for their usage and application.
Dataedo
Dataedo allows you to connect and scan metadata from multiple sources and build data dictionary automatically in a couple of minutes.
Desktop/Cloud: | Desktop |
---|---|
ER Diagram: | |
Export: | HTML,MS Excel,PDF |
Metadata stored in: | Documentation repository/file |
Commercial: | Commercial |
Free edition: | |
Notable features: | ER diagrams, metadata repository, schema change tracking, organizing with modules, documenting missing FKs, custom fields, description suggestions, documentation progress tracking, rich text with images |
Runs on: (for desktop): | Mac OS,Windows |
SolarWinds Database Mapper
SolarWinds Database Mapper (formerly SentryOne Document) offers complete documentation for SQL Server databases and BI tools, including SSIS, SSRS, SSAS, Oracle, Hive, Tableau, Informatica, and Excel. In addition, it allows object description editing and documentation can be customized for different audiences, so users only see the most relevant information for their role.
Desktop/Cloud: | Desktop |
---|---|
ER Diagram: | |
Export: | CHM,HTML,MS Word,RTF |
Metadata stored in: | Documentation repository/file |
Commercial: | Commercial |
Free edition: | |
Notable features: | Lineage Analysis and Impact Analysis, include images |
Runs on: (for desktop): | Windows |
DbSchema
DbSchema facilitates to design, document and manage SQL and NoSQL databases. It is an intuitive designer for complex databases. It allows editing tables or columns directly in the layout, by double-clicking them.
Desktop/Cloud: | Desktop |
---|---|
ER Diagram: | |
Export: | HTML,PDF |
Metadata stored in: | Database metadata |
Commercial: | Commercial |
Free edition: | |
Notable features: | Entity relationship diagram, Reverse engineer schema from database, Relational data browse, SQL editor |
Runs on: (for desktop): | Linux,Mac OS,Windows |
ER/Studio
ER/Studio Data Architect helps to easily reverse- and forward-engineer, compare and merge, and visually document data assets across multiple platforms and data sources. The Data Dictionary is a feature of ER/Studio that allows the sharing of many objects including domains, defaults, rules, and attachments. Using the Data Dictionary you can enforce standards, promote reuse, and build a common framework across all models.
Desktop/Cloud: | Desktop |
---|---|
ER Diagram: | |
Export: | HTML |
Metadata stored in: | Documentation repository/file |
Commercial: | Commercial |
Free edition: | |
Notable features: | Data modeler, Entity relation diagram, Model sharing |
Runs on: (for desktop): | Windows |
Alation Data Catalog
Alation data dictionary defines and describes technical data terms. Data terms could be database schemas, tables, or columns. Once connected to data sources, Alation automatically indexes data and populates catalog pages. For example, a column catalog page shows the technical column name, a business title name, the data type, and popularity. Additional context can be added to the data dictionary, for shared understanding across the organization.
Desktop/Cloud: | Cloud |
---|---|
ER Diagram: | |
Export: | MS Excel |
Metadata stored in: | - |
Commercial: | Commercial |
Free edition: | |
Notable features: | ML auto-suggested business glossary terms |
Runs on: (for desktop): | - |
Ataccama Metadata Management & Data Catalog
Ataccama Data Catalog & Business Glossary tool provides automatic mapping of terms to real data sources in the Data Catalog during profiling, ensuring the Data Catalog is always up-to-date and synced with the Business Glossary.
Desktop/Cloud: | Desktop |
---|---|
ER Diagram: | |
Export: | CSV,MS Excel,XML |
Metadata stored in: | Program respository |
Commercial: | Commercial |
Free edition: | |
Notable features: | Automated Mapping of Business Terms, Up-to-date Business Glossary, Data Discovery on Multiple Sources |
Runs on: (for desktop): | Windows |
Atlan
Atlan's data dictionary allows you to document databases, data warehouses, data lakes and BI tools in one easy interface. It uses automation to generate pre-configured critical data quality metrics and automation to help propogate column descriptions through your data ecosystem. Using Atlan's interface, you can easily update table descriptions, column descriptions, assigns owners and stewards and attach a powerful readme to every object. Atlan also allows you to capture relationships such as primary key, foreign key relationships, lineage and more.
Desktop/Cloud: | Cloud |
---|---|
ER Diagram: | |
Export: | - |
Metadata stored in: | Graph Database |
Commercial: | Commercial |
Free edition: | |
Notable features: | Automated data dictionary, column level search, visual frequency, versioned data dictionary |
Runs on: (for desktop): | - |
Key functionality of Data Dictionary tools is to give users the ability to document data. Moreover, very important is the possibility to create a collection of multiple repositories, based on different system engines. For a better understanding of the data, some tools allow visualization of the data structure using ERD (Entity-Relationship Diagrams).
From the organization's point of view, a community module within a data dictionary tool proves to be useful. It facilitates the proper information flow, as well as provides sharing opinions on specific objects among the members of an organization.
Nowadays, data discovery and understanding becomes crucial for proper organization performance. There are many benefits to using Data Dictionaries, such as:
• helps avoid data inconsistencies problems,
• it allows introducing unified nomenclature used in the project,
• Make data searchable, and understandable,
• Create a single source of truth about the data from different repositories,
The prepared list includes simple, open-source data dictionaries as well as more advanced software.