Data profiling tools for Presto DB

Data Profiling tools allow analyzing, monitoring, and reviewing data from existing databases in order to provide critical insights. Data profiling can help organizations improve data quality and decision-making process by identifying problems and addressing them before they arise.

Talend Data Fabric

Talend Data Fabric combines data integration, integrity, and governance in a single, unified platform. Talend Data Fabric's capabilities allow you to extract, process, and profile data from virtually any source to your data warehouse. Data profiling lets you quickly identify data quality issues, discover hidden patterns, and spot anomalies through summary statistics and graphical representations.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: Yes
Runs on: (for desktop): Mac OS,Windows
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: -
Tagging data: Yes

Informatica Data Profiling

Informatica’s data profiling solution, Data Explorer, is available in two editions—Standard and Advanced—that employ powerful data profiling capabilities to scan every single data record, from any source, to find anomalies and hidden relationships. It works regardless of complexity or of the relationship between your data sources.

Access control: No
Commercial: Commercial
Desktop/Cloud: Cloud
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: Yes
Runs on: (for desktop): -
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: Avg,Max,Min,Stdev
Tagging data: Yes

Alation Data Catalog

Alation’s data profiling capabilities help reduce the time spent in the data exploration phase. With Alation’s data profile, data consumers have the metrics they need to easily discern the quality of any data object. Alation displays important characteristics, statistics, and numerical graphs about the data — enabling data scientists and data engineers to quickly take action. The data profiling now also includes new charts and customizations.

Access control: No
Commercial: Commercial
Desktop/Cloud: Cloud
Excel workbooks: No
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: Yes
Runs on: (for desktop): -
Sensitive data discovery: Yes
SQL sources: Yes
Statistics of data: -
Tagging data: Yes

Atlan

Atlan automatically profiles your data to identify missing values, outliers & other data anomalies. Data profiles are fully configurable, and admins can schedule data profile updates, run profiles on random/stratified samples or custom filters. Atlan's data profile is an open ecosystem, allowing teams to import data quality metrics from external ecosystems like data pipeline tools for key metrics, such as timeliness, or other internal tools or frameworks.

Access control: Yes
Commercial: Commercial
Desktop/Cloud: Cloud
Excel workbooks: Yes
Flat files: No
Free edition: No
Metadata identification: Yes
NoSQL sources: No
Runs on: (for desktop): -
Sensitive data discovery: Yes
SQL sources: Yes
Statistics of data: Avg,Stdev
Tagging data: Yes

The use of data profiling tools can lead to higher-quality, more reliable data or eliminating errors that add costs to data-driven projects. Eliminating these costly errors involve processes such as:

• Collecting descriptive statistics.
• Collecting data types, length and recurring patterns.
• Tagging data with keywords, descriptions or categories.
• Performing data quality assessment.
• Discovering metadata and assessing its accuracy.

The most efficient way of handling the data profiling process is to automate it with a data management solution. We prepared a list of open-source data profiling tools that help you carry out the analysis of your data and identify the issues.