Data profiling tools

Dataedo

Dataedo is a metadata management & data catalog tool with a data profiling feature. It allows you to use sample data to learn what data is stored in your data assets. You can browse min, max, average and median values, see top values, as well as value and row distribution to understand the data better before using it.

Access control: Yes
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: Yes
Runs on: (for desktop): Windows
Sensitive data discovery: Yes
SQL sources: Yes
Statistics of data: Avg,Max,Min,Stdev
Tagging data: -
Data Profiling - Dataedo Web
Data Profiling - Dataedo Desktop
Database Web Table Diagram

Atlan

Atlan automatically profiles your data to identify missing values, outliers & other data anomalies. Data profiles are fully configurable, and admins can schedule data profile updates, run profiles on random/stratified samples or custom filters. Atlan's data profile is an open ecosystem, allowing teams to import data quality metrics from external ecosystems like data pipeline tools for key metrics, such as timeliness, or other internal tools or frameworks.

Access control: Yes
Commercial: Commercial
Desktop/Cloud: Cloud
Excel workbooks: Yes
Flat files: No
Free edition: No
Metadata identification: Yes
NoSQL sources: No
Runs on: (for desktop): -
Sensitive data discovery: Yes
SQL sources: Yes
Statistics of data: Avg,Stdev
Tagging data: Yes

Datamartist

Datamartist is an easy-to-use data profiling tool for analyzing format, types, completeness, and value counts. It understands data quality issues clearly and quickly. It performs automated data profiling tasks, which can be used to create a periodic snapshot of key data quality metrics, letting an organization track and report on data quality.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: No
NoSQL sources: No
Runs on: (for desktop): Windows
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: -
Tagging data: Yes

Global IDs Data Profiling Suite

Global IDs Data Profiling Suite is a data discovery and profiling tool that automates the discovery of data assets, automates data profiling, and provides an active inventory of all data assets.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: Yes
Runs on: (for desktop): Linux
Sensitive data discovery: Yes
SQL sources: Yes
Statistics of data: -
Tagging data: Yes

Experian Pandora for Data Profiling

Experian Pandora for Data Profiling helps to focus on fixing data errors by enabling business users to conduct profiling analysis and relationship discovery with incredible speed. It automatically discovers broken keys, orphaned records, and thousands of content quality issues using the highly intuitive fault detection features of our data management platform.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: No
Runs on: (for desktop): Windows
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: -
Tagging data: No

Datiris Profiler

Datiris Profiler is an intuitive data profiling tool. Its key features include cross-table analysis, domain validation, pattern analysis, conditional profiling, command-line interface, and many more. Besides that, with features such as batch profiling, you can queue up and profile data quickly and spend your time analyzing instead of gathering it.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: No
Runs on: (for desktop): Windows
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: -
Tagging data: Yes

IBM InfoSphere Information Analyzer

IBM InfoSphere Information Analyzer provides data profiling and analysis to accurately evaluate the content and structure of your data for consistency and quality. It utilizes a reusable rules library and supports multi-level evaluations by rule record and pattern. It also facilitates the management of exceptions to established rules to help identify data inconsistencies, redundancies and anomalies, and make inferences about the best choices for structure.

Access control: Yes
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: Yes
Runs on: (for desktop): Linux,Windows
Sensitive data discovery: Yes
SQL sources: Yes
Statistics of data: Avg,Stdev
Tagging data: Yes

Oracle Data Profiling and Data Quality for Data Integrator

Oracle Data Profiling is a data investigation and quality monitoring tool. It allows business users to assess the quality of their data through metrics, to discover or infer rules based on this data, and to monitor the evolution of data quality over time.

Access control: No
Commercial: Free
Desktop/Cloud: Desktop
Excel workbooks: No
Flat files: Yes
Free edition: Yes
Metadata identification: Yes
NoSQL sources: No
Runs on: (for desktop): Linux,Windows
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: -
Tagging data: No

Open Source Data Quality and Profiling

Open Source Data Quality and Profiling tool is an open source project dedicated to data quality and data preparation solutions. This tool is developing high performance integrated data management platform which will seamlessly do data integration, data profiling, data quality, data preparation, dummy data creation, meta data discovery, anomaly discovery, data cleansing, reporting, and analytic.

Access control: No
Commercial: Free
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: No
Free edition: Yes
Metadata identification: Yes
NoSQL sources: Yes
Runs on: (for desktop): Mac OS,Windows
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: -
Tagging data: No

Acuate Data Profiling

Acuate Data Profiling tool enables you to find data problems quickly. It lets you select from a range of profiling rules to analyze your data. It can analyze data by character type or by word and lets you dive deeper into the details by drilling down into the data.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: No
Runs on: (for desktop): Windows
Sensitive data discovery: No
SQL sources: No
Statistics of data: -
Tagging data: No

Data Ladder

Data Ladder’s DataMatch Enterprise offers one of the easiest to use data profiling tools in the market. It quickly provides enough metadata to construct a cogent profile analysis of data quality and quantifies the scope and depth of necessary add-ons to make the project successful. Once it does the profiling, it proceeds to perform data matching, cleansing, deduplication and standardization, finally achieving data validation.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: Yes
Runs on: (for desktop): Windows
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: Avg,Max,Min
Tagging data: No

Ataccama ONE

Ataccama One lets you discover, analyze, understand critical patterns in your data. You can see data domains and data quality highlights for each attribute. Re-profile data in one click and check whether the problems you identified were fixed. You can select as many tables in a data source as you need and profile them all in one click.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: Yes
Metadata identification: Yes
NoSQL sources: No
Runs on: (for desktop): Windows
Sensitive data discovery: Yes
SQL sources: Yes
Statistics of data: -
Tagging data: Yes

Aperture Data Studio

Aperture Data Studio is a powerful and easy-to-use data management suite that helps you quickly and easily profile data to understand deficiencies as an essential first step to cleansing, joining, and validating data. It profiles the complete data set and audits every step in readiness for statutory reporting and enhanced transparency of data and processes, de-risking compliance initiatives.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: No
NoSQL sources: Yes
Runs on: (for desktop): Windows
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: Avg,Stdev
Tagging data: Yes

DataRobot Data Prep

DataRobot Data Prep enables both novice and expert users to quickly and interactively explore, profile, clean, enrich and shape diverse data into AI assets ready for machine learning model development and deployment. It offers a visually interactive user interface that presents data in familiar tabular or spreadsheet style with no coding required. DataRobot provides profiles for every record and feature, including how many values are unique or missing and the statistical mean, standard deviation, median, minimum value, and maximum value.

Access control: Yes
Commercial: Commercial
Desktop/Cloud: Cloud
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: Yes
Runs on: (for desktop): -
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: Avg,Max,Min,Stdev
Tagging data: Yes

Melissa Data Profiler

Melissa Data Profiler analyzes data before it’s merged into your warehouse, then helps ensure consistent data quality once it’s there. It lets you identify data quality issues, monitor improvements over time, and utilize reference data to determine if your input is consistent with expected data. It can also determine if the input data is consistently fielded using the data contained in the entire record to analyze the context of data.

Access control: No
Commercial: Commercial
Desktop/Cloud: Desktop
Excel workbooks: Yes
Flat files: Yes
Free edition: No
Metadata identification: Yes
NoSQL sources: No
Runs on: (for desktop): Linux,Mac OS,Windows
Sensitive data discovery: No
SQL sources: Yes
Statistics of data: Avg,Stdev
Tagging data: No