Waterline Data AI-Driven Data Catalog

Faster Tagging, Smarter Insights

Help your organization become more data driven

Waterline Data catalog software simplifies access, promotes collaboration allowing an organization to more intelligently use their data regardless of the user’s technical acumen

Enable Sensitive Data Discovery

Reduce the impact of a potential breach and minimize risk of fines by using Waterline’s Data Catalog to understand where your sensitive data resides in order to meet regulatory mandates such as GDPR and CCPA

Discover ALL of your Data across the Enterprise

Get insight into your data assets, including “dark” data for greater visibility, compliance and redundancy control. Understand the data’s unique characteristics including its origin, where it’s used and joined with other data

Automated Tagging and Classification

Waterline Data software leverages AI, machine learning and patented fingerprinting technology to automate the discovery, classification, and management of your enterprise data.

  • Speed time-to-insight
  • Discover and tag your sensitive data using 100’s of prepackaged baseline rules
  • Reduces the amount of time needed for tagging by 80%
  • Derive value from trusted data in days rather than weeks or months
  • Easily push insights and build custom views from your catalog into third-party applications with Waterline’s open APIs

Waterline Data’s patented fingerprinting technology is the industry’s only fingerprinting system that combines AI, machine learning, and best-in-class crowdsourcing – to fuel your critical data-driven business needs.

Read More

Build Trust in your Data Through Collaboration, Reviews and Recommendations


Increase knowledge of your Data Estate through Collaboration. With comments, collaboration, threaded conversations and ratings, get insights and guidance from your colleagues on how they are using data throughout your enterprise. Learn which data sets your colleagues find valuable for their use cases and evaluate data based on ratings and reviews from your peers.

  • View and contribute rich content descriptions and images to showcase and describe your data resources
  • Quickly understand the quality of a specific data resource, who are the experts, how many times has it been used, what are the users saying and their reviews
  • Easily engage distributed experts and the community to ask questions about a specific data resource
  • Get insight into how one data resource is either along with, or joined to other data resources
  • Create crowdsourced ratings and reviews that, combined with objective profiling metadata, provide users a view into data quality and usefulness

Accelerate Digital Initiatives with Cloud-Native Support

Waterline Data Catalog helps enterprises move to the cloud, enabling them to build applications quickly, while improving quality and reducing risk. It is responsive, scalable, and AI-Driven Data Catalog anywhere – public, or private cloud, or on-premises and cloud hybrid infrastructure.

Cross-Domain Deployments

  • Distributed agent allows customers to establish a single cloud catalog, across multiple sites and lines of business. independent of where the source data is located.
  • Runs on any cloud-native infrastructure such as AWS, Azure or GCP and can profile diverse data lakes, relational databases and other data stores.

Multi-Cluster Support

  • Enables customers to comply with their local regulatory requirements and provides an optional metadata-only view outside of a restricted jurisdiction – allowing enterprises to catalog data around the globe without violating data residency mandates.

Run Anywhere with Containerization

  • Built on containers and architected specifically to run in the elastic and distributed environment from modern cloud computing platforms. Containers make life easier because they are lightweight to run and extremely quick to start. Their portability allows the catalog to run anywhere – public, or private cloud, or on-premises and cloud hybrid infrastructure.
  • Full support for Docker and Kubernetes makes it easier for the catalog to scale based on demand.

Read More

DataOps Dashboard

Waterline’s DataOps Dashboard serves as the operational hub by continuously monitoring catalog tasks to understand what portion of your data estate has been processed, tagged and tagged as sensitive.

The dashboard enables users to immediately understand the macro risk of their data estate and view specific files that have been tagged as sensitive, helping expedite the identification, remediation and documentation processes.

It also helps you answer operational questions to drive adoption such as who is using the catalog, what department are they in, how many searches, questions asked, comments made, etc.

DataOps Dashboard delivers:

  • Continuous monitoring – scheduled or lambda based, incremental
  • Automated sensitive data detection
  • Automated data quality monitoring and assessments

Read More

Data Objects

Join data objects across the enterprise without writing a line of SQL. Our data catalog software fingerprints data objects so you can find the right data object and eliminate any guesswork. Easily join data from across the enterprise and get insights into metrics about new data objects.

  • Rapidly discover possibilities to leverage enterprise data to drive insights
  • Immediately find joinable assets that contain the data you are interested in
  • Instantly test possible join options from systems that have never been joined
  • Easy profile join results for instant insights to assess potential usefulness of the new object — all without the usual guesswork, time, and happenstance discovery
  • Efficiently package the new objects to generate views that can be written to Hive

We leverage patented fingerprinting technology to instantly present possible join options. The catalog profiles your join results so analysts can assess the semantics and completeness of the data object, including the objects’ fields, field tags, field profiles, and row count. Analysts can package these new data objects to write them to Hive.

Read More

Data Rationalization Dashboard

The Data Rationalization Dashboard enables you to eliminate clutter in your cloud migrations and identify potential privacy risks for cleaner searches and a healthier data estate.

  • Rapidly declutter your data for search, identify files with privacy risks
  • Dramatically reduce application licensing and storage costs
  • Accelerate cloud migration and data rationalization projects with a true view of redundant data

Traditional data catalogs return messy search results with redundant data, exposing users to security risks from rogue files. Waterline Data’s Rationalization
Dashboard powered by our patented fingerprinting technology, allows you to spot multiple copies of data and compromised files. Users can reach the approved master version of data while navigating around redundant data. They can spot copies that are no longer identical. Businesses can effectively root out redundant data to reduce licensing and storage costs and identify bad copies in order to surface the data that matters.

Read More

Business Rules Engine

Our new business rules engine enables you to write rules that leverages tags for data quality monitoring, among other use cases.

  • Create an automated, compliant data estate for data-driven decisions and self-service analytics
  • Examine tags to uncover the quality of the data and expose tags against business rules
  • Business rules combine data and metadata conditions to perform data quality checks, monitor regulatory compliance targets, perform metadata quality checks, and programmatically tag data

It’s difficult to ascertain data quality in large data lakes. Traditional data quality tools are difficult to deploy and rely on manual and unmanaged tags and policy directives. Waterline Data AI-Driven Data Catalog’s intuitive business rules engine makes it easy to apply tags for data quality monitoring.

The catalog software incrementally evaluates and tags new data sets and rows to keep your metadata fresh. For example, new data is automatically scanned for sensitive data and tagged appropriately, so you can implement tag-based security policies. The catalog creates smart tags that the business rules engine can use to power directives and data policies and to deliver data quality metrics.

Read More

Data Lineage

Our new data lineage tool makes it easy to understand data lineage, both inferred and imported, so you know where data sets came from and how they are used.

  • 10x increase in data lineage accuracy
  • Intuitively ascertain data lineage, both inferred and imported
  • Infer missing lineage through Fingerprinting or import lineage directly from the source with REST APIs and connectors with Apache Atlas and Cloudera Navigator

Waterline Data’s patented fingerprinting technology looks at lineage relationships, meta tags, time stamps, and many other criteria to infer data lineage. This dramatically reduces the time needed to develop lineage as new data is added and existing data is used, changed, and joined. New data is incrementally fingerprinted without the system re-scoring the entire object.

Waterline Data AI-Driven Data Catalog establishes user trust in the data through lineage visibility and addresses requirements for compliance with regulatory laws. For more information about Waterline Data’s Data lineage functionality, click here.

Read more

Integrates Seamlessly With All Of Your Data Systems

Integrates with cloud, on-premise or hybrid data environments including:

Supported Data Sources

Introducing the Waterline 2019 Product Launch