Automated Tagging and Classification
Waterline Data software leverages AI, machine learning and patented fingerprinting technology to automate the discovery, classification, and management of your enterprise data.
- Speed time-to-insight
- Discover and tag your sensitive data
- Reduce manual tagging of data by more than 80%
- Derive value from trusted data in days rather
than weeks or months
Waterline Data’s patented fingerprinting technology is the industry’s only fingerprinting system that combines AI, machine learning, and best-in-class crowdsourcing – to fuel your critical data-driven business needs.
Join data objects across the enterprise without writing a line of SQL. Our data catalog software fingerprints data objects so you can find the right data object and eliminate any guesswork. Easily join data from across the enterprise and get insights into metrics about new data objects.
- Rapidly discover possibilities to leverage enterprise data to drive insights
- Immediately find joinable assets that contain the data you are interested in
- Instantly test possible join options from systems that have never been joined
- Easy profile join results for instant insights to assess potential usefulness of the new object — all without the usual guesswork, time, and happenstance discovery
- Efficiently package the new objects to generate views that can be written to Hive
We leverage patented fingerprinting technology to instantly present possible join options. The catalog profiles your join results so analysts can assess the semantics and completeness of the data object, including the objects’ fields, field tags, field profiles, and row count. Analysts can package these new data objects to write them to Hive.
Data Rationalization Dashboard
The Data Rationalization Dashboard enables you to eliminate clutter in your cloud migrations and identify potential privacy risks for cleaner searches and a healthier data estate.
- Rapidly declutter your data for search, identify files with privacy risks
- Dramatically reduce application licensing and storage costs
- Accelerate cloud migration and data rationalization projects with a true view of redundant data
Traditional data catalogs return messy search results with redundant data, exposing users to security risks from rogue files. Waterline Data’s Rationalization
Dashboard powered by our patented fingerprinting technology, allows you to spot multiple copies of data and compromised files. Users can reach the approved master version of data while navigating around redundant data. They can spot copies that are no longer identical. Businesses can effectively root out redundant data to reduce licensing and storage costs and identify bad copies in order to surface the data that matters.
Business Rules Engine
Our new business rules engine enables you to write rules that leverages tags for data quality monitoring, among other use cases.
- Create an automated, compliant data estate for data-driven decisions and self-service analytics
- Examine tags to uncover the quality of the data and expose tags against business rules
- Business rules combine data and metadata conditions to perform data quality checks, monitor regulatory compliance targets, perform metadata quality checks, and programmatically tag data
It’s difficult to ascertain data quality in large data lakes. Traditional data quality tools are difficult to deploy and rely on manual and unmanaged tags and policy directives. Waterline Data AI-Driven Data Catalog’s intuitive business rules engine makes it easy to apply tags for data quality monitoring.
The catalog software incrementally evaluates and tags new data sets and rows to keep your metadata fresh. For example, new data is automatically scanned for sensitive data and tagged appropriately, so you can implement tag-based security policies. The catalog creates smart tags that the business rules engine can use to power directives and data policies and to deliver data quality metrics.
Our new data lineage tool makes it easy to understand data lineage, both inferred and imported, so you know where data sets came from and how they are used.
- 10x increase in data lineage accuracy
- Intuitively ascertain data lineage, both inferred and imported
- Infer missing lineage through Fingerprinting or import lineage directly from the source with REST APIs and connectors with Apache Atlas and Cloudera Navigator
Waterline Data’s patented fingerprinting technology looks at lineage relationships, meta tags, time stamps, and many other criteria to infer data lineage. This dramatically reduces the time needed to develop lineage as new data is added and existing data is used, changed, and joined. New data is incrementally fingerprinted without the system re-scoring the entire object.
Waterline Data AI-Driven Data Catalog establishes user trust in the data through lineage visibility and addresses requirements for compliance with regulatory laws. For more information about Waterline Data’s Data lineage functionality, click here.
Leverage the Tribal Knowledge of your Organization
With comments, collaboration, and ratings, get insights from your peers on how they are using data throughout your enterprise. Learn which data sets your colleagues find valuable for their use cases and evaluate data based on ratings and reviews from your peers.
- Evaluate data based on ratings and reviews from your peers
- Add annotations and view other users’ comments to capture tribal
knowledge and establish trusted data sources
- Easily push insights to external systems and applications with our open APIs
- Build custom views of your catalog in third-party applications. Create external data quality applications that leverage tags from the catalog
- Create subjective crowdsourced ratings and reviews that, combined with objective profiling metadata, provide users a view into data quality and usefulness