One Solution Across Multiple Organizations
You’ll know the what, what, permissions and integrations across your organization. Waterline Data Catalog automatically and incrementally “fingerprint” all data and infers data lineage by analyzing actual data values for relational, cloud and hadoop data.
Use machine learning to automatically suggest tags and match data fingerprints to business glossary terms. This process diminishes human error and improves metadata management.
Analysts and data stewards accept or reject suggested tags, while machine learning fine tunes the tagging process and improves the matching algorithm for a dramatic time savings over manual tagging processes.
Map your compliance policies to your data assets: acceptable use, legal holds, expiry
Gain visibility over disparate data and into dark data
Stay ahead of changing compliance requirements but having all of your organization’s data in one central repository.
A properly implemented data catalog is the foundation for successful governance and compliance
Search for relevant data through Waterline Data’s intuitive GUI. Now data discovery and assessing the quality of your data is as easy as shopping on-line.
Better and Faster Insight
Through Waterline’s collaborative platform, peer reviews and broad visibility of all data sources, business professionals achieve better and faster insights.
You have thousands of datasets with millions of distinct data fields across your company and that number is growing every day. Manually documenting your data isn’t an option! Waterline Data automatically catalogs all your data assets (Hadoop, cloud, relational, etc) so you can spend more time using data and less time looking for it.
Reduces manual tagging of data by over 80%. Waterline Data Fingerprinting™ combines big data analysis, machine learning and human curation to automatically catalog data and data lineage at scale
Native Big Data, Native Cloud Storage
Works natively on Hadoop and Spark to easily scale to handle all your data. Directly connects to cloud storage like Amazon S3, Azure Blobstore and Google Cloud Storage
Data stewards accept/reject automatically suggested tags and the system learns, fine tunes and improves the matching algorithm
Any Data Source
Works seamlessly across a wide variety of data sources (relational, files, Hadoop, cloud, etc.) because you never know where the most important data is located
You’re a business professional and when you have questions, you need reliable answers, but where is the right data, and who do you ask? Waterline Data consolidates your tribal data knowledge and makes it easy to share with others so you and your colleagues can quickly find the data you need.
Easy to use web search interface designed specifically for the business user to search a catalog of trusted, curated data
Easy to integrate
Search directly from your existing data wrangling and visualization tools integrated through our REST APIs
Ratings and Reviews
Add annotations and view the comments of other users to capture tribal knowledge and establish trusted data sources
Automatically propagates data tags so users can easily find similar data
Govern your data with agility
Data Governance isn’t one size fits all. We provide the appropriate level of governance for whatever type of data is being managed.
Simplifies data governance by delivering a truly scalable, automated and repeatable process for identifying sensitive data, capturing data lineage, and ensuring proper data use and access
GDPR governance modules generate specific reports that highlight the location and proper use of GDPR compliant data to demonstrate proper compliance processes to regulators
Data access control
User and Role management ensures proper data access for sensitive data and integrates directly with Apache Ranger and Cloudera Sentry to enable tag based access control
Usage Audit and Monitoring
Auditing provides full traceability on how all users have tagged, curated, commented, and searched for data within their data catalog