News Resource

Waterline Data Brings Automated Data Cataloging to Hortonworks Data Platform through Integration with Apache Atlas

Mountain View, Calif.—June 28, 2016—Waterline Data, The Smart Data Catalog Company, today announces the integration of the company’s Smart Data Catalog with Apache Atlas within Hortonworks Data Platform (HDP). This announcement is being made from the Hadoop Summit being held in San Jose, June 28-30.

The Apache Atlas project provides data governance framework and capabilities for Hadoop that effectively address many compliance requirements. With the addition of Waterline Data’s Smart Data Catalog, Apache Atlas users can replace manual tagging of metadata with an automated process that rapidly classifies the data assets in their data lake, including new data even as it’s created. Unlike catalogs that scan historical SQL logs, Waterline Data automatically catalogs every field of data in the data lake while capturing and learning from tribal knowledge.

HDP is the industry’s only true secure, enterprise-ready open source Apache Hadoop distribution based on a centralized architecture (YARN). HDP addresses the complete needs of data-at-rest, powers real-time customer applications and delivers robust analytics that accelerate decision making and innovation.

“We are very excited that Waterline Data has integrated their automated smart data cataloging capabilities with Apache Atlas, which brings added value to Waterline Data and Hortonworks users,” said Matt Morgan, Vice President of Product and Alliance Marketing, Hortonworks. “This helps customers rapidly organize their data lake, enabling more secure, compliant and optimal use of their data through Atlas.”

With this announcement, Waterline Data has now earned the Governance Ready badge. Previously, Waterline Data has earned HDP Certification and YARN integration certification.

This new integration allows our common customers to:

  • Accelerate data discovery, governance and time to value through smart data discovery capabilities
  • Provide data engineers, data scientists and business analysts with secure self-service access to trusted, high quality data for faster understanding and use
  • Automatically update Atlas with all the metadata Waterline Data uncovers
  • Facilitate data compliance and trust by discovering sensitive data and data lineage

Furthermore, as part of the company’s integration with Apache Atlas via HDP, Waterline Data will begin importing the data lineage information captured in Apache Atlas.

“No data lake can be opened up without proper data governance,” said Alex Gorelik, CEO of Waterline Data. “If compliance isn’t assured, the data simply isn’t usable. That’s why our new integration with Apache Atlas is so significant. As soon as organizations begin to realize they can replace manual tagging with rapid, automatic cataloging, we expect to see a dramatic rise in the adoption and expanded use of Hadoop.”

About Waterline Data

Waterline Data accelerates data discovery, governance and time to value though an industry-only Smart Data Catalog that instantly automates the cataloging of all data lake assets, including the ability to capture and learn from tribal knowledge. The company is led by a team of enterprise data management veterans, funded by top venture and corporate investors, including Menlo Ventures, Jackson Square Ventures, Partech Ventures, and Infosys, and implemented in large enterprises around the globe. Founded in 2013, the company is headquartered in Mountain View, California. For more, visit

Media Contacts:

Chris Blake
MSR Communications
Phone: 1-415- 989-9000