Big Data Scott Whitney September 28, 2018

Strata Data Conference: It’s a Wrap!

Another Strata Data Conference has come and gone. As I walked the floor last week, I noticed a new air of excitement.

As tech trends go, big data has certainly been taking its hits with some even chalking up the whole thing to unwarranted hype. But while the term “big data” may have lost its shine, data is still the all-important ingredient in many of the big trends everyone’s talking about now–AI, machine learning, IoT… As Gartner Research VP Mike Walker recently said in discussing the latest Gartner Hype Cycle report, “Big data hasn’t been a thing on the hype curve for years, but there are plenty of big data technologies.”

This was exactly the vibe I got at Strata where traffic seemed to be a bit down from previous years, but the data professionals, company heads, analysts and other industry experts I met also seemed to agree we’re on the cusp of something big. The companies that attended this year—the organizations that continue to make big investments in data, whether it’s under the guise of machine learning or AI—will be the ones who will soon score the biggest returns. These innovators learned there’s a difference between throwing everything into the data swamp and properly managing a well-oiled data lake. They’ve evolved with innovative new technologies that allow them to tackle many of the difficult challenges that appeared along their data journeys (and continue to sideline many of those early data dreamers). Hadoop has been pulled down from the stratosphere, closer to the ground where it’s more in line with realistic expectations. They’re automating. They’re getting a handle on governance. They’re investing and innovating in machine learning and AI. And those journeys are now starting to lead to sunnier paths.

Waterline Data is a big part of this next phase. By allowing businesses to replace manual tagging of metadata with an automated process through our industry-leading data catalog, we’re helping them much more rapidly classify all the data assets in their data lake. This is incredibly important, because otherwise—as Gartner has said—90% of data lakes are useless. There is simply too much captured information for organizations to wade through. That’s why at Strata, MapR announced its collaboration with Waterline Data to deliver a data catalog for machine learning-driven AI and analytics. That’s why Waterline Data was named to the Constellation ShortList for Data Cataloging for the third straight year.  That’s why at the Gartner Catalyst Conference last month, Gartner research director Sanjeev Mohan told attendees you need automated data cataloging capabilities—otherwise, you can’t work with data you don’t know you have.

There’s plenty more in store. Thanks to everyone at O’Reilly Media and Strata Data. See you next time!