Earlier this year, Waterline Data teamed with Arcadia Data, StreamSets and Trifacta to launch MakeBigDataWork to help you get the most from your Big Data.
MakeBigDataWork has since hosted a number of educational webinars led by respected industry analysts and other Big Data experts. In addition to delivering best practices and other information, these sessions have also gleaned some interesting data from attendees themselves through polls conducted by the webinar leads.
What did we find? That many organizations can probably relate with the Carpenters song, “We’ve Only Just Begun.”
While the Big Data revolution is well underway, and plenty of new technologies are available to help organizations make the most of their data, only 12% of the enterprises we surveyed say Big Data makes up at least 60% of their overall data make-up. Most businesses (68%) say no more than 40% of their data is Big Data. You may feel your business is in the same boat, but before you start thinking you’re behind the times, keep in mind that not all data needs to be Big Data. As Eckerson Group’s David Wells said in his recent Flipping Analytics on Its Head (How Big Data Self-Service Analytics Really Works) webinar, “Traditional data is a part of Big Data. In many ways Big Data is an unfortunate name, because it tends to concentrate on the size of data, the volume of data, when the really interesting things about this data world are the variety and the speed. We don’t kick relational data. We don’t kick flat files out of the equation. They are part of the variety of data. So too are geospatial databases. So too are documents store and the other forms of NOSQL databases.”
He’s right. It is the variety of data and the sources of data that make data truly interesting. Businesses may have data coming in from social media, ecommerce, and other sources, but it doesn’t replace traditional enterprise data. It enriches our enterprise data. And the traditional enterprise data, in turn, provides context for the Big Data. Without it, much of the Big Data would be meaningless.
Something else we found:
Most businesses (52%) are still in the early stages of their Big Data journeys, with half of this group still asking questions like “What’s a data lake?” and the other half piloting their first use case. About a third of enterprises have their data lakes in production and are working on end-user self-service analytics. As for the data lake journey itself, 42% say they’re still gathering info, 30% say their lakes are partially implemented, and 14% say they’re fully implemented. Another 14% have admitted to having created a data swamp!
Another area where businesses don’t seem to be making the kind of headway that would be expected is with regard to the cloud: 66% of the businesses we polled say only 25% (or less) of their data resides in the cloud while 23% of businesses say up to 50%. This goes to show that Big Data in the cloud for businesses is still a relatively new phenomenon with many struggling with the fact that not all their data is going into once place or even a single Hadoop or cloud storage system. The complexity in data storage, for many, is getting worse—not better.
But, while our polling data suggests the Big Data train has been a slow moving one, it’s moving nonetheless. The same data that may come off as negative to some reflects an undeniably shiny outlook: a very large majority of organizations have made at least some progress in rolling out their plans for Big Data.
Yes, there are many roads to choose. And yes, many businesses start out walkin’. But soon, as Karen Carpenter once sang, they’ll learn to run.