“Be vewy vewy quiet. I’m hunting data!”
If you work with data, chances are you’ve uttered that embellished Elmer Fudd line at least once. And why not? After spending 80% of the analytics lifecycle on data acquisition and prep, you do start to feel like the hapless hunter forever unable to catch the prized wascally wabbit. And even when you do find him, as in Elmer’s case, there’s always some kind of hold up. The wabbit always finds some way to confuse the hunter.
In our world, the wabbit is of course that needle-in-the-haystack data that can be used to deliver value to the organization. It’s hard enough to sort through structural data you’ve been authorized to access without having to ask the data professionals in IT for permission to access new sources of data. What about all the semi-structured and unstructured data that hasn’t been processed simply because there is too much data to determine what the organization has and who can access it? How do you know it’s trusted? How do you know where it came from? How do you know it hasn’t been replicated, modified or corrupted?
You could liken the problem to a candle that isn’t burning on either end. If you employ a “data first” approach, you’ll waste tons of time and money trying to manually make sense of all that legacy data that’s growing larger by the second. But organizations that try to target the data they’re looking for based on the challenge or opportunity at hand can also waste tons of time and money trying to locate and access data that may or may not be there.
What we at Waterline do is enable both approaches with data catalog solutions that automate the discovery, governance and usage of data. But, unlike catalogs that only scan historical SQL logs, we combine data profiling and machine learning with peer ratings and reviews to ensure the technology is always kept in check by human expertise. Automated results are curated through subject matter expert review. Data stewards and business analysts can officially accept or reject a tag at any time.
Trust, then, is established to support tighter control over accessing and provisioning. With Waterline’s automated tagging of data part of the regular project workflow to kickstart the initial identification of data, organizations can more readily find critical data and convert it into actionable business intelligence on an ongoing basis.
Check out this webinar with Intelligent Business Strategies’ Michael Ferguson who talks more about how smart data catalog solutions can help your Elmer Fudds catch their wabbits.