Big Data Todd Goldman October 18, 2017

Self-Service Analytics is Not Enough. We Need Self-Service Data!

The idea of “self-service” analytics really took off when Tableau was founded in 2003 with the idea that business analysts shouldn’t have to go to IT to do basic business intelligence.  Fundamentally, every time they had to make a request from some someone in IT to generate a report for them, it got stuck in a massive queue of pre-existing IT requests.  The idea was formed that business analysts should be able to “self-serve” their own data visualizations to tell “data stories”.  And if the Tableau conference in Vegas last week was any indication, they succeed wildly in delivering on that vision.

But like any good technology product, the solution to one problem, only served to generate a new problem (or opportunity depending on your point of view).  Actually in this case, it generated a new set of problems that require the expansion of the self-service data concept.

So let’s start back at the top.  Imagine you are a business analyst and you are using Tableau or some other similar product to free you from the shackles of IT.  The first thing you of course do is analyze the heck out of the data sources you tend to work with on a regular basis.  But eventually you run into a problem where you have to combine multiple data sources together.  Well, back in the day, when the idea of self-service analytics first started, when you hit this problem, you ended right back in the situation you started in.  You had to go back and get in the queue for someone from IT to write some ETL code for you to combine those data sources together, land them in a new table and then go back to your data visualization tool.

For a few years you suffered again in this situation and then a bunch of self-service ETL tools appeared.  Some were SaaS ETL tools and others were entirely new creations called Data Wrangling tools.  Solutions like Trifacta and Paxata lead the pack in this space today.  Now all of a sudden you no longer had to wait for IT to do basic data quality and data integration.  Self-service had progressed to the next level of sophistication.  And that lasted a few years until the next problem appeared on the horizon.

Then, one day, some random executive asks you to generate a report that combines not only data you are familiar with, but data that you will have to get from some other department or some other business unit.  The problem is that you don’t even know where to find this data.  So you start asking around and your request ends up with some data steward who tells you that she thinks she can help you, but after she gets back from vacation and after she gets through the 20 other executive level requests.  And voila, you are back to the exact same position you were in before.  In the queue, waiting on someone else.

Now the next level in self-service has arrived, the ability to go to a data catalog and search for the exact data you need without having to count on so called “tribal-knowledge” of people within the organization who happen to know where to find the data you need to do your job.  Ultimately this will evolve as well but with integration up and down the self-service stack so a business analyst will be able to search for data without asking for help, look at different potential data sources and based on profiling information that presents quality statistics as well as ratings and reviews, they can chose the data they need for their project, then using a data wrangling tool to integrate it themselves and finally use a data visualization tool to create pretty charts and graphs.  All with the goal of enabling true end to end, data self service.

Now if only I could fries with that!