Big Data

Big Data Smack Down: Round 1 – Best of Breed Stacks vs. All-in-One Solutions

Posted on February 16th, 2017 | Todd Goldman

Ladies and gentlemen, boys and girls of all ages, welcome to the World Data Federation’s giant cage match! In the corner to your left we have the straggle-toothed veterans—Informatica, IBM, and Oracle—dragging their legacy architectures into the big data age with so called “end-to-end solutions” and only “one throat to choke.” In the other corner, to your right, we have the young scrappy up and comers—Waterline, Trifacta, Arcadia, and Streamsets—building their platforms from scratch to run natively on Hadoop and Spark, with modern REST API architectures that allow for easy integration to create a custom “best-of-breed” big data stack!

 

Ding! Ding! Ding! The first bell rings and the wily veterans strike first with their ability to provide a complete end-to-end solution that allows you to ingest, discover, catalog, govern, wrangle and analyze your data from a single vendor. But wait, they have just tripped on their own feet having forgotten that their solutions come from cobbling together technologies, which they have acquired over the last ten years. No we see that they have tied themselves up in the ropes even further with an installation process that requires many different dedicated servers to be administered.   

 

Ding! Ding! Ding! Now here come the challengers—young startup companies that are laser focused on solving the same new problems the incumbents claim to solve, including the ability to ingest, discover, catalog, govern, wrangle and analyze your data. Installation is trivial because they are all built to run on Hadoop and Spark, so the question remains – can the incumbents work together? It looks like they struggle against the veterans at first—not knowing quite how to partner to solve the customer problems. But, lucky for the startups, they have coaches who are experienced system integrators with expertise in implementing complex systems. Once the coaches step in and suggest the use of RESTful APIs, they are able to get the startups to easily work together and solve customer problems!

 

We‘ve seen this wrestling match between legacy vendors and startup technology disruptors many times before. In the end, the startup technology usually comes out on top for a couple of reasons; 1.) They become the next Tableau or Splunk and go public, or 2.) They get acquired (the list of acquired companies is too long to list here) and become part of the legacy vendor stack. Chances are, and according to some simple math, the startup will likely get acquired but still be around to solve customer problems.

 

In the end, as a consumer of both business intelligence and data management software, you have a choice; either go with the legacy vendor because you’re already using their technology stack and it fits in well with your existing architecture, OR acquire technology from new vendors who are innovating at a faster pace and will either get acquired or go public. I am obviously biased because I work at an early stage firm (Note: we are past the “startup” phase) and current technology makes it pretty easy to integrate new startup technology into your existing environment. But either way, the latest cage match of best in breed startups vs. legacy end-to-end solutions is taking place in the world of data management and integration as we speak. So pull up a chair and enjoy the show.