Data discovery is quickly becoming mainstream, and almost every company is deploying that now. We haven’t seen that yet with search-based business intelligence. We haven’t hit that tipping point, yet. But that’s something I think to watch. And at the lower right of this slide you have the big database, and that’s this notion that okay I am dealing with maybe hundreds of millions of rows, tens of millions, hundreds of millions of rows, maybe in some cases, low billions, but that’s really pushing it.
I think I have seen a couple installations that are that big. The original data discovery works, but what if I am talking about billions and billions of rows or petabytes of data here, is in-memory architecture really going to cut it? And at least some folks, vendors like Karmasphere, Datameer, Platfora are promoting the idea of putting lot of the data into Hadoop and using instead of in-memory, using more of a distributed file system, essentially of the open source Hadoop system as a way to store large amounts of detailed data, structured unstructured and analyze that. So that’s sort of another vector that we may see innovation on.
#1 Ranking: Read how InetSoft was rated #1 for user adoption in G2's user survey-based index |
|
Read More |
There is other innovation happening with data discovery. There is a notion of what has been called smart pattern discovery where it’s actually the software that mashes up the data together and finds the interesting columns or variables, interesting insights in the data and then visualizes it.
So it relies almost a lot less on having smart humans using these tools. So there is lot of innovation coming in the data discovery space, The data discovery trend is innovating and clearly going along these lines of unstructured variety, large amounts of volume where we are going to see some significant innovation in this space.
Now there is just one more point I want to make in this slide, and that’s what are the mega vendors doing. So clearly I think all the mega vendors have been disrupted as by the challengers like us, and I think are working on a plan to get involved in this space. I think they kind of recognize the need to have data discovery.
You look at SAS with their visual analytics. This essentially is the innovator’s dilemma, which is that the entrenched vendor knows that there is something new coming along but their whole business model is set up selling the old stuff. That is hard to switch. And SAS has been very clear that SAS Visual Analytics is the new architecture, and that they are slowly moving more and more in that direction and some of the older tools
I think their coding tools will become more legacy tools as SAS Visual Analytics advances. This analytics platform again leverages large amounts of in-memory processing capabilities with interactive g visualization capabilities. So clearly visualization and memory based data discovery tool will become a more and more dominant piece of their analytic portfolio.
Then there is Microsoft with Power Pivot, right and it’s actually an entire power BI suite, not just Power Pivot anymore, it’s also Power Query and Power View. Those three tools almost directly map to the left hand side of this slide where Power Query is number one, Power Pivot is number two and Power View is number three, intuitive navigation.
The reason I am singling out Microsoft here is just that a lot of people who do this kind of work are Excel users anyway. And again, a great way to get around the innovator’s dilemma problem that Microsoft faces is to actually bring out this new innovation with what they have been trying to sell anyway. They have always been trying to sell Excel.
They have been built around selling Excel. So by making the data discovery a free Excel plug-in and also integrating it with SharePoint as well, it just naturally fits into their existing sales structure. We are just starting to see more and more competitive situations. It took a while. I think it was Office 2010 dependence that slowed it down.