Mark Flaherty: Today we are talking about Big Data. It has become a very popular topic nowadays in business intelligence. There were obvious reasons why, there are all kinds of sources of data now that challenge us. It’s not just the amount of data but also the type of data that is really giving organizations a run for their money in terms of being able to leverage this information and get some value from it.
It's certainly a new challenge to the enterprise in general and it’s kind of an exciting one as well. There are new opportunities to get more value out of our business analytics by getting Big Data into our environments and that’s the cornerstone of what we will be talking about today.
So let me dive in and just kind of cover my agenda just a little bit. I am going to help everybody who is listing in today to get a better view of understanding some of the characteristics of Big Data, some of it sources, and then really looking at some of the technologies and methodologies that you might want to have in place or be thinking about in order to leverage it to get value from it.
And then figure out exactly what type of things that you need to have in place or at least some of the things that I am suggesting that you should have in place, to get the most from it.
#1 Ranking: Read how InetSoft was rated #1 for user adoption in G2's user survey-based index |
|
Read More |
And then of course we are going to talk a little bit about some of the more advanced sites of it around real-time value of Big Data and of course an exploration of “Aha!” moments, which is one of my favorite topics. So let me dive right on in and start talking just a little bit about the characteristics of Big Data. And I am going to start out with the definition (which may seem a little surprising to you all) but this is kind of a new territory. And there is a lot of debate going on about how to exactly define it and what exactly it is.
So I am going to share a generic kind of formula here that you might want to consider when you are looking at Big Data. Generally speaking, working with big data sets, either because of their size or their complexity, tends to exceed the ability of the traditional BI tools or commonly used BI technologies that you may have in place for analyzing data, for managing information or for running processes. When this happens, when Big Data gets introduced to your environment, some of these traditional technologies either falter or probably become stretched a little bit beyond their normal capabilities.
We are talking large data sets here; in many cases hundreds of terabytes to petabytes is very common when we are discussing Big Data. And one of the things that you might want to think about that makes it somewhat unique is that it has a lot of variety to it; it comes from a lot of different locations. The volume is extremely large and the velocity of the information is extremely fast. And some of this is good and some of it is difficult to manage, but overall these are some of the characteristics that you might see when you are looking at Big Data.
As I mentioned, it's generally widely distributed, often times it's semi-, or completely, unstructured which comes along with its own set of challenges when you are trying to run analytics or get information out of that type of data. And often times, it is best if it's leveraged in a real-time environment. Certainly you can get extremely high value from large data when you are looking at it in a real time environment. So from an understanding of it or a definition standpoint that’s probably one of the best ways to look at Big Data. When we start talking about the sources that comes from many different places, a machine data is probably one of the largest contributors to the Big Data challenge and some of those things might be sensor or system information- or RFID-type implementations that are dumping in awful lot of information in your systems.