Big Data was one of the hottest buzzwords back in 2015. Companies were well underway in their investments into arriving at better answers from their massive amounts of data. But, in that same year, Gartner canceled its Hype Cycle for Big Data and many articles published about the demise of Big Data began to appear. In order to understand whether it has lived up to its hype, we should first understand what the differences are between Big Data and new buzzwords that have arisen, such as Fast Data
Big data can be a confusing concept and the explosive growth of data from the internet, computers, mobile phones, and the Internet of Things, makes it even more confusing. If we take a look at Google’s definition of Big Data,
“extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.”
We start to understand that Big Data requires not just the collection and storage of data, but the creation of better ways to make that data more efficient and effective within your organization. The rise of Massively Parallel Processing (MPP) databases and MapReduce technologies has made what once seemed impossible, the historical analysis of petabytes of data on ordinary, or commodity hardware, an everyday reality.
But a Big Data strategy doesn’t end there, once your massive data sets are being captured and being stored in a data warehouse, it’s what you now do with the massive amounts of data that will transform your organization into finding new revenue streams, or new markets to focus on.
Now that the volume of data that a company collects has been strategically organized, better decisions need to be made based on the data being collected in real time. A Big Data strategy will lose its benefit if the data that is being collected cannot be acted upon at the time that it is being collected. The data being collected represents data with an ongoing purpose, it is a representation of what is happening now, a real-time status update.
The difference between big data and fast data is the velocity at which the data is being analyzed. For instance, a department store will analyze historical sales and inventories in the data warehouse to determine what products to stock for the upcoming Back to School season; compared to Amazon that can perform real-time analysis to provide recommendations based on many characteristics of a customer, including what products they have viewed within the last five minutes.
In order to support the ever-increasing velocity of basing decisions on real-time data, technologies are quickly filling the need. Message queueing, Message Oriented Middleware (MOM) and streaming services are a couple of the technologies that are at the forefront of fast data. A few of those services are noted below:
- Apache Kafka - Open-source stream processing platform;
- Akka Streams - Open-source stream processing;
- Amazon Kinesis - Amazon data-stream processing solution;
- ActiveMQ - Open-source message broker with a JMS client in Java;
- RabbitMQ - Open-source message broker with a JMS client in Erlang;
- JBoss AMQ - Lightweight MOM ;
- Oracle Tuxedo - Middleware message platform;
- Sonic MQ - messaging system platform by
Leveraging real-time data to immediately find the best course of action is the key to fast data.
While the hot buzzword Big Data faded and its Hype Cycle canceled by Gartner, it is still a strong investment that companies will continue to make. While not the buzzword that it was, it did usher in a leapfrog in technological advancement, i.e. more MPP databases and re-imagined Message Queuing technologies, that are allowing for new ways to make data insightful and actionable possible.
Contact Rythmos today for a free data assessment.