Why is everybody excited about big data
Well, strictly speaking, not everyone is excited about big data but many are, and there are good reasons to be excited about the benefits and opportunities that big data offers. But what is big data? Search the internet, and there are almost as many definitions as there are providers of data analytics on the market. One point on which everyone agrees is that big data involves lots and lots of data. Big data captures a quintessential character of the digital world – the accelerated creation of digital information. Easily created and stored – from a significant and growing amount of online transactions, to our use of the smartphone, to all types of connected telemetry and telematics equipment in M2M, and now, to the Internet of Things. The scale and volume of data certainly makes it big.
The second point is about speed. M2M and the Internet of Things have been about connecting devices, people and systems. With advances in technology, connectivity has improved significantly. Reaching rural areas through cellular networks has become a cost efficient approach in communications, and continuous improvements in short range technologies and broadband have offered new opportunities for increased and extended data traffic. What this increased level of connectivity has started to provide is near real-time data. Big Data is not just about scale but more importantly, about having real-time data available for immediate insight and actions.
Speed is a game changer
The speed or velocity of data in big data is actually multi-dimensional. In real-time, data is captured from a range of connected devices and fed back to platforms and supporting systems. This is traditionally what happened as part of business intelligence solutions.
The change lies in the processing and analysis of data which for many applications takes place as data ‘travels’ through the systems. Concepts such as complex event processing are mentioned in this context, and speed, referring to the real time processing and feedback loop of data becomes an integral part of the entire process rather than simply referring to the transfer rate of data.
Take a manufacturing plant as an example. Connected production equipment, storage and transport systems will be sharing data as envisaged in the connected industry, not just for remote monitoring and management purposes but to embed greater degrees of autonomy and automation through intelligence based on the real-time sharing and processing of data. Taken to another level, speed becomes a game changer when considering applications with mission critical or life-saving elements. Whether it is lives being saved through connected medical devices, or rescue services being guided safely through burning buildings before all systems fail, the speed at which this data is processed, analysed and fed back into the system makes big data a significant opportunity area if only for that one reason.
Structure and situation adds to a definition
There are two remaining characteristics that make big data what it is. In the Internet of Things, data that we produce will not always ‘fit’ into the uniform fields and columns of traditional databases. Data will come in all kinds of formats and structures including audio files, images, random text messages, tweets, likes, and videos. To process and manage this data, new data processing, storage, and analysis methods have led the way, and introduced us to terms like Hadoop, MongoDB, MapReduce, and Splunk to name a few of the tools needed to manage data analytics.
Big data is about managing, storing and analysing data in real-time, and turning that data into actionable insight. As a result of the scale and structural diversity of the data managed, these analytics’ platforms need to support complex and functionally rich applications, deliver software extensibility and agility, and first and foremost, deliver scalable and robust complex event processing.
No, I have not forgotten situation. This becomes one final important piece of the big data picture. With all this information streaming through the systems, interacting with other pieces of data, and enabling this single unified view of the data to emerge, there is a significant opportunity to identify trends and patterns which may previously have gone undetected. For some advocates of big data, this is the holy grail in big data. But for every piece of data generated, from that single thermostat in the boiler showing a higher than normal temperature as an example, knowing which device the data came from, where that device was located, under which circumstances data was logged (e.g. time, outside temperature, etc.), within what context could this data be qualified (as an example if other nearby devices noted similar events) and whether this data was delivered in the proper sequence are important situational characteristics.
The promise of big data
M2M and the Internet of Things have arrived. Industries, products, devices and systems are connecting every day, and Big Data, defined as the “process through which heterogeneous data is collected, processed, stored and actuated in real-time for either of two purposes: i) to enhance and create intelligent flows, or ii) to perform correlation analysis whereby seemingly tenuous relationships are identified” has started to deliver new and significant areas of improvement across many business areas, and in the ‘Subnets of Things.’ Above mentioned is the working definition of big data as presented by Machina Research. The aim is to move the discussion from looking at the parts of big data (scale, structure, etc.) and recognise that the significance of the data is derived from the analytics process which is core to big data.
The real challenge and opportunity in big data is not in the vast amounts of data created and stored, or the analytics’ processes and tools that are applied. It is identifying the significance (value) of that data within the context of business services and commercial propositions. As we move beyond the mechanical analysis of data, focusing on significance will be key.
Where do you think the promise of big data lies?