The world is awash in data, and 90 percent of it was created in the last two years.1 In fact, every day we create 2.5 quintillion bytes of data2 and that number is growing exponentially. The explosive growth of the Internet of Things (IoT) promises to add to this data glut, with 40 percent of all data coming from sensors by 2020.3 Today, a jet engine may generate 1 terabyte of data in a single flight,4 and a major global retailer collects 2.5 petabytes of customer day each hour.5 Yet 99.5 percent of all this data is never used or analyzed.6
IoT is creating an “analytics imperative”—a mandate to turn more of this data into actionable insights that drive operational value and business agility.
In the first wave of the Internet, we moved the data to the analytics. This works well for large volumes of historical data—such as an oil company batch processing years of seismic data to apply to new extraction techniques. It also works for low-bandwidth, not-so-real-time applications such as connected vending machines that send just a few packets to the cloud when an item needs restocking.
Today, however, the Internet of Everything (IoE) has enabled a plethora of real-time, high-data-rate applications that require a new approach, which we call “fog.” By extending cloud capabilities to the edge of the network, fog computing moves analytics to the source of the data, enabling real-time processing and instantaneous action. Rather than moving massive amounts of raw data, the fog system sorts and indexes the data locally, and sends just alerts and exceptions back to the cloud.
By moving analytics to the data, sensors in the transportation infrastructure can identify approaching emergency vehicles and immediately adjust traffic lights to help them increase both speed and safety. Or, an oil and gas company can use temperature and acoustic sensing to detect abnormal conditions and react immediately to prevent a blowout.
Among survey respondents, the move toward fog computing has begun.
The move toward fog analytics is well underway. In a recent survey of IT and operational technology (OT) professionals, 37 percent of respondents said that within three years, “most” IoT data will be processed locally, at the edge of the network.
Fog analytics will require a flexible network architecture, where some elements, such as policy, reside in the cloud while real-time data processing functions move to the edge. Less time-sensitive data can still go to the cloud for long-term storage and historical analysis. Other requirements include standardization of device and data interfaces, integration with the cloud, and a scalable policy infrastructure.
In the future, capabilities such as streaming analytics to handle continuous incoming data, machine learning to enable performance improvement of IoT applications over time, and data visualization capabilities will rise in importance. In the meantime, Cisco recently announced Fog Data Services software that you can use today to build scalable IoT data solutions. In my next blog, we’ll go further into the fog, with a look at IoT applications running at the edge of the network.
References:
1 “Big Data and What it Means,” U.S. Chamber of Commerce Foundation
2 http://www-01.ibm.com/software/data/bigdata/what-is-big-data.html
3 “Trillions of Sensors Feed Big Data,” Signal Online, February 1, 2014
4 “If You Think Big Data’s Big Now, Just Wait,” TechCrunch, August 10, 2014
5 http://www.adweek.com/socialtimes/big-data-infographic/489509
6 “Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East,” IDC, February 2013
The success of IoT will depend more on metadata management (device classification, device properties, device connectivity, device context) than on the devices themselves. These things are the drivers of IoT application cost and time of deployment.
Ron,
I agree, standardizing the interfaces and metadata structures and management is key to scaling IoT. OIC and other forums are working on it as we speak.
Want to understand how Analytics can be moved to data? An example to elaborate this will help. Thank you.
Great question. When we move analytics to the data, we are running rule engines as close to the source of the data as possible–the fog node at the edge of the network. So, rather than sending the huge volume of raw data collected by sensors at the edge, we can set policy in the cloud that is implemented in the fog node. Let’s say we have a sensor monitoring the temperature of a piece of equipment. We can set the policy that we only care about a temperature reading above 120 degrees. The fog node then indexes all the packets coming from the sensor, and sends just the exceptions to the cloud, to alert the system of excessive heat. Or, you could set the policy so that a high temperature would trigger an immediate shut-down of the equipment. By moving the analytics capabilities to the data, fog computing generates insight (the equipment is too hot) at the source of the problem and triggers action based on policy from the cloud.
Another example you can find in road crossings. If you monitor congestion and want to adjust traffic lights accordingly, than you should analyse video there and then and use this data to adjust traffic lights in real time even if network is down. So you do the analytics on the edge of the network and do not want to transfer the data to cloud to be analyzed.
great piece of content!
aside note on the donut charts – the one on the left does not reflect 37% (at least to me).
cheers,
Boyan
perhaps the solution for management of big data is a better information retrieval in all variables.