Big Data – A Simplistic Illustration!
Data, in today’s business and technology world, is indispensable. The Big Data evolved at the beginning of 21st century, and every technology giant is now making use of Big Data technologies. In simplistic term it refers to vast and voluminous data sets that may be structured, unstructured. Or semi- structured. This massive amount of data is produced every day by businesses entities and users of the social media. The term Big Data Analytics is the process of examining the large data sets to seek in-depth knowledge and insights from the data patterns and their interrelationship.
What is Big Data?
Big data has been discussed in the digital world as well as in the business world with equal fervor. The enthusiasm around this term primarily because of its potential usage in key decision making processes. SAS Institute, creator of one of the leading statistical suites, describes big data as “the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis”. However, the insights that helps the business take superior decisions at the various levels of the organization matters the most rather than the volume of data itself.
On the face of it, neither the volume of data nor the analytics elements are really new. For many years, enterprise organizations have accumulated growing storage of data. Some have also run analytics on that data to gain business value out of the data sets. For example, the oil and gas industry, which has, for decades now, used very large data sets for high-performance computing (HPC) to model underground reserves from seismic data. It has also been observed that the organizations using OLAP (Online Analytical Processing) tool on top of data warehouse to interrogate large historical data sets to create business value.
The concept of big data gained momentum in the early 2000s when industry analyst Doug Laney articulated the now-mainstream definition of big data as the Six V’s:
- Volume: Organizations collect data from a variety of sources, including business transactions, smart (IoT) devices, industrial equipment, videos, social media and more.
- Velocity: With the growth in the IoT, the data streams into businesses at an unprecedented speed, and therefore, must be handled in a timely manner. RFID tags, sensors and smart meters are driving the need to deal with these torrents of data in near-real time
- Variety: Data comes in all types of formats including structured numeric data, unstructured text documents, emails, videos, audios, stock ticker data and financial transactions.
Over the period, other dimensions of the large data sets have become relevant from its usage perspectives. These dimensions are ‘Variability’, ‘Veracity’ and ‘Value’.
- Variability: Flow of big data is unpredictable primarily because of the multitude of data dimensions resulting from multiple disparate data types and sources. It’s challenging, but businesses need to know when something is trending in social media, and how to manage daily, seasonal and event-triggered peak data loads. Variability can also refer to the inconsistent speed at which big data is loaded into your database.
- Veracity: Veracity refers to the purity of data. In another words the data should be relevant to the business entity. Since data comes from varied sources, businesses need to connect and correlate relevancy relationships, hierarchies and multiple data linkages. Otherwise, their data can quickly spiral out of control.
- Value: This dimension can arguably be the most important of all. The other attributes of big data are meaningless if some kind of business value is not generated out of the data sets. Substantial value can be found in big data, including understanding the customers better, identifying the target customers and the product and services customization they seek, optimizing processes, and improving business performance.
Why is Big Data important?
The insights that come out of big data has enormous potential in the business world. Slicing and dicing of big data with the help of analytics tool provides varied perspectives which subsequently assist organizations identify new business opportunities, develop better products/services, recognize customer’s perceptions on products/services and initiate faster responses to the customer demands. That, in turn, leads to smarter business moves, more efficient operations, higher profits and happier customers.
Big Data analytics involves the use of analytics techniques such as machine learning, data mining, deep learning, natural language processing, and statistics. The data is extracted, prepared and blended to provide in-depth analysis for the businesses. Data analytics involves qualitative as well as quantitative techniques to improve business productivity and profits.
The demand of big data analytics tools and techniques are rising due to the facts that business organizations find enormous benefits in understanding the data sets in terms of their interrelationship, pattern matching and learning various other facets about the data. These analytics tools help generate meaningful information for making better business decisions. For an instance, an online taxi service provider, such as Uber, uses data analytics for price optimization and to create superior customer experience based on real time parameters including weather condition, customer demand of the hour, and varied customer needs at any point of time.
There has been an enormous growth in the field of Big Data analytics with the benefits of technology usage. This has led to the use of big data in multiple industries such as banking, healthcare, energy, technology, retail and manufacturing etc. Banking is seen as the field making the maximum use of Big Data Analytics. The education sector is also making use of data analytics in a big way. There are new options for research with the help of data analytics leading to innovations for the benefit of the society.
Big Data Management Challenges
Big data challenges include storing and analyzing large, rapidly growing, diverse data stores, and then deciding the best way of handling that data.
The most obvious challenge associated with big data is the storing and analyzing all that information. Much of that data is unstructured, meaning that it cannot be stored in a conventional database. Moreover, documents, photos, audio, videos and other unstructured data can be difficult to handle and analyze. On the management and analysis side, enterprises are using tools like NoSQL databases, Hadoop, Spark, big data analytics software, business intelligence applications, artificial intelligence and machine learning to help them comb through their big data stores to find the insights for their business needs.
Organizations don't just want to store the big data but they want to use the data to attain certain business goals including reduction of operational expenses, implementation of data-driven decision making processes, innovation, faster business response, and creating new product, service and business models.
In order to develop, manage and run the applications that generate insights, organizations need to hire and retain big data professionals with requisite skills. Therefore, the demand for big data experts has been growing day by day. Organizations, therefore, need to have higher budgetary allocation for hiring human resource with data analytics skill sets and re-skilling their non-core/surplus employees. Some organizations opt for off-the-self analytics solutions, and/or machine learning capabilities. These tools help organizations achieve their big data goals even if they do not have a lot of big data experts on their payroll.
Big data comes from varied sources including enterprise applications, social media streams, email systems, employee-created documents, etc. Relating all that data to the business and their reconciliation becomes a biggest challenge at the same time creating reports can also be incredibly difficult. There are a variety of data integration tools designed to make the integration process simpler.
Often within an organization similar pieces of data come from varied sources. For example, the daily sales figure can come from ecommerce systems as well as from ERP system and the two numbers don’t match. In another instance, hospital's electronic health record (EHR) system may show a patient’s address, which doesn’t match with the address recorded in the system of a partner. When organizations face such challenges in data validation/data governance, they may have to invest in solutions designed to ensure better data governance and to ensure the accuracy of big data storage systems.
People related issues arising out of big data management cannot be discounted. These issues can be due to insufficient organization alignment, lack of middle management adoption and understanding, and resistance to change in working culture of the organization.
Big Data Benefits
The use of Big Data is becoming common these days by the companies to outperform their peers. Organizations have realized the benefits of better business insights that may come out on a real time basis. Hidden patterns within the data get visible in terms of unknown correlations between data. Which eventually throw up relevant market trends, customer preferences and other useful business information for the organizations. In most industries, existing competitors and new entrants alike will use the strategies resulting out of the analyzed data for value creation.
Analytics on the relevant data will help an organization know the market trends well ahead of its competitors giving them competitive edge. Big data can facilitate this by, for an example, scanning and analyzing social media feeds and newspaper reports. Big data also helps the organization carry out its health-check on its customers, suppliers, and other stakeholders so as to minimize business risks.





