Technology

Understanding the 5 Vs of Big Data

Twenty years ago, the phrase “Big Data” was unknown to most business leaders, but today, it is all but impossible to operate an organization successfully without some understanding of data science. Fortunately, today’s executives can take a data science course to improve their understanding of the field’s fundamentals, to include the five Vs, which are the main characteristics of Big Data:

Volume

There is a reason that the name Big Data includes the adjective “big.” Big Data tends to involve data in overwhelming volumes, much more than an individual business leader could manage without considerable support. In 2012, at the beginning of the Big Data boom, companies were capable of collecting over 3 million megabytes of data every day, but that figure has doubled about every 40 months. Today, internet users generate more than 1.145 trillion megabytes of data per day — though some estimates suggest that figure is dramatically higher, closer to 7.7 decillion megabytes of data per day, thanks to the concept of dark data. If companies collect even a fraction of all that data, they have a sizeable amount of information to utilize for decision-making. Thus, it is important for Big Data to comprise a substantial volume.

Velocity

Velocity refers to the speed of data accumulation, or how fast data can flow from its sources (e.g. social media, customer surveys, third-party cookies) into business databases. Old data is of little use to companies. These days, trends can shift in a matter of hours, so businesses need to be sure that the data they are relying on is as close to real-time as possible. This is why the velocity of Big Data matters: the faster their data moves, the sooner businesses can make pertinent decisions, and the more competitive businesses can be in their market. Many data-driven business leaders believe that data velocity is more important than volume, especially because too much data can slow decision-making to a crawl if a business doesn’t have systems in place to assist with data management and analysis.

Also read: What is Model Governance and Why it’s Important?

Veracity

Data isn’t always accurate, and even when it is accurate, it isn’t always interpreted in clean ways. It is possible to interpret data improperly and identify all manner of patterns that aren’t necessarily relevant or precisely true. This phenomenon, called data dredging, data butchery or p-hacking depending on the field, isn’t helpful to business leaders, who need to know that their decisions are informed by high-quality and reliable data and analysis. Hence the importance of veracity, which is a combination of the quality, consistency and certainty of Big Data.

Veracity

Variety

There are three types of data: structured, semi-structured and unstructured. Structured data is data that is clearly defined; it is easy to search and analyze thanks to a predefined structure, which makes it an advantageous form for businesses. Unstructured data is the exact opposite: data stored in a variety of native formats that require extra effort for data scientists to parse. Semi-structured data is somewhere between two. Unfortunately, more than 80 percent of enterprise data is unstructured, indicating that too many businesses are not capitalizing on the variety of Big Data. Different data sources produce different types of data, and it is useful for companies to collect a variety of data to ensure they are gaining a complete data picture.

Value

Value refers to the usability of data to a business. In and of itself, data is usually not particularly helpful to business leaders — especially those who are uneducated in data science and unequipped to leverage any kind of data in their decision-making. Only when data is converted into a useful form, through analysis, does Big Data gain value. However, businesses also need to be certain that they aren’t wasting their time collecting and analyzing data that will never be valuable to decision-makers; for example, a tea company probably doesn’t need to know the color of shoes worn by its customers. Value is perhaps the most important V and the one that most business leaders get wrong in their excitement to collect and harness Big Data.

Businesses need Big Data to compete in the 21st century. By paying attention to the five Vs, business leaders can collect more, better data to guide their decision-making and benefit their business.

Shares: