Big Data has become a new buzz word in the IT industry. Everyone is talking about it and repeatedly using it to impress others, even if they themselves don’t really know what it means. It is often used out of context and more as a marketing gimmick. This article aims to explain what Big Data really is and how it will be useful in solving problems.IBM data scientists break big data into four dimensions: volume, variety, velocity and veracity. But there are many more aspects of it. Big data can be described by the following characteristics:
Physics and Mathematics calculations can give us the exact distance from the East Coast of the USA to the West Coast, accurate to about 1 yard Wikidata. This is a phenomenal achievement and has been applied to various technologies in our daily life. But the challenge comes in when you have data which is not static, which is constantly changing and changing at a rate and in volumes which are humongous to determine in real time. The only way we can process this data is by using computers.
Volume is the size of the data which determines the value and potential of the data under consideration and whether it can actually be considered as Big Data or not. Variety means that the category to which the data belongs to is also a very essential fact that needs to be known by the data analysts. This helps the people, who are closely analyzing the data and are associated with it, to effectively use the data to their advantage and thus upholding the importance of the data. Velocity refers to how fast the data is generated and processed to be useful
“One analogy for Big Data analysis is to compare your data to a large lake… Trying to get an accurate size of this lake down to the last gallon or ounce is virtually impossible… Now let’s assume that you have built a big water counting machine… You feed all of the water in the lake through your big water counting machine, and it tells you the number of ounces of water in the lake… for that point in time.”
One of the major reasons why we need Big Data is for prediction and analysis. One of the best examples where Big Data can be seen in action is the Large Hadron Collider experiment, in which about 150 million sensor deliver data 40 million times per second. After filtering and refraining from recording more than 99.999% of these streams, there are 100 collisions of interest per second. Another important example is Facebook, which handles over 50 billion user photos.