MIST 7500: Big Data – Executive Summary 1

We are in the midst of a data revolution. 90% of the total data in the world today was created in the last two years. The most successful IT companies of the past decade, Facebook, Google, Amazon, etc., are examples of organizations who have prioritized the collection and management of data to improve their products as well as their relationships with their customers.

The topic of this Executive Summary is the technology surrounding Big Data. Our company has been collecting transactional data since its inception but there are other kinds of information out there that we should be storing and are far bigger than what our current systems are capable of. We are talking about things such as, customer feedback, data from social networks such as Twitter and Facebook, server logs, transaction logs, etc.

How is this going to help us? Take for example social networking data, how helpful would it be for our marketing department to know the demographics behind certain products so we can develop materials targeted for a particular geographic location?

Would it be advantageous if we can mobilize our support group as soon as we get alerts about customers posting on Twitter that there is a bug in one of our products?

Will we improve employee satisfaction if we can schedule shifts and vacation time based on data from server logs? What if we can map out high transaction volume times within a year and rotate our staff accordingly to make sure that everyone is able to take advantage of their vacation time?

Our current relational databases are good for our transactional data, but data such as the above are better stored and analyzed using a different system. Here we have to take a look at big data technologies such as Apache Hadoop.

Hadoop is derived from the same technology – Map Reduce, which helped Google to be successful in analyzing their search data. Hadoop will allow distributed computations on very large data. The bigger the amount of data is, the more efficient this technology would be compared to relational systems. Some notable companies such as Yahoo, Amazon, HP and IBM are already using Hadoop to run large distributed computations.

In closing, our company will need to be in the forefront of Big Data technology if we want to be successful. Harnessing big data will open up a lot of opportunities for us in improving our product catalog, customer satisfaction and employee retention.

Speak Your Mind