Today, in the second post in a series on big data and data mining, I’ll be discussing MapReduce, a strategy for handling large amounts of data quickly by exploiting the power of many computers working in parallel.

The original implementation of MapReduce, along with the name, came out of Google. MapReduce originally referred to the proprietary technology Google used to handle the huge quantities of data generated by crawling the World Wide Web.  As the ideas behind the technique became better known, other implementations, such as Hadoop, have been created and are available to the world at large.

Continue reading