This document discusses processing large datasets with Python and Hadoop. It begins with an example of finding the highest temperature from a climate dataset using a map-reduce approach. Next, it provides code examples for implementing map-reduce in pure Python, with Hadoop Streaming, and with the Dumbo library. The document then discusses using Amazon Elastic MapReduce for running Hadoop jobs on AWS. It poses a question about how to implement breadth-first search as a map-reduce algorithm and ends with an example of using MongoDB's map-reduce functionality.