This document discusses data mining and knowledge discovery. It defines data mining as the process of extracting implicit and potentially useful information from large datasets. Knowledge discovery is the broader process that includes data cleaning, integration, selection, transformation, mining, pattern evaluation, and knowledge representation. The document provides examples of the large amounts of data being collected from various sources, such as business transactions, scientific data, medical records, videos, games, text, and the world wide web. It describes how data mining can be applied to different data sources and formats, including flat files, relational databases, data warehouses, and more.