This paper introduces the technological techniques of data cleaning and data extraction. The current state of domestic and international research in these two areas is reviewed and their future development considered. The following concepts are all explained: the basic principle of data cleaning, the framework models, the need for and the objectives of data cleaning, the testing method and the cleaning tool. Also introduced are data extraction techniques such as static data capture, log file capture, database generator capture, date and time capture, file comparison capture and finally source application capture. Finally the advantages and disadvantages of these various data extraction technologies and which to use in real-life situations are considered.