Speech recognition technology is spreading with personal digital assistants such as smart phones. However, we are concerned about the decline in the recognition rate at places with multiple voices and considerable noise. Therefore, we have been studying a lip operation that would recognize the content of an utterance by reading from an image. Based on this research, we created a database of utterances by Japanese television announcers and English teachers for utterance training in Japanese and English. Furthermore, applying the technology we developed, we propose a method of utterance training using specific equipment. First, we compared the student's utterance with data in the lip movement database. Second, we evaluated the effectiveness of the utterance training equipment.