In the existing researches on zero-shot learning, people focus more on the mapping relationship between the visual features of images and the semantic features of each class. However, these features themselves affect the final identification for classification in a very significant way. Particularly, concerning the semantic features, some representations are relatively close to each other in some similar categories, which indicates the distinction among categories will not be so apparent. Additionally, features will also witness a redundancy if the wider span of the category appears. Therefore, to obtain more discriminative and finer-grained semantic features, this paper proposes a model on framework of the correlated dual autoencoder. Although these autoencoders are established for visual and semantic features, the two autoencoders are still related to each other without any independence. Hence, we encode the visual features, and these are added to the encoded semantic features with the decoding of added semantic features. Finally, the decoded and the original semantic features are added for better attainment and completion of semantic features. In this paper, experiments were carried out on the AwA and Cub datasets, and higher accuracy was achieved in the final classification identification.