With the rapid advancement of wearable devices and Internet of Things (IoT) technologies, sensor data generated by edge devices has surged. This data is crucial for advancing IoT applications, including health status monitoring, abnormal behavior detection, and environmental monitoring. However, traditional centralized learning requires uploading data to a central server, raising security and privacy concerns and hindering data application. Federated learning (FL) offers a solution by enabling collaborative model training on IoT devices without transferring data from the local device. In practice, edge devices generate data that is often highly heterogeneous, making it challenging for the global FL model to capture local data distributions accurately, leading to significant performance degradation. Additionally, imbalanced edge device resources and limited bandwidth can cause data transmission delays or interruptions, impacting application feasibility. To address these issues, we propose PFedKD, a novel personalized FL algorithm based on knowledge distillation, aimed at enhancing the model's generalization ability and reducing communication overhead in heterogeneous IoT data environments. PFedKD constructs a public dataset using unlabeled pseudo data to extract knowledge from each client, training personalized models that fit local data distributions. This method controls dataset size while enhancing performance. During communication, only logits and class prototypes are transmitted, ensuring high communication efficiency. Sharpness aware minimization is introduced in local model training to optimize generalization. Additionally, we design a weight distribution mechanism based on client sample quality evaluation that optimizes knowledge aggregation and model personalization. Extensive experiments demonstrate that PFedKD significantly outperforms state-of-the-art baselines in both learning performance and communication efficiency.