Federated Learning (FL) has emerged as a promising collaborative learning paradigm that enables to train machine learning models across decentralized devices, while keeping the training data localized to preserve user privacy. However, the heterogeneity in both decentralized training data and distributed computing resources has posed significant challenges to the design of effective and efficient FL schemes. Most existing solutions either focus on tackling a single type of heterogeneity, or are unable to fully support model heterogeneity with low communication overhead, fast convergence, and good interpretability. In this paper, we present CloREF, a novel rule-based collaborative learning framework that allows devices in FL to use completely different local learning models to cater to both data and resource heterogeneity. In CloREF, each rule is represented as a linear model, which provides good interpretability. Each participating device chooses a local model and trains it using its local data. The decision boundary of each trained local model is then approximated using a set of rules, which effectively bridges the gap arising from model heterogeneity. All participating devices collaborate to select the optimal set of rules as the global model, employing evolutionary optimization to effectively fuse the knowledge acquired from all local models. Experimental results on both synthesized and real-world datasets demonstrate that the rules generated by our proposed method can mimic the behaviors of various learning models with high fidelity ($> $>0.95 in most tests), and CloREF gives competitive performance in accuracy, AUC, and communication overhead, compared with both the best-performing model trained centrally and several state-of-the-art model-heterogeneous federated learning schemes.