Multimodal representation learning has gained significant attention across various fields, yet it faces challenges when dealing with missing modalities in real-world applications. Existing solutions are confined to specific scenarios, such as single-modality missing or missing modalities in test cases, thereby restricting their applicability. To address a more general scenario of uncertain missing modalities in both training and testing framework projects each modality's representation into a shared subspace, enabling the reconstruction of any missing modalities within a unified model. We propose an interaction refinement module that utilizes cross-modal attention to enhance these reconstructions, particularly beneficial in scenarios with limited complete modality data. Furthermore, we introduce an iterative training strategy that alternately trains different modules to effectively utilize both complete and incomplete modality data. Experimental results on four benchmark datasets demonstrate the superiority of RMRU over existing baselines, particularly in scenarios with a high rate of missing modalities. Remarkably, our proposed RMRU can be broadly applied to diverse scenarios, regardless of modality types and quantities.