In recent years, device-free passive localization leveraging Wi-Fi channel state information (CSI) has emerged as a prominent technique for indoor positioning, yet the nonlinear interactions and signal superposition among multiple targets, coupled with occlusion and shadowing effects, significantly complicate the localization task, rendering multitarget device-free passive localization a substantial challenge in the field. In this article, we propose a novel device-free passive multitarget indoor localization approach based on multilabel learning (MLL) and unsupervised domain adaptation, denoted as MLDA-MultiLoc. It segments the localization area into multiple training point regions, reformulating the multitarget problem as a multilabel classification task. MLDA-MultiLoc employs a fusion representation model that capitalizes on the spatio-temporal redundancy of CSI amplitude and phase, effectively mapping these features into a unified representation domain. This model is optimized to enhance the discriminative power of the fusion fingerprint (HDFF) by maximizing spatial metrics. Acknowledging the nonlinear influence of multiple targets on CSI, MLDA-MultiLoc incorporates a fusion generation network to synthesize multitarget fingerprints from multiple single-target fingerprints, creating virtual samples for multitarget scenarios. This process facilitates the training of a deep learning-based multilabel classifier, leveraging MLL for robust parameter optimization. Furthermore, MLDA-MultiLoc introduces an unsupervised domain adaptation technique that utilizes a meta-learning dual-stream structure. This method effectively bridges the gap between virtual and real fingerprint samples, ensuring accurate multitarget localization in complex, dynamic indoor settings. Extensive experiments have confirmed the superiority of MLDA-MultiLoc over existing state-of-the-art systems, showcasing its effectiveness in real-world indoor environments.