Weakly supervised object detection (WSOD) has attracted significant attention in recent years, as it utilizes only image-level annotations to train object detectors and greatly reduces the labor and capital cost of fine labeling. Nevertheless, the absence of instance-level annotations leads to two phenomena: partial regions and missing instances. We believe these are mainly caused by two issues: 1) Noisy instances exist in the training samples, which can confuse the detector. 2) Global salient information is missing, resulting in little attention being received in the low-confidence region. To solve the above two problems, we propose an instance dual-optimization framework called IDO. First, an instance-wise selection strategy (IWSS) based on curriculum learning is proposed for instance denoising and for improving the robustness of the model. Second, CAM-generated spatial attention (CGSA) is carefully designed to optimize the features of instances. Without introducing additional hyperparameters, our CGSA complements the low class-confidence region with more global salient information, which assists the model in acquiring a more complete region of the target and identifying more neglected targets. Finally, we empirically demonstrate that our proposal can achieve comparable results to those of other state-of-the-art methods on PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO.