Accurate short-term precipitation prediction at a high spatial resolution is crucial for effective urban water management, flooding warning, and mitigation. However, conventional numerical weather models usually face the challenge of systematic errors and spatiotemporal biases due to an inadequate understanding of many processes and unrealistic parameterizations. In recent years, deep learning techniques have gained popularity as a tool in precipitation forecasting and risk pre-warning. To support deep learning for precipitation forecasting and flooding warning, this paper introduces a large-scale multimodal Geo dataset. This dataset incorporates spatially connected features and real-world climate data, enabling the prediction of extreme precipitations. The dataset comprises Multi-Radar/Multi-Sensor System (MRMS), High-Resolution Rapid Refresh (HRRR), Geostationary Satellite Server (GOES) data, and local hydrological data from the United States Geological Survey (USGS), providing a diverse array of information sources. The compiling of multi-source data within the proposed multimodal Geo scope can improve prediction accuracy over uni-modal data and shows high accuracy in predicting heavy rain when integrating Transformer, which offers the opportunity for more efficient urban water management and improved disaster response strategies. By providing a comprehensive view of environmental conditions, this dataset enables a deeper understanding of precipitation patterns, facilitating effective mitigation efforts.