Engineering structures often exhibit time-varying characteristics due to environmental erosion or working conditions during their service life. Therefore, time-varying modal identification plays a crucial role in assessing the health condition of them. However, only the response is measurable, making the identification challenging. Recently, variational mode decomposition (VMD) based methods have been reported on output-only modal identification (OMI) of time-varying structures. Nevertheless, these methods cannot obtain time-varying mode shapes and identify overlapped or wideband modes. This paper develops a short-time method based on multivariate VMD (MVMD) to address these problems. Firstly, long-time non-stationary vibration signals are divided into small chunks, where modes show narrowband characteristics. Then, MVMD is employed on these short-time data chunks to extract multivariate modes. Finally, the instantaneous frequencies are recovered from the center frequencies, and the mode shapes are solved from the extracted multivariate modes. A series of numerical and experimental examples demonstrate that the short-time MVMD (STMVMD) has good noise-robustness, overlapped mode decomposition ability, and mode-tracking ability of time-varying structures.