Coded caching can create coded multicasting and, thus, significantly accelerates content delivery in broadcast channels with receiver caches. While the original delivery scheme in coded caching multicasts each coded message sequentially, it is not optimal for multiple-input multiple-output (MIMO) broadcast channels. This paper aims to investigate the full spatial multiplexing gain in multi-antenna coded caching by concurrently transmitting all coded messages. In specific, we propose to treat the content delivery as the transmission problem with general message sets where all possible messages are present, each with different length and intended for different user set. We first obtain inner and outer bounds of the degrees of freedom (DoF) region of a K-user (M, N) broadcast channel with general message sets, with M and N being the number of transmit and receive antennas, respectively. Then for any given set of coded messages, we find its minimum normalized delivery time (NDT) by searching the optimal DoF tuple in the DoF regions. The obtained minimum NDT is optimal at antenna configuration M/N is an element of (0, 1] boolean OR [ K, infinity) and is within a multiplicative gap of M/N to optimum at M/N is an element of (1, K). Our NDT results can be evaluated for any user demand with both centralized and decentralized cache placements.