A Multi-User MIMO (MU-MIMO) Access Point (AP) can obtain a capacity gain by simultaneously transmitting to multiple clients. This technique requires Channel State Information (CSI) at the transmitting AP to set antenna gains and phases to enable simultaneous reception through beamforming. The AP must also select both the mode (number of transmit and collective receive antennas) and the user set prior to transmission. While the ideal mode and user selection is a function of CSI, CSI must be estimated with an overhead intensive channel sounding process. We design, implement, and evaluate Pre-sounding User and Mode selection Algorithm (PUMA), a method for mode and user selection prior to channel sounding. We show that even without CSI, PUMA (i) exploits theoretical properties of MUMIMO system scaling with respect to mode, (ii) characterizes the relative cost of each potential mode, and (iii) estimates per-stream transmission rate and aggregate throughput in each mode for a potential user set, all without CSI. Once PUMA has selected the appropriate mode and user group, the chosen protocol's channel sounding method is used on the intended user subset to carry out the transmission. We show that, on average, PUMA selects the mode and group that achieves an aggregate rate within 3% of the saturation throughput of what would have been achieved by sounding all users (which would require significant additional overhead). Moreover, we show that PUMA obtains 30% higher aggregate throughput compared to the best fixed-mode policy that uses the maximum number of available transmit and receive antennas.