We investigate the advantages and disadvantages of different loopback buffer architectures for optical switches and compare their performance via simulation. The simulation results show that, without the use of virtual output queuing, the head-of-line blocking can be alleviated by wavelength parallelism when each separate queue in a loopback buffer has multiple transmitters. Furthermore, the proposed two-level flow control can eliminate packet drop at the switch, resolve rate mismatching due to output queuing at switch outputs, and ensure that congestion occurring at the hotspot port will not affect the performance of non-congested ports.