Person re-identification is a vital module of the tracking-by-detection framework for online multi-object tracking. Despite recent advances in multi-object tracking and person re-identification, inadequate attention was given to integrating these technologies to provide a robust multi-object tracker. In this work, we combine modern state-of-the-art re-identification models and modeling techniques on the basic tracking-by-detection framework and benchmark them on heavily occluded scenes to understand their effect. We hypothesize that temporal modeling for re-identification is crucial for training robust re-identification models for they are conditioned on sequences containing occlusions. Along with traditional image-based re-identification methods, we analyze temporal modeling methods used in video-based re-identification tasks. We also train re-identification models with different embedding methods, including triplet loss, and analyze their effect. We benchmark the re-identification models on the challenging MOT20 dataset containing crowded scenes with various occlusions. We provide a thorough assessment and investigation of the usage of modern re-identification modeling methods and prove that these methods are, in fact, effective for multi-object tracking. Compared to baseline methods, results show that these models can provide robust re-identification proved by improvements in the number of identity switching, MOTA, IDF1, and other metrics.