共 26 条
- [1] Krizhevsky A, Sutskever I, Hinton G E., ImageNet classification with deep convolutional neural networks, Communications of the ACM, 60, 6, pp. 84-90, (2017)
- [2] Girshick R, Donahue J, Darrell T, Et al., Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 580-587, (2014)
- [3] Girshick R., Fast R-CNN, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1440-1448, (2015)
- [4] Ren S, He K, Girshick R, Et al., Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 6, pp. 1137-1149, (2017)
- [5] He K, Gkioxari G, Dollar P, Et al., Mask R-CNN, IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2, pp. 386-397, (2017)
- [6] Redmon J, Divvala S, Girshick R, Et al., You only look once: Unified, real-time object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, (2016)
- [7] Liu W, Anguelov D, Erhan D, Et al., SSD: Single shot MultiBox detector[C], Proceedings of the European Conference on Computer Vision, pp. 21-37, (2016)
- [8] Lin T Y, Goyal P, Girshick R, Et al., Focal loss for dense object detection [J], IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2, pp. 318-327, (2020)
- [9] Dosovitskiy A, Beyer L, Kolesnikov A, Et al., An image is worth 16x16 words
- [10] Transformers for image recognition at scale [EB/OL]