X-MAS: Extremely Large-Scale Multi-Modal Sensor Dataset for Outdoor Surveillance in Real Environments

被引:4
作者
Noh, DongKi [1 ,2 ]
Sung, Changki [1 ]
Uhm, Teayoung [3 ]
Lee, WooJu [1 ]
Lim, Hyungtae [1 ]
Choi, Jaeseok [4 ]
Lee, Kyuewang [5 ]
Hong, Dasol [1 ]
Um, Daeho [5 ]
Chung, Inseop [4 ]
Shin, Hochul [6 ]
Kim, MinJung [7 ]
Kim, Hyoung-Rock [2 ]
Baek, SeungMin [2 ]
Myung, Hyun [1 ]
机构
[1] Korea Adv Inst Sci & Technol KAIST, Sch Elect Engn, Daejeon 34141, South Korea
[2] Adv Robot Lab, LG Elect, Seoul 06772, South Korea
[3] Korea Inst Robot & Technol Convergence KIRO, Pohang 37666, South Korea
[4] Seoul Natl Univ SNU, Dept Intelligence & Informat, Seoul 08826, South Korea
[5] SNU, Automat & Syst Res Inst ASRI, Dept Elect & Comp Engn, Seoul 08826, South Korea
[6] Elect & Telecommun Res Inst ETRI, Daejeon 34129, South Korea
[7] Korea Adv Inst Sci & Technol, Kim Jaechul Grad Sch Artificial Intelligence, Daejeon 34141, South Korea
关键词
Surveillance; Robots; Task analysis; Cameras; Videos; Multimodal sensors; Robot vision systems; Dataset; field robot; multi-modal perception; surveillance robot; ANOMALY DETECTION; OBJECT TRACKING;
D O I
10.1109/LRA.2023.3236569
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (X-MAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms.
引用
收藏
页码:1093 / 1100
页数:8
相关论文
共 47 条
  • [1] [Anonymous], 2011, P 2011 JOINT ACM WOR
  • [2] Aytar Y, 2011, IEEE I CONF COMP VIS, P2252, DOI 10.1109/ICCV.2011.6126504
  • [3] Unsupervised Domain Adaptation by Domain Invariant Projection
    Baktashmotlagh, Mahsa
    Harandi, Mehrtash T.
    Lovell, Brian C.
    Salzmann, Mathieu
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 769 - 776
  • [4] Bewley A, 2016, IEEE IMAGE PROC, P3464, DOI 10.1109/ICIP.2016.7533003
  • [5] Bhardwaj Ritika, 2021, Proceedings of International Conference on Artificial Intelligence and Applications. ICAIA 2020. Advances in Intelligent Systems and Computing (AISC 1164), P583, DOI 10.1007/978-981-15-4992-2_55
  • [6] UAV-AdNet: Unsupervised Anomaly Detection using Deep Neural Networks for Aerial Surveillance
    Bozcan, Ilker
    Kayacan, Erdal
    [J]. 2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 1158 - 1164
  • [7] GridNet: Image-Agnostic Conditional Anomaly Detection for Indoor Surveillance
    Bozcan, Ilker
    Le Fevre, Jonas
    Pham, Huy X.
    Kayacan, Erdal
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02): : 1638 - 1645
  • [8] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [9] Chen XYL, 2019, IEEE INT C INT ROBOT, P4530, DOI 10.1109/IROS40897.2019.8967704
  • [10] Choi Y, 2016, 2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), P223, DOI 10.1109/IROS.2016.7759059