Automated Human Use Mapping of Social Infrastructure by Deep Learning Methods Applied to Smart City Camera Systems

被引:5
作者
Sun, Peng [1 ]
Draughon, Gabriel [2 ]
Hou, Rui [2 ]
Lynch, Jerome P. [3 ]
机构
[1] Univ Cent Florida, Dept Civil Environm & Construct Engn, Orlando, FL 32816 USA
[2] Univ Michigan, Dept Civil & Environm Engn, Ann Arbor, MI 48105 USA
[3] Univ Michigan, Dept Elect Engn & Comp Sci, Dept Civil & Environm Engn, Ann Arbor, MI 48105 USA
基金
美国国家科学基金会;
关键词
Computer vision; Deep learning; Mask R-CNN; Public open space; Human sensing; Social infrastructure;
D O I
10.1061/(ASCE)CP.1943-5487.0000998
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the emergence of the smart city, there is a growing need for scalable methods that sense how humans interact and use infrastructure in order to model social behaviors relevant to designing sustainable and resilient built environments. Cyber-physical system (CPS) frameworks used to monitor and automate infrastructure systems in smart cities can be extended to sense people to better understand how they use infrastructure systems including social infrastructure (e.g., parks, markets). This paper adopts convolutional neural network (CNN) architectures to automate the detection and spatiotemporal mapping of people using camera data to form a cyber-physical-social system (CPSS) for smart cities. The Mask region based convolutional neural network (R-CNN) detector was adopted and tailored to identify and segment human subjects in real time using camera images with an average speed of 7 frames per second. The Mask R-CNN framework was trained end to end using the Objects in Public Open Spaces (OPOS) image data set that includes classified segmentations of people in public spaces. A two-dimensional/three-dimensional (2D-3D) lifting algorithm based on a monocular camera calibration model was also employed to accurately position detected people in space. Finally, a Hungarian assignment algorithm based on association metrics extracted from detected people was used to assign people to spatiotemporal trajectories. To demonstrate the proposed framework, this study used the Detroit riverfront parks to study how people utilize community parks, which are a form of social infrastructure. The Mask R-CNN detector is proven precise in detecting and classifying the behavior of people in parks with mean average precision well above 85% for all class types defined in the OPOS library. The framework is also shown to be effective in spatially mapping the various uses of park furnishings, leading to better management of parks.
引用
收藏
页数:21
相关论文
共 53 条
  • [1] Adriaens P, 2019, MONEYBALL APPROACH C
  • [2] Open storm: a complete framework for sensing and control of urban watersheds
    Bartos, Matthew
    Wong, Brandon
    Kerkez, Branko
    [J]. ENVIRONMENTAL SCIENCE-WATER RESEARCH & TECHNOLOGY, 2018, 4 (03) : 346 - 358
  • [3] The significance of parks to physical activity and public health - A conceptual model
    Bedimo-Rung, AL
    Mowen, AJ
    Cohen, DA
    [J]. AMERICAN JOURNAL OF PREVENTIVE MEDICINE, 2005, 28 (02) : 159 - 168
  • [4] Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics
    Bernardin, Keni
    Stiefelhagen, Rainer
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2008, 2008 (1)
  • [5] Bewley A, 2016, IEEE IMAGE PROC, P3464, DOI 10.1109/ICIP.2016.7533003
  • [6] Crompton J.L., 2005, MANAGING LEISURE, V10, P203, DOI [10.1080/13606710500348060, DOI 10.1080/13606710500348060]
  • [7] Datal N., 2005, 2005 IEEE COMP VIS P, P886, DOI 10.1109/CVPR.2005.177
  • [8] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [9] Elefteriadou L, 2020, CYBER PHYS SYSTEMS B, P237
  • [10] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338