Automated Human Use Mapping of Social Infrastructure by Deep Learning Methods Applied to Smart City Camera Systems

被引：5

作者：

Sun, Peng ^{[1
]}

Draughon, Gabriel ^{[2
]}

Hou, Rui ^{[2
]}

Lynch, Jerome P. ^{[3
]}

机构：

[1] Univ Cent Florida, Dept Civil Environm & Construct Engn, Orlando, FL 32816 USA

[2] Univ Michigan, Dept Civil & Environm Engn, Ann Arbor, MI 48105 USA

[3] Univ Michigan, Dept Elect Engn & Comp Sci, Dept Civil & Environm Engn, Ann Arbor, MI 48105 USA

来源：

JOURNAL OF COMPUTING IN CIVIL ENGINEERING | 2022年 / 36卷 / 04期

基金：

美国国家科学基金会;

关键词：

Computer vision; Deep learning; Mask R-CNN; Public open space; Human sensing; Social infrastructure;

D O I：

10.1061/(ASCE)CP.1943-5487.0000998

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

With the emergence of the smart city, there is a growing need for scalable methods that sense how humans interact and use infrastructure in order to model social behaviors relevant to designing sustainable and resilient built environments. Cyber-physical system (CPS) frameworks used to monitor and automate infrastructure systems in smart cities can be extended to sense people to better understand how they use infrastructure systems including social infrastructure (e.g., parks, markets). This paper adopts convolutional neural network (CNN) architectures to automate the detection and spatiotemporal mapping of people using camera data to form a cyber-physical-social system (CPSS) for smart cities. The Mask region based convolutional neural network (R-CNN) detector was adopted and tailored to identify and segment human subjects in real time using camera images with an average speed of 7 frames per second. The Mask R-CNN framework was trained end to end using the Objects in Public Open Spaces (OPOS) image data set that includes classified segmentations of people in public spaces. A two-dimensional/three-dimensional (2D-3D) lifting algorithm based on a monocular camera calibration model was also employed to accurately position detected people in space. Finally, a Hungarian assignment algorithm based on association metrics extracted from detected people was used to assign people to spatiotemporal trajectories. To demonstrate the proposed framework, this study used the Detroit riverfront parks to study how people utilize community parks, which are a form of social infrastructure. The Mask R-CNN detector is proven precise in detecting and classifying the behavior of people in parks with mean average precision well above 85% for all class types defined in the OPOS library. The framework is also shown to be effective in spatially mapping the various uses of park furnishings, leading to better management of parks.

引用

页数：21

共 53 条

[1] Adriaens P, 2019, MONEYBALL APPROACH C
[2] Open storm: a complete framework for sensing and control of urban watersheds
Bartos, Matthew
Wong, Brandon
Kerkez, Branko
[J]. ENVIRONMENTAL SCIENCE-WATER RESEARCH & TECHNOLOGY, 2018, 4 (03) : 346 - 358
[3] The significance of parks to physical activity and public health - A conceptual model
Bedimo-Rung, AL
Mowen, AJ
Cohen, DA
[J]. AMERICAN JOURNAL OF PREVENTIVE MEDICINE, 2005, 28 (02) : 159 - 168
[4] Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics
Bernardin, Keni
Stiefelhagen, Rainer
[J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2008, 2008 (1)
[5] Bewley A, 2016, IEEE IMAGE PROC, P3464, DOI 10.1109/ICIP.2016.7533003
[6] Crompton J.L., 2005, MANAGING LEISURE, V10, P203, DOI [10.1080/13606710500348060, DOI 10.1080/13606710500348060]
[7] Datal N., 2005, 2005 IEEE COMP VIS P, P886, DOI 10.1109/CVPR.2005.177
[8] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9] Elefteriadou L, 2020, CYBER PHYS SYSTEMS B, P237
[10] The Pascal Visual Object Classes (VOC) Challenge
Everingham, Mark
Van Gool, Luc
Williams, Christopher K. I.
Winn, John
Zisserman, Andrew
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338

← 1 2 3 4 5 6 →