Deep Learning for Fall Detection: Three-Dimensional CNN Combined With LSTM on Video Kinematic Data

被引：222

作者：

Lu, Na ^{[1
,2
]}

Wu, Yidan ^{[1
]}

Feng, Li ^{[3
]}

Song, Jinbo ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Syst Engn Inst, Xian 710049, Shaanxi, Peoples R China

[2] Beijing Adv Innovat Ctr Intelligent Robots & Syst, Beijing 100081, Peoples R China

[3] Xi An Jiao Tong Univ, Affiliated Hosp, Second Med Imaging Dept, Hosp Xian 9, Xian 710004, Shaanxi, Peoples R China

来源：

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS | 2019年 / 23卷 / 01期

基金：

中国博士后科学基金; 高等学校博士学科点专项科研基金; 中国国家自然科学基金;

关键词：

Activity recognition; convolutional neural network; deep learning; fall detection; visual attention; CONVOLUTIONAL NETWORKS; NEURAL-NETWORKS; SYSTEM;

D O I：

10.1109/JBHI.2018.2808281

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Fall detection is an important public healthcare problem. Timely detection could enable instant delivery of medical service to the injured. A popular nonintrusive solution for fall detection is based on videos obtained through ambient camera, and the corresponding methods usually require a large dataset to train a classifier and are inclined to be influenced by the image quality. However, it is hard to collect fall data and instead simulated falls are recorded to construct the training dataset, which is restricted to limited quantity. To address these problems, a three-dimensional convolutional neural network (3-D CNN) based method for fall detection is developed, which only uses video kinematic data to train an automatic feature extractor and could circumvent the requirement for large fall dataset of deep learning solution. 2-D CNN could only encode spatial information, and the employed 3-D convolution could extract motion feature from temporal sequence, which is important for fall detection. To further locate the region of interest in each frame, a long short-term memory (LSTM) based spatial visual attention scheme is incorporated. Sports dataset Sports-1 M with no fall examples is employed to train the 3-D CNN, which is then combined with LSTM to train a classifier with fall dataset. Experiments have verified the proposed scheme on fall detection benchmark with high accuracy as 100%. Superior performance has also been obtained on other activity databases.

引用

页码：314 / 323

页数：10

共 58 条

[1]

Alwan M., 2006, 2 INFORM COMMUNICATI, P1003, DOI DOI 10.1109/ICTTA.2006.1684511

[2]

Anderson Derek, 2006, Conf Proc IEEE Eng Med Biol Soc, V2006, P6388

[3] Linguistic summarization of video for fall detection using voxel person and fuzzy logic [J].

Anderson, Derek ;

Luke, Robert H. ;

Keller, James M. ;

Skubic, Marjorie ;

Rantz, Marilyn ;

Aud, Myra .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2009, 113 (01) :80-89

[4]

[Anonymous], ARXIV170508106V2

[5]

[Anonymous], 2015, ICLR

[6]

[Anonymous], 2016, P INT C LEARN REPR S

[7]

[Anonymous], 2013, IEEE T PATTERN ANAL, DOI DOI 10.1109/TPAMI.2012.59

[8]

[Anonymous], PROC CVPR IEEE

[9]

[Anonymous], P AAAI C ART INT PHO

[10]

[Anonymous], COMMUN ACM, DOI [DOI 10.1145/3065386, 10.1145/3065386]

← 1 2 3 4 5 6 →