Bag-of-words representation for biomedical time series classification

被引:106
作者
Wang, Jin [1 ,3 ]
Liu, Ping [2 ]
She, Mary F. H. [1 ,3 ]
Nahavandi, Saeid [1 ]
Kouzani, Abbas [4 ]
机构
[1] Deakin Univ, Ctr Intelligent Syst Res, Waurn Ponds 3217, Australia
[2] Univ S Carolina, Dept Comp Sci, Columbia, SC 29205 USA
[3] Deakin Univ, Inst Frontier Mat, Waurn Ponds 3217, Australia
[4] Deakin Univ, Sch Engn, Waurn Ponds 3217, Australia
关键词
Bag of words; Codebook construction; k-Means clustering; EEG; ECG; HUMAN IDENTIFICATION; EEG; ECG; RECOGNITION;
D O I
10.1016/j.bspc.2013.06.004
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Automatic analysis of biomedical time series such as electroencephalogram (EEG) and electrocardiographic (ECG) signals has attracted great interest in the community of biomedical engineering due to its important applications in medicine. In this work, a simple yet effective bag-of-words representation that is originally developed for text document analysis is extended for biomedical time series representation. In particular, similar to the bag-of-words model used in text document domain, the proposed method treats a time series as a text document and extracts local segments from the time series as words. The biomedical time series is then represented as a histogram of codewords, each entry of which is the count of a codeword appeared in the time series. Although the temporal order of the local segments is ignored, the bag-of-words representation is able to capture high-level structural information because both local and global structural information are well utilized. The performance of the bag-of-words model is validated on three datasets extracted from real EEG and ECG signals. The experimental results demonstrate that the proposed method is not only insensitive to parameters of the bag-of-words model such as local segment length and codebook size, but also robust to noise. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:634 / 644
页数:11
相关论文
共 47 条
[1]   Automated diagnosis of epileptic EEG using entropies [J].
Acharya, U. Rajendra ;
Molinari, Filippo ;
Sree, S. Vinitha ;
Chattopadhyay, Subhagata ;
Ng, Kwan-Hoong ;
Suri, Jasjit S. .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2012, 7 (04) :401-408
[2]   Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state [J].
Andrzejak, RG ;
Lehnertz, K ;
Mormann, F ;
Rieke, C ;
David, P ;
Elger, CE .
PHYSICAL REVIEW E, 2001, 64 (06) :8-061907
[3]  
[Anonymous], 2006, P BIOM S SPEC SESS R
[4]   ECG analysis: A new approach in human identification [J].
Biel, L ;
Pettersson, O ;
Philipson, L ;
Wide, P .
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2001, 50 (03) :808-812
[5]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[6]   Wavelet distance measure for person identification using electrocardiograms [J].
Chan, Adrian D. C. ;
Hamdy, Mohyeldin M. ;
Badre, Armin ;
Badee, Vesal .
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2008, 57 (02) :248-253
[7]   A new metric for probability distributions [J].
Endres, DM ;
Schindelin, JE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2003, 49 (07) :1858-1860
[8]   Human identification by quantifying similarity and dissimilarity in electrocardiogram phase space [J].
Fang, Shih-Chin ;
Chan, Hsiao-Lung .
PATTERN RECOGNITION, 2009, 42 (09) :1824-1831
[9]  
Fei-Fei L, 2005, PROC CVPR IEEE, P524
[10]   Supervised learning of Gaussian mixture models for visual vocabulary generation [J].
Fernando, Basura ;
Fromont, Elisa ;
Muselet, Damien ;
Sebban, Marc .
PATTERN RECOGNITION, 2012, 45 (02) :897-907