Large-scale data analysis on aviation accident database using different data mining techniques

被引:6
作者
Christopher, A. B. Arockia [1 ]
Vivekanandam, V. Shunmughavel [2 ]
Anderson, A. B. Antony [3 ]
Markkandeyan, S. [4 ]
Sivakumar, V. [5 ]
机构
[1] VSB Engn Coll, Dept Informat Technol, Karur, Tamil Nadu, India
[2] VSB Engn Coll, Dept Comp Sci & Engn, Karur, Tamil Nadu, India
[3] PVP Coll Technol & Engn Women, Dept Comp Sci & Engn, Dindigul, Tamil Nadu, India
[4] RVS Coll Engn, Dept IT, Dindigul, Tamil Nadu, India
[5] VSB Engn Coll, Dept Elect & Elect Engn, Karur, Tamil Nadu, India
关键词
data mining; feature selection; classification techniques; decision tree; aviation;
D O I
10.1017/aer.2016.107
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Data mining is an iterative process in which progress is defined by discovery through either automatic or manual methods. A data cleaning procedure is proposed to improve the quality of classification tasks in the knowledge discovery process by taking into account both redundant and conflicting data. The redundancy check is performed on the original dataset and the resultant dataset is preserved. This resultant dataset is then checked for conflicting data and, if any are found, they are corrected and updated on the original aircraft dataset. This updated dataset is then classified using a variety of classifiers such as Bayes, functions, lazy, MISC, rules and decision trees. The performance of the updated datasets on these classifiers is examine, and the result shows a significant improvement in the classification accuracy after redundancy and conflicts are removed. The conflicts after correction are updated in the original dataset, and when the performance of the classifier is evaluated, great improvement is observed. This paper aims to address how data mining techniques can be used to understand complex system accidents in the aviation domain. Decision trees are considered to be the one of the most powerful and popular approaches in knowledge discovery and data mining. The objective is to develop a classification model for aviation risk investigation and reduction using a decision tree induction method that enhances the ability to form decision trees and thereby proves that the classification accuracy of decision trees is greater. Different feature selectors are used in this study in order to reduce the number of initial attributes.
引用
收藏
页码:1849 / 1866
页数:18
相关论文
共 30 条
[1]  
Altidor W, 2011, HANDBOOK OF DATA INTENSIVE COMPUTING, P349, DOI 10.1007/978-1-4614-1415-5_13
[2]   Marketing culture and customer retention in the tourism industry [J].
Appiah-Adu, K ;
Fyall, A ;
Singh, S .
SERVICE INDUSTRIES JOURNAL, 2000, 20 (02) :95-113
[3]   Consistency measures for feature selection [J].
Arauzo-Azofra, Antonio ;
Manuel Benitez, Jose ;
Luis Castro, Juan .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2008, 30 (03) :273-292
[4]  
ASHA G. K, 2012, INT J INFO TECH KNOW, V2, P271
[5]  
Berry M., 2006, LECT NOTES DATA MINI, P87
[6]   Development of a civil aircraft dispatch reliability prediction methodology [J].
Bineid, M ;
Fielding, FP .
AIRCRAFT ENGINEERING AND AEROSPACE TECHNOLOGY, 2003, 75 (06) :588-594
[7]  
Chen Y, 2006, LECT NOTES COMPUT SC, V4318, P153
[8]  
Delimata P., 2006, INT C HYBR INF TECHN
[9]  
DESSUREAULT S., 2007, MINING ENG LITTLETON, V59, P64
[10]   For most large underdetermined systems of equations, the minimal l1-norm near-solution approximates the sparsest near-solution [J].
Donoho, David L. .
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 2006, 59 (07) :907-934