Statistical Feature Selection From Massive Data in Distribution Fault Diagnosis

被引:35
作者
Cai, Yixin [1 ]
Chow, Mo-Yuen [1 ]
Lu, Wenbin [2 ]
Li, Lexin [2 ]
机构
[1] N Carolina State Univ, Dept Elect & Comp Engn, Raleigh, NC 27695 USA
[2] N Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
基金
美国国家科学基金会;
关键词
Akaike's information criteria; classification; fault cause identification; feature selection; hypothesis test; LASSO; logistic regression; power distribution systems; smart grid; stepwise regression; CAUSE IDENTIFICATION; ADAPTIVE LASSO; ALGORITHM; MODELS;
D O I
10.1109/TPWRS.2009.2036924
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Selecting proper features to identify the root cause is a critical step in distribution fault diagnosis. Power engineers usually select features based on experience. However, engineers cannot be familiar with every local system, especially in fast growing regions. With the advancing information technologies and more powerful sensors, utilities can collect much more data on their systems than before. The phenomenon will be even more substantial for the anticipating Smart Grid environments. To help power engineers select features based on the massive data collected, this paper reviews two popular feature selection methods: 1) hypothesis test, 2) stepwise regression, and introduces another two: 3) stepwise selection by Akaike's Information Criterion, and 4) LASSO/ALASSO. These four methods are compared in terms of their model requirements, data assumptions, and computational cost. With real-world datasets from Progress Energy Carolinas, this paper also evaluates these methods and compares fault diagnosis performance by accuracy, probability of detection and false alarm ratio. This paper discusses the advantages and limitations of each method for distribution fault diagnosis as well.
引用
收藏
页码:642 / 648
页数:7
相关论文
共 18 条
[1]   MAXIMUM LIKELIHOOD IDENTIFICATION OF GAUSSIAN AUTOREGRESSIVE MOVING AVERAGE MODELS [J].
AKAIKE, H .
BIOMETRIKA, 1973, 60 (02) :255-265
[2]  
[Anonymous], 2002, Electric power distribution reliability
[3]  
BERNARDI M, 2006, P 9 INT C PROB METH
[4]  
CAI Y, 2009, P IEEE POW EN SOC GE
[5]  
Friedman J., 2007, Regularization paths for generalized linear models via coordinate descent
[6]   Statistical models of the effects of tree trimming on power system outages [J].
Guikema, Seth D. ;
Davidson, Rachel A. ;
Liu, Haibin .
IEEE TRANSACTIONS ON POWER DELIVERY, 2006, 21 (03) :1549-1557
[7]  
Hosmer W., 2000, Applied Logistic Regression, VSecond
[8]  
SAHAI S, 2006, P 9 INT C PROB METH
[10]  
ULRICH R, SUMMARY DISCUSSIONS