Robust weighted regression via PAELLA sample weights

被引:0
作者
Castejón-Limas M. [1 ]
Alaiz-Moreton H. [2 ]
Fernández-Robles L. [1 ]
Alfonso-Cendón J. [1 ]
Fernández-Llamas C. [1 ]
Sánchez-González L. [1 ]
Pérez H. [1 ]
机构
[1] Department of Mechanical, Computer Science and Aerospace Engineering, Universidad de León, León
[2] Department of Electrical, Systems and Automatic Engineering, Universidad de León, León
关键词
Multilayer perceptron; Outlier detection; PAELLA; Robust regression; Weighted regression;
D O I
10.1016/j.neucom.2019.03.108
中图分类号
学科分类号
摘要
This paper reports the usage of the occurrence vector provided by the PAELLA algorithm in the context of robust regression. PAELLA was originally conceived as an outlier detection and data cleaning technique. A novel approach is to use this algorithm not for discarding outliers but to generate information related to the reliability of the observations recorded in the dataset. This approach proves to provide successful results when compared to traditional common practice such as outlier removal. A set of experiments using a contrived difficult artificial dataset are described using both neural networks and classical polynomial fitting. Finally, a successful comparison of our approach to two state-of-the-art algorithms proves the benefits of using the PAELLA algorithm in the context of robust regression. © 2019 Elsevier B.V.
引用
收藏
页码:325 / 333
页数:8
相关论文
共 17 条
  • [1] Menendez C., Ordieres J.B., Ortega F., Importance of information pre-processing in the improvement of neural network results, Expert Syst., 13, 2, pp. 95-103, (1996)
  • [2] Ordieres J.B., Vergara E.P., Capuz R.S., Salazar R.E., Neural network prediction model for fine particulate matter (PM 2.5) on the US–Mexico border in el paso (Texas) and Ciudad Juárez (Chihuahua), Environ. Model. Softw., 20, 5, pp. 547-559, (2005)
  • [3] Salazar-Ruiz E., Ordieres J.B., Vergara E.P., Capuz-Rizo S.F., Development and comparative analysis of tropospheric ozone prediction models using linear and artificial intelligence-based models in Mexicali, Baja California (Mexico) and Calexico, California (US), Environ. Model. Softw., 23, 8, pp. 1056-1069, (2008)
  • [4] Gong B., Ordieres-Mere J., Prediction of daily maximum ozone threshold exceedances by preprocessing and ensemble artificial intelligence techniques: case study of Hong Kong, Environ. Model. Softw., 84, pp. 290-303, (2016)
  • [5] Bing G., Ordieres-Mere J., Cabrera C.B., Prediction models for ozone in metropolitan area of Mexico city based on artificial intelligence techniques, Int. J. Inf. Decis. Sci., 7, 2, pp. 115-139, (2015)
  • [6] Lv Z., Chirivella J., Gagliardo P., Bigdata oriented multimedia mobile health applications, J. Med. Syst., 40, 5, (2016)
  • [7] Ordieres-Mere J., Martinez-de Pison-Ascacibar F.J., Gonzalez-Marcos A., Ortiz-Marcos I., Comparison of models created for the prediction of the mechanical properties of galvanized steel coils, J. Intell. Manuf., 21, 4, pp. 403-421, (2010)
  • [8] Gonzalez-Marcos A., Alba-Elias F., Castejon-Limas M., Ordieres-Mere J., Development of neural network-based models to predict mechanical properties of hot dip galvanised steel coils, Int. J. Data Min. Model. Manag., 3, 4, pp. 389-405, (2011)
  • [9] Ordieres J., Lopez L.M., Bello A., Garcia A., Intelligent methods helping the design of a manufacturing system for die extrusion rubbers, Int. J. Comput. Integr. Manuf., 16, 3, pp. 173-180, (2003)
  • [10] Dasu T., Johnson T., Exploratory Data Mining and Data Cleaning, (2003)