Relational Data Pre-Processing Techniques for Improved Securities Fraud Detection

被引:0
|
作者
Fast, Andrew [1 ]
Friedland, Lisa [1 ]
Maier, Marc [1 ]
Taylor, Brian [1 ]
Jensen, David [1 ]
Goldberg, Henry G. [2 ]
Komoroske, John [2 ]
机构
[1] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA
[2] Natl Associat Secur Dealers, Washington, DC 20006 USA
基金
美国国家科学基金会;
关键词
Fraud detection; data pre-processing; statistical relational learning; normalization; relational probability trees;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Commercial datasets are often large, relational, and dynamic. They contain many records of people, places, things, events and their interactions over time. Such datasets are rarely structured appropriately for knowledge discovery, and they often contain variables whose meanings change across different subsets of the data. We describe how these challenges were addressed in a collaborative analysis project undertaken by the University of Massachusetts Amherst and the National Association of Securities Dealers (NASD). We describe several methods for data preprocessing that we applied to transform a large, dynamic, and relational dataset describing nearly the entirety of the U.S. securities industry, and we show how these methods made the dataset suitable for learning statistical relational models. To better utilize social structure, we first applied known consolidation and link formation techniques to associate individuals with branch office locations. In addition, we developed an innovative technique to infer professional associations by exploiting dynamic employment histories. Finally, we applied normalization techniques to create a suitable class label that adjusts for spatial, temporal, and other heterogeneity within the data. We show how these pre-processing techniques combine to provide the necessary foundation for learning high-performing statistical models of fraudulent activity.
引用
收藏
页码:941 / +
页数:2
相关论文
共 50 条
  • [1] Optimizing Machine Learning Data Pre-Processing for Financial Fraud Detection
    Bower, Matthew
    Godasu, Rajesh
    Nyakundi, Nicholas
    Reynolds, Shawn
    2024 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY, EIT 2024, 2024, : 28 - 37
  • [2] An improved procedure for detection of heart arrhythmias with novel pre-processing techniques
    Ghorbanian, P.
    Jalali, A.
    Ghaffari, A.
    Nataraj, C.
    EXPERT SYSTEMS, 2012, 29 (05) : 478 - 491
  • [3] Ensembles of Pre-processing Techniques for Noise Detection in Gene Expression Data
    Libralon, Giampaolo L.
    Leon Ferreira Carvalho, Andre C. Ponce
    Lorena, Ana C.
    ADVANCES IN NEURO-INFORMATION PROCESSING, PT I, 2009, 5506 : 486 - +
  • [4] Pre-processing Techniques for Detection of Blurred Images
    Francis, Leena Mary
    Sreenath, N.
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA ENGINEERING (ICCIDE 2018), 2019, 28 : 59 - 66
  • [5] VIBRATION DATA PRE-PROCESSING TECHNIQUES FOR ROLLING ELEMENT BEARING FAULT DETECTION
    Peeters, Cedric
    Guillaume, Patrick
    Helsen, Jan
    PROCEEDINGS OF THE 23RD INTERNATIONAL CONGRESS ON SOUND AND VIBRATION: FROM ANCIENT TO MODERN ACOUSTICS, 2016,
  • [6] Pre-processing of ground penetrating impulse radar data for improved detection capability
    Brunzell, H
    IGARSS '98 - 1998 INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, PROCEEDINGS VOLS 1-5: SENSING AND MANAGING THE ENVIRONMENT, 1998, : 1478 - 1480
  • [7] Pre-processing techniques for improved detection of vocalization sounds in a neonatal intensive care unit
    Raboshchuk, Ganna
    Nadeu, Climent
    Vidiella Pinto, Sergio
    Ros Fornells, Oriol
    Munoz Mahamud, Blanca
    Riverola de Veciana, Ana
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2018, 39 : 390 - 395
  • [8] Quantitative Analysis of Pre-processing Techniques for Tumour Detection
    Kuriakose, Nimi Mary
    Salgaonkar, Antakshari
    Marriappan, Ambika
    Malhan, Nikit Singh
    Marchon, Niyan
    2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING (ICIP), 2015, : 485 - 490
  • [9] Analysis of activity detection data pre-processing
    Alexan, Anca
    Alexan, Alexandru
    Stefan, Oniga
    Pap, Iuliu Alexandru
    2019 IEEE 25TH INTERNATIONAL SYMPOSIUM FOR DESIGN AND TECHNOLOGY IN ELECTRONIC PACKAGING (SIITME 2019), 2019, : 282 - 286
  • [10] Visualization Techniques on the Examination Timetabling Pre-processing Data
    Thomas, J. Joshua
    Khader, Ahamad Tajudin
    Belaton, Bahari
    PROCEEDINGS OF THE 2009 SIXTH INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS, IMAGING AND VISUALIZATION, 2009, : 454 - 458