Relational Data Pre-Processing Techniques for Improved Securities Fraud Detection

被引:0
|
作者
Fast, Andrew [1 ]
Friedland, Lisa [1 ]
Maier, Marc [1 ]
Taylor, Brian [1 ]
Jensen, David [1 ]
Goldberg, Henry G. [2 ]
Komoroske, John [2 ]
机构
[1] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA
[2] Natl Associat Secur Dealers, Washington, DC 20006 USA
基金
美国国家科学基金会;
关键词
Fraud detection; data pre-processing; statistical relational learning; normalization; relational probability trees;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Commercial datasets are often large, relational, and dynamic. They contain many records of people, places, things, events and their interactions over time. Such datasets are rarely structured appropriately for knowledge discovery, and they often contain variables whose meanings change across different subsets of the data. We describe how these challenges were addressed in a collaborative analysis project undertaken by the University of Massachusetts Amherst and the National Association of Securities Dealers (NASD). We describe several methods for data preprocessing that we applied to transform a large, dynamic, and relational dataset describing nearly the entirety of the U.S. securities industry, and we show how these methods made the dataset suitable for learning statistical relational models. To better utilize social structure, we first applied known consolidation and link formation techniques to associate individuals with branch office locations. In addition, we developed an innovative technique to infer professional associations by exploiting dynamic employment histories. Finally, we applied normalization techniques to create a suitable class label that adjusts for spatial, temporal, and other heterogeneity within the data. We show how these pre-processing techniques combine to provide the necessary foundation for learning high-performing statistical models of fraudulent activity.
引用
收藏
页码:941 / +
页数:2
相关论文
共 50 条
  • [21] Automated algorithm for improved pre-processing of magnetic relaxometry data
    Stefan, W.
    Mathieu, K.
    Thrower, S. L.
    Fuentes, D.
    Kaffes, C.
    Sovizi, J.
    Hazle, J. D.
    MEDICAL IMAGING 2018: PHYSICS OF MEDICAL IMAGING, 2018, 10573
  • [22] Improved Segmentation of Cardiac MRI Using Efficient Pre-Processing Techniques
    Joshi, Nikita
    Jain, Sarika
    JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2022, 15 (01)
  • [23] Pre-processing of the speech data
    不详
    ROBUST ADAPTATION TO NON-NATIVE ACCENTS IN AUTOMATIC SPEECH RECOGNITION, 2002, 2560 : 15 - 19
  • [24] Pre-processing for data clustering
    Frigui, H
    NAFIPS 2004: ANNUAL MEETING OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY, VOLS 1AND 2: FUZZY SETS IN THE HEART OF THE CANADIAN ROCKIES, 2004, : 967 - 972
  • [25] Pre-processing techniques for the QSAR problem
    Dept. of Computer SciandEng, University DunǍrea de Jos of Galaţi, Romania
    Front. Artif. Intell. Appl., 2008, 1 (107-114):
  • [26] Evaluation of Effect of Pre-Processing Techniques in Solar Panel Fault Detection
    Pathak, Sujata P. P.
    Patil, Sonali A. A.
    IEEE ACCESS, 2023, 11 : 72848 - 72860
  • [27] Pre-processing for noise detection in gene expression classification data
    Institute of Mathematics and Computer Sciences - ICMC, University of São Paulo - USP, PO Box 668, 13560-970, São Carlos, SP, Brazil
    不详
    J. Braz. Comput. Soc., 2009, 1 (3-11):
  • [28] Detection of Brain Tumour in Medical Images Using Pre-Processing Techniques
    Monika, Surineni
    Malathi, K.
    Monisha, Surineni
    RESEARCH JOURNAL OF PHARMACEUTICAL BIOLOGICAL AND CHEMICAL SCIENCES, 2016, 7 : 78 - 87
  • [29] Research on the data pre-processing in the network abnormal intrusion detection
    Cui, Xiang
    Yin, Guisheng
    Teng, Xuyang
    Open Automation and Control Systems Journal, 2014, 6 (01): : 1228 - 1232
  • [30] Research on the data pre-processing in the network abnormal intrusion detection
    Yin, Guisheng, 1600, Bentham Science Publishers B.V., P.O. Box 294, Bussum, 1400 AG, Netherlands (06):