A Supervised Learning Model for High-Dimensional and Large-Scale Data

被引:25
|
作者
Peng, Chong [1 ]
Cheng, Jie [2 ]
Cheng, Qiang [1 ]
机构
[1] Southern Illinois Univ, Dept Comp Sci, Carbondale, IL 62901 USA
[2] Univ Hawaii, Dept Comp Sci & Engn, Hilo, HI 96720 USA
基金
美国国家科学基金会;
关键词
Discriminative regression; supervised learning; classification; high dimension; large-scale data; NEWTON METHOD; CLASSIFICATION;
D O I
10.1145/2972957
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new supervised learning model using a discriminative regression approach. This new model estimates a regression vector to represent the similarity between a test example and training examples while seamlessly integrating the class information in the similarity estimation. This distinguishes our model from usual regression models and locally linear embedding approaches, rendering our method suitable for supervised learning problems in high-dimensional settings. Our model is easily extensible to account for nonlinear relationship and applicable to general data, including both high-and low-dimensional data. The objective function of the model is convex, for which two optimization algorithms are provided. These two optimization approaches induce two scalable solvers that are of mathematically provable, linear time complexity. Experimental results verify the effectiveness of the proposed method on various kinds of data. For example, our method shows comparable performance on low-dimensional data and superior performance on high-dimensional data to several widely used classifiers; also, the linear solvers obtain promising performance on large-scale classification.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Text Relevance Analysis Method over Large-Scale High-Dimensional Text Data Processing
    Wang, Ling
    Ding, Wei
    Zhou, Tie Hua
    Ryu, Keun Ho
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT I, 2015, 9329 : 371 - 379
  • [32] High-dimensional large-scale mixed-type data imputation under missing at random
    Liu, Wei
    Li, Guizhen
    Zhou, Ling
    Luo, Lan
    SCIENCE CHINA-MATHEMATICS, 2025, 68 (04) : 969 - 1000
  • [33] High-dimensional large-scale mixed-type data imputation under missing at random
    Wei Liu
    Guizhen Li
    Ling Zhou
    Lan Luo
    Science China(Mathematics), 2025, 68 (04) : 969 - 1000
  • [34] An Interactive Visual Testbed System for Dimension Reduction and Clustering of Large-scale High-dimensional Data
    Choo, Jaegul
    Lee, Hanseung
    Liu, Zhicheng
    Stasko, John
    Park, Haesun
    VISUALIZATION AND DATA ANALYSIS 2013, 2013, 8654
  • [35] Multi-aspect visual analytics on large-scale high-dimensional cyber security data
    Chen, Victor Y.
    Razip, Ahmad M.
    Ko, Sungahn
    Qian, Cheryl Z.
    Ebert, David S.
    INFORMATION VISUALIZATION, 2015, 14 (01) : 62 - 75
  • [36] Data Independent Method of Constructing Distributed LSH for Large-Scale Dynamic High-Dimensional Indexing
    Gu, Xiaoguang
    Zhang, Lei
    Zhang, Dongming
    Zhang, Yongdong
    Li, Jintao
    Bao, Ning
    2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 564 - 571
  • [37] Copula-Based Anomaly Scoring and Localization for Large-Scale, High-Dimensional Continuous Data
    Horvath, Gabor
    Kovacs, Edith
    Molontay, Roland
    Novaczki, Szabolcs
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2020, 11 (03)
  • [38] An efficient similarity join approach on large-scale high-dimensional data using random projection
    Ma, Youzhong
    Zhang, Ruiling
    Jia, Shijie
    Zhang, Yongxin
    Meng, Xiaofeng
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (20):
  • [39] A fast classification strategy for SVM on the large-scale high-dimensional datasets
    Li, I-Jing
    Wu, Jiunn-Lin
    Yeh, Chih-Hung
    PATTERN ANALYSIS AND APPLICATIONS, 2018, 21 (04) : 1023 - 1038
  • [40] LARGE-SCALE PARALLEL SIMULATION OF HIGH-DIMENSIONAL AMERICAN OPTION PRICING
    Chang Hong-xu
    Lu Zhong-hua
    Chi Xue-bin
    DCABES 2009: THE 8TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE, PROCEEDINGS, 2009, : 127 - 132