A Supervised Learning Model for High-Dimensional and Large-Scale Data

被引:25
|
作者
Peng, Chong [1 ]
Cheng, Jie [2 ]
Cheng, Qiang [1 ]
机构
[1] Southern Illinois Univ, Dept Comp Sci, Carbondale, IL 62901 USA
[2] Univ Hawaii, Dept Comp Sci & Engn, Hilo, HI 96720 USA
基金
美国国家科学基金会;
关键词
Discriminative regression; supervised learning; classification; high dimension; large-scale data; NEWTON METHOD; CLASSIFICATION;
D O I
10.1145/2972957
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new supervised learning model using a discriminative regression approach. This new model estimates a regression vector to represent the similarity between a test example and training examples while seamlessly integrating the class information in the similarity estimation. This distinguishes our model from usual regression models and locally linear embedding approaches, rendering our method suitable for supervised learning problems in high-dimensional settings. Our model is easily extensible to account for nonlinear relationship and applicable to general data, including both high-and low-dimensional data. The objective function of the model is convex, for which two optimization algorithms are provided. These two optimization approaches induce two scalable solvers that are of mathematically provable, linear time complexity. Experimental results verify the effectiveness of the proposed method on various kinds of data. For example, our method shows comparable performance on low-dimensional data and superior performance on high-dimensional data to several widely used classifiers; also, the linear solvers obtain promising performance on large-scale classification.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] AliGater: a framework for the development of bioinformatic pipelines for large-scale, high-dimensional cytometry data
    Ekdahl, Ludvig
    Arrizabalaga, Antton Lamarca
    Ali, Zain
    Cafaro, Caterina
    de Lapuente Portilla, Aitzkoa Lopez
    Nilsson, Bjorn
    NEURO-ONCOLOGY ADVANCES, 2023, 5 (01)
  • [22] Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data
    Leclercq, Mickael
    Vittrant, Benjamin
    Martin-Magniette, Marie Laure
    Boyer, Marie Pier Scott
    Perin, Olivier
    Bergeron, Alain
    Fradet, Yves
    Droit, Arnaud
    FRONTIERS IN GENETICS, 2019, 10
  • [23] Grid-based indexing and search algorithms for large-scale and high-dimensional data
    Yang, Chuanfu
    Li, Zhiyang
    Qu, Wenyu
    Liu, Zhaobin
    Qi, Heng
    2017 14TH INTERNATIONAL SYMPOSIUM ON PERVASIVE SYSTEMS, ALGORITHMS AND NETWORKS & 2017 11TH INTERNATIONAL CONFERENCE ON FRONTIER OF COMPUTER SCIENCE AND TECHNOLOGY & 2017 THIRD INTERNATIONAL SYMPOSIUM OF CREATIVE COMPUTING (ISPAN-FCST-ISCC), 2017, : 46 - 51
  • [24] Supervised model-based visualization of high-dimensional data
    Kontkanen, Petri
    Lahtinen, Jussi
    Myllymäki, Petri
    Silander, Tomi
    Tirri, Henry
    Intelligent Data Analysis, 2000, 4 (3-4) : 213 - 227
  • [25] Batched Large-scale Bayesian Optimization in High-dimensional Spaces
    Wang, Zi
    Gehring, Clement
    Kohli, Pushmeet
    Jegelka, Stefanie
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [26] Parallel algorithms for clustering high-dimensional large-scale datasets
    Nagesh, H
    Goil, S
    Choudhary, A
    DATA MINING FOR SCIENTIFIC AND ENGINEERING APPLICATIONS, 2001, 2 : 335 - 356
  • [27] Distributed Methods for High-dimensional and Large-scale Tensor Factorization
    Shin, Kijung
    Kang, U.
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 989 - 994
  • [28] High-Dimensional Signature Compression for Large-Scale Image Classification
    Sanchez, Jorge
    Perronnin, Florent
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 1665 - 1672
  • [29] ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system
    Hughey, Jacob J.
    Hastie, Trevor
    Butte, Atul J.
    NUCLEIC ACIDS RESEARCH, 2016, 44 (08) : e80
  • [30] BSSReduce an O(|U|) Incremental Feature Selection Approach for Large-Scale and High-Dimensional Data
    Gong, Ke
    Wang, Yong
    Xu, Maozeng
    Xiao, Zhi
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2018, 26 (06) : 3356 - 3367