Exact Algorithms for Two Quadratic Euclidean Problems of Searching for the Largest Subset and Longest Subsequence

被引:1
作者
Kel'manov, Alexander [1 ,2 ]
Khamidullin, Sergey [1 ]
Khandeev, Vladimir [1 ,2 ]
Pyatkin, Artem [1 ,2 ]
机构
[1] Sobolev Inst Math, 4 Koptyug Ave, Novosibirsk 630090, Russia
[2] Novosibirsk State Univ, 2 Pirogova St, Novosibirsk 630090, Russia
来源
LEARNING AND INTELLIGENT OPTIMIZATION, LION 12 | 2019年 / 11353卷
基金
俄罗斯基础研究基金会; 俄罗斯科学基金会;
关键词
Euclidean space; Largest set; Longest subsequence; Quadratic variation; NP-hard problem; Integer coordinates; Exact algorithm; Fixed space dimension; Pseudopolynomial time; TIME-SERIES DATA; CLUSTER-ANALYSIS;
D O I
10.1007/978-3-030-05348-2_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The following two strongly NP-hard problems are considered. In the first problem, we need to find in the given finite set of points in Euclidean space the subset of largest size such that the sum of squared distances between the elements of this subset and its unknown centroid (geometrical center) does not exceed a given percentage of the sum of squared distances between the elements of the input set and its centroid. In the second problem, the input is a sequence (not a set) and we have some additional constraints on the indices of the elements of the chosen subsequence under the same restriction on the sum of squared distances as in the first problem. Both problems can be treated as data editing problems aimed to find similar elements and removal of extraneous (dissimilar) elements. We propose exact algorithms for the cases of both problems in which the input points have integer-valued coordinates. If the space dimension is bounded by some constant, our algorithms run in a pseudopolynomial time. Some results of numerical experiments illustrating the performance of the algorithms are presented.
引用
收藏
页码:326 / 336
页数:11
相关论文
共 27 条
[1]   Approximation polynomial algorithm for the data editing and data cleaning problem [J].
Ageeva A.A. ;
Kel’manov A.V. ;
Pyatkin A.V. ;
Khamidullin S.A. ;
Shenmaier V.V. .
Pattern Recognition and Image Analysis, 2017, 27 (03) :365-370
[2]   FINDING K-POINTS WITH MINIMUM DIAMETER AND RELATED PROBLEMS [J].
AGGARWAL, A ;
IMAI, H ;
KATOH, N ;
SURI, S .
JOURNAL OF ALGORITHMS, 1991, 12 (01) :38-56
[3]  
Aggarwal C.C., 2015, Data mining: The Textbook, DOI [10.1007/978-3-319-14142-8, DOI 10.1007/978-3-319-14142-8]
[4]  
BISHOP C. M., 2006, Pattern recognition and machine learning, DOI [DOI 10.1117/1.2819119, 10.1007/978-0-387-45528-0]
[5]  
De Waal T., 2011, Handbook of Statistical Data Editing and Imputation, DOI 10.1002/9780470904848
[6]  
Farcomeni A., 2015, Robust methods for data reduction
[7]   A review on time series data mining [J].
Fu, Tak-chung .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2011, 24 (01) :164-181
[8]  
Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
[9]   Cluster analysis and mathematical programming [J].
Hansen, P ;
Jaumard, B .
MATHEMATICAL PROGRAMMING, 1997, 79 (1-3) :191-215
[10]  
Hastie T., 2009, ELEMENTS STAT LEARNI, DOI 10.1007/978-0-387-84858-7