Clustering-Based Predictive Analytics to Improve Scientific Data Discovery

被引:1
作者
Devarakonda, Ranjeet [1 ]
Kumar, Jitendra [1 ]
Prakash, Giri [1 ]
机构
[1] Oak Ridge Natl Lab, Environm Sci Div, Oak Ridge, TN 37830 USA
来源
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2020年
关键词
clustering; content-based filtering; collaborative filtering; data recommended system; data discovery;
D O I
10.1109/BigData50022.2020.9377797
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given the sheer volume of scientific data archived within the data-intensive projects at the US Department of Energy's Oak Ridge National Laboratory, finding precisely what data we are looking for may not be a trivial task; conversely, we may also miss a more prominent data product. To address such issues, we propose improving the data discovery system and using data analytics methods to comprehend what specific users might be interested in based on their physiological state, search patterns, and past data usage history. This work's primary goal is to prune the complexity, increase the visibility of popular data products, and direct users toward the data that best meet their needs. The proposed algorithm constructs a user profile based on the user's explicit or implicit interactions with the system, such as items they are currently looking at on-site and the key metadata mappings related to the data set. The pattern is then used to build a training data set, which will help find relevant data to recommend to the user.
引用
收藏
页码:5658 / 5661
页数:4
相关论文
共 50 条
[41]   An improved clustering-based collaborative filtering recommendation algorithm [J].
Liu Xiaojun .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2017, 20 (02) :1281-1288
[42]   An improved clustering-based collaborative filtering recommendation algorithm [J].
Liu Xiaojun .
Cluster Computing, 2017, 20 :1281-1288
[43]   A hybrid method using multidimensional clustering-based collaborative filtering to improve recommendation diversity [J].
Li, Xiaohui ;
Murata, Tomohiro .
IEEJ Transactions on Electronics, Information and Systems, 2013, 133 (04) :749-755
[44]   Clustering-Based Incremental Web Crawling [J].
Tan, Qingzhao ;
Mitra, Prasenjit .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (04)
[45]   Random clustering-based outlier detector [J].
Kiersztyn A. ;
Pylak D. ;
Horodelski M. ;
Kiersztyn K. ;
Urbanovich P. .
Information Sciences, 2024, 667
[46]   A clustering-based discretization for supervised learning [J].
Gupta, Ankit ;
Mehrotra, Kishan G. ;
Mohan, Chilukuri .
STATISTICS & PROBABILITY LETTERS, 2010, 80 (9-10) :816-824
[47]   Clustering-based preconditioning for stochastic programs [J].
Yankai Cao ;
Carl D. Laird ;
Victor M. Zavala .
Computational Optimization and Applications, 2016, 64 :379-406
[48]   Novel clustering-based pruning algorithms [J].
Paweł Zyblewski ;
Michał Woźniak .
Pattern Analysis and Applications, 2020, 23 :1049-1058
[49]   Metric learning with clustering-based constraints [J].
Xinyao Guo ;
Chuangyin Dang ;
Jianqing Liang ;
Wei Wei ;
Jiye Liang .
International Journal of Machine Learning and Cybernetics, 2021, 12 :3597-3605
[50]   Clustering-based preconditioning for stochastic programs [J].
Cao, Yankai ;
Laird, Carl D. ;
Zavala, Victor M. .
COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2016, 64 (02) :379-406