Clustering-Based Predictive Analytics to Improve Scientific Data Discovery

被引:1
作者
Devarakonda, Ranjeet [1 ]
Kumar, Jitendra [1 ]
Prakash, Giri [1 ]
机构
[1] Oak Ridge Natl Lab, Environm Sci Div, Oak Ridge, TN 37830 USA
来源
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2020年
关键词
clustering; content-based filtering; collaborative filtering; data recommended system; data discovery;
D O I
10.1109/BigData50022.2020.9377797
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given the sheer volume of scientific data archived within the data-intensive projects at the US Department of Energy's Oak Ridge National Laboratory, finding precisely what data we are looking for may not be a trivial task; conversely, we may also miss a more prominent data product. To address such issues, we propose improving the data discovery system and using data analytics methods to comprehend what specific users might be interested in based on their physiological state, search patterns, and past data usage history. This work's primary goal is to prune the complexity, increase the visibility of popular data products, and direct users toward the data that best meet their needs. The proposed algorithm constructs a user profile based on the user's explicit or implicit interactions with the system, such as items they are currently looking at on-site and the key metadata mappings related to the data set. The pattern is then used to build a training data set, which will help find relevant data to recommend to the user.
引用
收藏
页码:5658 / 5661
页数:4
相关论文
共 50 条
[21]   A Learning Automata-Based Approach to Improve the Scalability of Clustering-Based Recommender Systems [J].
Taghipour, Sara ;
Torkestani, Javad Akbari ;
Nazari, Sara .
CYBERNETICS AND SYSTEMS, 2024, 55 (07) :1562-1593
[22]   Detecting Data Accuracy Issues in Textual Geographical Data by a Clustering-based Approach [J].
Pellegrino, Maria Angela ;
Postiglione, Luca ;
Scarano, Vittorio .
CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, :208-212
[23]   A Clustering-based Recommendation System [J].
Wu, Shaofei .
PROCEEDINGS OF 2008 INTERNATIONAL PRE-OLYMPIC CONGRESS ON COMPUTER SCIENCE, VOL I: COMPUTER SCIENCE AND ENGINEERING, 2008, :328-330
[24]   Automated Clustering for Data Analytics [J].
Byrnes, Paul E. .
JOURNAL OF EMERGING TECHNOLOGIES IN ACCOUNTING, 2019, 16 (02) :43-58
[25]   Let's Summarize Scientific Documents! A Clustering-Based Approach via Citation Context [J].
Mishra, Santosh Kumar ;
Saini, Naveen ;
Saha, Sriparna ;
Bhattacharyya, Pushpak .
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 :330-339
[26]   Clustering-Based Federated Learning for Enhancing Data Privacy in Internet of Vehicles [J].
Jin, Zilong ;
Wang, Jin ;
Zhang, Lejun .
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (06) :1462-1477
[27]   Efficient clustering-based data aggregation techniques for wireless sensor networks [J].
Jung, Woo-Sung ;
Lim, Keun-Woo ;
Ko, Young-Bae ;
Park, Sang-Joon .
WIRELESS NETWORKS, 2011, 17 (05) :1387-1400
[28]   ClubCF: A Clustering-Based Collaborative Filtering Approach for Big Data Application [J].
Hu, Rong ;
Dou, Wanchun ;
Liu, Jianxun .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2014, 2 (03) :302-313
[29]   Unsupervised Clustering-Based Analysis of the Measured Eye-Tracking Data [J].
Ivanova, Lenka ;
Laco, Miroslav ;
Benesova, Wanda .
FOURTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2021), 2022, 12084
[30]   An Interactive Clustering-Based Visualization Tool for Air Quality Data Analysis [J].
Ashouri, Mahsa ;
Phoa, Frederick Kin Hing ;
Chen, Chun-Houh ;
Shmueli, Galit .
AEROSOL AND AIR QUALITY RESEARCH, 2023, 23 (12)