Integrated Digital Library System for Long Documents and their Elements

被引:4
作者
Chekuri, Satvik [1 ]
Chandrasekar, Prashant [2 ]
Banerjee, Bipasha [1 ]
Park, Sung Hee [1 ]
Masrourisaadat, Nila [1 ]
Ahuja, Aman [1 ]
Ingram, William A. [1 ]
Fox, Edward A. [1 ]
机构
[1] Virginia Tech, Blacksburg, VA 24061 USA
[2] Univ Mary Washington, Fredericksburg, VA USA
来源
2023 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, JCDL | 2023年
关键词
Digital Library; Information System; Information Retrieval; Deep Learning; NLP; RECOMMENDER SYSTEMS; WORKFLOW;
D O I
10.1109/JCDL57899.2023.00012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We describe a next-generation integrated Digital Library (DL) system that addresses the numerous goals associated with long documents such as Electronic Theses and Dissertations (ETDs). Our extensible workflow-centric design supports a variety of users/personas (e.g., researchers, curators, and experimenters) who can benefit from improved access to ETDs and the content buried therein. Our approach leverages natural language processing, deep learning, information retrieval, and software engineering methods. The services cover ingesting, storing, curating, analyzing, detecting, extracting, classifying, summarizing, topic modeling, browsing, searching, retrieving, recommending, visualizing/reporting, and interacting with ETDs and derivative text/image-based elements/objects. Workflows connect the services and their APIs, along with UI-based access. We believe our approach can guide others to combine tailored user support, research, and education by way of extensible DLs.
引用
收藏
页码:13 / 24
页数:12
相关论文
共 84 条
[51]  
Liu Ruikai, 2020, PyMuPDF: Python bindings for the PDF rendering library MuPDF
[52]  
Lopez P, 2009, LECT NOTES COMPUT SC, V5714, P473, DOI 10.1007/978-3-642-04346-8_62
[53]   Scientific workflow management and the Kepler system [J].
Ludascher, Bertram ;
Altintas, Ilkay ;
Berkley, Chad ;
Higgins, Dan ;
Jaeger, Efrat ;
Jones, Matthew ;
Lee, Edward A. ;
Tao, Jing ;
Zhao, Yang .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2006, 18 (10) :1039-1065
[54]  
Manzoor J.A., 2022, THESIS VIRGINIA TECH
[55]  
McMillan Gail, 1999, C PRES ACC EL COLL U
[56]  
Munroe K. D., 2000, Advances in Visual Information Management. Visual Database Systems. IFIP TC2 WG2.6 Fifth Working Conference on Visual Database Systems, P277
[57]  
Nallapati R, 2017, AAAI CONF ARTIF INTE, P3075
[58]  
Naumov Maxim, 2019, CoRR, V0091
[59]  
Paidiparthy Manoj Prabhakar, 2022, CS 5604 Fall 2022 Team 2: End Users. CS5604 team term project
[60]  
Pinto D., 2003, Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR'03, P235, DOI DOI 10.1145/860435.860479