Towards Understanding Contracts Grammar: A Large Language Model-based Extractive Question-Answering Approach

被引:0
|
作者
Rejithkumar, Gokul [1 ]
Anish, Preethu Rose [1 ]
Ghaisas, Smita [1 ]
机构
[1] TCS Res, Pune, India
来源
32ND IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE, RE 2024 | 2024年
关键词
text extraction; deep learning; natural language processing; large language models; question-answering; token classification; text-to-text generation; prompting; empirical research;
D O I
10.1109/RE59067.2024.00037
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Software Engineering (SE) contracts play a pivotal role in Information Technology Outsourcing (ITO) projects. The obligations in SE contracts are known to be a useful source for deriving software requirements, thereby contributing to the overall Software Development Life Cycle (SDLC). Making sense of contractual obligations is an important first step in successfully executing software projects. This includes building compliant systems, meeting delivery deadlines, avoiding heavy penalties, and steering clear of expensive litigations. In this work, we present an approach to capture the essence of a contractual clause by extracting its Contracts Grammar. Through an exploratory study, we first identify the constituents of Contracts Grammar. Subsequently, we experiment with multiple approaches for the automated extraction of these constituents, including extractive question-answering, token classification, text-to-text generation, prompting, and regular expressions. The question-answering based approach performed the best in terms of high average ROUGE-L score of 0.81, and faster inference times. The work presented in this paper is a part of the Contracts Governance System (CGS) and is in the process of deployment within a large IT vendor organization.
引用
收藏
页码:310 / 320
页数:11
相关论文
共 50 条
  • [31] Developing and Pre-Processing a Dataset using a Rhetorical Relation to Build a Question-Answering System based on an Unsupervised Learning Approach
    Dutta, Ashit Kumar
    Sait, Abdul Rahaman Wahab
    Keshta, Ismail Mohamed
    Elhalles, Abheer
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (11): : 199 - 206
  • [32] OpenMedLM: prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models
    Maharjan, Jenish
    Garikipati, Anurag
    Singh, Navan Preet
    Cyrus, Leo
    Sharma, Mayank
    Ciobanu, Madalina
    Barnes, Gina
    Thapa, Rahul
    Mao, Qingqing
    Das, Ritankar
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [33] OpenMedLM: prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models
    Jenish Maharjan
    Anurag Garikipati
    Navan Preet Singh
    Leo Cyrus
    Mayank Sharma
    Madalina Ciobanu
    Gina Barnes
    Rahul Thapa
    Qingqing Mao
    Ritankar Das
    Scientific Reports, 14 (1)
  • [34] Slit Lamp Report Generation and Question Answering: Development and Validation of a Multimodal Transformer Model with Large Language Model Integration
    Zhao, Ziwei
    Zhang, Weiyi
    Chen, Xiaolan
    Song, Fan
    Gunasegaram, James
    Huang, Wenyong
    Shi, Danli
    He, Mingguang
    Liu, Na
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [35] Text Matching in Insurance Question-Answering Community Based on an Integrated BiLSTM-TextCNN Model Fusing Multi-Feature
    Li, Zhaohui
    Yang, Xueru
    Zhou, Luli
    Jia, Hongyu
    Li, Wenli
    ENTROPY, 2023, 25 (04)
  • [36] Large language model-based evolutionary optimizer: Reasoning with elitism
    Brahmachary, Shuvayan
    Joshi, Subodh M.
    Panda, Aniruddha
    Koneripalli, Kaushik
    Sagotra, Arun Kumar
    Patel, Harshil
    Sharma, Ankush
    Jagtap, Ameya D.
    Kalyanaraman, Kaushic
    NEUROCOMPUTING, 2025, 622
  • [37] Question Answering based Clinical Text Structuring Using Pre-trained Language Model
    Qiu, Jiahui
    Zhou, Yangming
    Ma, Zhiyuan
    Ruan, Tong
    Liu, Jinlin
    Sun, Jing
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1596 - 1600
  • [38] KFEX-N : A table-text data question-answering model based on knowledge-fusion encoder and EX-N tree decoder
    Tao, Ye
    Liu, Jiawang
    Li, Hui
    Cao, Wenqian
    Qin, Xiugong
    Tian, Yunlong
    Du, Yongjie
    NEUROCOMPUTING, 2024, 593
  • [39] Enhancing Embedding Performance through Large Language Model-based Text Enrichment and Rewriting
    Harris, Nicholas
    Butani, Anand
    Hashmy, Syed
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2024, 4 (02): : 2358 - 2368
  • [40] Intelligent question answering for water conservancy project inspection driven by knowledge graph and large language model collaboration
    Yang, Yangrui
    Chen, Sisi
    Zhu, Yaping
    Liu, Xuemei
    Pan, Shifeng
    Wang, Xin
    LHB-HYDROSCIENCE JOURNAL, 2024, 110 (01)