Data Augmentation with ChatGPT for Assessing Subject Alignment

被引:0
|
作者
Kontoghiorghes, Louisa [1 ]
Colubi, Ana [1 ]
机构
[1] Kings Coll London, London, England
关键词
Text mining; Pre-trained transformer; Tf-idf; Hypothesis testing; Data augmentation; TEXT;
D O I
10.1007/978-3-031-65993-5_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a statistical tool, topic modeling requires replication. Topics are identified as distributions over a set of words, and documents are described as mixtures of latent topics. Sometimes, text information is available through a unique document that covers well-defined subjects, but without replication, the intuitive representation of the document as a mixture of latent topics cannot be derived. Nevertheless, the available document can potentially generate more knowledge (not yet available) that could assist in employing compelling text mining tools. The proposal is to use ChatGPT to mimic the process of generating such knowledge. To illustrate the approach, a research proposal is used as the initial document. The aim is to verify if a given piece of research aligns with the subjects of the research proposal or not combining text mining and statistical tools. A second case-study, analyse Novel's chapters alignment with the Gothic genre.
引用
收藏
页码:217 / 224
页数:8
相关论文
共 50 条
  • [1] Is ChatGPT the ultimate Data Augmentation Algorithm?
    Piedboeuf, Frederic
    Langlais, Philippe
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 15606 - 15615
  • [2] Motif Alignment for Time Series Data Augmentation
    Bahri, Omar
    Li, Peiyu
    Boubrahimi, Soukaina Filali
    Hamdi, Shah Muhammad
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2023, 2023, 14148 : 42 - 48
  • [3] Educational data augmentation in physics education research using ChatGPT
    Kieser, Fabian
    Wulff, Peter
    Kuhn, Jochen
    Kuechemann, Stefan
    PHYSICAL REVIEW PHYSICS EDUCATION RESEARCH, 2023, 19 (02):
  • [4] Combining Euclidean Alignment and Data Augmentation for BCI decoding
    Rodrigues, Gustavo H.
    Aristimunha, Bruno
    Chevallier, Sylvain
    de Camargo, Raphael Y.
    32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 1382 - 1387
  • [5] Gradient Hyperalignment for Multi-subject fMRI Data Alignment
    Xu, Tonglin
    Yousefnezhad, Muhammad
    Zhang, Daoqiang
    PRICAI 2018: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2018, 11012 : 1058 - 1068
  • [6] Toward Learning Robust and Invariant Representations with Alignment Regularization and Data Augmentation
    Wang, Haohan
    Huang, Zeyi
    Wu, Xindi
    Xing, Eric
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 1846 - 1856
  • [7] Target-centered Subject Transfer Framework for EEG Data Augmentation
    Yin, Kang
    Lee, Byeong-Hoo
    Kwon, Byoung-Hee
    Cho, Jeong-Hyun
    2023 11TH INTERNATIONAL WINTER CONFERENCE ON BRAIN-COMPUTER INTERFACE, BCI, 2023,
  • [8] Assessing the reliability of point mutation as data augmentation for deep learning with genomic data
    Lee, Hyunjung
    Ozbulak, Utku
    Park, Homin
    Depuydt, Stephen
    De Neve, Wesley
    Vankerschaver, Joris
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [9] Assessing the impact of data augmentation and a combination of CNNs on leukemia classification
    Claro, Maila L.
    Veras, Rodrigo de M. S.
    Santana, Andre M.
    Vogado, Luis Henrique S.
    Braz Junior, Geraldo
    de Medeiros, Fatima N. S.
    Tavares, Joao Manuel R. S.
    INFORMATION SCIENCES, 2022, 609 : 1010 - 1029
  • [10] Two-stage fine-tuning with ChatGPT data augmentation for learning class-imbalanced data
    Valizadehaslani, Taha
    Shi, Yiwen
    Wang, Jing
    Ren, Ping
    Zhang, Yi
    Hu, Meng
    Zhao, Liang
    Liang, Hualou
    NEUROCOMPUTING, 2024, 592