A Study on Hierarchical Text Classification as a Seq2seq Task

被引:0
|
作者
Torba, Fatos [1 ,2 ]
Gravier, Christophe [2 ]
Laclau, Charlotte [3 ]
Kammoun, Abderrhammen [1 ]
Subercaze, Julien [1 ]
机构
[1] AItenders, St Etienne, France
[2] CNRS, Lab Hubert Curien, UMR 5516, St Etienne, France
[3] Inst Polytech Paris, Telecom Paris, Paris, France
来源
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III | 2024年 / 14610卷
关键词
Hierarchical text classification; generative model; reproducibility;
D O I
10.1007/978-3-031-56063-7_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the progress of generative neural models, Hierarchical Text Classification (HTC) can be cast as a generative task. In this case, given an input text, the model generates the sequence of predicted class labels taken from a label tree of arbitrary width and depth. Treating HTC as a generative task introduces multiple modeling choices. These choices vary from choosing the order for visiting the class tree and therefore defining the order of generating tokens, choosing either to constrain the decoding to labels that respect the previous level predictions, up to choosing the pre-trained Language Model itself. Each HTC model therefore differs from the others from an architectural standpoint, but also from the modeling choices that were made. Prior contributions lack transparent modeling choices and open implementations, hindering the assessment of whether model performance stems from architectural or modeling decisions. For these reasons, we propose with this paper an analysis of the impact of different modeling choices along with common model errors and successes for this task. This analysis is based on an open framework coming along this paper that can facilitate the development of future contributions in the field by providing datasets, metrics, error analysis toolkit and the capability to readily test various modeling choices for one given model.
引用
收藏
页码:287 / 296
页数:10
相关论文
共 48 条
  • [41] Does the Order Matter? A Random Generative Way to Learn Label Hierarchy for Hierarchical Text Classification
    Yan, Jingsong
    Li, Piji
    Chen, Haibin
    Zheng, Junhao
    Ma, Qianli
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 276 - 285
  • [42] External methods to address limitations of using global information on the narrow-down approach for hierarchical text classification
    Oh, Heung-Seon
    Jung, Yuchul
    JOURNAL OF INFORMATION SCIENCE, 2014, 40 (05) : 688 - 708
  • [43] A real-world multi-center RNA-seq benchmarking study using the Quartet and MAQC reference materials
    Wang, Duo
    Liu, Yaqing
    Zhang, Yuanfeng
    Chen, Qingwang
    Han, Yanxi
    Hou, Wanwan
    Liu, Cong
    Yu, Ying
    Li, Ziyang
    Li, Ziqiang
    Zhao, Jiaxin
    Shi, Leming
    Zheng, Yuanting
    Li, Jinming
    Zhang, Rui
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [44] Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study
    Li, Sheng
    Tighe, Scull W.
    Nicolet, Charles M.
    Grove, Deborah
    Levy, Shawn
    Farmerie, William
    Viale, Agnes
    Wright, Chris
    Schweitzer, Peter A.
    Gao, Yuan
    Kim, Dewey
    Boland, Joe
    Hicks, Belynda
    Kim, Ryan
    Chhangawala, Sagar
    Jafari, Nadereh
    Raghavachari, Nalini
    Gandara, Jorge
    Garcia-Reyero, Natalia
    Hendrickson, Cynthia
    Roberson, David
    Rosenfeldr, Jeffrey
    Smith, Todd
    Underwood, Jason G.
    Wang, May
    Zumbo, Paul
    Baldwin, Don A.
    Grills, George S.
    Mason, Christopher E.
    NATURE BIOTECHNOLOGY, 2014, 32 (09) : 915 - 925
  • [45] A comparative study of RNA-Seq and microarray data analysis on the two examples of rectal-cancer patients and Burkitt Lymphoma cells
    Wolff, Alexander
    Bayerlova, Michaels
    Gaedcke, Jochen
    Kube, Dieter
    Beissbarth, Tim
    PLOS ONE, 2018, 13 (05):
  • [46] Enhancing RNA-seq analysis by addressing all co-existing biases using a self-benchmarking approach with 2D structural insights
    Su, Qiang
    Long, Yi
    Gou, Deming
    Quan, Junmin
    Lian, Qizhou
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
  • [47] Reliability of MR Quantification of Rotator Cuff Muscle Fatty Degeneration Using a 2-point Dixon Technique in Comparison with the Goutallier Classification: Validation Study by Multiple Readers
    Horiuchi, Saya
    Nozaki, Taiki
    Tasaki, Atsushi
    Yamakawa, Akira
    Kaneko, Yasuhito
    Hara, Takeshi
    Yoshioka, Hiroshi
    ACADEMIC RADIOLOGY, 2017, 24 (11) : 1343 - 1351
  • [48] Cognition-induced modulation of serotonin in the orbitofrontal cortex: A controlled cross-over PET study of a delayed match-to-sample task using the 5-HT2a receptor antagonist [18F]altanserin
    Hautzel, Hubertus
    Mueller, Hans-Wilhelm
    Herzog, Hans
    Grandt, Ruediger
    NEUROIMAGE, 2011, 58 (03) : 905 - 911