Maximum Lexical Cohesion for Fine-Grained News Story Segmentation

被引:0
|
作者
Liu, Zihan [1 ]
Xie, Lei [1 ]
Feng, Wei [2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China
[2] City Univ Hong Kong, Sch Creat Media, Hong Kong, Hong Kong, Peoples R China
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 | 2010年
基金
中国国家自然科学基金;
关键词
story segmentation; KL-divergence; lexical cohesion; word weighting; dynamic programming; spoken document segmentation; spoken document retrieval; TEXT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a maximum lexical cohesion (MLC) approach to news story segmentation. Unlike sentence-dependent lexical methods, our approach is able to detect story boundaries at finer word/subword granularity, and thus is more suitable for speech recognition transcripts which have no sentence delimiters. The proposed segmentation goodness measure takes account of both lexical cohesion and a prior preference of story length. We measure the lexical cohesion of a segment by the KL-divergence from its word distribution to an associated piecewise uniform distribution. Taking account of the uneven contributions of different words to a story, the cohesion measure is further refined by two word weighting schemes, i.e. the inverse document frequency (IDF) and a new weighting method called difference from expectation (DFE). We then propose a dynamic programming solution to exactly maximize the segmentation goodness and efficiently locate story boundaries in polynomial time. Experimental results show that our MLC approach outperforms several state-of-the-art lexical methods.
引用
收藏
页码:1301 / +
页数:2
相关论文
共 50 条
  • [1] SeLeCT: a lexical cohesion based news story segmentation system
    Stokes, N
    Carthy, J
    Smeaton, AF
    AI COMMUNICATIONS, 2004, 17 (01) : 3 - 12
  • [2] On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news
    Xie, L.
    Yang, Y. -L.
    Liu, Z. -Q.
    INFORMATION SCIENCES, 2011, 181 (13) : 2873 - 2891
  • [3] Maximum Entropy Fine-Grained Classification
    Dubey, Abhimanyu
    Gupta, Otkrist
    Raskar, Ramesh
    Naik, Nikhil
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [4] Lexical predicates do substitute in fine-grained attitudes
    Jespersen, Bjorn
    SYNTHESE, 2025, 205 (01)
  • [5] Fine-Grained Financial News Sentiment Analysis
    Meyer, Bradley
    Bikdash, Marwan
    Dai, Xiangfeng
    SOUTHEASTCON 2017, 2017,
  • [6] Fine-Grained Analysis of Propaganda in News Articles
    Da San Martino, Giovanni
    Yu, Seunghak
    Barron-Cedeno, Alberto
    Petrov, Rostislav
    Nakov, Preslav
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5636 - 5646
  • [7] Fine-Grained Analysis of Diversity Levels in the News
    Amsalem, Eran
    Fogel-Dror, Yair
    Shenhav, Shaul R.
    Sheafer, Tamir
    COMMUNICATION METHODS AND MEASURES, 2020, 14 (04) : 266 - 284
  • [8] DELTASCORE: Fine-Grained Story Evaluation with Perturbations
    Xie, Zhuohan
    Li, Miao
    Cohn, Trevor
    Lau, Jey Han
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 5317 - 5331
  • [9] Towards Fine-Grained Polyp Segmentation and Classification
    Tudela, Yael
    Garcia-Rodriguez, Ana
    Fernandez-Esparrach, Gloria
    Bernal, Jorge
    CLINICAL IMAGE-BASED PROCEDURES, FAIRNESS OF AI IN MEDICAL IMAGING, AND ETHICAL AND PHILOSOPHICAL ISSUES IN MEDICAL IMAGING, CLIP 2023, FAIMI 2023, EPIMI 2023, 2023, 14242 : 32 - 42
  • [10] FINE-GRAINED VISUAL CATEGORIZATION WITH FINE-TUNED SEGMENTATION
    Li, Lingyun
    Guo, Yanqing
    Xie, Lingxi
    Kong, Xiangwei
    Tian, Qi
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 2025 - 2029