Machine Learning Based Text Summarization for Turkish News

被引:2
作者
Kartal, Yavuz Selim [1 ]
Kutlu, Mucahid [1 ]
机构
[1] TOBB Ekon & Teknol Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey
来源
2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2020年
关键词
Text Summarization; Machine Learning;
D O I
10.1109/siu49456.2020.9302096
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose an automatic text summarization model for Turkish news articles using machine learning models. Our proposed model uses sentence position, speech expression, presence of named entities and statements, term frequency and title similarity as features. We construct and share a new dataset for Turkish text summarization. In our experiments, we show that all our features we use have a positive impact on the performance of the system. In addition, we show that our model outperforms the latent semantic analysis based baseline method.
引用
收藏
页数:4
相关论文
共 13 条
[1]  
Abujar Sheikh, 2019, Proceedings of the 2nd International Conference on Data Engineering and Communication Technology (ICDECT 2017). Advances in Intelligent Systems and Computing (AISC 828), P155, DOI 10.1007/978-981-13-1610-4_16
[2]   COSUM: Text summarization based on clustering and optimization [J].
Alguliyev, Rasim M. ;
Aliguliyev, Ramiz M. ;
Isazade, Nijat R. ;
Abdi, Asad ;
Idris, Norisma .
EXPERT SYSTEMS, 2019, 36 (01)
[3]  
Diri B., 2008, AK SIST YEN VE UYG S
[4]  
Dutta Madhurima, 2019, Emerging Technologies in Data Mining and Information Security. Proceedings of IEMIS 2018. Advances in Intelligent Systems and Computing (AISC 813), P179, DOI 10.1007/978-981-13-1498-8_16
[5]   The challenges of automatic summarization [J].
Hahn, U ;
Mani, I .
COMPUTER, 2000, 33 (11) :29-+
[6]  
Keneshloo Y., 2019, P 2019 SIAM INT C DA, P675
[7]  
Khan R., 2019, International Journal of Information Engineering Electronic Business, V11
[8]   Generic Text Summarization for Turkish [J].
Kutlu, Mucahid ;
Cigir, Celal ;
Cicekli, Ilyas .
COMPUTER JOURNAL, 2010, 53 (08) :1315-1323
[9]  
Lin CY, 2003, HLT-NAACL 2003: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, P150
[10]   THE AUTOMATIC CREATION OF LITERATURE ABSTRACTS [J].
LUHN, HP .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1958, 2 (02) :159-165