Metadata Generation for Multi-Text Classification in Structured Data

被引:0
作者
Trejo, Karla [1 ]
Garcia, Pere [1 ]
Puyol-Gruart, Josep [1 ]
机构
[1] IIIA CSIC, UAB Campus, E-08193 Bellaterra, Catalonia, Spain
来源
ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT | 2019年 / 319卷
关键词
text analysis; text mining; data formatting; multi-text classification; topology; metadata; structured data;
D O I
10.3233/FAIA190154
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
dIn today's information-saturated world, text analysis has become an indispensable resource to extract useful data from massive amounts of texts. A large portion of this information is unstructured. Hence, it has created a need for methodologies -Named Entity Recognition (NER), Part-of-Speech (PoS) Tagging, N-grams, Term Frequency - Inverse Document Frequency (TF-IDF)- which can read and understand information based on their meaning, context and linguistic cohesion. However, these approaches on their own fall short if applied in already structured data. The idea of generating metadata which can simultaneously provide situational information from structured text data is proposed in this paper. The abstraction of text as a "group of concepts" can boost the relevance of a word in a collection of documents, which allows a more refined separation of classes and a better performance in multi-text classification tasks.
引用
收藏
页码:417 / 421
页数:5
相关论文
共 50 条
[41]   Truth discovery method for multi-source text data [J].
Cao J. ;
Chang C. ;
Tao J. ;
Weng N. ;
Jiang G. .
Guofang Keji Daxue Xuebao/Journal of National University of Defense Technology, 2022, 44 (04) :172-179
[42]   Effective Text Classification Using Multi-level Fuzzy Neural Network [J].
Zobeidi, Shima ;
Naderan, Marjan ;
Alavi, Seyed Enayatollah .
2017 5TH IRANIAN JOINT CONGRESS ON FUZZY AND INTELLIGENT SYSTEMS (CFIS), 2017, :91-96
[43]   Classification of Shopify App User Reviews Using Novel Multi Text Features [J].
Rustam, Furqan ;
Mehmood, Arif ;
Ahmad, Muhammad ;
Ullah, Saleem ;
Khan, Dost Muhammad ;
Choi, Gyu Sang .
IEEE ACCESS, 2020, 8 (08) :30234-30244
[44]   Efficient Population of Structured Data Forms for Medical Records Using Syntactic Constraints and Intermediate Text [J].
Loui, Ronald P. ;
Hollinshead, Ashley .
2016 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2016, :317-317
[45]   Hierarchical Three-module Method of Text Classification in Web Big Data [J].
Rezaei, Zahra ;
Eslami, Behnaz ;
Amini, Mohammad-Amin ;
Eslami, Mohammad .
2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, :58-65
[46]   A Multi-GPU Framework for In-Memory Text Data Analytics [J].
Chong, Poh Kit ;
Karuppiah, Ettikan K. ;
Yong, Keh Kok .
2013 IEEE 27TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA), 2013, :1411-1416
[47]   An Advanced Multi Class Instance Selection based Support Vector Machine for Text Classification [J].
Ramesh, B. ;
Sathiaseelan, J. G. .
3RD INTERNATIONAL CONFERENCE ON RECENT TRENDS IN COMPUTING 2015 (ICRTC-2015), 2015, 57 :1124-1130
[48]   Inference during reading: multi-label classification for text with continuous semantic units [J].
Tian, Xuetao ;
Jing, Liping ;
Luo, Fang ;
Liu, Feng .
APPLIED INTELLIGENCE, 2022, 52 (06) :6292-6305
[49]   Inference during reading: multi-label classification for text with continuous semantic units [J].
Xuetao Tian ;
Liping Jing ;
Fang Luo ;
Feng Liu .
Applied Intelligence, 2022, 52 :6292-6305
[50]   The MASi repository service - Comprehensive, metadata-driven and multi-community research data management [J].
Grunzke, Richard ;
Hartmann, Volker ;
Jejkal, Thomas ;
Kollai, Helen ;
Prabhune, Ajinkya ;
Herold, Hendrik ;
Deicke, Aline ;
Dressler, Christiane ;
Dolhoff, Julia ;
Stanek, Julia ;
Hoffmann, Alexander ;
Mueller-Pfefferkorn, Ralph ;
Schrade, Torsten ;
Meinel, Gotthard ;
Herres-Pawlis, Sonja ;
Nagel, Wolfgang E. .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 94 :879-894