Towards Multilingual Automated Classification Systems

被引:2
|
作者
Musaev, Aibek [1 ]
Pu, Calton [2 ]
机构
[1] Univ Alabama, Dept Comp Sci, Tuscaloosa, AL 35487 USA
[2] Georgia Inst Technol, Sch Comp Sci, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICDCS.2017.208
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper we propose and evaluate three approaches for automated classification of texts in over 60 languages without the need for a manually annotated dataset in those languages. All approaches are based on the randomized Explicit Semantic Analysis method using multilingual Wikipedia articles as their knowledge repository. We evaluate the proposed approaches by classifying a Twitter dataset in English and Portuguese into relevant and irrelevant items with respect to landslide as a natural disaster, where the highest achieved F1-score is 0.93. These approaches can be used in various applications where multilingual classification is needed, including multilingual disaster reporting using Social Media to improve coverage and increase confidence. As illustration, we present a demonstration that combines data from physical sensors and social networks to detect landslide events reported in English and Portuguese.
引用
收藏
页码:2333 / 2337
页数:5
相关论文
共 50 条
  • [41] Towards Automated COVID-19 Presence and Severity Classification
    Mueller, Dominik
    Mertes, Silvan
    Schroeter, Niklas
    Hellmann, Fabio
    Elia, Miriam
    Bauer, Bernhard
    Reif, Wolfgang
    Andre, Elisabeth
    Kramer, Frank
    CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023, 2023, 302 : 917 - 921
  • [42] Towards automated spectroscopic tissue classification in thyroid and parathyroid surgery
    Schols, Rutger M.
    Alic, Lejla
    Wieringa, Fokko P.
    Bouvy, Nicole D.
    Stassen, Laurents P. S.
    INTERNATIONAL JOURNAL OF MEDICAL ROBOTICS AND COMPUTER ASSISTED SURGERY, 2017, 13 (01):
  • [43] Maintenance Task Classification: Towards Automated Robotic Maintenance for Industry
    Akrout, H.
    Anson, D.
    Bianchini, G.
    Neveur, A.
    Trinel, C.
    Farnsworth, M.
    Tomiyama, T.
    2ND INTERNATIONAL THROUGH-LIFE ENGINEERING SERVICES CONFERENCE, 2013, 11 : 367 - 372
  • [44] TAGS: Towards Automated Classification of Unstructured Clinical Nursing Notes
    Gangavarapu, Tushaar
    Jayasimha, Aditya
    Krishnan, Gokul S.
    Kamath, Sowmya S.
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 195 - 207
  • [45] Improving Text Security Classification Towards an Automated Information Guard
    Heintz, Ilana
    Grothendieck, John
    Bernardin, Fred
    Kuperman, Gregory
    2022 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2022,
  • [46] Towards Automated Classification of Firmware Images and Identification of Embedded Devices
    Costin, Andrei
    Zarras, Apostolis
    Francillon, Aurelien
    ICT SYSTEMS SECURITY AND PRIVACY PROTECTION, SEC 2017, 2017, 502 : 233 - 247
  • [47] A Multilingual Approach to Question Classification
    Kalouli, Aikaterini-Lida
    Kaiser, Katharina
    Hautli-Janisz, Annette
    Kaiser, Georg A.
    Butt, Miriam
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2715 - 2720
  • [48] Towards a multilingual culture of education
    Robinson, C
    INTERNATIONAL JOURNAL OF EDUCATIONAL DEVELOPMENT, 2005, 25 (02) : 183 - 186
  • [49] Towards multilingual programming environments
    van der Storm, Tijs
    Vinju, Jurgen J.
    SCIENCE OF COMPUTER PROGRAMMING, 2015, 97 : 143 - 149
  • [50] A Multilingual Application for Automated Essay Scoring
    Castro-Castro, Daniel
    Lannes-Losada, Rocio
    Maritxalar, Montse
    Niebla, Ianire
    Perez-Marques, Celia
    Alamo-Suarez, Nancy C.
    Pons-Porrata, Aurora
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2008, PROCEEDINGS, 2008, 5290 : 243 - 251