Towards Open-Domain Twitter User Profile Inference

被引:0
|
作者
Wen, Haoyang [1 ]
Xiao, Zhenxin [1 ]
Hovy, Eduard H. [1 ,2 ]
Hauptmann, Alexander G. [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] Univ Melbourne, Sch Comp & Informat Syst, Melbourne, Vic, Australia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Twitter user profile inference utilizes information from Twitter to predict user attributes (e.g., occupation, location), which is controversial because of its usefulness for downstream applications and its potential to reveal users' privacy. Therefore, it is important for researchers to determine the extent of profiling in a safe environment to facilitate proper use and make the public aware of the potential risks. Contrary to existing approaches on limited attributes, we explore open-domain Twitter user profile inference. We conduct a case study where we collect publicly available WikiData public figure profiles and use diverse WikiData predicates for profile inference. After removing sensitive attributes, our data contains over 150K public figure profiles from WikiData, over 50 different attribute predicates, and over 700K attribute values. We further propose a prompt-based generation method, which can infer values that are implicitly mentioned in the Twitter information. Experimental results show that the generation-based approach can infer more comprehensive user profiles than baseline extraction-based methods, but limitations still remain to be applied for real-world use. We also enclose a detailed ethical statement for our data, potential benefits and risks from this work, and our efforts to mitigate the risks. (1)
引用
收藏
页码:3172 / 3188
页数:17
相关论文
共 50 条
  • [1] Open-domain extraction of future events from Twitter
    Kunneman, Florian
    Van den Bosch, Antal
    NATURAL LANGUAGE ENGINEERING, 2016, 22 (05) : 655 - 686
  • [2] Towards Open-Domain Topic Classification
    Ding, Hantian
    Yang, Jinrui
    Deng, Yuqian
    Zhang, Hongming
    Roth, Dan
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE DEMONSTRATIONS SESSION, 2022, : 90 - 98
  • [3] Towards an Open-Domain Dialog System
    Gao, Jianfeng
    PROCEEDINGS OF THE 2019 ACM SIGIR INTERNATIONAL CONFERENCE ON THEORY OF INFORMATION RETRIEVAL (ICTIR'19), 2019, : 1 - 1
  • [4] Towards Open-Domain Semantic Role Labeling
    Croce, Danilo
    Giannone, Cristina
    Annesi, Paolo
    Basili, Roberto
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 237 - 246
  • [5] Estimating User Interest from Open-Domain Dialogue
    Inaba, Michimasa
    Takahash, Kenichi
    19TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2018), 2018, : 32 - 40
  • [6] Profile Consistency Identification for Open-domain Dialogue Agents
    Song, Haoyu
    Yan Wang
    Zhang, Wei Nan
    Zhao, Zhengyu
    Ting Liu
    Xiaojiang Liu
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6651 - 6662
  • [7] Towards Open-domain Vision and Language Understanding with Wikimedia
    Semedo, David
    WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 591 - 593
  • [8] Towards Boosting the Open-Domain Chatbot with Human Feedback
    Lu, Hua
    Bao, Siqi
    He, Huang
    Wang, Fan
    Wu, Hua
    Wang, Haifeng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4060 - 4078
  • [9] Towards Multilingual Automatic Open-Domain Dialogue Evaluation
    Mendonca, John
    Lavie, Alon
    Trancoso, Isabel
    24TH MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE, SIGDIAL 2023, 2023, : 130 - 141
  • [10] Towards Building an Open-Domain Corpus for Arabic Reading Comprehension
    Biltawi, Mariam
    Awajan, Arafat
    Tedmori, Sara
    EDUCATION EXCELLENCE AND INNOVATION MANAGEMENT: A 2025 VISION TO SUSTAIN ECONOMIC DEVELOPMENT DURING GLOBAL CHALLENGES, 2020, : 2329 - 2342