Embedding User Behavioral Aspect in TF-IDF like Representation

被引:3
作者
Pradhan, Ligaj [1 ]
Zhang, Chengcui [1 ]
Bethard, Steven [2 ]
Chen, Xin [3 ]
机构
[1] Univ Alabama Birmingham, Dept Comp Sci, Birmingham, AL 35294 USA
[2] Univ Arizona, Sch Informat, Tucson, AZ USA
[3] Governors State Univ, Div Sci Math & Tech, Chicago, IL USA
来源
IEEE 1ST CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2018) | 2018年
关键词
TF-IDF; topic modeling; user-concerns; user behavior; rating prediction;
D O I
10.1109/MIPR.2018.00061
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Term Frequency - Inverse Document Frequency (TF-IDF) computes weight for each word in a document which increases proportionally to the number of times the word appears in a specific document but is counterbalanced by the number of times it occurs in the collection of documents. TF-IDF is the state-of-the-art for computing relevancy scores between documents. However, it is based on statistical learning alone and doesn't directly capture the conceptual contents of the text or the behavioral aspects of the writer. Hence, in this work we show how relatively low dimensional user behavioral vectors extracted from the same text, from which TF-IDF vectors are extracted, can be used to enrich the performance of TF-IDF. We extract User-Concerns embedded in user reviews and append them to TF-IDF vectors to train a deep rating prediction model. Our experiments show that adding such conceptual knowledge to TF-IDF vectors can significantly enhance the performance of TF-IDF vectors by only adding very little complexity.
引用
收藏
页码:262 / 267
页数:6
相关论文
共 50 条
  • [21] Analysis of TF-IDF Model and its Variant for Document Retrieval
    Mishra, Apra
    Vishwakarma, Santosh
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 772 - 776
  • [22] Android Malware Detection in Bytecode Level Using TF-IDF and XGBoost
    Ozogur, Gokhan
    Erturk, Mehmet Ali
    Aydin, Zeynep Gurkas
    Aydin, Muhammed Ali
    [J]. COMPUTER JOURNAL, 2023, 66 (09) : 2317 - 2328
  • [23] Unsupervised sentence representations as word information series: Revisiting TF-IDF
    Arroyo-Fernandez, Ignacio
    Mendez-Cruz, Carlos-Francisco
    Sierra, Gerardo
    Torres-Moreno, Juan-Manuel
    Sidorov, Grigori
    [J]. COMPUTER SPEECH AND LANGUAGE, 2019, 56 : 107 - 129
  • [24] TF-IDF and Data Visualization For Syafie Madhhab Hadith Scriptures Authenticity
    Abu Samah, Khyrina Airin Fariza
    Norhisam, Nor Faezahtul Salme
    Fesol, Siti Feirusz Ahmad
    Aminuddin, Raihah
    [J]. 11TH IEEE SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE 2021), 2021, : 65 - 70
  • [25] News keywords extraction algorithm based on TextRank and classified TF-IDF
    Ao, Xiong
    Yu, Xin
    Liu, Derong
    Tian, Hongkang
    [J]. 2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 1364 - 1369
  • [26] Automatic Sarcasm Detection in Dialectal Arabic Using BERT and TF-IDF
    Mihi, Soukaina
    Ben Ali, Brahim Ait
    El Bazi, Ismail
    Arezki, Sara
    Laachfoubi, Nabil
    [J]. 6TH INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS, 2022, 393 : 837 - 847
  • [27] 分类加权的TF-IDF的网页分类算法
    王彦焱
    李文超
    [J]. 数码世界, 2017, (07) : 106 - 106
  • [28] Microblogging Hash Tag Recommendation System Based on Semantic TF-IDF
    Tajbakhsh, Mir Saman
    Bagherzadeh, Jamshid
    [J]. 2016 IEEE 4TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD WORKSHOPS (FICLOUDW), 2016, : 252 - 257
  • [29] A gene pathway enrichment method based on improved TF-IDF algorithm
    Xu, Shutan
    Leng, Yinhui
    Feng, Guofu
    Zhang, Chenjing
    Chen, Ming
    [J]. BIOCHEMISTRY AND BIOPHYSICS REPORTS, 2023, 34
  • [30] Granular IoT Device Identification Using TF-IDF and Cosine Similarity
    Andrews, Ashley
    Oikonomou, George
    Armour, Simon
    Thomas, Paul
    Cattermole, Thomas
    [J]. PROCEEDINGS OF THE 5TH WORKSHOP ON CPS & IOT SECURITY AND PRIVACY, CPSIOTSEC 2023, 2023, : 91 - 99