Towards the Discovery of Influencers to Follow in Micro-Blogs (Twitter) by Detecting Topics in Posted Messages (Tweets)

被引:6
作者
Ali, Mubashir [1 ]
Baqir, Anees [2 ]
Psaila, Giuseppe [1 ]
Malik, Sayyam [2 ]
机构
[1] Univ Bergamo, Dept Management Informat & Prod Engn, I-24129 Bergamo, Italy
[2] Univ Sialkot, Fac Comp & IT, Sialkot 51040, Pakistan
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 16期
关键词
social media; micro-blogs (Twitter); towards recommending influencers based on topic classification; investigation framework; comparison of various techniques for topic classification; cost-benefit function; CLASSIFICATION;
D O I
10.3390/app10165715
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Micro-blogs, such as Twitter, have become important tools to share opinions and information among users. Messages concerning any topic are daily posted. A message posted by a given user reaches all the users that decided to follow her/him. Some users post many messages, because they aim at being recognized as influencers, typically on specific topics. How a user can discover influencers concerned with her/his interest? Micro-blog apps and web sites lack a functionality to recommend users with influencers, on the basis of the content of posted messages. In this paper, we envision such a scenario and we identify the problem that constitutes the basic brick for developing a recommender of (possibly influencer) users: training a classification model by exploiting messages labeled with topical classes, so as this model can be used to classify unlabeled messages, to let the hidden topic they talk about emerge. Specifically, the paper reports the investigation activity we performed to demonstrate the suitability of our idea. To perform the investigation, we developed an investigation framework that exploits various patterns for extracting features from within messages (labeled with topical classes) in conjunction with the mostly-used classifiers for text classification problems. By means of the investigation framework, we were able to perform a large pool of experiments, that allowed us to evaluate all the combinations of feature patterns with classifiers. By means of a cost-benefit function called "Suitability", that combines accuracy with execution time, we were able to demonstrate that a technique for discovering topics from within messages suitable for the application context is available.
引用
收藏
页数:28
相关论文
共 51 条
[1]  
Al-Shalabi R., 2008, P 6 INT C INFORMATIC, P108
[2]  
[Anonymous], **NON-TRADITIONAL**
[3]  
[Anonymous], 2013, COMMUNICATIONS NETWO
[4]  
[Anonymous], **NON-TRADITIONAL**
[5]  
[Anonymous], **NON-TRADITIONAL**
[6]   Comparison of term frequency and document frequency based feature selection metrics in text categorization [J].
Azam, Nouman ;
Yao, JingTao .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (05) :4760-4768
[7]  
Bhargava N., 2013, INT J ADV RES COMPUT, V3, P1114
[8]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[9]  
Bordogna Gloria, 2017, International Journal of Intelligent Information and Database Systems, V10, P246
[10]  
Bordogna G, 2016, 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), P514, DOI [10.1109/ICMLA.2016.0091, 10.1109/ICMLA.2016.188]