Bert-Based Latent Semantic Analysis (Bert-LSA): A Case Study on Geospatial Data Technology and Application Trend Analysis

被引:6
|
作者
Cheng, Quanying [1 ,2 ]
Zhu, Yunqiang [1 ,3 ]
Song, Jia [1 ,3 ]
Zeng, Hongyun [4 ]
Wang, Shu [1 ]
Sun, Kai [1 ]
Zhang, Jinqu [5 ]
机构
[1] Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
[2] Univ Chinese Acad Sci, Coll Resources & Environm, Beijing 100049, Peoples R China
[3] Jiangsu Ctr Collaborat Innovat Geog Informat Reso, Nanjing 210023, Peoples R China
[4] Yunnan Univ, Sch Earth Sci, Kunming 650500, Yunnan, Peoples R China
[5] South China Normal Univ, Sch Comp Sci, Guangzhou 510000, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 24期
基金
中国国家自然科学基金;
关键词
trend analysis; topic modeling; Bert; geospatial data technology and application; BIBLIOMETRIC ANALYSIS;
D O I
10.3390/app112411897
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Geospatial data is an indispensable data resource for research and applications in many fields. The technologies and applications related to geospatial data are constantly advancing and updating, so identifying the technologies and applications among them will help foster and fund further innovation. Through topic analysis, new research hotspots can be discovered by understanding the whole development process of a topic. At present, the main methods to determine topics are peer review and bibliometrics, however they just review relevant literature or perform simple frequency analysis. This paper proposes a new topic discovery method, which combines a word embedding method, based on a pre-trained model, Bert, and a spherical k-means clustering algorithm, and applies the similarity between literature and topics to assign literature to different topics. The proposed method was applied to 266 pieces of literature related to geospatial data over the past five years. First, according to the number of publications, the trend analysis of technologies and applications related to geospatial data in several leading countries was conducted. Then, the consistency of the proposed method and the existing method PLSA (Probabilistic Latent Semantic Analysis) was evaluated by using two similar consistency evaluation indicators (i.e., U-Mass and NMPI). The results show that the method proposed in this paper can well reveal text content, determine development trends, and produce more coherent topics, and that the overall performance of Bert-LSA is better than PLSA using NPMI and U-Mass. This method is not limited to trend analysis using the data in this paper; it can also be used for the topic analysis of other types of texts.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Assessing the alignment of corporate ESG disclosures with the UN sustainable development goals: a BERT-based text analysis
    Kim, Hyogon
    Lee, Eunmi
    Yoo, Donghee
    DATA TECHNOLOGIES AND APPLICATIONS, 2025, 59 (01) : 19 - 40
  • [32] Optimizing Customer Satisfaction Through Sentiment Analysis: A BERT-Based Machine Learning Approach to Extract Insights
    Rahman, Ben
    Maryani
    IEEE ACCESS, 2024, 12 : 151476 - 151489
  • [33] BERT-based Transfer Learning Model for COVID-19 Sentiment Analysis on Turkish Instagram Comments
    Karayigit, Habibe
    Akdagli, Ali
    Aci, Cigdem Inan
    INFORMATION TECHNOLOGY AND CONTROL, 2022, 51 (03): : 409 - 428
  • [34] Sentiment Analysis of Comment Data Based on BERT-ETextCNN-ELSTM
    Deng, Lujuan
    Yin, Tiantian
    Li, Zuhe
    Ge, Qingxia
    ELECTRONICS, 2023, 12 (13)
  • [35] Development of technology opportunity analysis based on technology landscape by extending technology elements with BERT and TRIZ
    Wang, Jinfeng
    Zhang, Zhixin
    Feng, Lijie
    Lin, Kuo-Yi
    Liu, Peng
    TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2023, 191
  • [36] BERT Transformers Performance Comparison for Sentiment Analysis: A Case Study in Spanish
    Barcena Ruiz, Gerardo
    de Jesus Gil, Richard
    GOOD PRACTICES AND NEW PERSPECTIVES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 5, WORLDCIST 2024, 2024, 989 : 152 - 164
  • [37] BERT-based Approach to Arabic Hate Speech and Offensive Language Detection in Twitter: Exploiting Emojis and Sentiment Analysis
    Althobaiti, Maha Jarallah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (05) : 972 - 980
  • [38] Identification of emerging technology topics (ETTs) using BERT-based model and sematic analysis: a perspective of multiple-field characteristics of patented inventions (MFCOPIs)
    Bowen Song
    Chunjuan Luan
    Danni Liang
    Scientometrics, 2023, 128 : 5883 - 5904
  • [39] Combining BERT and CNN for Sentiment Analysis A Case Study on COVID-19
    Kumar, Gunjan
    Agrawal, Renuka
    Sharma, Kanhaiya
    Gundalwar, Pravin Ramesh
    Kazi, Aqsa
    Agrawal, Pratyush
    Tomar, Manjusha
    Salagrama, Shailaja
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (10) : 676 - 686
  • [40] Identification of emerging technology topics (ETTs) using BERT-based model and sematic analysis: a perspective of multiple-field characteristics of patented inventions (MFCOPIs)
    Song, Bowen
    Luan, Chunjuan
    Liang, Danni
    SCIENTOMETRICS, 2023, 128 (11) : 5883 - 5904