Bert-Based Latent Semantic Analysis (Bert-LSA): A Case Study on Geospatial Data Technology and Application Trend Analysis

被引:6
|
作者
Cheng, Quanying [1 ,2 ]
Zhu, Yunqiang [1 ,3 ]
Song, Jia [1 ,3 ]
Zeng, Hongyun [4 ]
Wang, Shu [1 ]
Sun, Kai [1 ]
Zhang, Jinqu [5 ]
机构
[1] Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
[2] Univ Chinese Acad Sci, Coll Resources & Environm, Beijing 100049, Peoples R China
[3] Jiangsu Ctr Collaborat Innovat Geog Informat Reso, Nanjing 210023, Peoples R China
[4] Yunnan Univ, Sch Earth Sci, Kunming 650500, Yunnan, Peoples R China
[5] South China Normal Univ, Sch Comp Sci, Guangzhou 510000, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 24期
基金
中国国家自然科学基金;
关键词
trend analysis; topic modeling; Bert; geospatial data technology and application; BIBLIOMETRIC ANALYSIS;
D O I
10.3390/app112411897
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Geospatial data is an indispensable data resource for research and applications in many fields. The technologies and applications related to geospatial data are constantly advancing and updating, so identifying the technologies and applications among them will help foster and fund further innovation. Through topic analysis, new research hotspots can be discovered by understanding the whole development process of a topic. At present, the main methods to determine topics are peer review and bibliometrics, however they just review relevant literature or perform simple frequency analysis. This paper proposes a new topic discovery method, which combines a word embedding method, based on a pre-trained model, Bert, and a spherical k-means clustering algorithm, and applies the similarity between literature and topics to assign literature to different topics. The proposed method was applied to 266 pieces of literature related to geospatial data over the past five years. First, according to the number of publications, the trend analysis of technologies and applications related to geospatial data in several leading countries was conducted. Then, the consistency of the proposed method and the existing method PLSA (Probabilistic Latent Semantic Analysis) was evaluated by using two similar consistency evaluation indicators (i.e., U-Mass and NMPI). The results show that the method proposed in this paper can well reveal text content, determine development trends, and produce more coherent topics, and that the overall performance of Bert-LSA is better than PLSA using NPMI and U-Mass. This method is not limited to trend analysis using the data in this paper; it can also be used for the topic analysis of other types of texts.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] BERT-Based Natural Language Processing of Drug Labeling Documents: A Case Study for Classifying Drug-Induced Liver Injury Risk
    Wu, Yue
    Liu, Zhichao
    Wu, Leihong
    Chen, Minjun
    Tong, Weida
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [42] Application of semantic and lexical analysis to technology forecasting by trend analysis - thematic clusters in separation processes
    Sitarz, Robert
    Kraslawski, Andrzej
    22 EUROPEAN SYMPOSIUM ON COMPUTER AIDED PROCESS ENGINEERING, 2012, 30 : 437 - 441
  • [43] A Study of Sentiment Analysis Algorithms for Agricultural Product Reviews Based on Improved BERT Model
    Cao, Ying
    Sun, Zhexing
    Li, Ling
    Mo, Weinan
    SYMMETRY-BASEL, 2022, 14 (08):
  • [44] An enhanced guided LDA model augmented with BERT based semantic strength for aspect term extraction in sentiment analysis
    Venugopalan, Manju
    Gupta, Deepa
    KNOWLEDGE-BASED SYSTEMS, 2022, 246
  • [45] Fine-Grained Sentiment Analysis of Arabic COVID-19 Tweets Using BERT-Based Transformers and Dynamically Weighted Loss Function
    Alturayeif, Nora
    Luqman, Hamzah
    APPLIED SCIENCES-BASEL, 2021, 11 (22):
  • [46] Sentiment analysis method of consumer comment text based on BERT and hierarchical attention in e-commerce big data environment
    Chang, Wanjun
    Zhu, Mingdong
    JOURNAL OF INTELLIGENT SYSTEMS, 2023, 32 (01)
  • [47] Applicability Analysis and Ensemble Application of BERT with TF-IDF, TextRank, MMR, and LDA for Topic Classification Based on Flood-Related VGI
    Du, Wenying
    Ge, Chang
    Yao, Shuang
    Chen, Nengcheng
    Xu, Lei
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2023, 12 (06)
  • [48] Diesel Engine Maintenance System, Using Trend Analysis of Histroical Data (Case Study)
    Sayyah, Ali
    Nikbakht, Mehrdad
    Sayyah, Moones
    Btht, Baharudin
    Ismail, N.
    MANUFACTURING SCIENCE AND TECHNOLOGY, PTS 1-8, 2012, 383-390 : 6755 - +
  • [49] Development and case study of trend analysis software based on FACT-Graph
    Saga, Ryosuke
    Tsuji, Hiroshi
    Miyamoto, Takao
    Tabata, Kuniaki
    ARTIFICIAL LIFE AND ROBOTICS, 2010, 15 (02) : 234 - 238
  • [50] Bibliometric Analysis of the Application of Blockchain Technology for Data Security: a Case Study of Global Mission Services' Digital Platform
    Bizumuremyi, Yves
    Mabanza, Ntima
    Masinde, Muthoni
    2022 IST-AFRICA CONFERENCE, 2022,