Predicting innovative firms using web mining and deep learning

被引:29
作者
Kinne, Jan [1 ,2 ,3 ]
Lenz, David [3 ,4 ]
机构
[1] ZEW Ctr European Econ Res, Dept Econ Innovat & Ind Dynam, Mannheim, Germany
[2] Univ Salzburg, Dept Geoinformat Z GIS, Salzburg, Austria
[3] Istari Ai, Mannheim, Germany
[4] Justus Liebig Univ, Dept Econometr & Stat, Giessen, Germany
来源
PLOS ONE | 2021年 / 16卷 / 04期
关键词
PATENT STATISTICS; NEURAL-NETWORKS;
D O I
10.1371/journal.pone.0249071
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Evidence-based STI (science, technology, and innovation) policy making requires accurate indicators of innovation in order to promote economic growth. However, traditional indicators from patents and questionnaire-based surveys often lack coverage, granularity as well as timeliness and may involve high data collection costs, especially when conducted at a large scale. Consequently, they struggle to provide policy makers and scientists with the full picture of the current state of the innovation system. In this paper, we propose a first approach on generating web-based innovation indicators which may have the potential to overcome some of the shortcomings of traditional indicators. Specifically, we develop a method to identify product innovator firms at a large scale and very low costs. We use traditional firm-level indicators from a questionnaire-based innovation survey (German Community Innovation Survey) to train an artificial neural network classification model on labelled (product innovator/no product innovator) web texts of surveyed firms. Subsequently, we apply this classification model to the web texts of hundreds of thousands of firms in Germany to predict whether they are product innovators or not. We then compare these predictions to firm-level patent statistics, survey extrapolation benchmark data, and regional innovation indicators. The results show that our approach produces reliable predictions and has the potential to be a valuable and highly cost-efficient addition to the existing set of innovation indicators, especially due to its coverage and regional granularity.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Predicting protein network topology clusters from chemical structure using deep learning
    Akshai P. Sreenivasan
    Philip J Harrison
    Wesley Schaal
    Damian J. Matuszewski
    Kim Kultima
    Ola Spjuth
    Journal of Cheminformatics, 14
  • [22] Predicting protein network topology clusters from chemical structure using deep learning
    Sreenivasan, Akshai P.
    Harrison, Philip J.
    Schaal, Wesley
    Matuszewski, Damian J.
    Kultima, Kim
    Spjuth, Ola
    JOURNAL OF CHEMINFORMATICS, 2022, 14 (01)
  • [23] Using deep Q-learning to understand the tax evasion behavior of risk-averse firms
    Goumagias, Nikolaos D.
    Hristu-Varsakelis, Dimitrios
    Assael, Yannis M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 101 : 258 - 270
  • [24] Predicting status of pre- and post-M&A deals using machine learning and deep learning techniques
    Tugce Karatas
    Ali Hirsa
    Digital Finance, 2025, 7 (1): : 61 - 106
  • [25] Mining structure-property linkage in nanoporous materials using an interpretative deep learning approach
    Liu, Haomin
    Shargh, Ali K.
    Abdolrahim, Niaz
    MATERIALIA, 2022, 21
  • [26] Predicting parking occupancy via machine learning in the web of things
    Provoost, Jesper C.
    Kamilaris, Andreas
    Wismans, Luc J. J.
    Van der Drift, SanderJ.
    Van Keulen, Maurice
    INTERNET OF THINGS, 2020, 12
  • [27] An ensemble deep learning approach for predicting cocoa yield
    Olofintuyi, Sunday Samuel
    Olajubu, Emmanuel Ajayi
    Olanike, Deji
    HELIYON, 2023, 9 (04)
  • [28] Deep learning methods of predicting RNA torsion angle
    Xiu-Juan, Ou
    Yi, Xiao
    ACTA PHYSICA SINICA, 2023, 72 (24)
  • [29] Predicting the Secondary Structure of Proteins: A Deep Learning Approach
    Kathuria, Charu
    Mehrotra, Deepti
    Misra, Navnit Kumar
    CURRENT PROTEOMICS, 2022, 19 (05) : 400 - 411
  • [30] Deep learning for predicting respiratory rate from biosignals
    Kumar, Amit Krishan
    Ritam, M.
    Han, Lina
    Guo, Shuli
    Chandra, Rohitash
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 144