Machine learning and statistical models for analyzing multilevel patent data

被引:0
|
作者
Sunyun Qi
Yu Zhang
Hua Gu
Fei Zhu
Meiying Gao
Hongxiao Liang
Qifeng Zhang
Yanchao Gao
机构
[1] Zhejiang Provincial Center for Medical Science Technology and Education Development,Leuven Statistics Research Centre, Faculty of Science
[2] KU Leuven (Katholieke Universiteit Leuven),Department of Public Utilities Management, Faculty of Humanities and Management
[3] Zhejiang Chinese Medical University,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
A recent surge of patent applications among public hospitals in China has aroused significant research interest. A country’s healthcare innovation capacity can be measured by its number of patents. This paper explores the link between the number of patents and ten independent variables. Multicollinearity was carefully detected and removed by using the variable selection method and LASSO regression, respectively. The Poisson model and the negative binomial model were proposed to analyze the patent data. Three goodness of fit tests, the Pearson test, the deviance test, and the DHARMa non-parametric dispersion test, were conducted to investigate if the model has a good fit. After discovering four clusters by conducting agglomerative hierarchical clustering, these two models were replaced by the negative binomial mixed model. The likelihood ratio test was used to determine which model is more appropriate and the results reveal that the negative binomial mixed model outperforms both the Poisson model and the negative binomial model. Three variables, number of health technicians per 10,000 people, financial expenditure on science and technology as well as number of patent applications per 10,000 health personnel, have a significantly positive relationship with the number of patents in Chinese tertiary public hospitals.
引用
收藏
相关论文
共 50 条
  • [1] Machine learning and statistical models for analyzing multilevel patent data
    Qi, Sunyun
    Zhang, Yu
    Gu, Hua
    Zhu, Fei
    Gao, Meiying
    Liang, Hongxiao
    Zhang, Qifeng
    Gao, Yanchao
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [2] Analyzing Medical Data by Using Statistical Learning Models
    Mariani, Maria C.
    Biney, Francis
    Tweneboah, Osei K.
    MATHEMATICS, 2021, 9 (09)
  • [3] Multilevel statistical models and the analysis of experimental data
    Behm, Jocelyn E.
    Edmonds, Devin A.
    Harmon, Jason P.
    Ives, Anthony R.
    ECOLOGY, 2013, 94 (07) : 1479 - 1486
  • [4] Data-Driven Computational Neuroscience: Machine Learning and Statistical Models
    Kreinovich, Vladik
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (01) : 2513 - 2514
  • [5] Analyzing Longitudinal Data Using Machine Learning with Mixed-Effects Models
    Yigit, Pakize
    Ahmed, Syed Ejaz
    EIGHTEENTH INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING MANAGEMENT, ICMSEM 2024, 2024, 215 : 633 - 646
  • [6] A Method for Analyzing the Performance Impact of Imbalanced Binary Data on Machine Learning Models
    Zheng, Ming
    Wang, Fei
    Hu, Xiaowen
    Miao, Yuhao
    Cao, Huo
    Tang, Mingjing
    AXIOMS, 2022, 11 (11)
  • [7] Machine Learning Models for Statistical Analysis
    Grebovic, Marko
    Filipovic, Luka
    Katnic, Ivana
    Vukotic, Milica
    Popovic, Tomo
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (3A) : 505 - 514
  • [8] Marine Data Prediction: An Evaluation of Machine Learning, Deep Learning, and Statistical Predictive Models
    Ali, Ahmed
    Fathalla, Ahmed
    Salah, Ahmad
    Bekhit, Mahmoud
    Eldesouky, Esraa
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021 (2021)
  • [9] A METHOD OF ANALYZING STATISTICAL DATA BY CODING ON AN ADDING MACHINE
    Jordan, Robert
    AMERICAN JOURNAL OF PUBLIC HEALTH, 1926, 16 (02) : 123 - 125
  • [10] Enhancing Multilevel Models Through Supervised Machine Learning
    Kilian, Pascal
    Kelava, Augustin
    QUANTITATIVE PSYCHOLOGY, IMPS 2023, 2024, 452 : 145 - 154