Machine learning and statistical models for analyzing multilevel patent data

被引:0
|
作者
Sunyun Qi
Yu Zhang
Hua Gu
Fei Zhu
Meiying Gao
Hongxiao Liang
Qifeng Zhang
Yanchao Gao
机构
[1] Zhejiang Provincial Center for Medical Science Technology and Education Development,Leuven Statistics Research Centre, Faculty of Science
[2] KU Leuven (Katholieke Universiteit Leuven),Department of Public Utilities Management, Faculty of Humanities and Management
[3] Zhejiang Chinese Medical University,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
A recent surge of patent applications among public hospitals in China has aroused significant research interest. A country’s healthcare innovation capacity can be measured by its number of patents. This paper explores the link between the number of patents and ten independent variables. Multicollinearity was carefully detected and removed by using the variable selection method and LASSO regression, respectively. The Poisson model and the negative binomial model were proposed to analyze the patent data. Three goodness of fit tests, the Pearson test, the deviance test, and the DHARMa non-parametric dispersion test, were conducted to investigate if the model has a good fit. After discovering four clusters by conducting agglomerative hierarchical clustering, these two models were replaced by the negative binomial mixed model. The likelihood ratio test was used to determine which model is more appropriate and the results reveal that the negative binomial mixed model outperforms both the Poisson model and the negative binomial model. Three variables, number of health technicians per 10,000 people, financial expenditure on science and technology as well as number of patent applications per 10,000 health personnel, have a significantly positive relationship with the number of patents in Chinese tertiary public hospitals.
引用
收藏
相关论文
共 50 条
  • [31] Analyzing and Addressing Data-driven Fairness Issues in Machine Learning Models used for Societal Problems
    Pendyala, Vishnu S.
    Kim, HyungKyun
    2023 INTERNATIONAL CONFERENCE ON COMPUTER, ELECTRICAL & COMMUNICATION ENGINEERING, ICCECE, 2023,
  • [32] Statistical models for e-learning data
    Figini, Silvia
    Giudici, Paolo
    STATISTICAL METHODS AND APPLICATIONS, 2009, 18 (02): : 293 - 304
  • [33] Statistical Inference, Learning and Models in Big Data
    Franke, Beate
    Plante, Jean-Francois
    Roscher, Ribana
    Lee, En-Shiun Annie
    Smyth, Cathal
    Hatefi, Armin
    Chen, Fuqi
    Gil, Einat
    Schwing, Alexander
    Selvitella, Alessandro
    Hoffman, Michael M.
    Grosse, Roger
    Hendricks, Dieter
    Reid, Nancy
    INTERNATIONAL STATISTICAL REVIEW, 2016, 84 (03) : 371 - 389
  • [34] Statistical models for e-learning data
    Silvia Figini
    Paolo Giudici
    Statistical Methods and Applications, 2009, 18 : 293 - 304
  • [35] Statistical data integration using multilevel models to predict employee compensation
    Erciulescu, Andreea L.
    Opsomer, Jean D.
    Schneider, Benjamin J.
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2023, 51 (01): : 312 - 326
  • [36] Statistical approaches to identifying significant differences in predictive performance between machine learning and classical statistical models for survival data
    Nasejje, Justine B.
    Whata, Albert
    Chimedza, Charles
    PLOS ONE, 2022, 17 (12):
  • [37] Analyzing Road Accident Data using Machine Learning Paradigms
    Nandurge, Priyanka A.
    Dharwadkar, Nagaraj V.
    2017 INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC), 2017, : 604 - 610
  • [38] A method for analyzing complex structured data with elements of machine learning
    Mandrikova B.S.
    Computer Optics, 2022, 46 (03) : 506 - 512
  • [39] A Bayesian perspective of statistical machine learning for big data
    Rajiv Sambasivan
    Sourish Das
    Sujit K. Sahu
    Computational Statistics, 2020, 35 : 893 - 930
  • [40] How Big Data changes Statistical Machine Learning
    Bottou, Leon
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 1 - 1