Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures

被引:112
作者
Arts, Sam [1 ]
Hou, Jianan [1 ]
Gomez, Juan Carlos [2 ]
机构
[1] Katholieke Univ Leuven, Fac Econ & Business, Dept Management Strategy & Innovat, Korte Nieuwstr 33, B-2000 Antwerp, Belgium
[2] Univ Guanajuato, Dept Elect Engn, Campus Irapuato Salamanca,Carretera Salamanca, Salamanca, Mexico
关键词
Natural language processing; Patent; Novelty; Impact; Breakthrough; Award; KNOWLEDGE SPILLOVERS; CITATIONS; NOVELTY; ORIGINALITY; INDICATORS; INNOVATION;
D O I
10.1016/j.respol.2020.104144
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
We develop natural language processing techniques to identify the creation and impact of new technologies in the population of U.S. patents. We validate the new techniques and their improvement over traditional metrics based on patent classification and citations in two case-control studies. First, we collect patents linked to awards such as the Nobel prize and the National Inventor Hall of Fame. These patents likely cover radically new technologies with a major impact on technological progress and patenting. Second, we identify patents granted by the United States Patent and Trademark Office but simultaneously rejected by both the European and Japanese patent office. Such patents arguably lack novelty or cover small incremental advances over prior art and should have little impact on technological progress. We provide open access to code, data, and new measures for all utility patents granted by the USPTO up to May 2018 (see https://zenodo.org/record/3515985, DOI: 10.5281/zenodo.3515985).
引用
收藏
页数:13
相关论文
共 67 条
[1]  
Al Hasan M, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P1175
[2]   Patent citations as a measure of knowledge flows:: The influence of examiner citations [J].
Alcacer, Juan ;
Gittelman, Michelle .
REVIEW OF ECONOMICS AND STATISTICS, 2006, 88 (04) :774-779
[3]  
Allan J., 2003, P 26 ANN INT ACM SIG, P314, DOI [DOI 10.1145/860435.860493, 10.1145/860435.860493]
[4]  
[Anonymous], 2019, TRIAD PAT FAM DAT VE
[5]  
Argente D., 2020, 20204 FRB, DOI [10.29338/wp2020-04, DOI 10.29338/WP2020-04]
[6]   Taste for science, academic boundary spanning, and inventive performance of scientists and engineers in industry [J].
Arts, Sam ;
Veugelers, Reinhilde .
INDUSTRIAL AND CORPORATE CHANGE, 2020, 29 (04) :917-933
[7]   Paradise of Novelty-Or Loss of Human Capital? Exploring New Fields and Inventive Output [J].
Arts, Sam ;
Fleming, Lee .
ORGANIZATION SCIENCE, 2018, 29 (06) :1074-1092
[8]   Text matching to measure patent similarity [J].
Arts, Sam ;
Cassiman, Bruno ;
Carlos Gomez, Juan .
STRATEGIC MANAGEMENT JOURNAL, 2018, 39 (01) :62-84
[9]   Investigating Cohort Similarity as an Ex Ante Alternative to Patent Forward Citations [J].
Ashtor, Jonathan H. .
JOURNAL OF EMPIRICAL LEGAL STUDIES, 2019, 16 (04) :848-880
[10]   Machine learning and natural language processing on the patent corpus: Data, tools, and new measures [J].
Balsmeieri, Benjamin ;
Assaf, Mohamad ;
Chesebro, Tyler ;
Fierro, Gabe ;
Johnson, Kevin ;
Johnson, Scott ;
Li, Guan-Cheng ;
Lueck, Sonja ;
O'Reagan, Doug ;
Yeh, Bill ;
Zang, Guangzheng ;
Fleming, Lee .
JOURNAL OF ECONOMICS & MANAGEMENT STRATEGY, 2018, 27 (03) :535-553