Machine learning and natural language processing on the patent corpus: Data, tools, and new measures

被引:48
|
作者
Balsmeieri, Benjamin [1 ]
Assaf, Mohamad [2 ,3 ]
Chesebro, Tyler [4 ]
Fierro, Gabe [4 ]
Johnson, Kevin [4 ]
Johnson, Scott [4 ]
Li, Guan-Cheng [2 ]
Lueck, Sonja [5 ]
O'Reagan, Doug [2 ]
Yeh, Bill [4 ]
Zang, Guangzheng [4 ]
Fleming, Lee [2 ]
机构
[1] Univ Luxembourg, Ctr Res Econ & Management, Esch Sur Alzette, Luxembourg
[2] Univ Calif Berkeley, Coleman Fung Inst Engn Leadership, Berkeley, CA 94720 USA
[3] Amer Univ Beirut, Dept Elect & Comp Engn, Beirut, Lebanon
[4] Univ Calif Berkeley, Elect Engn & Comp Sci, Berkeley, CA USA
[5] Univ Paderborn, Dept Econ, Paderborn, Germany
基金
美国国家科学基金会;
关键词
database; disambiguation; machine learning; natural language processing; patent; social networks; NETWORKS;
D O I
10.1111/jems.12259
中图分类号
F [经济];
学科分类号
02 ;
摘要
Drawing upon recent advances in machine learning and natural language processing, we introduce new tools that automatically ingest, parse, disambiguate, and build an updated database using U.S. patent data. The tools identify unique inventor, assignee, and location entities mentioned on each granted U.S. patent from 1976 to 2016. We describe data flow, algorithms, user interfaces, descriptive statistics, and a novelty measure based on the first appearance of a word in the patent corpus. We illustrate an automated coinventor network mapping tool and visualize trends in patenting over the last 40 years. Data and documentation can be found at https://console.cloud.google.com/launcher/partners/patents-public-data.
引用
收藏
页码:535 / 553
页数:19
相关论文
共 50 条
  • [41] Natural language processing and machine learning to assist radiation oncology incident learning
    Mathew, Felix
    Wang, Hui
    Montgomery, Logan
    Kildea, John
    JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS, 2021, 22 (11): : 172 - 184
  • [42] Natural Language Processing and Machine Learning Applications For Assessment and Evaluation in Education: Opportunities and New Approaches
    Yilmaz, Kubra
    Deniz, Kaan Zulfikar
    JOURNAL OF MEASUREMENT AND EVALUATION IN EDUCATION AND PSYCHOLOGY-EPOD, 2024, 15 (04): : 421 - 445
  • [43] An intelligent patent recommender adopting machine learning approach for natural language processing: A case study for smart machinery technology mining
    Trappey, Amy
    Trappey, Charles V.
    Hsieh, Alex
    TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2021, 164
  • [44] Arabic natural language processing and machine learning-based systems
    Larabi Marie-Sainte S.
    Alalyani N.
    Alotaibi S.
    Ghouzali S.
    Abunadi I.
    IEEE Access, 2019, 7 : 7011 - 7020
  • [45] RESEARCH ON THE TEXT CLASSIFICATION BASED ON NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING
    Chen Keming
    Zheng Jianguo
    JOURNAL OF THE BALKAN TRIBOLOGICAL ASSOCIATION, 2016, 22 (03): : 2484 - 2494
  • [46] Detecting Phishing Attacks Using Natural Language Processing and Machine Learning
    Peng, Tianrui
    Harris, Ian G.
    Sawa, Yuki
    2018 IEEE 12TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2018, : 300 - 301
  • [47] Towards Machine Learning Fairness Education in a Natural Language Processing Course
    Bobesh, Samantha Jane
    Miller, Tyler
    Newman, Pax
    Liu, Yudong
    Elglaly, Yasmine N.
    PROCEEDINGS OF THE 54TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, VOL 1, SIGCSE 2023, 2023, : 312 - 318
  • [48] Extracting Biomarker Information applying Natural Language Processing and Machine Learning
    Islam, Md Tawhidul
    Shaikh, Mostafa
    Nayak, Abhaya
    Ranganathan, Shoba
    2010 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING (ICBBE 2010), 2010,
  • [49] Analysis of Breakdown Reports Using Natural Language Processing and Machine Learning
    Ahmed, Mobyen Uddin
    Bengtsson, Marcus
    Salonen, Antti
    Funk, Peter
    INTERNATIONAL CONGRESS AND WORKSHOP ON INDUSTRIAL AI 2021, 2022, : 40 - 52
  • [50] Detecting hate crimes through machine learning and natural language processing
    Salazar, Ana Ortiz
    POLICE PRACTICE AND RESEARCH, 2024,