Using text mining to retrieve information about circular economy

被引:27
作者
Spreafico, Christian [1 ]
Spreafico, Matteo [1 ]
机构
[1] Univ Bergamo, Dept Management Informat & Prod Engn, Via Marconi 5, I-24044 Dalmine, BG, Italy
关键词
Circular economy; Text mining; Dependency patterns; Patents;
D O I
10.1016/j.compind.2021.103525
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper proposes a method of text mining to automatically retrieve knowledge from patents on how to recycle and reuse a waste. The main novelties are the introduction of a set of specific dependency patterns and the introduction of a partially revised TRIZ (Russian acronym for T spacing diaeresis heory of Inventive Problem Solving) spacing diaeresis ontology to classify the retrieved information. The proposed dependency patterns were manually extracted from a sample patents pool about waste recycling and reuse. The classification of the information is based on different classes: (1) what transformations can be carried out on the waste, (2) what technologies can be used to carry out these transformations, (3) what products can be obtained by transforming the waste, (4) what functions can be carried out by the waste, (5) with which technologies, and (6) on which entities. An automatic implementation of the proposed method, involving the manual check of the retrieved results, was tested through a case study about wood chip recycling and reuse. Compared to the dependency patterns from the literature, the proposed ones allowed to retrieve 28 % more pertinent information. This results mainly depends by better ability of the proposed patterns to better discriminate the relevant sentences from which to extract information, compared to the other patterns (i.e. + 40 %). The automatic classification of the information was also correctly performed: in almost each class, precision and recall were higher than 60 % and on average equal to 90 %. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 61 条
[1]   Deep learning model for Demolition Waste Prediction in a circular economy [J].
Akanbi, Lukman A. ;
Oyedele, Ahmed O. ;
Oyedele, Lukumon O. ;
Salami, Rafiu O. .
JOURNAL OF CLEANER PRODUCTION, 2020, 274
[2]   Automatic identification of relevant chemical compounds from patents [J].
Akhondi, Saber A. ;
Rey, Hinnerk ;
Schwoerer, Markus ;
Maier, Michael ;
Toomey, John ;
Nau, Heike ;
Ilchmann, Gabriele ;
Sheehan, Mark ;
Irmer, Matthias ;
Bobach, Claudia ;
Doornenbal, Marius ;
Gregory, Michelle ;
Kors, Jan A. .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2019,
[3]  
Al Hasan M, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P1175
[4]   A Review of Factors Influencing User Satisfaction in Information Retrieval [J].
Al-Maskari, Azzah ;
Sanderson, Mark .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (05) :859-868
[5]  
Al'tshuller G., 1988, Creativity as an exact science. The theory of the solution of inventive problems, DOI DOI 10.1201/9781466593442
[6]   Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis [J].
Albalawi, Rania ;
Yeap, Tet Hin ;
Benyoucef, Morad .
FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2020, 3
[7]  
[Anonymous], 2004, P 13 INT C WORLD WID, DOI DOI 10.1145/988672.988687
[8]   Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures [J].
Arts, Sam ;
Hou, Jianan ;
Gomez, Juan Carlos .
RESEARCH POLICY, 2021, 50 (02)
[9]   Text matching to measure patent similarity [J].
Arts, Sam ;
Cassiman, Bruno ;
Carlos Gomez, Juan .
STRATEGIC MANAGEMENT JOURNAL, 2018, 39 (01) :62-84
[10]  
Atiku S.O., 2020, HDB RES ENTREPRENEUR, P520, DOI [10.4018/978-1-7998-5116-5.ch027, DOI 10.4018/978-1-7998-5116-5.CH027]