Multi-Label Data Fusion to Support Agricultural Vulnerability Assessments

被引:3
作者
Lopez, Ivan Dario [1 ]
Figueroa, Apolinar [2 ]
Corrales, Juan Carlos [1 ]
机构
[1] Univ Cauca Tulcan, Telemat Engn Grp, Popayan 190003, Colombia
[2] Univ Cauca Tulcan, Environm Studies Grp, Popayan 190003, Colombia
关键词
Agriculture; Data integration; Meteorology; Stakeholders; Data models; Production; Atmospheric modeling; Climate vulnerability assessment; climate change; crop production; data processing; data fusion; machine learning; multi-label classification; multi-label dataset; sustainable agriculture; DATA INTEGRATION; CLASSIFICATION; SENTINEL-2; LANDSAT-8; NETWORK; IMPACT; YIELD;
D O I
10.1109/ACCESS.2021.3089665
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Identifying crop species and varieties adaptable to climate change impacts is one of the main aspects of climate vulnerability assessments. This estimation involves processing, integrating, and analyzing many information sources to provide accurate and timely responses. However, designing this evaluation, examine the information gathered, and reaching agreements among all stakeholders and experts, often requires considerable effort in time, money, and people. In this study, we propose a data fusion strategy to support climate vulnerability assessments by identifying the adaptability of crops in a territory in the short term. This strategy follows the Joint Directors of Laboratories' data fusion model guidelines. It was evaluated and validated through a case study in Colombia's upper Cauca river basin. For this purpose, we identified Climate, Soil, Water Quality, Productive Alliances, and Production as the most relevant data sources to be integrated, and using metrics such as Mean IR, SCUMBLE, TCS, among others, we evaluated the combined datasets according to their theoretical complexity. The adaptability of crops in a territory was addressed as a multi-label learning problem, assessing the performance of different multi-label classification and multi-view multi-label classification models with both test and actual data. Comparing the predicted crops with the actual ones, we obtained a 98% similarity without considering crop ranking using the Binary Relevance approach and the Random Forest and XGBoost algorithms. Using a more exhaustive test involving order, we obtained a maximum similarity of 67% applying Binary Relevance and Random Forest.
引用
收藏
页码:88313 / 88326
页数:14
相关论文
共 63 条
[1]  
Abd El-aziz A. A., 2020, ADV INTELLIGENT SYST, V1153, P16, DOI [10.1007/978-3-030-44289-7_2, DOI 10.1007/978-3-030-44289-7_2]
[2]  
[Anonymous], 2016, Multilabel Classification, DOI 10.1007/978- 3-319- 41111-8
[3]  
[Anonymous], 1987, Statistical Science
[4]  
Charte D., 2015, P 16 C AS ESP INT AR, P695
[5]   Dealing with difficult minority labels in imbalanced mutilabel data sets [J].
Charte, Francisco ;
Rivera, Antonio J. ;
del Jesus, Maria J. ;
Herrera, Francisco .
NEUROCOMPUTING, 2019, 326 :39-53
[6]   On the Impact of Dataset Complexity and Sampling Strategy in Multilabel Classifiers Performance [J].
Charte, Francisco ;
Rivera, Antonio ;
Jose del Jesus, Maria ;
Herrera, Francisco .
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, 2016, 9648 :500-511
[7]   Resampling Multilabel Datasets by Decoupling Highly Imbalanced Labels [J].
Charte, Francisco ;
Rivera, Antonio ;
Jose del Jesus, Maria ;
Herrera, Francisco .
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS (HAIS 2015), 2015, 9121 :489-501
[8]   Addressing imbalance in multilabel classification: Measures and random resampling algorithms [J].
Charte, Francisco ;
Rivera, Antonio J. ;
del Jesus, Maria J. ;
Herrera, Francisco .
NEUROCOMPUTING, 2015, 163 :3-16
[9]  
Christen P., 2012, Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection, DOI [DOI 10.1007/978-3-642-31164-2, 10.1007/978-3-642-31164-2]
[10]   Spatio-temporal data fusion for fine-resolution subsidence estimation [J].
Chu, Hone-Jay ;
Ali, Muhammad Zeeshan ;
Burbey, Thomas J. .
ENVIRONMENTAL MODELLING & SOFTWARE, 2021, 137