Multi-Dimensional Dataset of Open Data and Satellite Images for Characterization of Food Security and Nutrition

被引:10
作者
Restrepo, David S. [1 ]
Perez, Luis E. [1 ]
Lopez, Diego M. [1 ]
Vargas-Canas, Rubiel [2 ]
Osorio-Valencia, Juan Sebastian [3 ]
机构
[1] Univ Cauca, Telemat Dept, Telemat Engn Res Grp, Popayan, Colombia
[2] Univ Cauca, Instrumentat & Control Res Grp, Dept Phys, Dynam Syst, Popayan, Colombia
[3] Univ Washington, Dept Global Hlth, Seattle, WA USA
关键词
data mining; food security; machine learning; remote sensing; satellite imagery; dataset; YIELD ESTIMATION;
D O I
10.3389/fnut.2021.796082
中图分类号
R15 [营养卫生、食品卫生]; TS201 [基础科学];
学科分类号
100403 ;
摘要
BackgroundNutrition is one of the main factors affecting the development and quality of life of a person. From a public health perspective, food security is an essential social determinant for promoting healthy nutrition. Food security embraces four dimensions: physical availability of food, economic and physical access to food, food utilization, and the sustainability of the dimensions above. Integrally addressing the four dimensions is vital. Surprisingly most of the works focused on a single dimension of food security: the physical availability of food. ObjectiveThe paper proposes a multi-dimensional dataset of open data and satellite images to characterize food security in the department of Cauca, Colombia. MethodsThe food security dataset integrates multiple open data sources; therefore, the Cross-Industry Standard Process for Data Mining methodology was used to guide the construction of the dataset. It includes sources such as population and agricultural census, nutrition surveys, and satellite images. ResultsAn open multidimensional dataset for the Department of Cauca with 926 attributes and 9 rows (each row representing a Municipality) from multiple sources in Colombia, is configured. Then, machine learning models were used to characterize food security and nutrition in the Cauca Department. As a result, The Food security index calculated for Cauca using a linear regression model (Mean Absolute Error of 0.391) is 57.444 in a range between 0 and 100, with 100 the best score. Also, an approach for extracting four features (Agriculture, Habitation, Road, Water) of satellite images were tested with the ResNet50 model trained from scratch, having the best performance with a macro-accuracy, macro-precision, macro-recall, and macro-F1-score of 91.7, 86.2, 66.91, and 74.92%, respectively. ConclusionIt shows how the CRISP-DM methodology can be used to create an open public health data repository. Furthermore, this methodology could be generalized to other types of problems requiring the creation of a dataset. In addition, the use of satellite images presents an alternative for places where data collection is challenging. The model and methodology proposed based on open data become a low-cost and effective solution that could be used by decision-makers, especially in developing countries, to support food security planning.
引用
收藏
页数:13
相关论文
共 37 条
[1]  
[Anonymous], 2008, An introduction to the basic concepts of food security
[2]  
[Anonymous], 2015, The Paris Agreement
[3]  
Ballard T., 2013, FOOD INSECURITY EXPE
[4]   Integrating Multi-Source Data for Rice Yield Prediction across China using Machine Learning and Deep Learning Approaches [J].
Cao, Juan ;
Zhang, Zhao ;
Tao, Fulu ;
Zhang, Liangliang ;
Luo, Yuchuan ;
Zhang, Jing ;
Han, Jichong ;
Xie, Jun .
AGRICULTURAL AND FOREST METEOROLOGY, 2021, 297
[5]  
Chapman P., 2000, Technical Report
[6]  
Comite Intersectorial De Seguridad Alimentaria y Nutricional Del Cauca, 2009, PLAN SEG AL NUTR DEP
[7]  
Departamento Administrativo Nacional de Estadistica, 2016, COLOMBIA 3 CENS NAC
[8]  
Dosovitskiy A, 2020, ARXIV
[9]  
Economist Intelligence Unit (EIU), 2021, Global Food Security Index 2020: Addressing Structural Inequalities to Build Strong and Sustainable Food Systems
[10]   WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas [J].
Fick, Stephen E. ;
Hijmans, Robert J. .
INTERNATIONAL JOURNAL OF CLIMATOLOGY, 2017, 37 (12) :4302-4315