Food Volume Estimation Based on Deep Learning View Synthesis from a Single Depth Map

被引：51

作者：

Lo, Frank P. -W. ^{[1
]}

Sun, Yingnan ^{[2
]}

Qiu, Jianing ^{[2
]}

Lo, Benny ^{[1
]}

机构：

[1] Imperial Coll London, Dept Surg & Canc, Hamlyn Ctr, London SW7 2AZ, England

[2] Imperial Coll London, Dept Comp, Hamlyn Ctr, London SW7 2AZ, England

来源：

NUTRIENTS | 2018年 / 10卷 / 12期

基金：

比尔及梅琳达.盖茨基金会;

关键词：

dietary assessment; volume estimation; mhealth; deep learning; view synthesis; image rendering; 3d reconstruction; 3D; RECONSTRUCTION;

D O I：

10.3390/nu10122005

中图分类号：

R15 [营养卫生、食品卫生]; TS201 [基础科学];

学科分类号：

100403 ;

摘要：

An objective dietary assessment system can help users to understand their dietary behavior and enable targeted interventions to address underlying health problems. To accurately quantify dietary intake, measurement of the portion size or food volume is required. For volume estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles which can be tedious. In this paper, a view synthesis approach based on deep learning is proposed to reconstruct 3D point clouds of food items and estimate the volume from a single depth image. A distinct neural network is designed to use a depth image from one viewing angle to predict another depth image captured from the corresponding opposite viewing angle. The whole 3D point cloud map is then reconstructed by fusing the initial data points with the synthesized points of the object items through the proposed point cloud completion and Iterative Closest Point (ICP) algorithms. Furthermore, a database with depth images of food object items captured from different viewing angles is constructed with image rendering and used to validate the proposed neural network. The methodology is then evaluated by comparing the volume estimated by the synthesized 3D point cloud with the ground truth volume of the object items.

引用

页数：20

共 32 条

[1]

Abdulla W., 2017, CNN OBJECT DETECTION

[2]

[Anonymous], 1995, P 1 INT COMP GEOM SO

[3]

[Anonymous], 2017, P CVPR HON HAW 21 26

[4]

[Anonymous], P 2018 IEEE 15 INT C

[5]

Bhagwat S., 2014, USDA DATABASE FLAVON

[6] Yale-CMU-Berkeley dataset for robotic manipulation research [J].

Calli, Berk ;

Singh, Arjun ;

Bruce, James ;

Walsman, Aaron ;

Konolige, Kurt ;

Srinivasa, Siddhartha ;

Abbeel, Pieter ;

Dollar, Aaron M. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2017, 36 (03) :261-268

[7]

Chen L, 2012, INT C PATT RECOG, P3070

[8] A new method for accurate, high-throughput volume estimation from three 2D projective images [J].

Chopin, Josh ;

Laga, Hamid ;

Miklavcic, Stanley J. .

INTERNATIONAL JOURNAL OF FOOD PROPERTIES, 2017, 20 (10) :2344-2357

[9] 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].

Choy, Christopher B. ;

Xu, Danfei ;

Gwak, Jun Young ;

Chen, Kevin ;

Savarese, Silvio .

COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644

[10] Diabetes60-Inferring Bread Units From Food Images Using Fully Convolutional Neural Networks [J].

Christ, Patrick Ferdinand ;

Schlecht, Sebastian ;

Ettlinger, Florian ;

Gruen, Felix ;

Heinle, Christoph ;

Tatavatry, Sunil ;

Ahmadi, Seyed-Ahmad ;

Diepold, Klaus ;

Menze, Bjoern H. .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :1526-1535

← 1 2 3 4 →