MVImgNet: A Large-scale Dataset of Multi-view Images

被引:31
|
作者
Yu, Xianggang [1 ,2 ]
Xu, Mutian [1 ,2 ]
Zhang, Yidan [1 ,2 ]
Liu, Haolin [1 ,2 ]
Ye, Chongjie [1 ,2 ]
Wu, Yushuang [1 ,2 ]
Yan, Zizheng [1 ,2 ]
Zhu, Chenming [1 ,2 ]
Xiong, Zhangyang [1 ,2 ]
Liang, Tianyou [1 ,2 ]
Chen, Guanying [1 ,2 ]
Cui, Shuguang [1 ,2 ]
Han, Xiaoguang [1 ,2 ]
机构
[1] CUHKSZ, FNii, Shenzhen, Peoples R China
[2] CUHKSZ, SSE, Shenzhen, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
基金
国家重点研发计划;
关键词
D O I
10.1109/CVPR52729.2023.00883
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Being data-driven is one of the most iconic properties of deep learning algorithms. The birth of ImageNet [24] drives a remarkable trend of 'learning from large-scale data' in computer vision. Pretraining on ImageNet to obtain rich universal representations has been manifested to benefit various 2D visual tasks, and becomes a standard in 2D vision. However, due to the laborious collection of real-world 3D data, there is yet no generic dataset serving as a counterpart of ImageNet in 3D vision, thus how such a dataset can impact the 3D community is unraveled. To remedy this defect, we introduce MVImgNet, a large-scale dataset of multi-view images, which is highly convenient to gain by shooting videos of real-world objects in human daily life. It contains 6.5 million frames from 219,188 videos crossing objects from 238 classes, with rich annotations of object masks, camera parameters, and point clouds. The multi-view attribute endows our dataset with 3D-aware signals, making it a soft bridge between 2D and 3D vision. We conduct pilot studies for probing the potential of MVImgNet on a variety of 3D and 2D visual tasks, including radiance field reconstruction, multi-view stereo, and view-consistent image understanding, where MVImgNet demonstrates promising performance, remaining lots of possibilities for future explorations. Besides, via dense reconstruction on MVImgNet, a 3D object point cloud dataset is derived, called MVPNet, covering 87,200 samples from 150 categories, with the class label on each point cloud. Experiments show that MVPNet can benefit the real-world 3D object classification while posing new challenges to point cloud understanding. MVImgNet and MVPNet will be public, hoping to inspire the broader vision community.
引用
收藏
页码:9150 / 9161
页数:12
相关论文
共 50 条
  • [21] Towards high-resolution large-scale multi-view stereo
    Hiep, Vu Hoang
    Keriven, Renaud
    Labatut, Patrick
    Pons, Jean-Philippe
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 1430 - 1437
  • [22] Auto-Weighted Multi-View Clustering for Large-Scale Data
    Wan, Xinhang
    Liu, Xinwang
    Liu, Jiyuan
    Wang, Siwei
    Wen, Yi
    Liang, Weixuan
    Zhu, En
    Liu, Zhe
    Zhou, Lu
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 10078 - +
  • [23] CTM - A Model for Large-Scale Multi-View Tweet Topic Classification
    Kulkarni, Vivek
    Leung, Kenny
    Haghighi, Aria
    2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2022, 2022, : 247 - 258
  • [24] Large-scale online multi-view graph neural network and applications
    Li, Zhao
    Xing, Yuying
    Huang, Jiaming
    Wang, Haobo
    Gao, Jianliang
    Yu, Guoxian
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 116 : 145 - 155
  • [25] The methods for improving large-scale multi-view clustering efficiency: a survey
    Yang, Zengbiao
    Tan, Yihua
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (06)
  • [26] Confidence-Based Large-Scale Dense Multi-View Stereo
    Li, Zhaoxin
    Zuo, Wangmeng
    Wang, Zhaoqi
    Zhang, Lei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 7176 - 7191
  • [27] A Scalable Algorithm for Large-Scale Unsupervised Multi-View Partial Least Squares
    Wang, Li
    Li, Ren-Cang
    IEEE TRANSACTIONS ON BIG DATA, 2020, 8 (04) : 1073 - 1083
  • [28] Large-scale multi-view clustering via matrix factorization of consensus graph
    Yang, Zengbiao
    Tan, Yihua
    Yang, Tao
    PATTERN RECOGNITION, 2024, 155
  • [29] Joint Camera Clustering and Surface Segmentation for Large-scale Multi-view Stereo
    Zhang, Runze
    Li, Shiwei
    Fang, Tian
    Zhu, Siyu
    Quan, Long
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2084 - 2092
  • [30] Efficient Supervised Discrete Multi-View Hashing for Large-Scale Multimedia Search
    Lu, Xu
    Zhu, Lei
    Li, Jingjing
    Zhang, Huaxiang
    Shen, Heng Tao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (08) : 2048 - 2060