PreCNet: Next-Frame Video Prediction Based on Predictive Coding

被引：7

作者：

Straka, Zdenek ^{[1
]}

Svoboda, Tomas ^{[1
]}

Hoffmann, Matej ^{[1
]}

机构：

[1] Czech Tech Univ, Fac Elect Engn, Dept Cybernet, Prague 12135, Czech Republic

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 08期

关键词：

Deep neural networks; next-frame video prediction; predictive coding; self-supervised learning; RESPONSE PROPERTIES; MODEL; RECOGNITION;

D O I：

10.1109/TNNLS.2023.3240857

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Predictive coding, currently a highly influential theory in neuroscience, has not been widely adopted in machine learning yet. In this work, we transform the seminal model of Rao and Ballard (1999) into a modern deep learning framework while remaining maximally faithful to the original schema. The resulting network we propose (PreCNet) is tested on a widely used next frame video prediction benchmark, which consists of images from an urban environment recorded from a car-mounted camera, and achieves state-of-the-art performance. Performance on all measures (MSE, PSNR, SSIM) was further improved when a larger training set (2M images from BDD100k), pointing to the limitations of the KITTI training set. This work demonstrates that an architecture carefully based in a neuroscience model, without being explicitly tailored to the task at hand, can exhibit exceptional performance.

引用

页码：10353 / 10367

页数：15

共 61 条

[1] A Novel Predictive-Coding-Inspired Variational RNN Model for Online Prediction and Recognition [J].

Ahmadi, Ahmadreza ;

Jun Tani .

NEURAL COMPUTATION, 2019, 31 (11) :2025-2074

[2]

[Anonymous], 2020, PRECNET GITHUB REPOS

[3]

Babaeizadeh M., 2018, 2018 IEEE IND APPL S, P1, DOI DOI 10.1109/IAS.2018.8544714

[4] ContextVP: Fully Context-Aware Video Prediction [J].

Byeon, Wonmin ;

Wang, Qin ;

Srivastava, Rupesh Kumar ;

Koumoutsakos, Petros .

COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 :781-797

[5]

Chalasani Rakesh, 2013, ARXIV13013541

[6] IPRNN: AN INFORMATION-PRESERVING MODEL FOR VIDEO PREDICTION USING SPATIOTEMPORAL GRUS [J].

Chang, Zheng ;

Zhang, Xinfeng ;

Wang, Shanshe ;

Ma, Siwei ;

Gao, Wen .

2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, :2703-2707

[7]

Choi M, 2018, NEURAL COMPUT, V30, P237, DOI [10.1162/neco_a_01026, 10.1162/NECO_a_01026]

[8] Whatever next? Predictive brains, situated agents, and the future of cognitive science [J].

Clark, Andy .

BEHAVIORAL AND BRAIN SCIENCES, 2013, 36 (03) :181-204

[9]

Denton Emily, 2018, P MACHINE LEARNING R, V80

[10]

Dollar P., Piotrs Computer Vision Matlab Toolbox PMT

← 1 2 3 4 5 6 7 →