A Sensorimotor Perspective on Contrastive Multiview Visual Representation Learning

被引：0

作者：

Laflaquiere, Alban ^{[1
]}

机构：

[1] AI Lab, SoftBank Robot Europe, F-75015 Paris, France

来源：

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS | 2022年 / 14卷 / 02期

关键词：

Task analysis; Visualization; Robot sensing systems; Training; Machine learning; Semantics; Deep learning; Artificial perception; contrastive multiview learning; representation learning; sensorimotor; unsupervised learning; CORTEX; EXPERIENCE; MODULATION; TOPOLOGY; AGENTS;

D O I：

10.1109/TCDS.2021.3086267

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The contrastive multiview visual representation learning (CMVRL) framework has recently gained a lot of traction in the unsupervised representation learning literature. Combining a simple data augmentation strategy and a contrastive learning objective, it has been able to generate representations that compare favorably to their supervised counterparts on common downstream visual tasks. The theoretical understanding of this empirical success is currently an active area of research. In this article, we propose a sensorimotor perspective on the various components of the framework. We show how it can be interpreted as building representations that geometrically embed the stable semantic content that a situated agent experiences on short spatiotemporal scales when actively exploring its environment. We also discuss the relevance of the approach in light of contemporary active, dynamical, and hierarchical theories of perception. Finally, we extrapolate this sensorimotor perspective to outline promising future research directions that could push the state of the art further and help better understand how an autonomous agent could develop useful visual representations in an unsupervised fashion.

引用

页码：269 / 278

页数：10

共 102 条

[1] Learning to See by Moving [J].

Agrawal, Pulkit ;

Carreira, Joao ;

Malik, Jitendra .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :37-45

[2]

[Anonymous], 2018, DEEP NETS WHAT HAVE

[3]

[Anonymous], 2008, The psychology of the child

[4]

Bachman P, 2019, ADV NEUR IN, V32

[5]

Barbu A, 2019, ADV NEUR IN, V32

[6]

Barlow H.B., 1994, LARGE SCALE NEURONAL, P1

[7] SUMMATION AND INHIBITION IN THE FROGS RETINA [J].

BARLOW, HB .

JOURNAL OF PHYSIOLOGY-LONDON, 1953, 119 (01) :69-88

[8] SELF-ORGANIZING NEURAL NETWORK THAT DISCOVERS SURFACES IN RANDOM-DOT STEREOGRAMS [J].

BECKER, S ;

HINTON, GE .

NATURE, 1992, 355 (6356) :161-163

[9] Representation Learning: A Review and New Perspectives [J].

Bengio, Yoshua ;

Courville, Aaron ;

Vincent, Pascal .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828

[10]

Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339

← 1 2 3 4 5 6 7 8 9 10 →