MACHINE LEARNING, COMPRESSION, AND IMAGE QUALITY

被引：5

作者：

Israel, Steven A. ^{[1
]}

Irvine, John M. ^{[2
]}

Tate, Steven ^{[3
]}

机构：

[1] Charles Stark Draper Lab Inc, 1943 Isaac Newton Sq East, Reston, VA 20190 USA

[2] Mitre Corp, 202 Burlington Rd, Bedford, MA 01730 USA

[3] Charles Stark Draper Lab Inc, 555 Technol Sq, Cambridge, MA 02139 USA

来源：

2020 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR): TRUSTED COMPUTING, PRIVACY, AND SECURING MULTIMEDIA | 2020年

关键词：

NIIRS; image quality; compression; machine learning; latent space; autoencoder;

D O I：

10.1109/AIPR50011.2020.9425256

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The current drive toward incorporating machine learning (ML) models within automated closed-loop workflows renews existing issues for image quality prediction; specifically, maintaining the operator-in-the-loop's trust in system operation. This paper reviews methods to assess information content in the reconstruction process to maintain trust in closed-loop workflows. We structure commonly used ML architectures in the form of Autoencoders. Auotencoders are commonly studied for image compression and reconstruction. For manual exploitation, the interpretability of an image indicates the potential intelligence value of the data. Historically, the National Imagery Interpretability Rating Scale (NIIRS) has been the standard for quantifying the intelligence potential based on image analysis by human observers. Empirical studies have demonstrated that spatial resolution is the dominant predictor of the NIIRS level of an image and that compression to 1 bit per pixel can maintain that NIIRS level. However, with modern ML that digests images rather than extracted features, what is the corresponding size of the latent space required to maintain the NIIRS levels? To those ends, we will operate on moderate size images, 480x480 pixels, to provide realistic generalizable estimates over those experiments against similar sprite size (< 100x100 pixels) images.

引用

页数：7

共 9 条

[1]

Dosovitskiy A., 160202644V2 ARXIV, P14

[2]

Irvine J., 2012, VIDEO COMPRESSION

[3]

Irvine J.M., 2003, Encyclopedia of Optical Engineering, V1, P1442

[4] Developing an interpretability scale for motion imagery [J].

Irvine, John M. ;

Aviles, Ana Ivelisse ;

Cannon, David M. ;

Fenimore, Charles ;

Haverkamp, Donna S. ;

Israel, Steven A. ;

O'Brien, Gary ;

Roberts, John .

OPTICAL ENGINEERING, 2007, 46 (11)

[5]

Leachtenauer J.C., 2001, SURVEILLANCE RECONNA

[6] Gradient-based learning applied to document recognition [J].

Lecun, Y ;

Bottou, L ;

Bengio, Y ;

Haffner, P .

PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324

[7]

Makhzani Alireza, 2016, P INT C LEARN REPR I

[8]

O'Brien G., 2007, SPIE DEF SEC S 6546, V6546

[9] Image quality assessment: From error visibility to structural similarity [J].

Wang, Z ;

Bovik, AC ;

Sheikh, HR ;

Simoncelli, EP .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (04) :600-612

← 1 →