Representation of features as images with neighborhood dependencies for compatibility with convolutional neural networks

被引：78

作者：

Bazgir, Omid ^{[1
]}

Zhang, Ruibo ^{[1
]}

Dhruba, Saugato Rahman ^{[1
]}

Rahman, Raziur ^{[1
]}

Ghosh, Souparno ^{[2
,3
]}

Pal, Ranadip ^{[1
]}

机构：

[1] Texas Tech Univ, Dept Elect & Comp Engn, 1012 Boston Ave, Lubbock, TX 79409 USA

[2] Texas Tech Univ, Dept Math & Stat, 1108 Mem Circle, Lubbock, TX 79409 USA

[3] Univ Nebraska, Dept Stat, 3310 Holdrege St, Lincoln, NE 68503 USA

来源：

NATURE COMMUNICATIONS | 2020年 / 11卷 / 01期

基金：

美国国家卫生研究院;

关键词：

DIMENSIONALITY REDUCTION; DRUG-SENSITIVITY; PREDICTION; EXTENSIONS; REGRESSION; DISCOVERY; EIGENMAPS; BIAS;

D O I：

10.1038/s41467-020-18197-y

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Deep learning with Convolutional Neural Networks has shown great promise in image-based classification and enhancement but is often unsuitable for predictive modeling using features without spatial correlations. We present a feature representation approach termed REFINED (REpresentation of Features as Images with NEighborhood Dependencies) to arrange high-dimensional vectors in a compact image form conducible for CNN-based deep learning. We consider the similarities between features to generate a concise feature map in the form of a two-dimensional image by minimizing the pairwise distance values following a Bayesian Metric Multidimensional Scaling Approach. We hypothesize that this approach enables embedded feature extraction and, integrated with CNN-based deep learning, can boost the predictive accuracy. We illustrate the superior predictive capabilities of the proposed framework as compared to state-of-the-art methodologies in drug sensitivity prediction scenarios using synthetic datasets, drug chemical descriptors as predictors from NCI60, and both transcriptomic information and drug descriptors as predictors from GDSC.

引用

页数：13

共 52 条

[1] Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].

Alipanahi, Babak ;

Delong, Andrew ;

Weirauch, Matthew T. ;

Frey, Brendan J. .

NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+

[2] Deep learning for computational biology [J].

Angermueller, Christof ;

Parnamaa, Tanel ;

Parts, Leopold ;

Stegle, Oliver .

MOLECULAR SYSTEMS BIOLOGY, 2016, 12 (07)

[3] Laplacian eigenmaps for dimensionality reduction and data representation [J].

Belkin, M ;

Niyogi, P .

NEURAL COMPUTATION, 2003, 15 (06) :1373-1396

[4]

Bengio Y, 2004, ADV NEUR IN, V16, P177

[5]

Bengio Yoshua, 2012, Neural Networks: Tricks of the Trade. Second Edition: LNCS 7700, P437, DOI 10.1007/978-3-642-35289-8_26

[6]

Cawley GC, 2010, J MACH LEARN RES, V11, P2079

[7] SPATIALLY EXPLICIT MODELS FOR INFERENCE ABOUT DENSITY IN UNMARKED OR PARTIALLY MARKED POPULATIONS [J].

Chandler, Richard B. ;

Royle, J. Andrew .

ANNALS OF APPLIED STATISTICS, 2013, 7 (02) :936-954

[8] Cancer Drug Response Profile scan (CDRscan): A Deep Learning Model That Predicts Drug Effectiveness from Cancer Genomic Signature [J].

Chang, Yoosup ;

Park, Hyejin ;

Yang, Hyun-Jin ;

Lee, Seungju ;

Lee, Kwee-Yum ;

Kim, Tae Soon ;

Jung, Jongsun ;

Shin, Jae-Min .

SCIENTIFIC REPORTS, 2018, 8

[9] SMOTE: Synthetic minority over-sampling technique [J].

Chawla, Nitesh V. ;

Bowyer, Kevin W. ;

Hall, Lawrence O. ;

Kegelmeyer, W. Philip .

2002, American Association for Artificial Intelligence (16)

[10] Stringing High-Dimensional Data for Functional Analysis [J].

Chen, Kun ;

Chen, Kehui ;

Mueller, Hans-Georg ;

Wang, Jane-Ling .

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (493) :275-284

← 1 2 3 4 5 6 →