Predicting membrane protein types by the LLDA algorithm

被引:115
作者
Wang, Tong [1 ]
Yang, Jie [1 ]
Shen, Hong-Bin [1 ,2 ]
Chou, Kuo-Chen [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Inst Image Proc & Pattern Recognit, Shanghai 200030, Peoples R China
[2] Gordon Life Sci Inst, San Diego, CA 92130 USA
基金
中国国家自然科学基金;
关键词
LLDA; membrane protein types; linear dimensionality reduction; PsePSSM; high dimension disaster; PCA; LDA;
D O I
10.2174/092986608785849308
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Membrane proteins are generally classified into the following eight types: (1) type I transmembrane, (2) type II, (3) type III, (4) type IV, (5) multipass transmembrane, (6) lipid-chain-anchored membrane, (7) GPI-anchored membrane, and (8) peripheral membrane (K. C. Chou and H. B. Shen: BBRC, 2007, 360: 339-345). Knowing the type of an uncharacterized membrane protein often provides useful clues for finding its biological function and interaction process with other molecules in a biological system. With the explosion of protein sequences generated in the Post-Genomic Age, it is urgent to develop an automated method to deal with such a challenge. Recently, the PsePSSM (Pseudo Position-Specific Score Matrix) descriptor is proposed by Chou and Shen (Biochem. Biophys. Res. Comm. 2007, 360, 339-345) to represent a protein sample. The advantage of the PsePSSM descriptor is that it can combine the evolution information and sequence-correlated information. However, incorporating all these effects into a descriptor may cause the "high dimension disaster". To overcome such a problem, the fusion approach was adopted by Chou and Shen. Here, a completely different approach, the so-called LLDA (Local Linear Discriminant Analysis) is introduced to extract the key features from the high-dimensional PsePSSM space. The dimension-reduced descriptor vector thus obtained is a compact representation of the original high dimensional vector. Our jackknife and independent dataset test results indicate that it is very promising to use the LLDA approach to cope with complicated problems in biological systems, such as predicting the membrane protein type.
引用
收藏
页码:915 / 921
页数:7
相关论文
共 89 条
[1]  
Alberts B., 2002, Molecular Biology of The Cell, V4th
[2]   Predicting membrane protein type by functional domain composition and pseudo-amino acid composition [J].
Cai, YD ;
Chou, KC .
JOURNAL OF THEORETICAL BIOLOGY, 2006, 238 (02) :395-400
[3]  
CAI YD, 2002, INTERNET ELECT J MOL, V1, P219
[4]   Relation between amino acid composition and cellular location of proteins [J].
Cedano, J ;
Aloy, P ;
PerezPons, JA ;
Querol, E .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 266 (03) :594-600
[5]   Using pseudo-amino acid composition and support vector machine to predict protein structural class [J].
Chen, Chao ;
Tian, Yuan-Xin ;
Zou, Xiao-Yong ;
Cai, Pei-Xiang ;
Mo, Jin-Yuan .
JOURNAL OF THEORETICAL BIOLOGY, 2006, 243 (03) :444-448
[6]   Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network [J].
Chen, Chao ;
Zhou, Xibin ;
Tian, Yuanxin ;
Zou, Xiaoyong ;
Cai, Peixiang .
ANALYTICAL BIOCHEMISTRY, 2006, 357 (01) :116-121
[7]   Prediction of linear B-cell epitopes using amino acid pair antigenicity scale [J].
Chen, J. ;
Liu, H. ;
Yang, J. ;
Chou, K.-C. .
AMINO ACIDS, 2007, 33 (03) :423-428
[8]   Prediction of membrane protein types by incorporating amphipathic effects [J].
Chou, KC ;
Cai, YD .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2005, 45 (02) :407-413
[9]   Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes [J].
Chou, KC .
BIOINFORMATICS, 2005, 21 (01) :10-19
[10]   Insights from modeling three-dimensional structures of the human potassium and sodium channels [J].
Chou, KC .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (04) :856-861