Membrane proteins play a crucial role in various cellular processes and are essential components of cell membranes. Computational methods have emerged as a powerful tool for studying membrane proteins due to their complex structures and properties that make them difficult to analyze experimentally. Traditional features for protein sequence analysis based on amino acid types, composition, and pair composition have limitations in capturing higher-order sequence patterns. Recently, multiple sequence alignment (MSA) and pre-trained language models (PLMs) have been used to generate features from protein sequences. However, the significant computational resources required for MSA-based features generation can be a major bottleneck for many applications. Several methods and tools have been developed to accelerate the generation of MSAs and reduce their computational cost, including heuristics and approximate algorithms. Additionally, the use of PLMs such as BERT has shown great potential in generating informative embeddings for protein sequence analysis. In this review, we provide an overview of traditional and more recent methods for generating features from protein sequences, with a particular focus on MSAs and PLMs. We highlight the advantages and limitations of these approaches and discuss the methods and tools developed to address the computational challenges associated with features generation. Overall, the advancements in computational methods and tools provide a promising avenue for gaining deeper insights into the function and properties of membrane proteins, which can have significant implications in drug discovery and personalized medicine.
机构:
Indian Inst Technol, Dept Biotechnol, Prot Bioinformat Lab, Madras 600036, Tamil Nadu, IndiaIndian Inst Technol, Dept Biotechnol, Prot Bioinformat Lab, Madras 600036, Tamil Nadu, India
机构:
Natl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Koto Ku, Tokyo 1350064, JapanNatl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Koto Ku, Tokyo 1350064, Japan
Gromiha, MM
Suwa, M
论文数: 0引用数: 0
h-index: 0
机构:
Natl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Koto Ku, Tokyo 1350064, JapanNatl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Koto Ku, Tokyo 1350064, Japan
机构:
UCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Francis Crick Inst, Biomed Data Sci Lab, London, EnglandUCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Kandathil, Shaun M.
Greener, Joe G.
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Francis Crick Inst, Biomed Data Sci Lab, London, EnglandUCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Greener, Joe G.
Jones, David T.
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Francis Crick Inst, Biomed Data Sci Lab, London, EnglandUCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
机构:
Indian Inst Technol, Dept Biotechnol, Prot Bioinformat Lab, Madras 600036, Tamil Nadu, IndiaIndian Inst Technol, Dept Biotechnol, Prot Bioinformat Lab, Madras 600036, Tamil Nadu, India
机构:
Natl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Koto Ku, Tokyo 1350064, JapanNatl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Koto Ku, Tokyo 1350064, Japan
Gromiha, MM
Suwa, M
论文数: 0引用数: 0
h-index: 0
机构:
Natl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Koto Ku, Tokyo 1350064, JapanNatl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Koto Ku, Tokyo 1350064, Japan
机构:
UCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Francis Crick Inst, Biomed Data Sci Lab, London, EnglandUCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Kandathil, Shaun M.
Greener, Joe G.
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Francis Crick Inst, Biomed Data Sci Lab, London, EnglandUCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Greener, Joe G.
Jones, David T.
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Dept Comp Sci, Gower St, London WC1E 6BT, England
Francis Crick Inst, Biomed Data Sci Lab, London, EnglandUCL, Dept Comp Sci, Gower St, London WC1E 6BT, England