-
1.
Bioinformatics Approach to Identify Novel AMPK Targets.
Gongol, B, Marin, T, Johnson, DA, Shyy, JY
Methods in molecular biology (Clifton, N.J.). 2018;:99-109
Abstract
In silico analysis of Big Data is a useful tool to identify putative kinase targets as well as nodes of signaling cascades that are difficult to discover by traditional single molecule experimentation. System approaches that use a multi-tiered investigational methodology have been instrumental in advancing our understanding of cellular mechanisms that result in phenotypic changes. Here, we present a bioinformatics approach to identify AMP-activated protein kinase (AMPK) target proteins on a proteome-wide scale and an in vitro method for preliminary validation of these targets. This approach offers an initial screening for the identification of AMPK targets that can be further validated using mutagenesis and molecular biology techniques.
-
2.
PSPEL: In Silico Prediction of Self-Interacting Proteins from Amino Acids Sequences Using Ensemble Learning.
Li, JQ, You, ZH, Li, X, Ming, Z, Chen, X
IEEE/ACM transactions on computational biology and bioinformatics. 2017;(5):1165-1172
Abstract
Self interacting proteins (SIPs) play an important role in various aspects of the structural and functional organization of the cell. Detecting SIPs is one of the most important issues in current molecular biology. Although a large number of SIPs data has been generated by experimental methods, wet laboratory approaches are both time-consuming and costly. In addition, they yield high false negative and positive rates. Thus, there is a great need for in silico methods to predict SIPs accurately and efficiently. In this study, a new sequence-based method is proposed to predict SIPs. The evolutionary information contained in Position-Specific Scoring Matrix (PSSM) is extracted from of protein with known sequence. Then, features are fed to an ensemble classifier to distinguish the self-interacting and non-self-interacting proteins. When performed on Saccharomyces cerevisiae and Human SIPs data sets, the proposed method can achieve high accuracies of 86.86 and 91.30 percent, respectively. Our method also shows a good performance when compared with the SVM classifier and previous methods. Consequently, the proposed method can be considered to be a novel promising tool to predict SIPs.
-
3.
Detection of Interactions between Proteins by Using Legendre Moments Descriptor to Extract Discriminatory Information Embedded in PSSM.
Wang, YB, You, ZH, Li, LP, Huang, YA, Yi, HC
Molecules (Basel, Switzerland). 2017;(8)
Abstract
Protein-protein interactions (PPIs) play a very large part in most cellular processes. Although a great deal of research has been devoted to detecting PPIs through high-throughput technologies, these methods are clearly expensive and cumbersome. Compared with the traditional experimental methods, computational methods have attracted much attention because of their good performance in detecting PPIs. In our work, a novel computational method named as PCVM-LM is proposed which combines the probabilistic classification vector machine (PCVM) model and Legendre moments (LMs) to predict PPIs from amino acid sequences. The improvement mainly comes from using the LMs to extract discriminatory information embedded in the position-specific scoring matrix (PSSM) combined with the PCVM classifier to implement prediction. The proposed method was evaluated on Yeast and Helicobacter pylori datasets with five-fold cross-validation experiments. The experimental results show that the proposed method achieves high average accuracies of 96.37% and 93.48%, respectively, which are much better than other well-known methods. To further evaluate the proposed method, we also compared the proposed method with the state-of-the-art support vector machine (SVM) classifier and other existing methods on the same datasets. The comparison results clearly show that our method is better than the SVM-based method and other existing methods. The promising experimental results show the reliability and effectiveness of the proposed method, which can be a useful decision support tool for protein research.
-
4.
Protein-protein interactions: scoring schemes and binding affinity.
Gromiha, MM, Yugandhar, K, Jemimah, S
Current opinion in structural biology. 2017;:31-38
Abstract
Protein-protein interactions mediate several cellular functions, which can be understood from the information obtained using the three-dimensional structures of protein-protein complexes and binding affinity data. This review focuses on computational aspects of predicting the best native-like complex structure and binding affinities. The first part covers the prediction of protein-protein complex structures and the advantages of conformational searching and scoring functions in protein-protein docking. The second part is devoted to various aspects of protein-protein interaction thermodynamics, such as databases for binding affinities and other thermodynamic parameters, computational methods to predict the binding affinity using either the three-dimensional structures of complexes or amino acid sequences, and change in binding affinities of the complexes upon mutations. We provide the latest developments on protein-protein docking and binding affinity studies along with a list of available computational resources for understanding protein-protein interactions.
-
5.
PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein-Protein Interactions from Protein Sequences.
Wang, Y, You, Z, Li, X, Chen, X, Jiang, T, Zhang, J
International journal of molecular sciences. 2017;(5)
Abstract
Protein-protein interactions (PPIs) are essential for most living organisms' process. Thus, detecting PPIs is extremely important to understand the molecular mechanisms of biological systems. Although many PPIs data have been generated by high-throughput technologies for a variety of organisms, the whole interatom is still far from complete. In addition, the high-throughput technologies for detecting PPIs has some unavoidable defects, including time consumption, high cost, and high error rate. In recent years, with the development of machine learning, computational methods have been broadly used to predict PPIs, and can achieve good prediction rate. In this paper, we present here PCVMZM, a computational method based on a Probabilistic Classification Vector Machines (PCVM) model and Zernike moments (ZM) descriptor for predicting the PPIs from protein amino acids sequences. Specifically, a Zernike moments (ZM) descriptor is used to extract protein evolutionary information from Position-Specific Scoring Matrix (PSSM) generated by Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then, PCVM classifier is used to infer the interactions among protein. When performed on PPIs datasets of Yeast and H. Pylori, the proposed method can achieve the average prediction accuracy of 94.48% and 91.25%, respectively. In order to further evaluate the performance of the proposed method, the state-of-the-art support vector machines (SVM) classifier is used and compares with the PCVM model. Experimental results on the Yeast dataset show that the performance of PCVM classifier is better than that of SVM classifier. The experimental results indicate that our proposed method is robust, powerful and feasible, which can be used as a helpful tool for proteomics research.
-
6.
Genetically encoded releasable photo-cross-linking strategies for studying protein-protein interactions in living cells.
Yang, Y, Song, H, He, D, Zhang, S, Dai, S, Xie, X, Lin, S, Hao, Z, Zheng, H, Chen, PR
Nature protocols. 2017;(10):2147-2168
Abstract
Although protein-protein interactions (PPIs) have crucial roles in virtually all cellular processes, the identification of more transient interactions in their biological context remains challenging. Conventional photo-cross-linking strategies can be used to identify transient interactions, but these approaches often suffer from high background due to the cross-linked bait proteins. To solve the problem, we have developed membrane-permeable releasable photo-cross-linkers that allow for prey-bait separation after protein complex isolation and can be installed in proteins of interest (POIs) as unnatural amino acids. Here we describe the procedures for using two releasable photo-cross-linkers, DiZSeK and DiZHSeC, in both living Escherichia coli and mammalian cells. A cleavage after protein photo-cross-linking (CAPP ) strategy based on the photo-cross-linker DiZSeK is described, in which the prey protein pool is released from a POI after affinity purification. Prey proteins are analyzed using mass spectrometry or 2D gel electrophoresis for global comparison of interactomes from different experimental conditions. An in situ cleavage and mass spectrometry (MS)-label transfer after protein photo-cross-linking (IMAPP) strategy based on the photo-cross-linker DiZHSeC is also described. This strategy can be used for the identification of cross-linking sites to allow detailed characterization of PPI interfaces. The procedures for photo-cross-linker incorporation, photo-cross-linking of interaction partners and affinity purification of cross-linked complexes are similar for the two photo-cross-linkers. The final section of the protocol describes prey-bait separation (for CAPP) and MS-label transfer and identification (for IMAPP). After plasmid construction, the CAPP and IMAPP strategies can be completed within 6 and 7 d, respectively.
-
7.
Mapping Protein-Protein Interaction Using High-Throughput Yeast 2-Hybrid.
Lopez, J, Mukhtar, MS
Methods in molecular biology (Clifton, N.J.). 2017;:217-230
Abstract
A tremendous asset to the analysis of protein-protein interactions is the yeast-2-hybrid (Y2H) method. The Y2H assay is a heterologous system that is expanding network biology knowledge via in vivo investigations of binary protein-protein interactions. Traditionally, the Y2H protocol entails the mating or co-transformation of yeast in solid agar media followed by visual analysis for diploid selection. Having played a key role in identifying protein-protein interactions for nearly three decades in a wide range of biological systems, the Y2H system assays the interaction between two proteins of interest which results in a reconstituted and/or activation of transcription factor allowing a reporter gene to be transcribed. Overall, the Y2H method takes advantage of two factors: (1) the auxotrophic yeast requires expression of the reporter gene to grow in media purposefully designed to lack one or more essential amino acids, and (2) the DNA-binding (DB) domain of transcription factor GAL4 is unable to initiate transcription unless it is physically associated with an activating domain (AD), which, together, DBs and ADs are fused to proteins of interest that must interact with each other to reconstitute the transcription factor and activate the reporter gene. The applications of Y2H are broad, entailing fields such as drug discovery, clinical trials for human disease including cancer and neurodegenerative disease, and extend even into synthetic biology applications and cellular engineering. This chapter begins with an introduction to the fundamental mechanics of Y2H utilizing a genetically engineered strain of yeast and proceeds with an in-depth look at the different types of Y2H and turn our focus particularly to the GAL4-based Y2H system to map protein-protein interactions. We will then provide a step-by-step protocol for the Y2H experimentation preceded by a materials listing while simultaneously including key notes throughout the entire experimental process of biological-mechanistic and historical understandings of the steps.
-
8.
Using structural knowledge in the protein data bank to inform the search for potential host-microbe protein interactions in sequence space: application to Mycobacterium tuberculosis.
Mahajan, G, Mande, SC
BMC bioinformatics. 2017;(1):201
Abstract
BACKGROUND A comprehensive map of the human-M. tuberculosis (MTB) protein interactome would help fill the gaps in our understanding of the disease, and computational prediction can aid and complement experimental studies towards this end. Several sequence-based in silico approaches tap the existing data on experimentally validated protein-protein interactions (PPIs); these PPIs serve as templates from which novel interactions between pathogen and host are inferred. Such comparative approaches typically make use of local sequence alignment, which, in the absence of structural details about the interfaces mediating the template interactions, could lead to incorrect inferences, particularly when multi-domain proteins are involved. RESULTS We propose leveraging the domain-domain interaction (DDI) information in PDB complexes to score and prioritize candidate PPIs between host and pathogen proteomes based on targeted sequence-level comparisons. Our method picks out a small set of human-MTB protein pairs as candidates for physical interactions, and the use of functional meta-data suggests that some of them could contribute to the in vivo molecular cross-talk between pathogen and host that regulates the course of the infection. Further, we present numerical data for Pfam domain families that highlights interaction specificity on the domain level. Not every instance of a pair of domains, for which interaction evidence has been found in a few instances (i.e. structures), is likely to functionally interact. Our sorting approach scores candidates according to how "distant" they are in sequence space from known examples of DDIs (templates). Thus, it provides a natural way to deal with the heterogeneity in domain-level interactions. CONCLUSIONS Our method represents a more informed application of local alignment to the sequence-based search for potential human-microbial interactions that uses available PPI data as a prior. Our approach is somewhat limited in its sensitivity by the restricted size and diversity of the template dataset, but, given the rapid accumulation of solved protein complex structures, its scope and utility are expected to keep steadily improving.
-
9.
Advances in protein complex analysis by chemical cross-linking coupled with mass spectrometry (CXMS) and bioinformatics.
Tran, BQ, Goodlett, DR, Goo, YA
Biochimica et biophysica acta. 2016;(1):123-9
Abstract
For the analysis of protein-protein interactions and protein conformations, cross-linking coupled with mass spectrometry (CXMS) has become an essential tool in recent years. A variety of cross-linking reagents are used to covalently link interacting amino acids to identify protein-binding partners. The spatial proximity of cross-linked amino acid residues is used to elucidate structural models of protein complexes. The main challenges for mapping protein-protein interaction are low stoichiometry and low frequency of cross-linked peptides relative to unmodified linear peptides as well as accurate and efficient matches to corresponding peptide sequences with low false discovery rates for identifying the site of cross-link. We evaluate the current state of chemical cross-linking and mass spectrometry applications with the special emphasis on the recent development of informatics data processing and analysis tools that help complexity of interpreting CXMS data. This article is part of a Special Issue entitled:Physiological Enzymology and Protein Functions.
-
10.
Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding.
Huang, YA, You, ZH, Chen, X, Chan, K, Luo, X
BMC bioinformatics. 2016;(1):184
Abstract
BACKGROUND Proteins are the important molecules which participate in virtually every aspect of cellular function within an organism in pairs. Although high-throughput technologies have generated considerable protein-protein interactions (PPIs) data for various species, the processes of experimental methods are both time-consuming and expensive. In addition, they are usually associated with high rates of both false positive and false negative results. Accordingly, a number of computational approaches have been developed to effectively and accurately predict protein interactions. However, most of these methods typically perform worse when other biological data sources (e.g., protein structure information, protein domains, or gene neighborhoods information) are not available. Therefore, it is very urgent to develop effective computational methods for prediction of PPIs solely using protein sequence information. RESULTS In this study, we present a novel computational model combining weighted sparse representation based classifier (WSRC) and global encoding (GE) of amino acid sequence. Two kinds of protein descriptors, composition and transition, are extracted for representing each protein sequence. On the basis of such a feature representation, novel weighted sparse representation based classifier is introduced to predict protein interaction class. When the proposed method was evaluated with the PPIs data of S. cerevisiae, Human and H. pylori, it achieved high prediction accuracies of 96.82, 97.66 and 92.83 % respectively. Extensive experiments were performed for cross-species PPIs prediction and the prediction accuracies were also very promising. CONCLUSIONS To further evaluate the performance of the proposed method, we then compared its performance with the method based on support vector machine (SVM). The results show that the proposed method achieved a significant improvement. Thus, the proposed method is a very efficient method to predict PPIs and may be a useful supplementary tool for future proteomics studies.