The Role of Bioinformatics in Studying G-Protein-Coupled Receptors

Being the foremost target receptor superfamily for multiple drugs, GPCRs currently serve as the focal point for over 30% of the drugs in the market. Nevertheless, the paucity of GPCR structural and functional data obtained through experimental methods remains a challenge. Encouragingly, the integration of bioinformatics methodologies has facilitated the exploration of GPCR structural and functional attributes.

G-Protein-Coupled Receptors

G-protein-coupled receptors (GPCRs), a diverse array of membrane protein receptors, act as the conduits for diverse extracellular signals into cells. Typically featuring an intricate three-dimensional structure encompassing an extracellular N-terminus, 7 transmembrane α-helices (TM1-TM7), 3 extracellular loops (ECL1-ECL3), 3 intracellular loops (ICL1-ICL3), and an intracellular C-terminus, they orchestrate a multitude of biological functions through intricate interplays with other signaling pathways. GPCRs hold a pivotal position as drug target proteins, finding widespread application across diverse medical domains ranging from depression, pain, and obesity to asthma, anxiety, hypertension, cancer, cardiovascular diseases, Parkinson's disease, and diabetes, among others. The current count reveals a staggering 739 GPCR drugs for pain management, 486 for asthma treatment, and 480 for hypertension. Notably, more than 6,600 GPCR drugs are currently undergoing various stages of development. These GPCR drugs find categorization based on their modes of action as agonists, antagonists, inverse agonists, modulators, and others. Given their significant potential in the realms of drug development and scientific exploration, GPCRs have attracted substantial attention from the scientific fraternity.

Bioinformatics Research on GPCRs

The precise arrangement of amino acids within a protein dictates its intricate three-dimensional configuration, ultimately influencing its biological role. Unraveling the three-dimensional structure of GPCRs stands as a pivotal avenue for gaining deeper insights into their functionalities. Given that GPCRs manifest as intricate membrane proteins characterized by seven transmembrane helices, crystallizing them poses a formidable challenge. Capturing their spatial structure through X-ray diffraction remains an arduous task while capturing their dynamic structure in solution through nuclear magnetic resonance proves equally demanding.

Nonetheless, statistical analysis has unveiled that GPCRs typically exhibit a conserved transmembrane helical structure, underscored by distinctive sequence attributes, rendering them amenable to bioinformatics methodologies for predicting the positioning of their transmembrane helical segments. Present-day bioinformatics investigations concerning GPCRs predominantly revolve around three key facets: the identification of GPCR proteins, the anticipation of GPCR transmembrane regions, and the projection of GPCR function alongside the binding sites for drug ligands.

Identification of GPCR Proteins

Unraveling GPCR proteins via methods reliant on sequence similarity pivots on conserving functional sequences. By harnessing sequence alignment tools like BLAST, a delve into non-redundant nucleotide sequences, expressed sequence tags (ESTs), and protein sequences is conducted, unearthing potential GPCR sequences. When a novel sequence exhibits adequate resemblance to established GPCR sequences, the identification of GPCRs ensues through the scrutiny of transmembrane regions or the revelation of fresh GPCR subfamilies. Notably, this method has a drawback as the sheer volume of sequences for prognostication leads to a substantial computational burden, thereby intricately impacting the accuracy of the results.

Another viable approach involves the utilization of prediction software to flag all feasible open reading frames (ORFs) conceivably representing GPCR sequences. This process involves excluding known GPCR sequences from the existing protein sequence collection, thereby crafting a database comprising the residual unfamiliar ORF sequences. Subsequently, this database undergoes BLAST analysis employing a predefined set of established GPCR sequences. However, this approach, too, harbors certain limitations, primarily revolving around the moderate accuracy levels of various prediction software, consequently amplifying the method's reliance on these tools. Moreover, the outcomes of these predictions directly influence the interpretation of transmembrane helical regions, adding an extra layer of intricacy to the process.

Prediction of GPCR Transmembrane Regions

Precision in determining the transmembrane and non-transmembrane sectors of GPCR proteins assumes a pivotal role in unraveling their biological functions. In the mid-1980s, researchers unearthed the positively charged nature of the amino acids encircling the intramembrane region. Subsequently, the early 1990s witnessed the development of the TopPred tool, integrating the "positive charge inside" principle and hydrophobicity analysis, thereby significantly enhancing the precision of transmembrane region prognostication. By the late 1990s, the emergence of the MEMSAT tool leveraged the frequency of amino acid occurrence in the transmembrane core region, both inside and outside the membrane, and at the terminus of the transmembrane region, in comparison to the overall transmembrane protein. This preference, coupled with dynamic programming algorithms, propelled the anticipation of protein transmembrane regions.

Following a similar trajectory, the TMHMM prediction tool harnessed HMM statistical analysis to scrutinize the amino acid distribution in known transmembrane proteins at various points, including the two ends of the transmembrane region, the transmembrane core region, the intracellular and extracellular loops, and extended loops, alongside the amino acid distribution away from the membrane. By calculating the likelihood of each amino acid residue's placement within the transmembrane, intracellular, or extracellular domains, this tool streamlined the forecast of transmembrane regions in transmembrane proteins. Similarly, the HMMTOP prediction tool, founded on HMM, devised a model comprising five distinct states — the transmembrane core region, intracellular and extracellular loops, and intracellular and extracellular helices' tail amino acid residue distributions. Through the astute amalgamation of states showcasing the most pronounced disparities in amino acid distribution, this tool has facilitated the prognosis of transmembrane regions based on shifts in protein topology.

Prediction of GPCR Function and Drug Ligand Binding Sites

The molecular docking of GPCR proteins with drug ligands involves the placement of drug ligand molecules into the active site of the GPCR, based on the principles of spatial, shape, and property complementarity, to form a receptor-ligand complex with a specific relationship when the three-dimensional structure of the GPCR is known. The virtual database screening method based on receptor structure utilizes molecular docking technology to automatically match the receptor binding pocket with the three-dimensional structure of small molecules in the compound database. Then, an energy function based on molecular force fields or empirical functions is used to score the docking mode of the molecules, followed by the selection of a group of compounds that interact best with the receptor for biological activity testing. This approach significantly reduces the cost and difficulty of finding lead compounds. 

Reference

  1. Kaczor A. A.; et al. Computational methods for studying G protein-coupled receptors (GPCRs)[M]//Methods in cell biology. Academic Press. 2016, 132: 359-399.

Note: If you don't receive our verification email, do the following:

Copyright © Amerigo Scientific. All rights reserved.