Machine-Learning in Drug Discovery

Virtual high-throughput screening (vHTS) can be used in drug discovery to replace experimental HTS or to preselect a subset of compounds for screening to reduce costs and time. The two main approaches to vHTS are structure-based or ligand-based screening. Structure-based screening does not need any known active molecules, but does need either a crystal structure or good homology model of the target protein. Ligand-based screening, on the other hand, does not need any knowledge about the target protein but identifies common patterns and features of known active molecules by creating a 2D fingerprint or 3D pharmacophore.

Scientists at Imperial College London/Equinox Pharma have developed a new rule-based vHTS methodology (INDDEx) that exploits logic-based machine learning to enhance performance in ligand-based screening.  INDDEx (Investigational Novel Drug Discovery by Example) is particularly good at identifying actives that are structurally distinct from the training set, making it useful for scaffold-hopping.

INDDEx learns easily interpretable qualitative logic rules from active ligands. These rules – in the form of ‘an active molecule requires fragment A and fragment B separated by a distance in Angstroms’ or ‘an active molecule requires the presence of fragment C’ or ‘an active molecule must NOT contain fragment D’ – give an insight into chemistry, relate molecular substructure to activity and can be used to guide the next steps of drug design chemistry. These qualitative rules are then weighted using Support Vector Machines (SVMs) to produce QSAR rules that can be used to generate novel in silico hits.

INDDEx (Investigational Novel Drug Discovery by Example)


INDDEx has been shown to be a powerful new approach to virtual screening whose strength lies in learning topological descriptors of multiple active compounds although, when considering scaffold hopping in isolation, INDDEx performs well even when there are small numbers of active molecules to learn from. One very attractive feature of INDDEx is that the rules that are produced can be readily understood and used by medicinal chemists. The technology has been extensively validated and shown to outperform comparable approaches (J. Phys. Chem. B 2012, 116, 6732-6739). In a study between Equinox and Imperial on sirtuin 2 (an NAD-dependent histone deacetylase) , INDDEx combined with structure-based docking was able to learn from only eight actives and identify a chemically novel hit that was experimentally validated to have an IC50 of 0.6 µM.

INDDEx has wide-scale applications including  rescuing failed programmes, directing hit-to-lead programmes and scaffold-hopping. Furthermore, INDDEx has the potential to derive rules for off-target activities such as the hERG receptor.

If you would like to find out more about INDDEx and Equinox Pharma, please contact us.

SkelGen – A Newly Available Tool for Computational Drug Design

drug designWith the rapidly growing body of biostructural information, structure-based drug design has increased in importance and a variety of computational methods have found a place in the drug discovery toolkit.

The de novo design program, SkelGen, was developed by De Novo Pharmaceuticals based on research begun in the Department of Pharmacology at the University of Cambridge. SkelGen constructs candidate ligands by assembling small molecular fragments within a protein target such as an enzyme or receptor (usually derived from X-ray crystal data). When growing a ligand, SkelGen uses information coded in the fragments and within its algorithm to favour synthetically tractable molecules. SkelGen is able to explore around one trillion low molecular weight, drug-like molecules using a default set of 1600 fragments. Since the accessible chemical space is so large, the majority of designed molecules are novel and patentable.

Whilst SkelGen can be run with minimal input, it also permits extensive control by the end-user, allowing the scientist to incorporate prior knowledge and insights into the drug design process. As well as completely de novo design, molecule generation can also be started from a user-defined fragment (for example, a low-affinity molecule identified by fragment-screening). SkelGen can also be used for scaffold hopping (chemotype switching) and focused library design.

Until recently SkelGen was only accessible through collaborations with De Novo Pharmaceuticals but is now available under both academic and commercial licenses. With these new licensing models, SkelGen can be a cost-effective (and accessible) tool for all scientists engaged in drug design. If you would like to find out more about SkelGen, please contact us.

Targeting Orphan Receptors for Multiple Sclerosis

ROR-alpha ligand binding domain
Crystal structure of the ligand-binding domain of RORα complexed with cholesterol sulfate. PDB ID=1S0X
Scientists at the Scripps Research Institute have reported on compounds that are able to suppress severity and disease progression in animal models of multiple sclerosis. The compounds, exemplified by SR1001, act by selectively suppressing a subset of T-helper cells characterised by their production of interleukin-17 (TH17 cells). TH17 cells have been implicated in a variety of autoimmune diseases including rheumatoid arthritis, multiple sclerosis, inflammatory bowel disease and systemic lupus erythematosus.

SR1001 selectively binds to two orphan nuclear receptors: retinoic acid receptor-related orphan receptors α and γt (RORα and RORγt). These receptors have indispensible roles in the development and function of TH17 cells, providing a mechanism for modulating one component of the immune system without general immunosuppression. The team reports that SR1001 induces a conformational change in the receptors that results in their reduced affinity for co-activators and increased affinity for co-repressors. The net result is inhibition of the receptors’ transcriptional activity.

SR1001 blocked the development of murine TH17 cells and inhibited cytokine production by differentiated murine and human TH17 cells. Although a drug is some way off, the team suggests that the results demonstrate the feasibility of targeting TH17 cells and the potential of such an approach for the treatment of autoimmune diseases.

The study is published in Nature.

Role for c-Abl in Parkinson’s Disease

Shaking Man
Image: Flickr - bearroast
The majority of Parkinson’s disease (PD) cases have no known cause, but have been associated with increased oxidative stress and mitochondrial dysfunction. Of the small proportion of hereditary cases, a number of defective genes have been identified including LRRK2 (PARK8), DJ-1 (PARK7), α-synuclein (SNCA) and parkin (PARK2). Mutations in parkin, which encodes an E3 ubiquitin ligase, are believed to interfere with the ability of parkin to clear the cell of its normal substrate proteins. Several substrates for parkin have been identified and shown to accumulate in the brain tissue of patients with hereditary PD.

Researchers at The University of Texas Health Science Center have now identified a link between the tyrosine kinase, c-Abl, and impaired parkin function. The scientists found that c-Abl was activated in cultured neuronal cells and the striatum of adult mice when subjected to oxidative and neuronal stress. They also identified parkin as a specific substrate for c-Abl and that the tyrosine-phosphorylated parkin lost its ubiquitin ligase activity.

The c-Abl inhibitor, imatinib (STI-571) was able to block the phosphorylation of parkin in vitro and in vivo, restoring ligase activity. Since there are several c-Abl inhibitors approved for the treatment of chronic myelogenous leukemia, tools are available to further explore the neuroprotective potential of c-Abl inhibition in sporadic PD.

The study is published in the Journal of Neuroscience.

A Histidine Kinase as a Target for Autoimmune Diseases

NDPK-B crystal structure
Crystal structure of the hexameric NDPK-B protein - PDB ID=3BBB
The nucleoside diphosphate kinases (NDPK) comprise a family of 10 members encoded by the Nme (non-metatstatic cell) gene family. These kinases are capable of transferring the γ-phosphate of nucleoside triphosphates to nucleoside diphosphates, which is accomplished via a phospho-histidine intermediate. Since their discovery, the NDPKs have been shown to play a role in numerous cellular processes. Of the 10 members, NDPK-A and B (also known as NM23-H1 and NM23-H2 respectively) are ubiquitously expressed and account for >95% of NDPK activity in most cells.

NDPK-A and B regulate cellular processes through a variety of mechanisms including generation of nucleoside triphosphates, histidine phosphorylation, protein-protein interactions and regulation of downstream signalling pathways. Interestingly, the NDPKs are currently the only known histidine kinases found in mammals.

Despite sharing 88% sequence identity, NDPK-A and B have each been associated with specific functions. Nevertheless, there appears to be significant redundancy within the family. NDPK-A knockout mice have been reported to be phenotypically normal, with the exception of reduced birth weight and delayed mammary development. However, double knockout of NDPK-A and B results in stunted mice that die perinatally as a result of severe anaemia and abnormal erythroid development.

Now a team at New York University Medical Center have reported the mouse knockout of NDPK-B. Previously the team had shown that NDPK-B activates the K+ channel, KCa3.1, by phosphorylation of 358His in the KCa3.1 carboxy terminus. Since this activation is required for T-cell receptor stimulation of Ca2+ flux and proliferation of naïve human CD4+ T-cells, the team speculated that inhibition of NDPK-B could represent a target for therapy of autoimmune diseases.

The NDPK-B knockout mice were phenotypically normal at birth, with normal T and B cell development. KCa3.1 channel activity and cytokine production were defective in Th1 and Th2 cells (but normal in Th17 cells), however, confirming the importance of NDPK-B in T cell activation. The data support the concept of NDPK-B inhibition as a therapeutic strategy, although specificity for NDPK-B over the A isoform will be necessary. Given the degree of sequence conservation between the two isoforms, this could be a significant challenge.

The BMK1 Pathway in Oncology – a Road Less Travelled

Image: Flickr – Zach Chisholm
Of the four mammalian MAP kinase pathways (ERK1/2, JNK, p38 and BMK1), BMK1 is the least studied. BMK1 and ERK1/2 pathways are both activated by mitogens and oncogenic signals and are therefore implicated in tumorigenesis. Indeed, the ERK1/2 pathway has received significant attention for the development of chemotherapeutic drugs. Deregulated BMK1 activity has been associated with a variety of human malignancies including chemoresistance of breast tumours, metastasis of prostate tumour cells and tumour-associated angiogenesis. Conditional knockout of endothelial BMK1 in mice, however, led to lethal vascular instability, discouraging exploration of BMK1 as a therapeutic target.

XMD8-92 structure
A new study from scientists at the Scripps Research Institute has revealed more detail on the role of BMK1 in oncogenesis and suggests that BMK1 inhibition could be a viable therapeutic strategy. The study found that BMK1 is associated with the tumour suppressor, PML (promyelocytic leukemia protein), and suppresses its anti-cancer activity. In cellular studies, reduced expression of BMK1 resulted in induced expression of p21, a downstream effector of PML and modulator of cell proliferation.

The team’s serendipitous discovery of a selective inhibitor of BMK1, XMD8-92, permitted further studies in animal models. XMD8-92 significantly inhibited the growth of xenografted human tumours in mice, with no obvious adverse effects. More specifically, in contrast to the BMK1 conditional knockout studies, no vascular instability was observed in response to pharmacological inhibition of BMK1.

The study is published in Cancer Cell.

Speeding the Cellular Waste Disposal System

Garbage Trucks
Image: Flickr - Michael Kohli
The ubiquitin-proteasome system (UPS) is a critical element of the cellular machinery, responsible for removing unwanted proteins. Target proteins, which may be misfolded, oxidised or simply no longer required, are marked for degradation by attachment of ubiquitin chains. The ubiquitinated proteins are then recognised by the proteasome and subjected to proteolytic cleavage. Failure of the UPS leads to a build up of damaged or misfolded proteins that may result in cellular toxicity.

Accumulation of misfolded proteins is a feature of a number of human disorders including Alzheimer’s, Parkinson’s and Creutzfeldt-Jakob diseases. Upregulation of the UPS is therefore of interest for potential therapy of these disorders.

Usp14 crystal structure
Complex of Usp14 (grey spheres) with ubiquitin-aldehydye (blue ribbon). PDB ID=2AYO

A team from Harvard Medical School has been investigating the role of Usp14, a de-ubiquitinating enzyme associated with the proteasome. They found that Usp14 is able to inhibit the proteasomal degradation of ubiquitin-protein conjugates both in vitro and in cells. A catalytically inactive variation of Usp14 had reduced inhibitory properties, suggesting that Usp14 mediates its effects by cleavage of ubiquitin from the substrate proteins.

The team identified a small molecule inhibitor of Usp14 using high-throughput screening and treatment of cultured cells with this compound enhanced the degradation of proteasome substrates associated with neurodegeneration. The compound also accelerated the degradation of oxidised proteins and improved cellular resistance to oxidative stress.

The study, published in Nature, sheds light on a poorly understood aspect of the UPS – the control of the speed of protein degradation. The authors suggest that inhibition of Usp14 may be a strategy to address a variety of human diseases where accumulation of aberrant proteins is a factor.

What’s the Structure of……??

chemicalized web page For those who haven’t encountered it yet, is a tool for adding chemical information to your web browsing – and it’s free! Provided by the chemistry software company, ChemAxon, under a Creative Commons license, chemicalize can convert chemical names to structures – either on a query basis or converting an entire web page. In addition, ChemAxon have now added a page of calculated properties that can be accessed by clicking the generated 2D structure.

The tool will convert trivial chemical names such as saquinavir, as well as IUPAC names such as 7-chloro-1-methyl-5-phenyl-3H-1,4-benzodiazepin-2-one. Structure generation is based on ChemAxon’s name-to-structure software, although conversion of trivial names presumably relies on a database. I don’t know how comprehensive the database is, but it can certainly generate some interesting results when converting a web page – I hadn’t realised that trigger was a trivial name for something!

When a web page is converted, recognised structures are underlined in the text. Hovering over the underlined text produces a tooltip with the 2D structure of the molecule and this can be clicked to visit the calculated properties page. The properties page provides a variety of useful information including, logP, rotatable bond count, pKa etc. The layout of the properties page can also be adjusted by the user, with some standard layouts provided for medicinal or synthetic chemist.

You can see the chemicalized version of this post here and visit for further information.

Low-Fat or Low-Carb?

Image: Flickr – malias
There has long been debate about the relative merits of a low-carbohydrate diet, as popularised by Atkins, compared to the more traditional low-fat approach to weight loss. A low-carbohydrate diet has also been anecdotally associated with adverse effects on health.

A newly published clinical study, led by researchers at the Center for Obesity Research and Education at Temple University, Philadelphia, has now shown remarkably little difference between the two regimes. The study followed over 300 subjects randomly assigned to either diet over a two year period and, importantly, combined the diets with comprehensive behavioural treatment.

In the low-carb group, carbohydrate intake was limited to 20 g/d for 3 months in the form of low–glycemic index vegetables with unrestricted consumption of fat and protein. After 3 months, participants were allowed to increase their carbohydrate intake (5 g/d per wk) until a stable and desired weight was achieved. The low-fat diet consisted of limited energy intake (1200 to 1800 kcal/d) with less than 30% of the calories derived from fat. For the behavioural treatment, each participant attended group sessions weekly for the first 20 weeks of the study, every other week for the next 20 weeks, and once every other month for the remainder of the study. In each session, participants discussed topics such as goal setting, self-monitoring, and limiting triggers to overeating.

Although attrition was high at 2 years, there were no differences in weight, body composition, or bone mineral density between the groups at any time point. Weight loss was approximately 11 kg (11%) at 1 year and 7 kg (7%) at 2 years. The low-carbohydrate diet group had greater increases in high-density lipoprotein cholesterol (“good” cholesterol) levels at all time points, increasing by approximately 23% at 2 years, suggesting that a low-carb diet may have some cardiovascular benefit.

Gary Foster, Director of Temple’s Center for Obesity Research and Education and lead author of the study said:

When comparing these two popular weight loss plans, none of the existing research had included a comprehensive, long-term, behavioural support component. This research tells us that people wanting to manage their weight need to be less concerned with which diet they choose, and more concerned with incorporating behavioural changes into their plan.

The study is published in Annals of Internal Medicine.

Illuminating the Link between Bone and Metabolism

Bone Tree
Image: Flickr – Livin-Lively
Two back-to-back studies published in the July 23rd issue of Cell, one from Columbia University Medical Center and the other from Johns Hopkins researchers, further the hypothesis that metabolic control and bone remodelling are inextricably linked. Both studies point to osteocalcin, a hormone released by bone, as a key mediator of this link.

The Johns Hopkins study used a conditional knock-out in mice to specifically suppress the expression of the insulin receptor in osteoblasts, the bone-forming cells of the skeletal system. As the mutant mice aged they became fat, had elevated blood sugar, and were glucose intolerant and resistant to insulin, mirroring the picture of diabetes in humans. The researchers found that the mutant mice had fewer osteoblasts, reduced bone formation and lower levels of circulating undercarboxylated osteocalcin (the active form of the hormone). The study showed that signalling via the insulin receptor in osteoblasts suppressed Twist2, an inhibitor of osteoblast development, and enhanced expression of osteocalcin, a mediator of insulin sensitivity and secretion.

The Columbia study links the complete bone remodelling process to energy regulation. Osteocalcin is released from osteoblasts predominantly in an inactive, carboxylated form. The researchers demonstrated that insulin signalling in osteoblasts stimulates release of inactive osteocalcin and activates osteoclasts, which activate the osteocalcin via decarboxylation in a bone-resorption-dependent manner.

The studies clearly have potential impact on human therapy, although significant questions remain. As yet the receptor for undercarboxylated osteocalcin is unknown, so the mechanism by with the hormone stimulates insulin release is unclear. Further work will be necessary to understand the interplay between skeletal- and metabolic-homeostasis in humans.

The two papers are previewed in Cell.