Machine-Learning in Drug Discovery


Virtual high-throughput screening (vHTS) can be used in drug discovery to replace experimental HTS or to preselect a subset of compounds for screening to reduce costs and time. The two main approaches to vHTS are structure-based or ligand-based screening. Structure-based screening does not need any known active molecules, but does need either a crystal structure or good homology model of the target protein. Ligand-based screening, on the other hand, does not need any knowledge about the target protein but identifies common patterns and features of known active molecules by creating a 2D fingerprint or 3D pharmacophore.

Scientists at Imperial College London/Equinox Pharma have developed a new rule-based vHTS methodology (INDDEx) that exploits logic-based machine learning to enhance performance in ligand-based screening.  INDDEx (Investigational Novel Drug Discovery by Example) is particularly good at identifying actives that are structurally distinct from the training set, making it useful for scaffold-hopping.

INDDEx learns easily interpretable qualitative logic rules from active ligands. These rules – in the form of ‘an active molecule requires fragment A and fragment B separated by a distance in Angstroms’ or ‘an active molecule requires the presence of fragment C’ or ‘an active molecule must NOT contain fragment D’ – give an insight into chemistry, relate molecular substructure to activity and can be used to guide the next steps of drug design chemistry. These qualitative rules are then weighted using Support Vector Machines (SVMs) to produce QSAR rules that can be used to generate novel in silico hits.

INDDEx (Investigational Novel Drug Discovery by Example)


INDDEx has been shown to be a powerful new approach to virtual screening whose strength lies in learning topological descriptors of multiple active compounds although, when considering scaffold hopping in isolation, INDDEx performs well even when there are small numbers of active molecules to learn from. One very attractive feature of INDDEx is that the rules that are produced can be readily understood and used by medicinal chemists. The technology has been extensively validated and shown to outperform comparable approaches (J. Phys. Chem. B 2012, 116, 6732-6739). In a study between Equinox and Imperial on sirtuin 2 (an NAD-dependent histone deacetylase) , INDDEx combined with structure-based docking was able to learn from only eight actives and identify a chemically novel hit that was experimentally validated to have an IC50 of 0.6 µM.

INDDEx has wide-scale applications including  rescuing failed programmes, directing hit-to-lead programmes and scaffold-hopping. Furthermore, INDDEx has the potential to derive rules for off-target activities such as the hERG receptor.

If you would like to find out more about INDDEx and Equinox Pharma, please contact us.

Leave a Reply

Your email address will not be published. Required fields are marked *