Candidate Targets Pipeline

Drugs, Antitoxins, and Vaccines

As part of NMPDR’s efforts to provide web-based resources to the scientific community conducting research on organisms considered to be potential agents of biowarfare or bioterrorism or to cause emerging or re-emerging diseases, the NMPDR Targets Pipeline, currently in development, explores bioinformatic target identification coupled with in silico docking analysis. A growing collection of candidate targets for therapeutic intervention is being generated from NMPDR sequence data, curated literature annotations, and structural data from Protein Data Bank (PDB). Our goal is to computationally select sets of small organic compounds that have the potential to alter bacterial viability, virulence, toxicity, or stimulation of the host immune response by interacting with these target proteins.

Reports of experimental evidence for essentiality, virulence, antibiotic susceptibility or resistance, surface expression, secretion, and toxicity have been annotated as "attributes" of protein sequences in the NMPDR. The curation effort represents a first pass through the literature, simply for proof of concept of the downstream analysis. Only a few reports of high-throughput screening for such characteristics in important pathogens have been encoded as attributes of the respective genes. We welcome suggestions of literature that should be curated. These annotated characteristics are used to identify target proteins for in silico screening.

The Candidates

Virtual Structural Proteomes

In addition to designated target proteins, which have both a PDB match and a curated annotation of a characteristic that implicates it as a specific target, all proteins with homologs in PDB have been compiled for one representative strain of each of the NMPDR organisms. These virtual structural proteomes, or homologous structural proteomes, are presented in tables with links to NMPDR protein pages and PDB. These tables provide a resource for target discovery.

Potential Targets

Tables of potential targets of therapeutic molecular interactions are automatically generated from the NMPDR database according to annotations and BLASTP matches to proteins in PDB, which have experimentally determined structures. These proteins are presented in separate tables for drug, antitoxin, and vaccine targets. Entries are specific for a particular pathogenic strain and include links to NMPDR protein pages, PDB and the literature.

Broad spectrum targets for in silico and in vitro drug screening

A curated subset of proteins that are essential in S. aureus and/or S. pneumoniae, have homologs in many of the bacterial priority pathogens, have a homolog in PDB with an experimentally solved structure, and have been included in a subsystem by NMPDR curators are presented in a table with links to both biochemical and structural data. These represent potential targets of broad-spectrum antibiotics and are the first set of targets to be used for computational molecular docking.

Virtual Screening

Taking advantage of the specific high performance computing capabilities at Argonne National Laboratory and the Universiy of Chicago, we have deployed two automated docking systems (DOCK5 and Auto Dock? ) to to do in silico screens of small compound structures against the three-dimensional structures of selected targets. In silico screening involves computational molecular docking of a library of compounds against a protein structure using an algorithm to compute the binding energy of a ligand to a protein structure. Preliminary results for a few targets have been used to optimize a novel method for compound selection.

Potential drug compounds are taken from the ZINC database. ZINC is a free database of more than 4.6 million, commercially available compounds, which are presented in ready-to-dock, 3D formats with links to vendors. Compounds are thus available for biochemical or virtual screening. Other sources of information about small chemical compounds that may be used as potential drugs include ChemDB, Super Drug Database, DrugBank, and Drugs@FDA.

The Pipeline

Identification of candidate targets and in silico screening follows these steps:

  1. Find published evidence for characteristics such as essential for viability, antibiotic target, antibiotic resistance factor, virulence factor, surface determinant, or toxin.
  2. Screen annotated functional roles and subsystem associations of genes with any of these characteristics to find those that are common among the pathogens but functionally or structurally distinct from the host.
  3. Compare candidate sequences to the Protein Data Bank (PDB) with BLASTP to find homologous proteins with experimentally solved structures.
  4. Compute the number and size of potential active sites on the PDB coordinates of the remaining candidates with the PASS (Putative Active Sites with Spheres) algorithm.
  5. Use DOCK5 configured for fast, large library screening to compute the binding energy with the target protein of 1500 compounds randomly chosen from the ZINC database.
  6. Train a back propagation neural network specific for the target with the resulting docking energies and a multi-dimensional orthologous vector describing the nine quantitative properties associated with these compounds.
  7. Use the neural net to predict the binding energies of all compounds in the ZINC database and select a set of compounds enriched for probable ligands.
  8. Use DOCK5 to compute the binding energies with the target protein of each compound in this enriched set of ligands.
  9. Create visualizations of top-ranked ligands bound to the protein to inspect interactions between ligand functional groups and key residues of the protein.
Topic revision: r9 - 15 Feb 2009 - 17:15:50 - TWiki Guest
Notice to NMPDR Users - The NMPDR BRC contract has ended and bacterial data from NMPDR has been transferred to PATRIC (, a new consolidated BRC for all NIAID category A-C priority pathogenic bacteria. NMPDR was a collaboration among researchers from the Computation Institute of the University of Chicago, the Fellowship for Interpretation of Genomes (FIG), Argonne National Laboratory, and the National Center for Supercomputing Applications (NCSA) at the University of Illinois. NMPDR is funded by the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contract HHSN266200400042C. Banner images are copyright © Dennis Kunkel.