Candidate Targets Pipeline
Drugs, Antitoxins, and Vaccines
As part of NMPDR’s efforts to provide web-based resources to the scientific community
conducting research on organisms considered to be potential agents of biowarfare or bioterrorism
or to cause emerging or re-emerging diseases, the NMPDR Targets Pipeline, currently in development,
explores bioinformatic target identification coupled with in silico docking analysis.
A growing collection of candidate targets for therapeutic intervention is being generated from
NMPDR sequence data, curated literature annotations, and structural data from
Protein Data Bank (PDB). Our goal is to computationally select
sets of small organic compounds that have the potential to alter bacterial viability, virulence,
toxicity, or stimulation of the host immune response by interacting with these target proteins.
Reports of experimental evidence for essentiality, virulence, antibiotic susceptibility or resistance,
surface expression, secretion, and toxicity have been annotated as "attributes" of protein sequences in the NMPDR.
The curation effort represents a first pass through the literature, simply for proof of concept of the
downstream analysis. Only a few reports of high-throughput screening for such characteristics in important
pathogens have been encoded as attributes of the respective genes. We welcome suggestions
of literature that should be curated. These annotated characteristics are used to identify target proteins for in silico screening.
In addition to designated target proteins, which have both a PDB match and a curated annotation of a characteristic that implicates it as a specific target, all proteins with homologs in PDB have been compiled for one representative strain of each of the NMPDR organisms. These virtual structural proteomes, or homologous structural proteomes, are presented in tables with links to NMPDR protein pages and PDB. These tables provide a resource for target discovery.
Tables of potential targets of therapeutic molecular interactions are automatically generated from the NMPDR database according to annotations and BLASTP matches to proteins in PDB, which have experimentally determined structures. These proteins are presented in separate tables for drug, antitoxin, and vaccine targets. Entries are specific for a particular pathogenic strain and include links to NMPDR protein pages, PDB and the literature.
A curated subset of proteins that are essential in S. aureus
and/or S. pneumoniae
, have homologs in many of the bacterial priority pathogens, have a homolog in PDB with an experimentally solved structure, and have been included in a subsystem by NMPDR curators are presented in a table with links to both biochemical and structural data. These represent potential targets of broad-spectrum antibiotics and are the first set of targets to be used for computational molecular docking.
Taking advantage of the specific high performance computing capabilities at Argonne National Laboratory
and the Universiy of Chicago, we have deployed two automated docking systems (DOCK5 and Auto Dock? ) to to do
in silico screens of small compound structures against the three-dimensional structures of
selected targets. In silico screening involves computational molecular docking of a library of
compounds against a protein structure using an algorithm to compute the binding energy of a ligand to a
protein structure. Preliminary results for a few targets have been used to optimize a novel method for compound selection.
Potential drug compounds are taken from the ZINC database. ZINC is a free database of more than 4.6 million, commercially available compounds, which are presented in ready-to-dock, 3D formats with links to vendors. Compounds are thus available for biochemical or virtual screening. Other sources of information about small chemical compounds that may be used as potential drugs include ChemDB, Super Drug Database, DrugBank, and Drugs@FDA.
Identification of candidate targets and in silico screening follows these steps:
- Find published evidence for characteristics such as essential for viability, antibiotic target, antibiotic resistance factor, virulence factor, surface determinant, or toxin.
- Screen annotated functional roles and subsystem associations of genes with any of these characteristics to find those that are common among the pathogens but functionally or structurally distinct from the host.
- Compare candidate sequences to the Protein Data Bank (PDB) with BLASTP to find homologous proteins with experimentally solved structures.
- Compute the number and size of potential active sites on the PDB coordinates of the remaining candidates with the PASS (Putative Active Sites with Spheres) algorithm.
- Use DOCK5 configured for fast, large library screening to compute the binding energy with the target protein of 1500 compounds randomly chosen from the ZINC database.
- Train a back propagation neural network specific for the target with the resulting docking energies and a multi-dimensional orthologous vector describing the nine quantitative properties associated with these compounds.
- Use the neural net to predict the binding energies of all compounds in the ZINC database and select a set of compounds enriched for probable ligands.
- Use DOCK5 to compute the binding energies with the target protein of each compound in this enriched set of ligands.
- Create visualizations of top-ranked ligands bound to the protein to inspect interactions between ligand functional groups and key residues of the protein.