The NMPDR offers a set of genome caparison tools for analyzing the genomic data in our database.
The Annotation Clearinghouse contains annotations from all the major databases as well as assertions submitted by experts. Selecting a genome in this tool will show you all the clearinghouse annotations for that genome.
The Homolog Spreadsheet Tool compares all proteins in the selected genomes to those in a reference genome. The results are presented in a table, with different genomes shown side-by-side in columns, and proteins in rows. The proteins are listed in order of their appearance in the selected reference genome.
The motivation for the Signature Genes Tool is to try to locate genes related to a phenotype that is associated with one set of organisms (call this set1) but not with another (call these set2).
The search goes through the genes in one organism from Set 1, selected as the reference genome. For each gene in the reference genome, the tool evaluates the bidirectional best hits of the genes that occur in genomes from set1 and set2. It tabulates these and constructs a score from 0 to 1. A score of 1 means that the gene has a bidirectional best hit in every genome from set1 and no bidirectional best hits against any genome in set2.
The scores are tabulated. The best candidate genes are then presented to you as a list of genes to explore. The main shortcoming of the tool relates to our use of bidirectional best hits. If there are paralogs to the gene within genomes, a bidirectional best hit may not exist in a genome that contains several clear homologs. This means that we may miss genes with paralogs, and we may include genes that do not discriminate as well as we seem to indicate. This means that you must explore each gene as a candidate, but nothing more. There is now the option of running the tool using precomputed similarities rather than bidirectional best hits.
Topic revision: r4 - 15 Feb 2009 - 17:10:06 - TWiki Guest