How to View Features on the Annotation Overview Page
Now we want to explore the NMPDR environment for one target protein, so let's all get on the same page. From the
results table (opens in new window) of a keyword search for "2.7.6.3 listeria" click on the FIGid in the row corresponding to strain EGD-e (go directly to page by
clicking here )
The tour of the Annotation Overview page and its associated tools will follow this workflow:
Annotation Overview
Details table
The Annotation Overview page lists details of the functional annotation of the focus peg such as the NMPDR database ID, organism name, and current functional assignment for the selected protein. Gene annotation is an ongoing process, therefore, viewing annotation details can greatly aid in the discernment of functional roles. You may view the history of the annotation or open the Annotation Clearinghouse (ACH) to show assignments for essentially identical proteins, which will open a table of the functional assignments for the focus protein in other databases, such as
UniProt, KEGG,
SwissProt, etc. The assigned functions should be the same in all cases because the amino acid sequences are the same, with any variation limited to the location of the start site. Links are provided to the
NCBI taxonomy browser as well as to our genome browser, evidence page, and sequence page for the focus protein. The sequence page provides your choice of protein sequence, DNA sequence, or DNA sequence including any input number of nucleotides flanking the gene on either side. Sequences are provided in FASTA format for easy copy and paste into other programs.
Support for the annotation is provided in an automatically generated statement that describes whether the focus protein plays a role in a curated subsystem, and provides links to supporting documentation in PubMed when available. Further evidence supporting the annotation is found on the evidence page.
Compare regions
Graphical View
The focus protein is highlighted red in the center of the Compare Regions display. The chromosomal region of the focus gene (top) is compared with four similar organisms in the default view. The graphic is centered on the focus gene, which is red and numbered 1. Sets of genes with similar sequence are grouped with the same number and color. Genes whose relative position is conserved in at least four other species are functionally coupled and share gray background boxes. Non-homologous proteins and some non-protein features are also shown in gray. Mousing over the arrows will display information for that gene, and clicking on any arrow will open the annotation overview for the selected protein with the compare regions display refocused on it. The focus gene always points to the right, even if it is located on the minus strand. When the boundaries of features overlap, one is depicted offset below the line.
- Type "15" in the text box to increase the number of genomic regions shown to a total of 16 (including the focus). Click the button to update the graphic. Notice that the result is showing all Listeria because we presently have 16 listerial genomes in NMPDR.
- Click the Advanced button to get more controls for Display Options. Click the radio button to collapse close genomes, then update the graphic again. Now there are only two listeria listed among a much broader set of genomes displayed. These are genomes in which the sequence of the focus peg has a high degree of similarity with its homologs in these genomes.
- Click Advanced Display Options again, and now click the radio button to pin the selection of genomes on Pairs of Co-localized Homologs (PCH pin) rather than on similarity to the focus protein. Update the graphic again and now see that the display shows genomes in which both the focus peg and the co-localized dihydropteroate synthase (EC 2.5.1.15) share a high degree of sequence similarity.
Tabular View
The peg that is the focus of our interest is also shown highlighted in red. Rows in the table above and below the highlighted peg describe pegs that appear up- and down-stream from the peg of interest. Additional lines are provided for every organism and every feature displayed. Column headings are active, so you may filter the data shown by selecting an organism name or typing the name of a functional role. Columns list feature id numbers and nucleotide coordinates for the start and stop of each gene, along with the lengths of the genes, location on the plus or minus strand, and the size of gaps or overlaps between neighbors. There are two columns in the table that are linked to the results of precomputed comparative analyses of functional coupling and clustering.
Proximity is most likely to have a functional basis when observed with high frequency and across a wide variety of organisms. For pairs of genes, this has been computed as a functional coupling score, fc-sc. The score takes into account the number of genomes in which the two genes are neighbors, and the phylogenetic distance between the genomes. The proteins shown with gray background shading in the graphical view are functionally coupled with the focus peg and will have a functional coupling score listed in the table. Its value is linked in the column headed "fc-sc." Click the score to open a table listing instances in which the scored protein is co-localized with the focus protein.
- Click on the score for the upstream protein, Dihydroneopterin aldolase (EC 4.1.2.25). This function is co-localized with the focus peg, and both play roles in the same subsystem, Folate Biosynthesis. The proteins share proximity in many different organisms.
- Go back to the tabular view of Compare Regions for HPPK in L.mo. EGD-e and this time click on the score for Cell division protein ftsH (EC 3.4.24.-). Notice that these two functions are proximal only in closely related genomes, so there is likely something other than shared function driving the conservation their location.
The context of the focus peg or any of its neighbors may be preserved in other organisms in clusters made up of more or fewer other genes. Click the clusters button for the focus peg to see the number of other genomes with clusters involving this function. For proteins without functional coupling scores, try clicking the clusters button to find homolgs that may be clustered with different functions in other genomes.
- Click on the clusters button for the focus (red highlighted) peg. The 8-protein cluster found in L.mo. EGD-e is detailed at the top of the page. Other organisms in which homologs of the focus peg are clustered are listed in a table, ordered by the size of the cluster.
Feature Evidence
To view a table of homologous sequences in other genomes, click on the link to Feature Evidence. The page returns a graphic and a table of homologous proteins in other organisms precomputed using BLASTP and ranked by similarity. Other evidence supporting the annotation, such as subcellular localization and domain structure, which are computed from the sequence, are also shown on the evidence page.
--
Leslie Mc Neil - 02 Dec 2008