Protein Encoding Gene
Protein encoding gene (PEG), protein coding sequence (
CDS), and open reading frame (ORF) are nearly synonymous terms. In the
FIG ID, the feature type for a protein encoding gene is
peg.
Protein encoding genes are commmonly described as
structural or
regulatory. Structural genes encode enzymes and structural proteins. Regulatory genes encode proteins that function to regulate the expression of other genes, often by binding specifically to short sequences of the DNA.
PEGs may be labeled as
hypothetical when there is an open reading frame (see
codon), but no experimental evidence that the encoded protein is expressed, or no suggestion by
similarity of the function of the encoded protein.
The table below shows the PEG counts for the NMPDR
core organisms along with an indication of how many are hypothetical and how many are known.
| Group name |
Genomes |
Protein Encoding Genes (PEGs) |
Named genes in subsystems |
Named genes not in subsystems |
Hypothetical genes in subsystems |
Hypothetical genes not in subsystems |
RNAs |
| Campylobacter |
15 |
27,372 |
12,905 |
8,942 |
358 |
5,691 |
1,043 |
| Chlamydiaceae |
10 |
10,112 |
4,439 |
2,627 |
58 |
3,235 |
419 |
| Haemophilus |
9 |
16,660 |
8,742 |
5,826 |
197 |
2,336 |
780 |
| Listeria |
16 |
51,610 |
22,933 |
17,040 |
992 |
11,658 |
1,685 |
| Mycoplasma |
12 |
8,900 |
4,027 |
2,948 |
40 |
2,343 |
508 |
| Neisseria |
5 |
11,319 |
4,809 |
3,462 |
134 |
1,679 |
372 |
| Staphylococcus |
17 |
44,572 |
20,304 |
13,096 |
1,456 |
11,524 |
1,995 |
| Streptococcus |
46 |
94,883 |
39,521 |
37,732 |
1,055 |
20,083 |
3,890 |
| Treponema |
2 |
3,826 |
1,171 |
1,547 |
10 |
1,197 |
134 |
| Ureaplasma |
13 |
8,522 |
3,654 |
2,482 |
47 |
2,425 |
583 |
| Vibrio |
20 |
79,814 |
35,035 |
27,596 |
1,152 |
17,951 |
3,925 |