Subsystems are a generalization of the concept of pathways, and they have two components. First is a list of
functional roles that are united by any common process or biologically meaningful organizing principle. These are often metabolic pathways, but subsystems may also describe complex structures or phenotypes. Second is a spreadsheet, called a populated subsystem, which is a two-dimensional integration of biological functions with genome sequences. In the populated subsystem, functional roles are represented in columns, genomes are represented in rows, and cells of the spreadsheet are populated by the genes responsible for each function.
The table below shows a portion of the spreadsheet for the
Arginine Biosynthesis subsystem. The name and ID of each participating organism are shown in the first two columns. The third column shows the ''variant code'', and the remaining columns correspond to the functional roles used to perform the subsystem's metabolic action, which in this case is the manufacture of Arginine.
| Genome ID |
Organism |
Variant Code |
argA/J |
argB |
argC |
argD |
argE |
argF/I |
argG |
argH |
ArgJ |
ArgR |
| 64091.1 |
Halobacterium sp. NRC-1 [A] |
0 |
|
|
|
|
|
2495 |
2066 |
2065 |
|
| 257309.1 |
Corynebacterium diphtheriae NCTC 13129 [B] |
1 |
1116 |
1117 |
1115 |
1118 |
|
1119 |
1121 |
1122 |
1116 |
1120 |
| 196164.1 |
Corynebacterium efficiens YS-314 [B] |
1 |
1527 |
1528 |
1526 |
1529 |
|
1530 |
1532 |
1533 |
1527 |
1531 |
| 196627.4 |
Corynebacterium glutamicum ATCC 13032 [B] |
1 |
1585 |
1586 |
1584 |
1587 |
|
1588 |
1590 |
1591 |
1585 |
1589 |
| 83332.1 |
Mycobacterium tuberculosis H37Rv [B] |
1 |
1655 |
1656 |
1654 |
1657 |
|
1658 |
1660 |
1661 |
1655 |
1659 |
| 247156.1 |
Nocardia farcinica IFM 10152 [B] |
1 |
1942, 3756 |
1943 |
1941 |
1944 |
|
1945 |
1967 |
1968 |
1942 |
1946 |
| 266940.1 |
Kineococcus radiotolerans SRS30216 [B] |
1 |
2472 |
2473 |
2471 |
2474 |
673 |
2475 |
4581 |
2477 |
2472 |
2476 |
| 281090.3 |
Leifsonia xyli subsp. xyli str. CTCB07 [B] |
1 |
38 |
39 |
37 |
40 |
|
41 |
1774 |
42 |
38 |
|
| 100226.1 |
Streptomyces coelicolor A3(2) [B] |
1 |
1546 |
1545 |
1547 |
1190, 1251 |
1544 |
5923 |
6971 |
960 |
1546 |
1543 |
| 206672.1 |
Bifidobacterium longum NCC2705 [B] |
1 |
1002 |
1001 |
1003 |
1000 |
|
999 |
997 |
996 |
1002 |
998 |
Each subsystem can have one or more variants, each of which uses a slightly different subset of the roles. The
variant code specifies which of these variants is used be the specified genome. A variant code of
0 indicates that many of the functional roles are missing. A variant code of
-1 indicates that the subsystem is not functional for the specified genome. Arginine Biosynthesis has only one real variant, so all the variant codes in the example are either 1 or 0.
To save space, the role names are abbreviated and the individual genes are shown as ID numbers only. To construct a
FigId from the ID number, you paste it to the genome ID using the indicated
gene type? . A missing gene type means that the gene is a
ProteinEncodingGene, or
peg. So, the
FigId for the
argB gene of
Streptomyces coelicolor A3(2) is
fig|100226.1.peg.1545.
Each subsystem has one or more biologists assigned as
curators. The curators use the subsystems to annotate new genes and to make the annotations of existing genes more consistent. This process has been going on since 2003. In 2007, the
RapidAnnotationServer (RAST) went online. The
RapidAnnotationServer uses subsystems to automatically call and annotate new genomes.