FASTA format
FASTA format is a standard format for encoding DNA or protein sequences. A
FASTA file may contain a single or multiple sequences in FASTA format.
A single sequence is described by a
title line followed by one or more
data lines. The title line begins with a right angle bracket followed by a label. The label ends with the first white space character. Everything after that on the first line is considered a comment. The data lines begin right after the title line and contain the sequence characters in order. Each data line except the last should be exactly 60 letters long, although many programs allow a little flexibility on that score.
The box below shows a FASTA file containing multiple RNA genes from
Listeria monocytogenes 10403S. In this case, the letters are
DNA Nucleotide Codes, and the file extension would be either ".fasta" or ".fna" (for
Fasta Nucleic Acid). When the sequences are amino acids, file extensions are either ".fasta" or ".faa" (for
Fasta Amino Acid).
>fig|393133.3.rna.1
ggagaaatacccaagtccggctgaaggggacagactcgaaatctgttaggtggtgtatgc
cgcgccggggttcgaatccccgtttctccg
>fig|393133.3.rna.2
gggttgttagctcagttggtagagcagctgactcttaatcagcgggtcgggggttcgaaa
ccctcacaaccca
>fig|393133.3.rna.3
gcccatatagttaaacggatataacaagcccctcctaagggctagttcgtggttcgattc
cgcgtatgggcg
>fig|393133.3.rna.4
gccgctttagctcagttggtagagcacttccatggtaaggaaggggtcgtcggttcaaat
ccgacaagtggct
>fig|393133.3.rna.5
gtcctgatagctcagctggatagagcaacggccttctaagccgtcggtcgggggttcgaa
tccctctcaggacg
>fig|393133.3.rna.6
gagccgttagctcagttggtagagcatctgacttttaatcagagggtcgctggttcgaac
ccagcacggctca
>fig|393133.3.rna.7
gccggcttagctcagttggtagagcaactgatttgtaatcagtaggtcgcgagttcgact
cttgcagccggca
>fig|393133.3.rna.8
ggggaagtactcaagtggctgaagaggtgcccctgctaagggtataggtcgctcgcgcgg
cgcgagggttcaaatccctccttctccg
To see (and optionally download) an individual gene in FASTA format, you use the
sequence link on the annotation overview page.
In addition, most
NMPDR Search result pages allow you to download genes or locations in FASTA format, either as raw DNA or as translated protein sequences. If you are downloading them in DNA form, you can also specify a number of
flanking positions on either side. So, for example, to include the 50 base pairs before and after each gene, you would type
50 into the little box next to the
nt in the search results activity box.
The listing below was obtained by requesting for a 50 nucleotide flanking width
FASTA download for the results of a
search for luxR genes. The flanking nucleotides are shown in lower case; the nucleotides for the actual gene are shown in upper case. Note that for the sake of brevity, only the first three genes are shown.
>fig|273036.3.peg.1759 [Staphylococcus aureus RF122] Two component transcriptional regulator VraR, LuxR family
ttcaggtacacgtatcgaggtgaaagcacctttaaataaggaggattcgtATGACGATTA
AAGTATTGTTTGTGGATGATCATGAAATGGTACGTATAGGAATTTCAAGTTATCTATCAA
CGCAAAGTGATATTGAAGTAGTTGGTGAAGGCGCTTCTGGTAAAGAAGCAATTGCCAAAG
CCCATGAGTTGAAGCCAGATTTAATTTTAATGGATTTACTTATGGATGACATGGATGGTG
TAGAAGCGACGACTCAGATTAAAAAAGATTTACCGCAAATTAAAGTATTAATGTTAACTA
GTTTTATTGAAGATAAAGAGGTATATCGTGCATTAGATGCAGGTGTCGATAGTTACATTT
TAAAAACAACAAGTGCAAAAGATATCGCCGATGCAGTTCGTAAAACTTCTAGAGGAGAAT
CTGTTTTTGAACCGGAAGTTTTAGTGAAAATGCGTAACCGTATGAAAAAGCGCGCAGAGT
TATATGAAATGCTTACAGAACGAGAAATGGAAATATTATTATTGATTGCGAAAGGTTACT
CAAATCAAGAAATTGCTAGTGCATCGCATATTACTATTAAAACGGTTAAGACACATGTGA
GTAACATTTTAAGTAAGTTAGAAGTGCAAGATAGAACACAAGCTGTTATCTATGCATTCC
AACATAATTTAATTCAATAGttcatatcgaattaagaaaagttacttacgccaatcacaa
tataacatca
>fig|93062.4.peg.393 [Staphylococcus aureus subsp. aureus COL] Two component transcriptional regulator VraR, LuxR family
ttcaggtacacgtatcgaggtgaaagcacctttaaataaggaggattcgtATGACGATTA
AAGTATTGTTTGTGGATGATCATGAAATGGTACGTATAGGAATTTCAAGTTATCTATCAA
CGCAAAGTGATATTGAAGTAGTTGGTGAAGGCGCTTCTGGTAAAGAAGCAATTGCCAAAG
CCCATGAGTTGAAGCCAGATTTAATTTTAATGGATTTACTTATGGATGACATGGATGGTG
TAGAAGCGACGACTCAGATTAAAAAAGATTTACCGCAAATTAAAGTATTAATGTTAACTA
GTTTTATTGAAGATAAAGAGGTATATCGTGCATTAGATGCAGGTGTCGATAGTTACATTT
TAAAAACAACAAGTGCAAAAGATATCGCCGATGCAGTTCGTAAAACTTCTAGAGGAGAAT
CTGTTTTTGAACCGGAAGTTTTAGTGAAAATGCGTAACCGTATGAAAAAGCGCGCAGAGT
TATATGAAATGCTTACAGAACGAGAAATGGAAATATTATTATTGATTGCGAAAGGTTACT
CAAATCAAGAAATTGCTAGTGCATCGCATATTACTATTAAAACGGTTAAGACACATGTGA
GTAACATTTTAAGTAAGTTAGAAGTGCAAGATAGAACACAAGCTGTTATCTATGCATTCC
AACATAATTTAATTCAATAGttcatatcgaattaagaaaagttacttacgccaatcacaa
tataacatca
>fig|359787.3.peg.2603 [Staphylococcus aureus subsp. aureus JH1] Two component transcriptional regulator VraR, LuxR family
ttcaggtacacgtatcgaggtgaaagcacctttaaataaggaggattcgtATGACGATTA
AAGTATTGTTTGTGGATGATCATGAAATGGTACGTATAGGAATTTCAAGTTATCTATCAA
CGCAAAGTGATATTGAAGTAGTTGGTGAAGGCGCTTCTGGTAAAGAAGCAATTGCCAAAG
CCCATGAGTTGAAGCCAGATTTAATTTTAATGGATTTACTTATGGAAGACATGGATGGTG
TAGAAGCGACGACTCAGATTAAAAAAGATTTACCGCAAATTAAAGTATTAATGTTAACTA
GTTTTATTGAAGATAAAGAGGTATATCGTGCATTAGATGCAGGTGTCGATAGTTACATTT
TAAAAACAACAAGTGCAAAAGATATCGCCGATGCAGTTCGTAAAACTTCTAGAGGAGAAT
CTGTTTTTGAACCGGAAGTTTTAGTGAAAATGCGTAACCGTATGAAAAAGCGCGCAGAGT
TATATGAAATGCTTACAGAACGAGAAATGGAAATATTATTATTGATTGCGAAAGGTTACT
CAAATCAAGAAATTGCTAGTGCATCGCATATTACTATTAAAACGGTTAAGACACATGTGA
GTAACATTTTAAGTAAGTTAGAAGTGCAAGATAGAACACAAGCTGTCATCTATGCATTCC
AACATAATTTAATTCAATAGttcgtatcgaattaagaaaagttacttacgccaatcacaa
tataacatca