How To Create GenBank Files with Biopython
Much like how you should use a csv library to generate csv files, you should use a library to generate GenBank files.
In Python, you can use Biopython!
Here’s a small recipe you get to started. It creates a GenBank from a sequence, and even includes an annotation.
from Bio import SeqIO from Bio.Seq import Seq from Bio.SeqRecord import SeqRecord from Bio.Alphabet import IUPAC from Bio.SeqFeature import SeqFeature, FeatureLocation # Create a sequence sequence_string = "ggggaaaattttaaaaccccaaaa" sequence_object = Seq(sequence_string, IUPAC.unambiguous_dna) # Create a record record = SeqRecord(sequence_object, id='123456789', # random accession number name='Example', description='An example GenBank file generated by BioPython') # Add annotation feature = SeqFeature(FeatureLocation(start=3, end=12), type='misc_feature') record.features.append(feature) # Save as GenBank file output_file = open('example.gb', 'w') SeqIO.write(record, output_file, 'genbank')
The output:
LOCUS Example 24 bp DNA UNK 01-JAN-1980 DEFINITION An example GenBank file generated by BioPython ACCESSION 123456789 VERSION 123456789 KEYWORDS . SOURCE . ORGANISM . . FEATURES Location/Qualifiers misc_feature 4..12 ORIGIN 1 ggggaaaatt ttaaaacccc aaaa //
Check out the really good Biopython documentation for more details!