Archive

Iron Chef SynBio

I was doing some next generation sequencing (NGS) analysis over the weekend, for the the first time. As such, I had to get some of the common software tools like PEAR and bowtie. Their official sites were hosted by SourceForge, but I didn’t want to download the binaries from SourceForge ’cause I’m paranoid about malware. So, I compiled them myself.

The process turned out to be super easy!

They all have git repos:

https://github.com/xflouris/PEAR

https://github.com/BenLangmead/bowtie

https://github.com/BenLangmead/bowtie2

For example, for bowtie, you can do:

git clone https://github.com/BenLangmead/bowtie.git
cd bowtie
make

For bowtie, you need libtbb, and for bowtie2, you need to compile with NO_TBB=1.

I’m pleasantly surprised because I remember the struggle of building open source projects when I was a young’un.

Just wanted to share!

Much like how you should use a csv library to generate csv files, you should use a library to generate GenBank files.

In Python, you can use Biopython!

Here’s a small recipe you get to started. It creates a GenBank from a sequence, and even includes an annotation.

from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
from Bio.Alphabet import IUPAC
from Bio.SeqFeature import SeqFeature, FeatureLocation

# Create a sequence
sequence_string = "ggggaaaattttaaaaccccaaaa"
sequence_object = Seq(sequence_string, IUPAC.unambiguous_dna)

# Create a record
record = SeqRecord(sequence_object,
                   id='123456789', # random accession number
                   name='Example',
                   description='An example GenBank file generated by BioPython')

# Add annotation
feature = SeqFeature(FeatureLocation(start=3, end=12), type='misc_feature')
record.features.append(feature)

# Save as GenBank file
output_file = open('example.gb', 'w')
SeqIO.write(record, output_file, 'genbank')

The output:

LOCUS       Example                   24 bp    DNA              UNK 01-JAN-1980
DEFINITION  An example GenBank file generated by BioPython
ACCESSION   123456789
VERSION     123456789
KEYWORDS    .
SOURCE      .
  ORGANISM  .
            .
FEATURES             Location/Qualifiers
     misc_feature    4..12
ORIGIN
        1 ggggaaaatt ttaaaacccc aaaa
//

Check out the really good Biopython documentation for more details!