How to Download the Sequences of Lots of Human Genes

I wrote a similar post for downloading sequences of yeast genes.

Basically I needed to do some analysis on the upstream promoter region of about 1200 human genes.

I wasn’t gonna download them one-by-one, and I didn’t want to get a database dump of the whole genome.

Luckily, there’s UCSC’s hgTable! (I love you guys)

Screen Shot 2016-03-31 at 11.39.02 AM

So it hit up.

Screen Shot 2016-03-31 at 11.39.31 AM

Paste in your list of genes.

Note: No commas at the end of lines

Screen Shot 2016-03-31 at 11.45.14 AM

Submit to get your sequences!

You can even toggle between many options, like genomic DNA or protein sequence, or a certain number of bases upstream and downstream of the gene.

Screen Shot 2016-03-31 at 11.46.18 AM

The output will be in one giant FASTA file so if you’re getting the sequences of a lot of genes, you better download the gzip, ’cause it’ll take forever to load it in your browser (it’s on the order of 100MB).

Another note: you might get more than one sequence per gene, depending on what tables, groups, and tracks you select.

Have fun!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: