Archive

Tag Archives: Ruby

I recently switched from working in Python to working in Ruby. The behavior of Ruby’s keyword arguments gave me a shock. You have to explicitly decide between positional and keyword arguments for methods in Ruby. With a method with keyword arguments, Ruby won’t interpret your parameters as positional if you don’t supply the parameter name. It will instead throw an argument error.

For example, in Python (examples from Stackoverflow):

def fn(a, b, c = 1):
    return a * b + c

print fn(1, 2)                # returns 3, positional and default.
print fn(1, 2, 3)             # returns 5, positional.
print fn(c = 5, b = 2, a = 2) # returns 9, named.
print fn(b = 2, a = 2)        # returns 5, named and default.
print fn(5, c = 2, b = 1)     # returns 7, positional and named.
print fn(8, b = 0)            # returns 1, positional, named and default.

In Ruby, for method with positional arguments:

def fn(a, b, c = 1)
    a * b + c
end

puts fn(1, 2)                # returns 3
puts fn(1, 2, 3)             # returns 5
puts fn(c: 5, b: 2, a: 2)    # ArgumentError, Wrong Number of Arguments
puts fn(b: 2, a: 2)          # ArgumentError, Wrong Number of Arguments
puts fn(5, c: 2, b: 1)       # ArgumentError, Wrong Number of Arguments
puts fn(8, b: 0)             # ArgumentError, Wrong Number of Arguments

For methods with keyword arguments:

def fn(a:, b:, c: 1)
    a * b + c
end

puts fn(1, 2)                # ArgumentError, Missing keywords
puts fn(1, 2, 3)             # ArgumentError, Missing keywords
puts fn(c: 5, b: 2, a: 2)    # returns 9
puts fn(b: 2, a: 2)          # returns 5
puts fn(5, c: 2, b: 1)       # ArgumentError, Missing keywords
puts fn(8, b: 0)             # ArgumentError, Missing keywords

Here’s a neat article on Ruby 2 arguments from thoughtbot.

Here’s a short FAQ on parameters vs arguments from Python 3.

Advertisements

absee has underwent a lot of structural changes.

First, it only retrieves the information you want instead of returning all traces and called bases.
Second, it’s a class now, so you can hold onto multiple sequencing data.
Third, it now has quality scores.

%irb
>> require ‘absee’
=> true
>> my_variable = ABSee.new()
=> #<ABSee:0x000001008599d0>
>> my_variable.read("/Users/Jenny/Desktop/my_sequence.ab1")
=> nil
>> my_variable.get_calledSequence()

Class Methods

  • read(file_location)
    • returns nil
  • get_traceA()
    • returns an array with the trace data for adenine
  • get_traceG()
    • returns an array with the trace data for guanine
  • get_traceC()
    • returns an array with the trace data for cytosine
  • get_traceT()
    • returns an array with the trace data for thymine
  • get_calledSequence()
    • returns an array with the Basecalled sequence
  • get_qualityScores()
    • returns an array with the Basecalled quality scores
  • get_peakIndexes()
      returns an array with indexes of the called sequence in the trace

[updated again as a Ruby class]

absee has an update!

The 0.1.0.0 version now encapsulates the methods in a Ruby Module, instead of being global functions.

Example usage:

% irb
>> require ‘absee’
=> true
>> Absee.readAB(“/Users/Jenny/Desktop/my_sequence.ab1″)

It still returns six arrays (the trace values for ACGT, called sequence, and peak indexes).

More information can be found on my previous post.

Thanks goes to Dan Cahoon for forking my absee on github.

1. Introduction

absee is a friendly ABIF reader in Ruby.

Three years ago, I desperately needed to analyze the trace values from DNA sequencing chromatograms (in the form of ABIF files). To my frustration, none of the available ABIF readers exported raw data. Even today, while lots of software are able to visualize ABIF files, very few allow for scripted inputs and custom manipulation of outputs. I want a ABIF reader that simply extracts the data and can be easily incorporated into other projects. Hence, I created absee.

absee is a Ruby gem. It has no GUI, no fluff. It simply reads the ABIF files and returns the values in six arrays, an array for each of the trace data for ACGT at discreet intervals, a called sequence, and an array of peak indexes corresponding to the called sequence.

% irb
>> require 'absee'
=> true
>> readAB("my_sequence.ab1")

With a simple Ruby script, it can be incorporated to rapidly read and process many ABIF files and pipe the data for further downstream processing. absee is a very nifty tool, one that I wish I had three years ago. The above code works for versions less than 0.1.0.0.

[update: new version as a Ruby Module]

2. Background

ABIF is a binary file format, usually with an .ab1 extension. It contains a trace value for A, C, G, and T at each point for a interval. Most ABIF viewing software will interpolate those values at the points to display sinusoidal lines.

ABIF files also contain estimated bases and peak indexes. The way DNA sequencing extracts a sequence from from trace data is to use a base-calling algorithm. The base-calling algorithm will estimate a peak in the trace data and determine a called-base for the peak. If peaks from more than one trace overlap and their values are sufficiently close, the algorithm may use N to denote uncertainty of the base for that peak, and lower the quality score. The sequence of called-bases is the estimated DNA sequence corresponding to a chromatogram.

3.  Details

Converting from the ABIF binary files to readable values was no small feat. Even with its file format architecture ready, I still needed a little guidance. I found an open-source ABIF viewer years ago (now no longer available) and translated absee from its ABIF reader.

The primary method to call is readAB. It opens the ABIF file, checking the filetype and version. Major ABIF versions greater than 1 are not supported, due to possible different encodings. If the check fails, readAB will return six empty arrays.

readAB(filename)

  • parameters:
      filename: a string containing the filename (including the path and extensions)
  • returns:
      six arrays, which are trace data for A, C, G, T, called sequence, and peak indexes

There’s more documentation in absee‘s yardoc / RDoc, as well as the source code on github.

4. Source Code

The source code for version 0.0.2.3 can be found at the absee github repository.