Title: | The R Genome Browser |
---|---|
Description: | Classes and methods to efficiently handle (slice, annotate, draw ...) genomic features (such as genes or transcripts), and an interactive interface to browse them. |
Authors: | Sylvain Mareschal |
Maintainer: | Sylvain Mareschal <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.7.6 |
Built: | 2024-11-01 11:17:56 UTC |
Source: | https://github.com/maressyl/r.rgb |
These functions constructs track.table
inheriting objects from free annotation files.
track.table.GTF(file, name = NA, attr = "split", features = "exon", quiet = FALSE, .chromosomes, ...) track.exons.CCDS(file, name = "CCDS exons", ...) track.exons.GENCODE(file, name = "GENCODE exons", extra = c("gene_id", "gene_name", "exon_id"), ...) track.CNV.DGV(file, name = "DGV CNV", ...) track.genes.NCBI(file, name = "NCBI genes", selection, ...) track.bands.UCSC(file, name = "UCSC bands", ...)
track.table.GTF(file, name = NA, attr = "split", features = "exon", quiet = FALSE, .chromosomes, ...) track.exons.CCDS(file, name = "CCDS exons", ...) track.exons.GENCODE(file, name = "GENCODE exons", extra = c("gene_id", "gene_name", "exon_id"), ...) track.CNV.DGV(file, name = "DGV CNV", ...) track.genes.NCBI(file, name = "NCBI genes", selection, ...) track.bands.UCSC(file, name = "UCSC bands", ...)
file |
Single character value, the path to the raw file to parse. See the 'Details' section below. |
name |
Single character value, the |
attr |
To be passed to |
features |
To be passed to |
quiet |
To be passed to |
.chromosomes |
To be passed to the |
... |
Further arguments are passed to the class constructor, as a result most of the handled arguments are |
extra |
Character vector, names of optional columns to keep from the GENCODE GTF file. |
selection |
Character vector, filter to apply on the "group_label" column for NCBI genes. Raises an error with the possible values when missing. |
track.table.GTF
imports a "Gene Feature Transfert" file, as proposed by the UCSC Table Browser at http://www.genome.ucsc.edu/cgi-bin/hgTables) for a large amount of species. See the read.gtf
manual for further details.
track.exons.CCDS
contains various transcripts from the "Consensus Coding DNA Sequence" project, currently only available for mouse and human (see the NCBI data repository at https://ftp.ncbi.nlm.nih.gov/pub/CCDS/, and look for a file named "CCDS_current.txt").
track.exons.GENCODE
contains various transcripts from the GENCODE project, currently only available for mouse and human (see the dedicated website at https://www.gencodegenes.org/). This function is intended to run on the GTF version of the data.
track.CNV.DGV
parses constitutive copy number variations from the current version of the Database of Genomic Variants, downloadable from http://dgv.tcag.ca/dgv/app/downloads using "DGV Variants" links. The whole database is dedicated to the human specy only.
track.genes.NCBI
parses the gene list from the MapView project of the NCBI, for one of many species available at https://ftp.ncbi.nih.gov/genomes/MapView/. Select your specy of interest, then browse "sequence", "current" and "initial_release" (if the directories are available, they are not for certain species). Download the file named "seq_gene.md.gz". As many assemblies are included in the file, a first call to the function without "selection" is required, to list the available values. A second call with the appropriate assembly name will produce the desired track file.
track.bands.UCSC
produces a track of cytogenetic banding, as made available by the UCSC for many species at http://hgdownload.cse.ucsc.edu/downloads.html. Select the specy and assembly version that suits your needs, and look for a file named "cytoBand.txt.gz" in the "Annotation database" section.
Return a track.table
-inheriting object (of class track.exons
, track.CNV
, track.genes
or track.bands
).
Sylvain Mareschal
Example of track.exons.CCDS
raw file (current human assembly) : https://ftp.ncbi.nlm.nih.gov/pub/CCDS/current_human/CCDS.current.txt
Example of track.exons.GENCODE
raw file (human assembly 'GRCh38') : ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz
Example of track.CNV.DGV
raw file (human assembly 'hg19') : http://dgv.tcag.ca/dgv/docs/GRCh37_hg19_variants_2013-05-31.txt
Example of track.genes.NCBI
raw file (human assembly 'GRCh37') : https://ftp.ncbi.nih.gov/genomes/MapView/Homo_sapiens/sequence/BUILD.37.3/initial_release/seq_gene.md.gz
Example of track.bands.UCSC
raw file (human assembly 'hg19') : http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/cytoBand.txt.gz
track.fasta-constructors
, Annotation
track.table-class
, track.exons-class
, track.CNV-class
, track.genes-class
or track.bands-class
# From the "How-to" vignette, section "Custom annotation tracks" file <- system.file("extdata/Cosmic_ATM.gtf.gz", package="Rgb") tt <- track.table.GTF(file)
# From the "How-to" vignette, section "Custom annotation tracks" file <- system.file("extdata/Cosmic_ATM.gtf.gz", package="Rgb") tt <- track.table.GTF(file)
"crossable"
Reference classes extending this virtual class must have a slice
method, as a generic cross
method based on it is provided.
Its only purpose is currently to add the cross
method to "track.table"
, as "sliceable"
does not guarantee that slice
returns a data.frame as crossable
needs one.
Class sliceable
, directly.
Class drawable
, by class sliceable
, distance 2.
All reference classes extend and inherit methods from envRefClass
.
The following fields are inherited (from the corresponding class):
cross(annotation, colname = , type = , fuzziness = , maxElements = , location = , precision = , quiet = )
:Add a new column computed from overlaps with an other crossable object.
- annotation : other crossable object to compute overlap with.
- colname : single character value, the name of the new column to add to .self. If NULL or NA, the result will be returned rather than added to .self.
- type : single character value, either :
'cover', to compute coverage of 'annotation' elements for each .self element
'count', to count 'annotation' elements overlapping each .self element
'cytoband', to get cytogenetic coordinates from a cytoband annotation track
an 'annotation' column name, to list 'annotation' elements overlapping each .self element
- fuzziness : single integer value, to be added on each side of .self elements when computing overlaps.
- maxElements : single integer value, when more overlaps are found, lists are replaced by counts. Can be NA to disable this behavior.
- location : character vector, the 'chrom' / 'start' / 'end' .self columns to use for annotation.
- precision : single integer value from 1 to 4, amount of digits to consider in banding (type='cytoband').
- quiet : single logical value, wether to throw progression messages or not.
The following methods are inherited (from the corresponding class):
callParams (drawable)
callSuper (envRefClass)
check (drawable)
chromosomes (drawable)
copy (envRefClass)
defaultParams (sliceable)
draw (sliceable)
export (envRefClass)
field (envRefClass)
fix.param (drawable)
getChromEnd (sliceable)
getClass (envRefClass)
getName (drawable)
getParam (drawable)
getRefClass (envRefClass)
import (envRefClass)
initFields (envRefClass)
initialize (drawable)
setName (drawable)
setParam (drawable)
show (sliceable)
slice (sliceable)
trace (envRefClass)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
drawable
, sliceable
, track.table
, cghRA.array
This function draws the background for the other track ploting functions.
draw.bg(start, end, ylab = "", ylab.horiz = FALSE, ysub = as.character(NA), mar = c(0.2, 5, 0.2, 1), xaxt = "s", yaxt = "n", yaxs = "r", ylim = c(0, 1), cex.lab = 1, cex.axis = 1, mgp = c(3, 1, 0), tck = NA, tcl = -0.5, xaxp = as.numeric(NA), yaxp = as.numeric(NA), bty = "o", las = 0, xgrid = TRUE, new = FALSE, bg = NA, bg.inner = NA, fg = "#000000", ...)
draw.bg(start, end, ylab = "", ylab.horiz = FALSE, ysub = as.character(NA), mar = c(0.2, 5, 0.2, 1), xaxt = "s", yaxt = "n", yaxs = "r", ylim = c(0, 1), cex.lab = 1, cex.axis = 1, mgp = c(3, 1, 0), tck = NA, tcl = -0.5, xaxp = as.numeric(NA), yaxp = as.numeric(NA), bty = "o", las = 0, xgrid = TRUE, new = FALSE, bg = NA, bg.inner = NA, fg = "#000000", ...)
start |
Single integer value, the left boundary of the window, in base pairs. |
end |
Single integer value, the right boundary of the window, in base pairs. |
ylab |
The name of the Y axis. See |
ylab.horiz |
Single logical value, whether to print |
ysub |
Similar to |
mar |
A numerical vector of the form "c(bottom, left, top, right)" which gives the number of lines of margin to be specified on the four sides of the plot. See |
xaxt |
Whether to plot an X axis ("s") or not ("n"). See |
yaxt |
Whether to plot an Y axis ("s") or not ("n"). If no Y axis is drawn, |
yaxs |
Y axis style, "r" enlarges the Y limits by 4 percents on each side for a cleaner look, "i" will not. See |
ylim |
The Y axis limits as a numerical vector of the form "c(start, end)" of the plot. Note that start > end is allowed and leads to a "reversed axis". Use "NULL" to guess the axis range from the data. See |
cex.lab |
The relative character size of x and y axis labels (default: 1). See |
cex.axis |
The relative character size of x and y axis annotations (default: 1). See |
mgp |
Length 3 vector defining the distance between the plot area and respectively the Y axis label, Y axis annotations and Y axis line (default: 3, 1, 0). See |
tck |
The length of tick marks as a fraction of the smaller of the width or height of the plot (default: NA, meaning using |
tcl |
The absolute length of a tick marks. Note that positive numbers put them inside the plot area (default: -0.5). See |
xaxp |
Length 3 vector defining the ticks on the X axis : X of first tick, X of last tick and number of intervals between them (default: NA). See |
yaxp |
Length 3 vector defining the ticks on the Y axis : Y of first tick, Y of last tick and number of intervals between them (default: NA). See |
bty |
A character string which determined the type of box which is drawn about plots. If bty is one of "o" (the default), "l", "7", "c", "u", or "]" the resulting box resembles the corresponding upper case letter. A value of "n" suppresses the box. See |
las |
The direction of both X and Y axis labels: 0 for labels parallel to the axes, 1 for horizontal labels, 2 for labels perpendicular to the axes and 3 for vertical labels. See |
xgrid |
Single logical value, whether to draw a grid on X axis or not. |
new |
Single logical value, whether to plot on top of previous track ( |
bg |
Single character value, defining the color of the background (margins included) as an english name or a hexadecimal code. Similar to |
bg.inner |
Single character value, defining the color of the background (margins excluded) as an english name or a hexadecimal code. |
fg |
Single character value, defining the color of the foreground (axes, labels...) as an english name or a hexadecimal code. Similar to |
... |
Not used, only ignores other arguments. |
Sylvain Mareschal
draw.boxes
, draw.density
, draw.hist
, draw.pileup
, draw.points
, draw.seq
, draw.steps
This function draws a slice of a track content, with a distinct box for each track element.
draw.boxes(slice, start, end, maxElements = 50, maxDepth = 100, label = TRUE, labelStrand = FALSE, labelCex = 1, labelSrt = 0, labelAdj = "center", labelOverflow = TRUE, labelFamily = "sans", labelColor = "#000000", fillColor = "#BBBBBB", border = "#666666", cex.lab = 1, spacing = 0.2, bty = "o", groupBy = NA, groupPosition = NA, groupSize = NA, groupLwd = 1, fg = "#000000", normalize.y = TRUE, ...)
draw.boxes(slice, start, end, maxElements = 50, maxDepth = 100, label = TRUE, labelStrand = FALSE, labelCex = 1, labelSrt = 0, labelAdj = "center", labelOverflow = TRUE, labelFamily = "sans", labelColor = "#000000", fillColor = "#BBBBBB", border = "#666666", cex.lab = 1, spacing = 0.2, bty = "o", groupBy = NA, groupPosition = NA, groupSize = NA, groupLwd = 1, fg = "#000000", normalize.y = TRUE, ...)
slice |
A |
start |
Single integer value, the left boundary of the window, in base pairs. |
end |
Single integer value, the right boundary of the window, in base pairs. |
maxElements |
Single integer value, the maximum amount of boxes on the plot (if exhausted, only the amount of elements will be ploted). |
maxDepth |
Single integer value, the maximum amount of box heights allowed on the plot to avoid overlaps (if exhausted an error message will be ploted, turning |
label |
Single logical value, whether to print labels on boxes or not. |
labelStrand |
Single logical value, whether to add the strand at the end of labels or not. |
labelCex |
Single numeric value, character expansion factor for labels. |
labelSrt |
Single numeric value, string rotation angle for labels. |
labelAdj |
'left', 'right' or 'center', the horizontal adjustement of the labels on the boxes. |
labelOverflow |
Single logical value, whether to write labels on boxes too narrow to host them or not. |
labelFamily |
Single character value, the font family to use for labels ('serif', 'sans', 'mono' or 'Hershey'). 'serif' and 'sans' are not monospaced fonts, so label box sizes and collision handling might not work as expected with them. |
labelColor |
The color to use for box labels (as a name, an integer or an hexadecimal character description). It can alternatively be a function without argument, which returns a vector of as many colors as |
fillColor |
The color to fill boxes with (as a name, an integer or an hexadecimal character description). It can alternatively be a function without argument, which returns a vector of as many colors as |
border |
The color to use for box borders (as a name, an integer or an hexadecimal character description). It can alternatively be a function without argument, which returns a vector of as many colors as |
cex.lab |
The relative character size of x and y axis labels (default: 1). See |
spacing |
Single numeric value, the vertical spacing between boxes, in proportion of the box height. Can alternatively be a single character value pointing a column to use to provide per-box spacing. |
bty |
A character string which determined the type of box which is drawn about plots. If bty is one of "o" (the default), "l", "7", "c", "u", or "]" the resulting box resembles the corresponding upper case letter. A value of "n" suppresses the box. See |
groupBy |
Single character value, the name of a |
groupPosition |
Single character value, the name of an integer |
groupSize |
Single character value, the name of an integer |
groupLwd |
Single numeric value, the width of the line drawn to group elements. |
fg |
Single character value, defining the color of the foreground (axes, labels...) as an english name or a hexadecimal code. Similar to |
normalize.y |
Single logical value, whether to normalize Y axis to 0:1 or have a Y axis growing with the number of stacked boxes. Default behavior is to fit all drawn boxes in 0:1 (default |
... |
Further arguments to be passed to |
Sylvain Mareschal
draw.bg
, draw.density
, draw.hist
, draw.pileup
, draw.points
, draw.seq
, draw.steps
This function is similar to draw.points
, but draws a 2D density plot of the points instead of the points themselves.
draw.density(slice, start, end, column = "value", cex.lab = 1, bty = "o", fg = "#000000", pal = grDevices::grey, border = NA, depth = 8, dpi = 7, bw.x = 0.005, bw.y = 0.2, precision = 1, skewing = 1.75, ...)
draw.density(slice, start, end, column = "value", cex.lab = 1, bty = "o", fg = "#000000", pal = grDevices::grey, border = NA, depth = 8, dpi = 7, bw.x = 0.005, bw.y = 0.2, precision = 1, skewing = 1.75, ...)
slice |
A |
start |
Single integer value, the left boundary of the window, in base pairs. |
end |
Single integer value, the right boundary of the window, in base pairs. |
column |
Single character value, the name of the |
cex.lab |
See |
bty |
See |
fg |
Single character value, defining the color of the foreground (axes, labels...) as an english name or a hexadecimal code. Similar to |
pal |
A function returning a set of colors when provided a set of intensities between 0 and 1 (typically |
border |
The color to use for polygon borders (as a name, an integer or an hexadecimal character description). Special value |
depth |
Single integer value, the amount of different ranges to simplify the density into before plotting (corresponds to the |
dpi |
Single numeric value, the Dots Per Inches resolution of the grid on which |
bw.x |
Single numeric value, the bandwidth to use for density estimation on the X (genomic) axis. Notice this value will be multiplied by the genomic width of the plotted window (in Mbp), to enforce a similar resolution at all zoom levels. |
bw.y |
Single numeric value, the bandiwdth to use for density estimation on the Y axis. It was typically chosen for a Y axis ranging from -1 to 1, wider axes could require wider bandwidths. |
precision |
Single numeric value, providing a simpler way to control the sharpness of the density plot than setting |
skewing |
Single numeric value, defining how the color scale should be skewed toward small values. Higher |
... |
Further arguments to be passed to |
Sylvain Mareschal
draw.bg
, draw.boxes
, draw.hist
, draw.pileup
, draw.points
, draw.seq
, draw.steps
This function draws a slice of a track content, with a distinct vertical bar for each track element.
draw.hist(slice, start, end, column = "value", fillColor = "#666666", border = "#666666", cex.lab = 1, origin = 0, bty = "o", fg = "#000000", ylim = NA, ...)
draw.hist(slice, start, end, column = "value", fillColor = "#666666", border = "#666666", cex.lab = 1, origin = 0, bty = "o", fg = "#000000", ylim = NA, ...)
slice |
A |
start |
Single integer value, the left boundary of the window, in base pairs. |
end |
Single integer value, the right boundary of the window, in base pairs. |
column |
Single character value, the name of the |
fillColor |
The color to fill vertical bars with (as a name, an integer or an hexadecimal character description). It can alternatively be a function without argument, which returns a vector of as many colors as |
border |
The color to use for box borders (as a name, an integer or an hexadecimal character description). It can alternatively be a function without argument, which returns a vector of as many colors as |
cex.lab |
The relative character size of x and y axis labels (default: 1). See |
origin |
Single numeric value, the Y value of the horizontal side common to all boxes. Can also be the name of a |
bty |
A character string which determined the type of box which is drawn about plots. If bty is one of "o" (the default), "l", "7", "c", "u", or "]" the resulting box resembles the corresponding upper case letter. A value of "n" suppresses the box. See |
fg |
Single character value, defining the color of the foreground (axes, labels...) as an english name or a hexadecimal code. Similar to |
ylim |
Numeric vector of length two, defining the Y axis boundaries. Any or both of them can be NA, meaning the missing boundary will be inferred from the data to plot. |
... |
Further arguments to be passed to |
Sylvain Mareschal
draw.bg
, draw.boxes
, draw.density
, draw.pileup
, draw.points
, draw.seq
, draw.steps
This function draws a slice of a sequence pileup, highlighting polymorphisms.
draw.pileup(slice, start, end, ylim = NA, bty = "o", label = TRUE, labelCex = 0.75, bases = c(A = "#44CC44", C = "#4444CC", G = "#FFCC00", T = "#CC4444"), maxRange = 500, cex.lab = 1, alphaOrder = 3, alphaMin = 0.1, fg = "#000000", ...)
draw.pileup(slice, start, end, ylim = NA, bty = "o", label = TRUE, labelCex = 0.75, bases = c(A = "#44CC44", C = "#4444CC", G = "#FFCC00", T = "#CC4444"), maxRange = 500, cex.lab = 1, alphaOrder = 3, alphaMin = 0.1, fg = "#000000", ...)
slice |
An integer matrix of read counts, with nucleotides in rows and positions in columns. Both dimensions must be named. |
start |
Single integer value, the left boundary of the window, in base pairs. |
end |
Single integer value, the right boundary of the window, in base pairs. |
ylim |
See |
bty |
A character string which determined the type of box which is drawn about plots. If bty is one of "o" (the default), "l", "7", "c", "u", or "]" the resulting box resembles the corresponding upper case letter. A value of "n" suppresses the box. See |
label |
Single logical value, whether to print nucleotide on bars or not. |
labelCex |
Single numeric value, character expansion factor for labels. |
bases |
Named character vector, defining the color to use for each nucleotide. |
maxRange |
Single integer value, nothing will be ploted if the plot window is wider by this value (in bases). |
cex.lab |
The relative character size of x and y axis labels (default: 1). See |
alphaOrder |
Single numeric value, the order of the formula controlling the transparency. Increase this value to increase sensitivity to rare variants. |
alphaMin |
Single numeric value, the minimal intensity in the formula controlling the transparency (between 0 and 1). Perfectly homozyguous positions will typically use this intensity of color. |
fg |
Single character value, defining the color of the foreground (axes, labels...) as an english name or a hexadecimal code. Similar to |
... |
Further arguments to be passed to |
Sylvain Mareschal
draw.bg
, draw.boxes
, draw.density
, draw.hist
, draw.points
, draw.seq
, draw.steps
This function draws a slice of a track content, with a distinct point for each track element.
draw.points(slice, start, end, column = "value", pointColor = "#666666", cex.lab = 1, cex = 0.6, pch = "+", bty = "o", fg = "#000000", ...)
draw.points(slice, start, end, column = "value", pointColor = "#666666", cex.lab = 1, cex = 0.6, pch = "+", bty = "o", fg = "#000000", ...)
slice |
A |
start |
Single integer value, the left boundary of the window, in base pairs. |
end |
Single integer value, the right boundary of the window, in base pairs. |
column |
Single character value, the name of the |
pointColor |
The color to use for points (as a name, an integer or an hexadecimal character description). It can alternatively be a function without argument, which returns a vector of as many colors as |
cex.lab |
See |
cex |
See |
pch |
See |
bty |
See |
fg |
Single character value, defining the color of the foreground (axes, labels...) as an english name or a hexadecimal code. Similar to |
... |
Further arguments to be passed to |
Sylvain Mareschal
draw.bg
, draw.boxes
, draw.density
, draw.hist
, draw.pileup
, draw.seq
, draw.steps
This function draws a slice of a character vector, with labels and distinct colors for each nucleotide.
draw.seq(slice = NULL, start, end, bty = "o", labelCex = 0.75, bases = c(A = "#44CC44", C = "#4444CC", G = "#FFCC00", T = "#CC4444"), maxRange = 500, cex.lab = 1, fg = "#000000", ...)
draw.seq(slice = NULL, start, end, bty = "o", labelCex = 0.75, bases = c(A = "#44CC44", C = "#4444CC", G = "#FFCC00", T = "#CC4444"), maxRange = 500, cex.lab = 1, fg = "#000000", ...)
slice |
Character vector, with a single letter per element. |
start |
Single integer value, the left boundary of the window, in base pairs. |
end |
Single integer value, the right boundary of the window, in base pairs. |
bty |
A character string which determined the type of box which is drawn about plots. If bty is one of "o" (the default), "l", "7", "c", "u", or "]" the resulting box resembles the corresponding upper case letter. A value of "n" suppresses the box. See |
labelCex |
Single numeric value, character expansion factor for labels. |
bases |
Named character vector, defining the color to use for each nucleotide (names have to be uppercase, |
maxRange |
Single integer value, nothing will be ploted if the plot window is wider by this value (in bases). |
cex.lab |
The relative character size of x and y axis labels (default: 1). See |
fg |
Single character value, defining the color of the foreground (axes, labels...) as an english name or a hexadecimal code. Similar to |
... |
Further arguments to be passed to |
Sylvain Mareschal
draw.bg
, draw.boxes
, draw.density
, draw.hist
, draw.pileup
, draw.points
, draw.steps
This function draws each element sliced from a track as a separate podium, defined by several start and end genomic coordinates. This representation may prove useful to represent results of Minimal Common Regions from algorithms such as SRA or GISTIC (see the 'cghRA' package).
draw.steps(slice, start, end, startColumns = "start", endColumns = "end", maxDepth = 100, label = TRUE, labelStrand = FALSE, labelCex = 1, labelSrt = 0, labelAdj = "center", labelOverflow = TRUE, labelFamily = "sans", labelColor = "#000000", fillColor = "#BBBBBB", border = "#666666", cex.lab = 1, spacing = 0.1, bty = "o", fg = "#000000", ...)
draw.steps(slice, start, end, startColumns = "start", endColumns = "end", maxDepth = 100, label = TRUE, labelStrand = FALSE, labelCex = 1, labelSrt = 0, labelAdj = "center", labelOverflow = TRUE, labelFamily = "sans", labelColor = "#000000", fillColor = "#BBBBBB", border = "#666666", cex.lab = 1, spacing = 0.1, bty = "o", fg = "#000000", ...)
slice |
A |
start |
Single integer value, the left boundary of the window, in base pairs. |
end |
Single integer value, the right boundary of the window, in base pairs. |
startColumns |
Character vector naming the columns in |
endColumns |
Character vector naming the columns in |
maxDepth |
Single integer value, the maximum amount of box heights allowed on the plot to avoid overlaps (if exhausted an error message will be ploted, turning |
label |
Single logical value, whether to print labels on boxes or not. |
labelStrand |
Single logical value, whether to add the strand at the end of labels or not. |
labelCex |
Single numeric value, character expansion factor for labels. |
labelSrt |
Single numeric value, string rotation angle for labels. |
labelAdj |
'left', 'right' or 'center', the horizontal adjustement of the labels on the boxes. |
labelOverflow |
Single logical value, whether to write labels on boxes too narrow to host them or not. |
labelFamily |
Single character value, the font family to use for labels ('serif', 'sans', 'mono' or 'Hershey'). 'serif' and 'sans' are not monospaced fonts, so label box sizes and collision handling might not work as expected with them. |
labelColor |
The color to use for box labels (as a name, an integer or an hexadecimal character description). It can alternatively be a function without argument, which returns a vector of as many colors as |
fillColor |
The color to fill boxes with (as a name, an integer or an hexadecimal character description). It can alternatively be a function without argument, which returns a vector of as many colors as |
border |
The color to use for box borders (as a name, an integer or an hexadecimal character description). It can alternatively be a function without argument, which returns a vector of as many colors as |
cex.lab |
The relative character size of x and y axis labels (default: 1). See |
spacing |
Single numeric value, the vertical spacing between boxes, in proportion of the box height. |
bty |
A character string which determined the type of box which is drawn about plots. If bty is one of "o" (the default), "l", "7", "c", "u", or "]" the resulting box resembles the corresponding upper case letter. A value of "n" suppresses the box. See |
fg |
Single character value, defining the color of the foreground (axes, labels...) as an english name or a hexadecimal code. Similar to |
... |
Further arguments to be passed to |
Sylvain Mareschal
draw.bg
, draw.boxes
, draw.density
, draw.hist
, draw.pileup
, draw.points
, draw.seq
"drawable"
Reference classes extending this virtual class must have a draw
method, so their objects can be managed by tk.browse
and browsePlot
.
All reference classes extend and inherit methods from envRefClass
.
name
:Custom name for the object, as a character
vector of length 1.
parameters
:A list
, storing object-specific parameters to use as draw
arguments.
callParams(chrom, start, end, ...)
:Called with draw() arguments, it returns the final argument list handling default and overloaded parameters.
- chrom, start, end, ... : arguments passed to draw().
check(warn = )
:Raises an error if the object is not valid, else returns TRUE
chromosomes()
:[Virtual method]
Returns the chromosome list as a vector. NULL is valid if non relevant, but should be avoided when possible.
defaultParams(...)
:Returns class-specific defaults for graphical parameters. Inheriting class should overload it to define their own defaults.
- ... : may be used by inheriting methods, especially for inter-dependant parameters.
draw(chrom, start = , end = , ...)
:[Virtual method]
Draws the object content corresponding to the defined genomic window, usually in a single plot area with coordinates in x and custom data in y.
Overloading methods should use .self$callParams(chrom, start, end ...) to handle drawing parameters and NA coordinates in a consistent way.
- chrom : single integer, numeric or character value, the chromosomal location.
- start : single integer or numeric value, inferior boundary of the window. NA should refer to 0.
- end : single integer or numeric value, superior boundary of the window. NA should refer to .self$getChromEnd().
- ... : additionnal drawing parameters (precede but do not overwrite parameters stored in the object).
fix.param(parent = )
:Edit drawing parameters using a Tcl-tk GUI
- parent : tcltk parent frame for inclusion, or NULL.
getChromEnd(chrom)
:[Virtual method]
Returns as a single integer value the ending position of the object description of the given chromosome. NA (integer) is valid if non relevant, but should be avoided when possible.
- chrom : single integer, numeric or character value, the chromosomal location. NA is not required to be handled.
getName()
:'name' field accessor.
getParam(name, ...)
:Returns the parameter stored, or the default value if no custom value is stored for it.
- name : single character value, the name of the parameter to return.
- ... : to be passed to defaultParams(), especially for inter-dependant parameters.
initialize(name = , parameters = , ...)
:setName(value)
:'name' field mutator.
setParam(name, value)
:Updates a parameter stored in the object.
- name : single character value, the name of the parameter to set.
- value : the new value to assign to the parameter (any type). If missing the parameter is discarded, thus returning to dynamic default value.
The following methods are inherited (from the corresponding class):
callSuper (envRefClass)
copy (envRefClass)
export (envRefClass)
field (envRefClass)
getClass (envRefClass)
getRefClass (envRefClass)
import (envRefClass)
initFields (envRefClass)
show (envRefClass, overloaded)
trace (envRefClass)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
Produces a drawable.list
object, for tk.browse
and browsePlot
input.
drawable.list(files = character(0), objects = NULL, hidden = FALSE, warn = TRUE)
drawable.list(files = character(0), objects = NULL, hidden = FALSE, warn = TRUE)
files |
Character vector, path and names of the files holding the drawables object (with .rdt or .rds extensions, see |
objects |
List of |
Logical vector, whether to show tracks in |
|
warn |
Single logical value, to be passed to the appropriate |
A drawable.list
object.
Sylvain Mareschal
drawable.list-class
, drawable-class
, tk.browse
, browsePlot
"drawable.list"
The purpose of this class is to store and manage a collection of drawable
objects. These collections are to be used by tk.browse
and browsePlot
as input.
Objects can be created by the drawable.list
constructor, and edited / created using the tk.tracks
Tcl-tk interface.
By default, the drawable.list$add()
method is only able to handle objects from drawable
-inheriting classes saved in RDT or RDS individual files. This can however be extended defining functions named drawableFromFile.EXTENSION
and drawableFromClass.CLASS
, in the global environment or a package. Such a function will take the same arguments as the drawable.list$add()
method, and will only have to return a drawable
-inheriting object.
All reference classes extend and inherit methods from envRefClass
.
classes
:Read-only, returns a vector of objects
classes.
count
:Read-only, returns the length of objects
, as a single integer.
files
:Character vector, the paths where each drawable object is to be stored.
hidden
:Logical vector, whether each object is to drawn or hidden in plots.
names
:Read-only, returns a vector of objects
'name' fields.
objects
:List of drawable
-inheriting objects.
add(file, track = , hidden = , ...)
:Add a track to the list.
- file : single character value, the path to the file containing the 'drawable' object to add.
- track : a 'drawable' object to add. If NULL, will be extracted from 'file'.
- hidden : single logical value, whether the track is to be shown on plots or hidden. This value can be changed later.
- ... : further arguments to be passed to drawableFromFile.EXTENSION or drawableFromClass.CLASS, if relevant.
check(warn = )
:Raises an error if the object is not valid, else returns TRUE
fix.files(parent = )
:Edit drawable list using a Tcl-tk GUI
- parent : tcltk parent frame for inclusion, or NULL.
fix.param(selection = , parent = )
:Edit drawing parameters using a Tcl-tk GUI
- selection : single integer value, the position of the track selected in the list.
- parent : tcltk parent frame for inclusion, or NULL.
get(index, what = )
:Returns a single 'what' from the series
- index : single numeric value, the position of the track to get.
- what : single character value, the field to be exracted.
getByClasses(classes, what = )
:Returns a subset of 'what' from the series, querying by class inheritance
- classes : character vector, the class names of the objects to get (inheriting classes are picked too).
- what : single character value, the field to be exracted.
getByNames(names, what = )
:Returns a subset of 'what' from the series, querying by track name
- names : character vector, the names of the objects to get.
- what : single character value, the field to be exracted.
getByPositions(positions, what = )
:Returns a subset of 'what' from the series, querying by position
- positions : integer vector, the positions of the objects to get.
- what : single character value, the field to be exracted.
getChromEnd(chrom)
:Returns as a single integer value the maximal ending position of the object descriptions of the given chromosome.
- chrom : single integer, numeric or character value, the chromosomal location.
,
initialize(files = , objects = , hidden = , ...)
:moveDown(toMove)
:Increases the position of a track, switching position with the next one
- toMove : single numeric value, the position of the track to move.
moveUp(toMove)
:Decreases the position of a track, switching position with the previous one
- toMove : single numeric value, the position of the track to move.
remove(toRemove)
:Remove one or many tracks from the list
- toRemove : numeric vector, the positions of the tracks to remove.
The following methods are inherited (from the corresponding class):
callSuper (envRefClass)
copy (envRefClass)
export (envRefClass)
field (envRefClass)
getClass (envRefClass)
getRefClass (envRefClass)
import (envRefClass)
initFields (envRefClass)
show (envRefClass, overloaded)
trace (envRefClass)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
drawable.list
, drawable-class
, tk.browse
, browsePlot
This function searches an environment for drawable-class
inheriting objects.
findDrawables(varNames = NA, envir = globalenv())
findDrawables(varNames = NA, envir = globalenv())
varNames |
Character vector, the R expression(s) of potential |
envir |
The |
Objects are currently found if defined as individual variables, as parts of drawable.list
objects or into standard R lists
. lists
are explored recursively, so lists
embedded into other lists
are explored too, whatever their depths.
Returns a character vector containing the R expression(s) to be evaluated in envir
to get the drawable-class
inheriting objects.
This vector carries an "envir" attribute containing the value passed to this function via the envir
argument.
Sylvain Mareschal
Whole list of GRCh37 human chromosome G-banding, from the UCSC repository.
data(hsBands)
data(hsBands)
track.table
object with 862 rows and the following columns : "name", "chrom", "strand", "start", "end", "stain".
University of California, Santa Cruz (genome.ucsc.edu)
8000 randomly choosen GRCh37 human genes, from the NCBI repository.
data(hsGenes)
data(hsGenes)
data.frame
with 8000 rows and the following columns : "chrom", "start", "end" and "name".
National Center for Biotechnology Information (ftp.ncbi.nih.gov)
Functions to write a single refTable
object to a file, and to restore it.
saveRDT(object, file, compress = "gzip", compression_level = 6) readRDT(file, version = FALSE)
saveRDT(object, file, compress = "gzip", compression_level = 6) readRDT(file, version = FALSE)
object |
An object of class |
file |
A connection or the name of the file where the R object is saved to or read from. The '.rdt' file extension is recommended, but not mandatory. |
compress |
To be passed to |
compression_level |
To be passed to |
version |
Single logical value, whether to return the stored object or the version of the package used to store it. |
These functions mimic the saveRDS
and saveRDS
system, without storing the class definition in the file (which can lead to about 100 useless Ko of data and longer loading times). It is intented to manage all classes extending refTable
, but no guarantee is provided for classes with non-atomic slots (particularly environment-derived ones).
saveRDT
returns nothing, readRDT
returns the object stored in the file or a single character value (depends on the version
argument).
To avoid whole-environment copying, environments of function
slots are discarded.
Sylvain Mareschal
refTable-class
, saveRDS
, saveRDS
This function parses a simple "Gene Transfer Format" (GTF2.2) into a data.frame, as distributed by the UCSC Table Browser.
As this format is an extension of the "Gene Feature Format" (GFF3), some retro-compatibility can be expected but not guaranteed.
read.gtf(file, attr = c("split", "intact", "skip"), features = NULL, quiet = FALSE)
read.gtf(file, attr = c("split", "intact", "skip"), features = NULL, quiet = FALSE)
file |
Single character value, the path and name of the GTF2 file to parse (possibly gzipped). |
attr |
Single character value, defining how to deal with attributes. "skip" discards the attributes data, "intact" does not process it and "split" adds a column for each attribute (identified by their names). |
features |
Character vector, if not |
quiet |
Single logical value, whether to send diagnostic messages or not. |
A data.frame
with the standard GTF2 columns. The "strand" column is converted to factor
, "?" are turned to NA
and "." are kept for features where stranding is not relevant (See the GFF3 specification).
Currently not implemented :
FASTA section and sequences (error raising)
Special character escaping (error raising)
Attribute quotes (kept)
Sections (all data pooled)
Meta data (ignored)
Sylvain Mareschal
GTF2.2 specification : http://mblab.wustl.edu/GTF22.html
GFF3 Sequence Ontology specification : http://www.sequenceontology.org/gff3.shtml
"refTable"
This class is similar to the data.frame
standard R class, following the object-oriented paradigm. The use of Reference Class system allows significative memory and time saving, making this class more suitable than data.frame
to handle large tabular data.
Objects can be created by two distincts means :
The refTable
constructor, similar to the data.frame
constructor. It imports a single data.frame
or a collection of vectors into the object, and check immediatly for validity.
The new
function (standard behavior for S4 and reference classes), which produces an empty object and do NOT check for validity. You can provide as arguments the values to use as the new object fields, if you know what you are doing.
All reference classes extend and inherit methods from envRefClass
.
refTable objects store data into their values
field as vectors. As values
is an environment, manipulating a refTable implies a pass-by-reference paradigm rather than the standard R pass-by-copy, i.e. data duplication (and so time and memory wasting) is widely reduced. As an example, updating a single cell in a data.frame
leads to the duplication of the whole table in memory ('before' and 'after' versions), while in a refTable the duplication is limited to the involved column.
To facilitate column renaming, the vectors in values
are not named according to the user-level column names, but according to references stored in the colReferences
field (integers greater than 0 converted to characters). Rename a column only updates the colNames
field and leave the values
one alone, as the column reference does not change.
Data extracted from refTable are usually returned as data.frame
, for a more comfortable R usage. The extraction mechanism handles data.frame
extraction mechanisms, and relies on the indexes
method to handle the others.
Rows and columns may be selected by a numeric vector, as for R data.frame
and vectors.
They also may be selected by a logical vector, defining for each row / column if it is to be selected (TRUE
) or not (FALSE
). Such vectors are recycled if not long enough to cover all the rows / columns.
A character vector defining the names of the rows / columns to select may also be used to extract data.
The NULL
value may be used to select all rows / columns.
An unevaluated expression, as returned by expression
or parse
may be used to select rows in the table environment. See 'examples'.
colCount
:Single integer
value, the amount of rows in the table.
colIterator
:Single integer
value, last column reference used.
colNames
:Character
vector, the names of all rows (may be empty).
colReferences
:Character
vector, the column names in the values
environment.
rowCount
:Single integer
value, the amount of rows in the table.
rowNamed
:Single logical
value, whether row names should be considered or not.
rowNames
:Character
vector, the names of all rows (may be empty).
values
:An environment
storing the columns as vectors.
addColumn(content, name, after = )
:Adds a column in the table
- content : values to fill in the new column, as a vector.
- name : name of the new column, as character.
- after : where to add the column, as the index (numeric) or name (character) of the column on its left
addDataFrame(dataFrame)
:Adds a data.frame content to the refTable
- dataFrame : the data to add.
addEmptyRows(amount, newNames)
:Add rows filled with NA at the bottom of the table.
- amount : single integer value, the amount of rows to add.
- newNames : character vector, the names of the new rows. Ignored if the table is not row named.
addList(dataList, row.names)
:Adds a list content to the refTable
- dataList : the data to add.
- row.names : character vector with the names of the enw rows.
addVectors(..., row.names)
:Adds vectors to the refTable
- ... : named vectors to add.
check(warn = )
:Raises an error if the object is not valid, else returns TRUE
coerce(j = , class, levels, ...)
:Coerces a single column to a different class
- j : column index (numeric) or name (character).
- class : name of the class to coerce 'j' to.
- levels : if 'class' is factor, the levels to use.
- ... : further arguments to be passed to the 'as' method (for atomics) or function (for other classes).
colOrder(newOrder, na.last = , decreasing = )
:Reorder the columns of the tables (duplication / subsetting are NOT handled)
- newOrder : new order to apply, as an integer vector of character vector of column names.
delColumns(targets)
:Deletes a column from the table
- targets : character vector, the name(s) of the column(s) to delete.
erase()
:Remove all the rows and columns in the table.
extract(i = , j = , drop = , asObject = )
:Extracts values into a data.frame or vector
- i : row selection, see indexes() for further details.
- j : column selection, see indexes() for further details.
- drop : if TRUE and querying a single column, will return a vector instead of a data.frame.
- asObject : if TRUE results will be served in the same class as the current object.
fill(i = , j = , newValues)
:Replaces values in a single column
- i : row indexes (numeric) or names (character). NULL or missing for all rows.
- j : column index (numeric) or name (character).
- newValues : vector of values to put in the object
getColCount()
:'colCount' field accessor.
getColNames()
:'colNames' field accessor.
getLevels(j = )
:Get levels of a factor column
- j : column index (numeric) or name (character).
getRowCount()
:'rowCount' field accessor.
getRowNames()
:'rowNames' field accessor.
indexes(i, type = )
:Checks row or column references and return numeric indexes
- i : reference to the rows or columns to select (NA not allowed), as :
- missing or NULL (all rows or columns)
- vector of numeric indexes to select
- vector of character indexes to select (if the dimension is named)
- vector of logical with TRUE on each value to select, FALSE otherwise
- expression object (as returned by e(...)), to be evaluated in the 'values' environment
initialize(rowCount = , rowNames = , rowNamed = , colCount = , colNames = , colReferences = , colIterator = , values = , ...)
:metaFields()
:Returns a character vector of fields that do not directly depend on the tabular content, for clonage.
rowOrder(newOrder, na.last = , decreasing = )
:Reorder the rows of the tables (duplication / subsetting are handled)
- newOrder : new order to apply, as an integer vector of row indexes or a character vector of column names.
- na.last : to be passed to order(), if 'newOrder' is a column name vector.
- decreasing : to be passed to order(), if 'newOrder' is a column name vector.
setColNames(j, value)
:Replaces one or many column names.
- j : subset of columns to rename.
- value : new column names to use, as a character vector.
setLevels(j = , newLevels)
:Get or replace levels of a factor column
- j : column index (numeric) or name (character).
- newLevels : new levels to use, as a character vector.
setRowNames(value)
:Replaces the entire row names set.
- value : new row names to use in the table, as a character vector. NULL will disable row naming.
types(j = )
:Returns classes of selected columns
- j : column indexes or names (NULL for all columns)
The following methods are inherited (from the corresponding class):
callSuper (envRefClass)
copy (envRefClass, overloaded)
export (envRefClass)
field (envRefClass)
getClass (envRefClass)
getRefClass (envRefClass)
import (envRefClass)
initFields (envRefClass)
show (envRefClass, overloaded)
trace (envRefClass)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
# New empty refTable tab <- new("refTable") tab$addColumn(1:5, "C1") tab$addColumn(letters[1:5], "C2") tab$setRowNames(LETTERS[11:15]) # New filled refTable (same content) tab <- refTable(C1=1:5, C2=letters[1:5], row.names=LETTERS[11:15]) # Whole table print print(tab$extract()) # Data update tab$fill(c(2,4), 2, c("B","D")) # Data extraction print(tab$extract(1:3)) print(tab$extract(c(TRUE, FALSE))) print(tab$extract("K", "C1")) # Expression-based extraction expr <- expression(C1 %% 2 == 1) print(tab$extract(expr)) # Table extension tab$addEmptyRows(5L, LETTERS[1:5]) tab$fill(6:10, "C1", 6:10) print(tab$extract()) # Filling from R objects tab <- new("refTable") print(tab$extract()) tab$addVectors(C1=1:5, C2=letters[1:5]) print(tab$extract()) tab$addList(list(C1=6:8, C3=LETTERS[6:8])) print(tab$extract()) # Beware of recycling ! tab$addVectors(C1=9:15, C3=LETTERS[9:10]) print(tab$extract())
# New empty refTable tab <- new("refTable") tab$addColumn(1:5, "C1") tab$addColumn(letters[1:5], "C2") tab$setRowNames(LETTERS[11:15]) # New filled refTable (same content) tab <- refTable(C1=1:5, C2=letters[1:5], row.names=LETTERS[11:15]) # Whole table print print(tab$extract()) # Data update tab$fill(c(2,4), 2, c("B","D")) # Data extraction print(tab$extract(1:3)) print(tab$extract(c(TRUE, FALSE))) print(tab$extract("K", "C1")) # Expression-based extraction expr <- expression(C1 %% 2 == 1) print(tab$extract(expr)) # Table extension tab$addEmptyRows(5L, LETTERS[1:5]) tab$fill(6:10, "C1", 6:10) print(tab$extract()) # Filling from R objects tab <- new("refTable") print(tab$extract()) tab$addVectors(C1=1:5, C2=letters[1:5]) print(tab$extract()) tab$addList(list(C1=6:8, C3=LETTERS[6:8])) print(tab$extract()) # Beware of recycling ! tab$addVectors(C1=9:15, C3=LETTERS[9:10]) print(tab$extract())
This function returns a new refTable
object from various arguments.
Notice the new()
alternative can be used to produce an empty object, setting only the fields not the content.
refTable(..., row.names, warn = TRUE)
refTable(..., row.names, warn = TRUE)
... |
A |
row.names |
Character vector, the names of the rows for list or vector input. |
warn |
Single logical value, to be passed to the |
An object of class refTable
.
Sylvain Mareschal
# From vectors tab <- refTable(colA=1:5, colB=letters[1:5]) print(tab$extract(3,)) # From list (recycling) columns <- list(number=1, letters=LETTERS) tab <- refTable(columns) print(tab$extract()) # data.frame conversion dataFrame <- data.frame(colA=1:5, colB=letters[1:5]) tab <- refTable(dataFrame) print(tab$extract())
# From vectors tab <- refTable(colA=1:5, colB=letters[1:5]) print(tab$extract(3,)) # From list (recycling) columns <- list(number=1, letters=LETTERS) tab <- refTable(columns) print(tab$extract()) # data.frame conversion dataFrame <- data.frame(colA=1:5, colB=letters[1:5]) tab <- refTable(dataFrame) print(tab$extract())
Given a set of segments defined by "chrom", "start", "end" and various data, it merges consecutive rows (sorted by "chrom" then "start") that share same data. As an example, it is useful to merge consecutive regions of the genome sharing same copy numbers after modelization, or filling small gaps.
segMerge(segTable, on = names(segTable), fun = list(unique, start=min, end=max), group = NULL)
segMerge(segTable, on = names(segTable), fun = list(unique, start=min, end=max), group = NULL)
segTable |
A |
on |
Character vector, |
fun |
A |
group |
A vector with as many values as |
Returns a data.frame
similar to segTable
.
Sylvain Mareschal
Given a set of segments defined by "chrom", "start", "end" and various data, it merges overlapping or jointive rows.
segOverlap(segTable, fun = list(unique.default, start=min, end=max), factorsAsIntegers = TRUE)
segOverlap(segTable, fun = list(unique.default, start=min, end=max), factorsAsIntegers = TRUE)
segTable |
A |
fun |
A |
factorsAsIntegers |
Single logical value, whether to handle columns of class factor as integers or as is. Using |
Returns a data.frame
similar to segTable
.
Sylvain Mareschal
This function plots a drawable list on all the chromosomes side by side, with a constant X axis scale. It mainly defines a custom layout for a series of browsePlot
calls.
singlePlot(drawables, columns = 4, exclude = c("X", "Y"), add = c(5e6, 15e6), vertical = FALSE, capWidth = "1 cm", spacer = "1 cm", finalize = TRUE, cap.border = "black", cap.font.col = "black", cap.bg.col = NA, cap.adj = c(0.5, 0.5), cap.cex = 2, cap.font = 2, mar = c(0,0,0,0), bty = "n", xaxt = "n", xgrid = FALSE, yaxt = "n", ylab = "", ysub = "", ...)
singlePlot(drawables, columns = 4, exclude = c("X", "Y"), add = c(5e6, 15e6), vertical = FALSE, capWidth = "1 cm", spacer = "1 cm", finalize = TRUE, cap.border = "black", cap.font.col = "black", cap.bg.col = NA, cap.adj = c(0.5, 0.5), cap.cex = 2, cap.font = 2, mar = c(0,0,0,0), bty = "n", xaxt = "n", xgrid = FALSE, yaxt = "n", ylab = "", ysub = "", ...)
drawables |
A |
columns |
Single integer value, column count in the chromosome layout. |
exclude |
Character vector, the names of the chromosomes to not plot. |
add |
Numeric vector of length 2, the margins to add before and after each chromosome, in base pairs. |
vertical |
Single logical value, whether to produce a plot showing chromosomes horizontally or vertically. Actually the figure will need to be manually rotated in the resulting file. |
capWidth |
Single value defining the width of the chromosome name caps, in a suitable format for |
spacer |
Single value defining the height of the inter-chromosome gaps, in a suitable format for |
finalize |
Single logical value, whether to fill unused space with a blank plot and return to default layout or allow further manual addition. If |
cap.border |
Single value defining the color of chromosome cap borders. See the |
cap.font.col |
Single value defining the color of chromosome names in the caps. See the |
cap.bg.col |
Single value defining the color of chromosome cap background. See the |
cap.adj |
Numeric vector of length 2 defining the X and Y adjustment of the chromosome names in the caps, ranging between 0 (left / bottom) to 1 (right / top). |
cap.cex |
Single value defining the character expansion factor for chromosome names in the caps. See the |
cap.font |
Single value defining the font type of chromosome names in the caps. See the |
mar |
Refer to ... described here-after. To disable the overriding and let each track define its own value as usual, set this parameter to |
bty |
Refer to ... described here-after. To disable the overriding and let each track define its own value as usual, set this parameter to |
xaxt |
Refer to ... described here-after. To disable the overriding and let each track define its own value as usual, set this parameter to |
xgrid |
Refer to ... described here-after. To disable the overriding and let each track define its own value as usual, set this parameter to |
yaxt |
Refer to ... described here-after. To disable the overriding and let each track define its own value as usual, set this parameter to |
ylab |
Refer to ... described here-after. To disable the overriding and let each track define its own value as usual, set this parameter to |
ysub |
Refer to ... described here-after. To disable the overriding and let each track define its own value as usual, set this parameter to |
... |
Further arguments will be passed to |
Sylvain Mareschal
browsePlot
, drawable.list
, drawable
"sliceable"
Reference classes extending this virtual class must have a slice
method, as a generic draw
method based on it is provided.
Class drawable
, directly.
All reference classes extend and inherit methods from envRefClass
.
The following fields are inherited (from the corresponding class):
slice(chrom, start, end)
:[Virtual method]
Extract elements in the specified window, in a format suitable to draw().
- chrom : single integer, numeric or character value, the chromosomal location. NA is not handled.
- start : single integer or numeric value, inferior boundary of the window. NA is not handled.
- end : single integer or numeric value, superior boundary of the window. NA is not handled.
The following methods are inherited (from the corresponding class):
callParams (drawable)
callSuper (envRefClass)
check (drawable)
chromosomes (drawable)
copy (envRefClass)
defaultParams (drawable, overloaded)
draw (drawable, overloaded)
export (envRefClass)
field (envRefClass)
fix.param (drawable)
getChromEnd (drawable, overloaded)
getClass (envRefClass)
getName (drawable)
getParam (drawable)
getRefClass (envRefClass)
import (envRefClass)
initFields (envRefClass)
initialize (drawable)
setName (drawable)
setParam (drawable)
show (drawable, overloaded)
trace (envRefClass)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
drawable
, crossable
, cghRA.array
subtrack
extracts lines from a data.frame
, list
or vector
collection within a single genomic window, defined by a chromosome name, a starting and an ending positions. As this is a common task in genome-wide analysis, this function relies on an optimized C code in order to achieve good performances.
sizetrack
is very similar to subtrack
, but only count lines without extracting the data.
istrack
checks if a collection of data is suitable for subtrack
and sizetrack
(See 'Track definition' for further details). As this operation is quite expensive and should be performed once, it is up to the user to check its data before subtracking.
istrack(...) subtrack(...) sizetrack(...)
istrack(...) subtrack(...) sizetrack(...)
... |
A collection of data to be considered as a single track. Named vectors are considered as single columns, For
|
The C code relies heavily on the ordering to fastly retrieve the elements that overlap the queried window. Elements entirely comprised in the window are returned, as well as elements that only partially overlap it.
subtrack
returns a single data.frame
merging all columns provided, with the subset of rows corresponding to elements in the queried window. This data.frame
has no row name, and is a valid track (See 'Track definition' for further details).
sizetrack
returns a single integer
value corresponding to the count of rows in the queried window.
istrack
returns a single TRUE
value if the data collection provided is a valid track. Otherwise it returns a single FALSE
value, with a "why" attribute containing a single character string explaining the (first) condition that is not fulfilled.
A track is defined as a data.frame
with a variable amount of data (in columns) about a variable amount of features (in rows).
3 columns are mandatory, with restricted names and types :
The chromosomal location of the feature, as integer
or factor
.
The starting position of the feature on the chromosome, as integer
.
The ending position of the feature on the chromosome, as integer
.
The track is supposed to be ordered by chromosome, then by starting position. When chromosomes are stored as factors
, they need to be numerically ordered by their internal codes (as the order
function does), not alphabetically by their labels.
In order to guarantee good performances, chromosomes are to be indexed. As the rows are supposed to be ordered by chromosome, then by starting position (see 'Track definition'), reminding starting or ending rows of each chromosome can save huge amounts of computation time in large tracks.
The following specifications must be fulfilled :
It must be an integer
vector, with the last row index of each chromosome in the track indexed.
Values are to be ordered by chromosome, in the same way than the 'chrom' column.
For integer
'chrom', values are extracted by position (chromosome '1' is the first value ...).
For factor
'chrom', values are extracted by names (named with 'chrom' levels).
Chromosomes without data in the track must be described, with NA integer
values.
See the 'Example' section below for index computation.
These three functions are proposed for generic usage on data.frame
, list
or vectors. The track.table
class implements more suitable slice
, size
and check
methods, and handles autonomously the indexing.
Sylvain Mareschal
# Exemplar data : subset of human genes data(hsGenes) # Track validity print(istrack(hsGenes)) hsGenes <- hsGenes[ order(hsGenes$chrom, hsGenes$start) ,] print(istrack(hsGenes)) # Chromosome index (factorial 'chrom') index <- tapply(1:nrow(hsGenes), hsGenes$chrom, max) # Factor chrom query print(class(hsGenes$chrom)) subtrack("1", 10e6, 15e6, index, hsGenes) # Row count a <- nrow(subtrack("1", 10e6, 15e6, index, hsGenes)) b <- sizetrack("1", 10e6, 15e6, index, hsGenes) if(a != b) stop("Inconsistency") # Multiple sources length <- hsGenes$end - hsGenes$start subtrack("1", 10e6, 15e6, index, hsGenes, length) subtrack("1", 10e6, 15e6, index, hsGenes, length=length) # Speed comparison (x200 here) system.time( for(i in 1:40000) { subtrack("1", 10e6, 15e6, index, hsGenes) } ) system.time( for(i in 1:200) { hsGenes[ hsGenes$chrom == "1" & hsGenes$start <= 15e6 & hsGenes$end >= 10e6 ,] } ) # Convert chrom from factor to integer hsGenes$chrom <- as.integer(as.character(hsGenes$chrom)) # Chromosome index (integer 'chrom') index <- rep(NA_integer_, 24) tmpIndex <- tapply(1:nrow(hsGenes), hsGenes$chrom, max) index[ as.integer(names(tmpIndex)) ] <- tmpIndex # Integer chrom query print(class(hsGenes$chrom)) subtrack(1, 10e6, 15e6, index, hsGenes)
# Exemplar data : subset of human genes data(hsGenes) # Track validity print(istrack(hsGenes)) hsGenes <- hsGenes[ order(hsGenes$chrom, hsGenes$start) ,] print(istrack(hsGenes)) # Chromosome index (factorial 'chrom') index <- tapply(1:nrow(hsGenes), hsGenes$chrom, max) # Factor chrom query print(class(hsGenes$chrom)) subtrack("1", 10e6, 15e6, index, hsGenes) # Row count a <- nrow(subtrack("1", 10e6, 15e6, index, hsGenes)) b <- sizetrack("1", 10e6, 15e6, index, hsGenes) if(a != b) stop("Inconsistency") # Multiple sources length <- hsGenes$end - hsGenes$start subtrack("1", 10e6, 15e6, index, hsGenes, length) subtrack("1", 10e6, 15e6, index, hsGenes, length=length) # Speed comparison (x200 here) system.time( for(i in 1:40000) { subtrack("1", 10e6, 15e6, index, hsGenes) } ) system.time( for(i in 1:200) { hsGenes[ hsGenes$chrom == "1" & hsGenes$start <= 15e6 & hsGenes$end >= 10e6 ,] } ) # Convert chrom from factor to integer hsGenes$chrom <- as.integer(as.character(hsGenes$chrom)) # Chromosome index (integer 'chrom') index <- rep(NA_integer_, 24) tmpIndex <- tapply(1:nrow(hsGenes), hsGenes$chrom, max) index[ as.integer(names(tmpIndex)) ] <- tmpIndex # Integer chrom query print(class(hsGenes$chrom)) subtrack(1, 10e6, 15e6, index, hsGenes)
The browsePlot
function produces an usual R plot from a drawable
inheriting object list, at specific coordinates in the genome.
The tk.browse
function summons a TCL-TK interface to navigate through the whole genome, relying on browsePlot
for the plotting.
The former may be called directly to automatically export views from the genome browser, the latter is more suited to an interactive browsing with frequent coordinate jumps.
tk.browse(drawables = drawable.list(), blocking = FALSE, updateLimit = 0.4, png.height = NA, png.res = 100, png.file = tempfile(fileext=".png"), panelWidth = "5 cm", panel = NA) browsePlot(drawables, chrom = NA, start = NA, end = NA, customLayout = FALSE, xaxt = "s", xaxm = 1.5, panelWidth = "5 cm", panelSide = "left", panel = NA, ...)
tk.browse(drawables = drawable.list(), blocking = FALSE, updateLimit = 0.4, png.height = NA, png.res = 100, png.file = tempfile(fileext=".png"), panelWidth = "5 cm", panel = NA) browsePlot(drawables, chrom = NA, start = NA, end = NA, customLayout = FALSE, xaxt = "s", xaxm = 1.5, panelWidth = "5 cm", panelSide = "left", panel = NA, ...)
drawables |
A |
blocking |
Single logical value, whether to wait for the interface window to be closed before unfreezing the R console. The |
updateLimit |
Single numeric value, minimal time (in seconds) between two image updates when move or zoom key are continuously pressed. This is used to limit zoom and move speed on fast computers. |
png.height |
Single integer value, the height of the display in pixels. Default value is to adapt the display to the size of the window, taller displays will require the user to use the scrollbar on the right of the display. |
png.res |
Single integer value, the resolution of the plot in Pixels Per Inches. Passed to |
png.file |
Single character value, the path to the PNG file that is displayed in the main window. The default behavior is to hide it in a temporary location, however you can define this argument to have an easier access to the images displayed in Rgb (the image will be replaced each time Rgb refresh its display). |
panelWidth |
Single value, the width of the panel displays on the left of the tracks, if any is to be plotted. This is handled by the |
panelSide |
Single character value ("left" or "right"), the side on which to plot the panels, if any. Note that |
panel |
Single logical value, wheter to force a panel to be displayed or to not be displayed. Default value of |
chrom |
Single character value, the chromosome to plot. |
start |
Single integer value, the left boundary of the window to plot. |
end |
Single integer value, the right boundary of the window to plot. |
customLayout |
Single logical value, whether to organize the various plot or not. If |
xaxt |
X axis showing (see |
xaxm |
Minimal bottom margin for the last track (see |
... |
Further arguments are passed through to the |
tk.browse
invisibly returns the drawables
argument (See the 'Typing R commands while browsing' section above).
browsePlot
invisibly returns the par
function output for the last track plot, as it is used by tk.browse
.
The left upper panel can be used to jump to specific coordinates, defined by a chromosome name, a starting position and an ending position (as floating point numerics in millions of base pairs.
The left and right arrow keys may be used to shift the window to the corresponding side. The page-up and page-down keys can be used to switch chromosome, without changing genomic numeric position.
The up and down arrow keys, as well as the vertical mouse wheel, may be used to zoom in or out on the current location.
A zoom can also be achieved with a mouse drag on the region to investigate : maintain a left click on the position to use as the new left boundary, and release the click at the position of the new right boundary.
The plot area size is defined by hscale
and vscale
. During interactive browsing, resize the browser window and use the "r" key to adjust the plot area to the window size.
tk.browse
returns the drawable.list
objects it uses to store currently browsed data. As it is a reference class object, the same memory location is shared by tk.browse
and the returned object, so updates (like track addition or edition) made by tk.browse
will impact the object in the R Command Line Interface (CLI), and updates made via R commands will impact the current tk.browse
session.
Some sub-interfaces (like information pop-ups and track selection panels) may freeze the R command prompt while opened, make sure to have only the tk.browse
main window opened when typing R commands.
Notice some operating systems (including Windows) restrain users to type R commands while a tcl-tk window is opened, or seems to be instable while doing so. Setting blocking
to TRUE
will enforce this behavior, keeping users from typing commands while tk.browse
is running.
Sylvain Mareschal
singlePlot
, drawable.list
, drawable
This function provides a tcl-tk interface to convert RDT table files into CSV-like files, and to produce basic track.table
objects from such files.
tk.convert(blocking = FALSE)
tk.convert(blocking = FALSE)
blocking |
Single logical value, whether to wait for the interface window to be closed before unfreezing the R console. The |
Sylvain Mareschal
This function allows to load and edit the drawable.list
object that is to be passed to tk.browse
and codebrowsePlot, using a Tcl-tk interface
tk.tracks(drawables = drawable.list(), parent = NULL)
tk.tracks(drawables = drawable.list(), parent = NULL)
drawables |
A previously built |
parent |
An optional |
Returns a drawable.list
object. Notice that if 'drawables' was provided, it will also be updated "in-place" (standard reference class behavior).
Sylvain Mareschal
tk.browse
, browsePlot
, drawable.list-class
, drawable-class
, findDrawables
These functions are components used to build interactive interfaces.
tk.file
is a wrapper to tcltk functions tkgetOpenFile
and tkgetOpenFile
with several enhancements.
tk.files
proposes to build and order a file list from multiple calls to tk.file
.
tk.folder
is a wrapper to tcltk function tkchooseDirectory
with small enhancements.
handle
is a wrapper to withRestarts
, which allows to catch errors, warnings and messages while executing an R expression and handle them with custom functions.
tk.file(title = "Choose a file", typeNames = "All files", typeExt = "*", multiple = FALSE, mandatory = TRUE, type = c("open", "save"), initialdir = NULL, parent = NULL) tk.files(preselection = character(0), multiple = TRUE, parent = NULL, ...) tk.folder(title = "Choose a directory", mustexist = TRUE, mandatory = TRUE) handle(expr, messageHandler, warningHandler, errorHandler)
tk.file(title = "Choose a file", typeNames = "All files", typeExt = "*", multiple = FALSE, mandatory = TRUE, type = c("open", "save"), initialdir = NULL, parent = NULL) tk.files(preselection = character(0), multiple = TRUE, parent = NULL, ...) tk.folder(title = "Choose a directory", mustexist = TRUE, mandatory = TRUE) handle(expr, messageHandler, warningHandler, errorHandler)
title |
Single character value, the displayed name of the summoned window. |
typeNames |
Character vector defining the displayed names of the filtered file extensions. Parallel to |
typeExt |
Character vector defining the filtered file extensions (use "*" as wildcard). Parallel to |
multiple |
Single logical value, whether to allow multiple file selection or not. |
mandatory |
Single logical value, whether to throw an error if no file is selected or not. |
type |
Single character value defining the label of the button. |
initialdir |
Single character value, the initially selected directory when the window is summoned. |
parent |
A Tcl-Tk object to consider as the parent of the new frame. |
preselection |
Character vector, the files already selected when the window is summoned. |
... |
Further arguments to be passed to |
mustexist |
Single logical value, whether to throw an error if a non-existing directory is selected or not. |
expr |
An R expression in which errors, warnings and messages are to be caught. Use |
messageHandler |
A function taking as single argument the condition object caught. If missing, messages will pass through. |
warningHandler |
A function taking as single argument the condition object caught. If missing, warnings will pass through. |
errorHandler |
A function taking as single argument the condition object caught. If missing, errors will pass through. |
tk.file
, tk.files
and tk.folder
return the selection as a character vector, possibly empty.
handle
returns nothing.
Sylvain Mareschal
tk.browse
, tk.convert
, tk.tracks
Produces track.table
-inheriting objects.
track.table(..., .name, .parameters, .organism, .assembly, .chromosomes, .makeNames = FALSE, .orderCols = TRUE, warn = TRUE) track.bam(bamPath, baiPath, addChr, quiet = FALSE, .name, .organism, .assembly, .parameters, warn = TRUE) track.genes(...) track.bands(...) track.exons(...) track.CNV(...)
track.table(..., .name, .parameters, .organism, .assembly, .chromosomes, .makeNames = FALSE, .orderCols = TRUE, warn = TRUE) track.bam(bamPath, baiPath, addChr, quiet = FALSE, .name, .organism, .assembly, .parameters, warn = TRUE) track.genes(...) track.bands(...) track.exons(...) track.CNV(...)
... |
Arguments to be passed to the inherited constructor ( |
.name |
Single character value, to fill the |
warn |
Single logical value, to be passed to the appropriate |
.parameters |
A |
.organism |
Single character value, to fill the |
.assembly |
Single character value, to fill the |
.chromosomes |
Single character value, levels to use for the 'chrom' column if conversion to factor is needed. |
.makeNames |
Single logical value, whether to compute the 'name' column with unique values or not. If |
.orderCols |
Single logical value, whether to reorder the columns for more consistency between tracks or not. |
bamPath |
Single character value, the file name and path to a BAM file (.bam) to build a track around. |
baiPath |
Single character value, the file name and path to the corresponding BAI file (.bai) to build a track around. If missing, a guess will be tried (adding '.bai' to |
addChr |
Single logical value, whether to automatically add 'chr' ahead chromosome names when querying or not. If missing, a guess will be tried looking for chromosome names beginning by 'chr' in the BAM header declaration. |
quiet |
Single logical value, whether to throw diagnostic messages during BAI parsing or not. |
track.table
and track.bam
inheriting objects.
Sylvain Mareschal
track.table-class
, track.bam-class
# track.table from a data.frame df <- data.frame( chrom=1, strand="+", start=1:5, end=2:6, name=letters[1:5], stringsAsFactors=FALSE ) track.table(df) # track.table from vectors track.table(chrom=1, strand="+", start=1:5, end=2:6, name=letters[1:5]) # track.bam track.bam(system.file("extdata/ATM.bam", package="Rgb"))
# track.table from a data.frame df <- data.frame( chrom=1, strand="+", start=1:5, end=2:6, name=letters[1:5], stringsAsFactors=FALSE ) track.table(df) # track.table from vectors track.table(chrom=1, strand="+", start=1:5, end=2:6, name=letters[1:5]) # track.bam track.bam(system.file("extdata/ATM.bam", package="Rgb"))
"track.bam"
"track.bam"
is a drawing wraper for Binary Alignment Map files (SAMtools).
Notice the data are not stored directly in the object, but stay in the original BAM file, thus exported track.bam
objects may be broken (the check
method can confirm this).
Objects are produced by the track.bam
constructor.
Class sliceable
, directly.
Class drawable
, by class sliceable
, distance 2.
All reference classes extend and inherit methods from envRefClass
.
addChr
:Single logical
value, whether to automatically add 'chr' ahead chromosome names when querying or not..
assembly
:Single character
value, the assembly version for the coordinates stored in the object. Must have length 1, should not be NA
.
baiPath
:Single character
value, the full path to the BAI index file in use.
bamPath
:Single character
value, the full path to the BAM file in use.
compression
:Single numeric
value, an estimation of the BAM file compression ratio.
header
:A data.frame
describing the @SQ elements of the BAM header (one per row).
index
:The parsed content of the BAI index, as a unamed list
with one element by reference sequence, itself a list
with 'bins' and 'intervals' elements. 'bins' is a named list
of two-column matrices ('start' and 'end'), giving virtual BGZF coordinates of the described bin (as double
). 'intervals' is a double
vector of virtual BGZF coordinates, used for linear filtering (see SAM specification for further details).
organism
:Single character
value, the name of the organism whose data is stored in the object. Must have length 1, should not be NA
.
The following fields are inherited (from the corresponding class):
coverage(chrom, start = , end = , tracks = , binLevel = , rawSize = )
:Fast estimation of depth coverage in a genomic window, from indexing data. Values are normalized into [0:1] over the genomic window.
- chrom : single integer, numeric or character value, the chromosomal location.
- start : single integer or numeric value, inferior boundary of the window. If NA, the whole chromosome is considered.
- end : single integer or numeric value, superior boundary of the window. If NA, the whole chromosome is considered.
- tracks : single logical value, whether to return a data.frame or a track.table.
- binLevel : single integer value, the higher bin order to allow
0 = 537Mb, 1 = 67Mb, 2 = 8Mb, 3 = 1Mb, 4 = 130kb, 5 = 16kb
incrementing this value enhances boundary precision but discards reads located at bin junctions.
- rawSize : single logical value, whether to output raw size or normalize by the maximum encountered.
crawl(chrom, start, end, addChr = , maxRange = , maxRangeWarn = , verbosity = , ..., init, loop, final)
:Apply a custom processing to reads in a genomic window (used by 'depth', 'extract' and 'pileup' methods).
- chrom : single integer, numeric or character value, the chromosomal location. NA is not handled.
- start : single integer or numeric value, inferior boundary of the window. NA is not handled.
- end : single integer or numeric value, superior boundary of the window. NA is not handled.
- addChr : single logical value, whether to systematically add 'chr' in front of the 'chrom' value or not.
- maxRange : single integer value, no extraction will be attempted if end and start are more than this value away (returns NULL).
- maxRangeWarn : single logical value, whether to throw a warning when 'maxRange' is exceeded and NULL is returned or not.
- verbosity : single integer value, the level of verbosity during processing (0, 1 or 2).
- ... : arguments to be passed to 'init', 'loop' or 'final'.
- init : a function taking a single storage environment as argument, to be evaluated before looping on reads for initialization.
This environment has R 'base' environment as parent and contains :
* all arguments passed to crawl()
* a 'self' reference to the current object.
* 'earlyBreak', a single logical value forcing crawl() to return immediately if set to TRUE.
* 'output', a place-holder for the variable to be returned by crawl().
* 'totalReads', the number of matching reads seen since the beginning of the whole looping process.
* 'blockReads', the number of matching reads seen since the beginning of the current BGZF block.
The 'init', 'loop' and 'final' functions defined by the user can freely store additionnal variables in this environment to share them.
- loop : a function taking a list-shapped read and the storage environment, to be evaluated for each read with matching coordinates.
- final : a function taking the storage environment as argument, to be evaluated once all reads were processed for finalization.
depth(..., qBase = , qMap = )
:Counts covering bases for each genomic position, similarly to SAMtools' depth.
- ... : arguments to be passed to the crawl() method.
- qBase : single integer value, minimal base quality for a base to be counted.
- qMap : single integer value, minimal mapping quality for a base to be counted.
extract(...)
:Extract reads as a list, similarly to SAMtools' view.
- ... : arguments to be passed to the crawl() method.
getBlocks(limit = , quiet = )
:Jump from BGZF blocks to blocks, recording compressed (bsize) and uncompressed (isize) block sizes
- limit : single integer value, the amount of blocks to evaluate (NA for the whole BAM file, may be very time consuming).
- quiet : single logical value, whether to throw diagnostic messages or not.
getCompression(sample = )
:Estimate BGZF block compression level from a sample of blocks
- sample : single integer value, the amount of blocks to use for estimation (the first block is ignored).
pileup(..., qBase = , qMap = )
:Counts each nucleotide type for each genomic position, similarly to SAMtools' mpileup.
- ... : arguments to be passed to the crawl() method.
- qBase : single integer value, minimal base quality for a base to be counted.
- qMap : single integer value, minimal mapping quality for a base to be counted.
summary(chrom = , tracks = , binLevel = , rawSize = )
:Fast estimation of depth coverage for the whole genome, from indexing data. Values are normalized into [0:1] over the whole genome.
- chrom : character vector, the names of the chromosome to query. If NA, all chromosomes will be queried.
- tracks : single logical value, whether to return a data.frame or a track.table.
- binLevel : single integer value, the higher bin order to allow
0 = 537Mb, 1 = 67Mb, 2 = 8Mb, 3 = 1Mb, 4 = 130kb, 5 = 16kb
incrementing this value enhances boundary precision but discards reads located at bin junctions
- rawSize : single logical value, whether to output raw size or normalize by the maximum encountered.
The following methods are inherited (from the corresponding class):
callParams (drawable)
callSuper (envRefClass)
check (drawable, overloaded)
chromosomes (drawable, overloaded)
copy (envRefClass)
defaultParams (sliceable, overloaded)
draw (sliceable)
export (envRefClass)
field (envRefClass)
fix.param (drawable)
getChromEnd (sliceable, overloaded)
getClass (envRefClass)
getName (drawable)
getParam (drawable)
getRefClass (envRefClass)
import (envRefClass)
initFields (envRefClass)
initialize (drawable, overloaded)
setName (drawable)
setParam (drawable)
show (sliceable, overloaded)
slice (sliceable, overloaded)
trace (envRefClass)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
http://samtools.github.io/hts-specs/SAMv1.pdf
track.table
, sliceable-class
, drawable-class
"track.bands"
This class is a variation of the track.table
class dedicated to cytogenetic banding, enforcing new drawing parameter defaults and a few specialized methods.
Objects can be created by two distincts means :
Using the corresponding constructors, which work like the track.table
constructor.
Importing a track.table
object in an empty object created by a call to new
.
Class track.table
, directly.
Class refTable
, by class track.table
, distance 2.
Class crossable
, by class track.table
, distance 2.
Class sliceable
, by class track.table
, distance 3.
Class drawable
, by class track.table
, distance 4.
All reference classes extend and inherit methods from envRefClass
.
The following fields are inherited (from the corresponding class):
assembly (track.table)
checktrack (track.table)
colCount (refTable)
colIterator (refTable)
colNames (refTable)
colReferences (refTable)
index (track.table)
name (drawable)
organism (track.table)
parameters (drawable)
rowCount (refTable)
rowNamed (refTable)
rowNames (refTable)
sizetrack (track.table)
subtrack (track.table)
values (refTable)
The following methods are inherited (from the corresponding class):
addArms (track.table)
addColumn (track.table)
addDataFrame (refTable)
addEmptyRows (refTable)
addList (track.table)
addVectors (refTable)
buildCalls (track.table)
buildGroupPosition (track.table)
buildGroupSize (track.table)
buildIndex (track.table)
callParams (drawable)
callSuper (envRefClass)
check (track.table)
chromosomes (track.table)
coerce (track.table)
colOrder (refTable)
copy (refTable)
cross (crossable)
defaultParams (track.table, overloaded)
delColumns (track.table)
draw (sliceable)
erase (refTable)
eraseArms (track.table)
export (envRefClass)
extract (refTable)
field (envRefClass)
fill (track.table)
fix.param (drawable)
getChromEnd (track.table)
getClass (envRefClass)
getColCount (refTable)
getColNames (refTable)
getLevels (refTable)
getName (drawable)
getParam (drawable)
getRefClass (envRefClass)
getRowCount (refTable)
getRowNames (refTable)
import (envRefClass)
indexes (refTable)
initFields (envRefClass)
initialize (track.table)
isArmed (track.table)
metaFields (track.table)
rowOrder (track.table)
segMerge (track.table)
segOverlap (track.table)
setColNames (track.table)
setLevels (track.table)
setName (drawable)
setParam (drawable)
setRowNames (refTable)
show (track.table, overloaded)
size (track.table)
slice (track.table)
trace (envRefClass)
types (refTable)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
track.table-class
, track.table
"track.CNV"
This class is a variation of the track.table
class dedicated to constitutive Copy Number Variations, enforcing new drawing parameter defaults and a few specialized methods.
Objects can be created by two distincts means :
Using the corresponding constructors, which work like the track.table
constructor.
Importing a track.table
object in an empty object created by a call to new
.
Class track.table
, directly.
Class refTable
, by class track.table
, distance 2.
Class crossable
, by class track.table
, distance 2.
Class sliceable
, by class track.table
, distance 3.
Class drawable
, by class track.table
, distance 4.
All reference classes extend and inherit methods from envRefClass
.
The following fields are inherited (from the corresponding class):
assembly (track.table)
checktrack (track.table)
colCount (refTable)
colIterator (refTable)
colNames (refTable)
colReferences (refTable)
index (track.table)
name (drawable)
organism (track.table)
parameters (drawable)
rowCount (refTable)
rowNamed (refTable)
rowNames (refTable)
sizetrack (track.table)
subtrack (track.table)
values (refTable)
The following methods are inherited (from the corresponding class):
addArms (track.table)
addColumn (track.table)
addDataFrame (refTable)
addEmptyRows (refTable)
addList (track.table)
addVectors (refTable)
buildCalls (track.table)
buildGroupPosition (track.table)
buildGroupSize (track.table)
buildIndex (track.table)
callParams (drawable)
callSuper (envRefClass)
check (track.table)
chromosomes (track.table)
coerce (track.table)
colOrder (refTable)
copy (refTable)
cross (crossable)
defaultParams (track.table, overloaded)
delColumns (track.table)
draw (sliceable)
erase (refTable)
eraseArms (track.table)
export (envRefClass)
extract (refTable)
field (envRefClass)
fill (track.table)
fix.param (drawable)
getChromEnd (track.table)
getClass (envRefClass)
getColCount (refTable)
getColNames (refTable)
getLevels (refTable)
getName (drawable)
getParam (drawable)
getRefClass (envRefClass)
getRowCount (refTable)
getRowNames (refTable)
import (envRefClass)
indexes (refTable)
initFields (envRefClass)
initialize (track.table)
isArmed (track.table)
metaFields (track.table)
rowOrder (track.table)
segMerge (track.table)
segOverlap (track.table)
setColNames (track.table)
setLevels (track.table)
setName (drawable)
setParam (drawable)
setRowNames (refTable)
show (track.table, overloaded)
size (track.table)
slice (track.table)
trace (envRefClass)
types (refTable)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
track.table-class
, track.table
, track.table.GTF
"track.exons"
This class is a variation of the track.table
class dedicated to exons, enforcing new drawing parameter defaults and a few specialized methods.
Objects can be created by two distincts means :
Using the corresponding constructors, which work like the track.table
constructor.
Importing a track.table
object in an empty object created by a call to new (see Examples).
Class track.table
, directly.
Class refTable
, by class track.table
, distance 2.
Class crossable
, by class track.table
, distance 2.
Class sliceable
, by class track.table
, distance 3.
Class drawable
, by class track.table
, distance 4.
All reference classes extend and inherit methods from envRefClass
.
The following fields are inherited (from the corresponding class):
assembly (track.table)
checktrack (track.table)
colCount (refTable)
colIterator (refTable)
colNames (refTable)
colReferences (refTable)
index (track.table)
name (drawable)
organism (track.table)
parameters (drawable)
rowCount (refTable)
rowNamed (refTable)
rowNames (refTable)
sizetrack (track.table)
subtrack (track.table)
values (refTable)
The following methods are inherited (from the corresponding class):
addArms (track.table)
addColumn (track.table)
addDataFrame (refTable)
addEmptyRows (refTable)
addList (track.table)
addVectors (refTable)
buildCalls (track.table)
buildGroupPosition (track.table)
buildGroupSize (track.table)
buildIndex (track.table)
callParams (drawable)
callSuper (envRefClass)
check (track.table)
chromosomes (track.table)
coerce (track.table)
colOrder (refTable)
copy (refTable)
cross (crossable)
defaultParams (track.table, overloaded)
delColumns (track.table)
draw (sliceable)
erase (refTable)
eraseArms (track.table)
export (envRefClass)
extract (refTable)
field (envRefClass)
fill (track.table)
fix.param (drawable)
getChromEnd (track.table)
getClass (envRefClass)
getColCount (refTable)
getColNames (refTable)
getLevels (refTable)
getName (drawable)
getParam (drawable)
getRefClass (envRefClass)
getRowCount (refTable)
getRowNames (refTable)
import (envRefClass)
indexes (refTable)
initFields (envRefClass)
initialize (track.table)
isArmed (track.table)
metaFields (track.table)
rowOrder (track.table)
segMerge (track.table)
segOverlap (track.table)
setColNames (track.table)
setLevels (track.table)
setName (drawable)
setParam (drawable)
setRowNames (refTable)
show (track.table, overloaded)
size (track.table)
slice (track.table)
trace (envRefClass)
types (refTable)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
track.table-class
, track.table
, track.table.GTF
"track.fasta"
"track.fasta"
is a drawing wraper for FASTA files.
Notice the data are not stored directly in the object, but stay in the original FASTA file(s), thus exported track.fasta
objects may be broken (the check
method can confirm this).
Objects are produced by the track.fasta.multi
and track.fasta.collection
constructors.
Class sliceable
, directly.
Class drawable
, by class sliceable
, distance 2.
All reference classes extend and inherit methods from envRefClass
.
assembly
:Single character
value, the assembly version for the coordinates stored in the object. Must have length 1, should not be NA
.
files
:A data.frame
, with 6 columns : file (character), header (character), startOffset (numeric), lineLength (integer), breakSize (integer) and contentSize (integer). Each row refers to a distinct chromosome, whose name is stored as row name.
organism
:Single character
value, the name of the organism whose data is stored in the object. Must have length 1, should not be NA
.
The following fields are inherited (from the corresponding class):
The following methods are inherited (from the corresponding class):
callParams (drawable)
callSuper (envRefClass)
check (drawable, overloaded)
chromosomes (drawable, overloaded)
copy (envRefClass)
defaultParams (sliceable, overloaded)
draw (sliceable)
export (envRefClass)
field (envRefClass)
fix.param (drawable)
getChromEnd (sliceable, overloaded)
getClass (envRefClass)
getName (drawable)
getParam (drawable)
getRefClass (envRefClass)
import (envRefClass)
initFields (envRefClass)
initialize (drawable, overloaded)
setName (drawable)
setParam (drawable)
show (sliceable, overloaded)
slice (sliceable, overloaded)
trace (envRefClass)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
track.table
, sliceable-class
, drawable-class
Produces sliceable
-inheriting objects to query "in situ" FASTA files.
track.fasta.multi
is designed to handle a single multi-FASTA file aggregating all the chromosomes of an organism. An index as generated by the HTSlib (formerly "SAMtools") faidx
command is required.
track.fasta.collection
is designed to handle a collection of standard FASTA files, one per chromosome.
track.fasta.multi(fastaFile, indexFile, .name, .organism, .assembly, .parameters, warn = TRUE) track.fasta.collection(files, chromosomes, .name, .organism, .assembly, .parameters, warn = TRUE)
track.fasta.multi(fastaFile, indexFile, .name, .organism, .assembly, .parameters, warn = TRUE) track.fasta.collection(files, chromosomes, .name, .organism, .assembly, .parameters, warn = TRUE)
fastaFile |
Single character value, the file name and path to a multi-FASTA file (.fa) to build a track upon. |
indexFile |
Single character value, the file name and path to a multi-FASTA index file (.fai) corresponding to |
.name |
Single character value, to fill the |
.organism |
Single character value, to fill the |
.assembly |
Single character value, to fill the |
.parameters |
A |
warn |
Single logical value, to be passed to the appropriate |
files |
Character vector, file names and paths to multiple single-FASTA file (.fa) to build a track upon (one per chromosome). |
chromosomes |
Character vector, chromosomes names to attribute to each file in |
Both functions suppose the FASTA files to respect the following :
They begin with a single line of comment, after a '>' sign.
All sequence lines have the same length (whatever it is).
The line separator (\n, \r\n) is always the same in a file.
No empty line.
Returned sequences may be wrong (without error or notice !) if any of these is not fullfilled. Standard sources (see below) usually enforce these conditions.
An object of class track.fasta
.
Sylvain Mareschal
Example of FASTA file collection (human assembly 'hg19'), from UCSC : http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz
Example of single multi-FASTA file (human assembly 'hg19'), from 1000 genomes : ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/human_g1k_v37.fasta.gz
The faidx documentation, from the HTSlib project.
track-constructors
, Annotation
"track.genes"
This class is a variation of the track.table
class dedicated to genes, enforcing new drawing parameter defaults and a few specialized methods.
Objects can be created by two distincts means :
Using the corresponding constructors, which work like the track.table
constructor.
Importing a track.table
object in an empty object created by a call to new (see Examples).
Class track.table
, directly.
Class refTable
, by class track.table
, distance 2.
Class crossable
, by class track.table
, distance 2.
Class sliceable
, by class track.table
, distance 3.
Class drawable
, by class track.table
, distance 4.
All reference classes extend and inherit methods from envRefClass
.
The following fields are inherited (from the corresponding class):
assembly (track.table)
checktrack (track.table)
colCount (refTable)
colIterator (refTable)
colNames (refTable)
colReferences (refTable)
index (track.table)
name (drawable)
organism (track.table)
parameters (drawable)
rowCount (refTable)
rowNamed (refTable)
rowNames (refTable)
sizetrack (track.table)
subtrack (track.table)
values (refTable)
The following methods are inherited (from the corresponding class):
addArms (track.table)
addColumn (track.table)
addDataFrame (refTable)
addEmptyRows (refTable)
addList (track.table)
addVectors (refTable)
buildCalls (track.table)
buildGroupPosition (track.table)
buildGroupSize (track.table)
buildIndex (track.table)
callParams (drawable)
callSuper (envRefClass)
check (track.table)
chromosomes (track.table)
coerce (track.table)
colOrder (refTable)
copy (refTable)
cross (crossable)
defaultParams (track.table)
delColumns (track.table)
draw (sliceable)
erase (refTable)
eraseArms (track.table)
export (envRefClass)
extract (refTable)
field (envRefClass)
fill (track.table)
fix.param (drawable)
getChromEnd (track.table)
getClass (envRefClass)
getColCount (refTable)
getColNames (refTable)
getLevels (refTable)
getName (drawable)
getParam (drawable)
getRefClass (envRefClass)
getRowCount (refTable)
getRowNames (refTable)
import (envRefClass)
indexes (refTable)
initFields (envRefClass)
initialize (track.table)
isArmed (track.table)
metaFields (track.table)
rowOrder (track.table)
segMerge (track.table)
segOverlap (track.table)
setColNames (track.table)
setLevels (track.table)
setName (drawable)
setParam (drawable)
setRowNames (refTable)
show (track.table, overloaded)
size (track.table)
slice (track.table)
trace (envRefClass)
types (refTable)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
track.table-class
, track.table
, track.table.GTF
"track.table"
"track.table"
describes a collection of features localized on a genome, defined by a chromosomal location ("chrom"), two boundaries ("start" and "end") and various data (other columns).
Objects can be created by two distincts means :
The track.table
constructor, similar to the data.frame
constructor. It imports a single data.frame
or a collection of vectors into the object, reorder it to meet the track.table
class restrictions (see below) and check immediatly for validity.
The new
function (standard behavior for S4 and reference classes), which produces an empty object and do NOT check for validity. It takes as named arguments the values to store in the fields of the various classes the new object will inherit from.
3 columns are mandatory in "track.table"
, with restricted names and types :
The unique name of the feature, as character
.
The chromosomal location of the feature, as integer
or factor
.
The starting position of the feature on the chromosome, as integer
.
The ending position of the feature on the chromosome, as integer
.
The track is supposed to be ordered by chromosome, then by starting position. When chromosomes are stored as factors
, they need to be numerically ordered by their internal codes (as the order
function does), not alphabetically by their labels.
As the slice
method relies on a row index, each update in the feature coordinates must be followed by a call to the buildIndex
method, a behavior that is enforced by overloading most of refTable
-inherited methods.
As the slice
method relies on a R call object using refTable
column references, each update in the column names must be followed by a call to the buildCalls
method, a behavior that is enforced by overloading most of refTable
-inherited methods.
Class refTable
, directly.
Class crossable
, directly.
Class sliceable
, by class crossable
, distance 2.
Class drawable
, by class crossable
, distance 3.
All reference classes extend and inherit methods from envRefClass
.
assembly
:Single character
value, the assembly version for the coordinates stored in the object. Must have length 1, should not be NA
.
checktrack
:A call
to the C external "checktrack", for faster object check (see the check
method).
index
:integer
vector, giving the index of the first row in each chromosome. See the subtrack
function for further details.
organism
:Single character
value, the name of the organism whose data is stored in the object. Must have length 1, should not be NA
.
sizetrack
:A call
to the C external "track", for faster row counting (see the size
method).
subtrack
:A call
to the C external "track", for faster slicing (see the slice
method).
The following fields are inherited (from the corresponding class):
colCount (refTable)
colIterator (refTable)
colNames (refTable)
colReferences (refTable)
name (drawable)
parameters (drawable)
rowCount (refTable)
rowNamed (refTable)
rowNames (refTable)
values (refTable)
addArms(centromeres, temp = )
:Adds an arm localization ('p' or 'q') to the 'chrom' column.
- centromeres : named numeric vector, providing the centromere position of each chromosome. Can also be a band track, as returned by track.UCSC_bands().
- temp : single logical value, whether to alter the object or return an altered copy.
buildCalls()
:Updates 'checktrack' and 'subtrack' fields. To be performed after each modification of colNames and colReferences (concerned methods are overloaded to enforce this).
buildGroupPosition(groupBy, colName = , reverse = )
:Adds a column to be used as 'groupPosition' by draw.boxes()
- groupBy : single character value, the name of a column to group rows on.
- colName : single character value, the name of the column to buid.
- reverse : single logical value, whether to reverse numbering on reverse strand or not.
buildGroupSize(groupBy, colName = )
:Adds a column to be used as 'groupSize' by draw.boxes()
- groupBy : single character value, the name of a column to group rows on.
- colName : single character value, the name of the column to buid.
buildIndex()
:Updates the 'index' parameter, should be done after any change made on the 'chrom' column (concerned methods are overloaded to enforce this).
eraseArms(temp = )
:Removes 'p' and 'q' added by the addArms() method from the 'chrom' column.
- temp : single logical value, whether to alter the object or return an altered copy.
isArmed()
:Detects whether the 'chrom' column refers to whole chromosomes or chromosome arms.
segMerge(...)
:Apply the segMerge() function to the track content.
- ... : arguments to be passed to segMerge().
segOverlap(...)
:Apply the segOverlap() function to the track content.
- ... : arguments to be passed to segOverlap().
size(chrom, start, end)
:Count elements in the specified window.
- chrom : single integer, numeric or character value, the chromosomal location.
- start : single integer or numeric value, inferior boundary of the window.
- end : single integer or numeric value, superior boundary of the window.
The following methods are inherited (from the corresponding class):
addColumn (refTable, overloaded)
addDataFrame (refTable)
addEmptyRows (refTable)
addList (refTable, overloaded)
addVectors (refTable)
callParams (drawable)
callSuper (envRefClass)
check (refTable, overloaded)
chromosomes (drawable, overloaded)
coerce (refTable, overloaded)
colOrder (refTable)
copy (refTable)
cross (crossable)
defaultParams (sliceable, overloaded)
delColumns (refTable, overloaded)
draw (sliceable)
erase (refTable)
export (envRefClass)
extract (refTable)
field (envRefClass)
fill (refTable, overloaded)
fix.param (drawable)
getChromEnd (sliceable, overloaded)
getClass (envRefClass)
getColCount (refTable)
getColNames (refTable)
getLevels (refTable)
getName (drawable)
getParam (drawable)
getRefClass (envRefClass)
getRowCount (refTable)
getRowNames (refTable)
import (envRefClass)
indexes (refTable)
initFields (envRefClass)
initialize (refTable, overloaded)
metaFields (refTable, overloaded)
rowOrder (refTable, overloaded)
setColNames (refTable, overloaded)
setLevels (refTable, overloaded)
setName (drawable)
setParam (drawable)
setRowNames (refTable)
show (refTable, overloaded)
slice (sliceable, overloaded)
trace (envRefClass)
types (refTable)
untrace (envRefClass)
usingMethods (envRefClass)
Sylvain Mareschal
track.table
, subtrack
, istrack
# Exemplar data : subset of human genes data(hsGenes) # Construction trackTable <- track.table( hsGenes, .name = "NCBI Genes", .organism = "Homo sapiens", .assembly = "GRCh37" ) # Slicing print(trackTable$slice(chrom="1", as.integer(15e6), as.integer(20e6)))
# Exemplar data : subset of human genes data(hsGenes) # Construction trackTable <- track.table( hsGenes, .name = "NCBI Genes", .organism = "Homo sapiens", .assembly = "GRCh37" ) # Slicing print(trackTable$slice(chrom="1", as.integer(15e6), as.integer(20e6)))
This function handles the collision between objects to be drawn, also accounting for labels. It is provided as a convenient component of custom drawing functions, and is currently in use in draw.boxes
and draw.steps
.
yline(boxes, start, end, label, labelStrand, labelCex, labelSrt, labelAdj, labelOverflow, maxDepth)
yline(boxes, start, end, label, labelStrand, labelCex, labelSrt, labelAdj, labelOverflow, maxDepth)
boxes |
A |
start |
Single integer value, the left boundary of the window, in base pairs. |
end |
Single integer value, the right boundary of the window, in base pairs. |
label |
Single logical value, whether to print labels on boxes or not. |
labelStrand |
Single logical value, whether to add the strand at the end of labels or not. |
labelCex |
Single numeric value, character expansion factor for labels. |
labelSrt |
Single numeric value, string rotation angle for labels. |
labelAdj |
'left', 'right' or 'center', the horizontal adjustment of the labels on the boxes. |
labelOverflow |
Single logical value, whether to write labels on boxes too narrow to host them or not. |
maxDepth |
Single integer value, the maximum amount of box heights allowed on the plot to avoid overlaps (if exhausted an error message will be ploted, turning |
Returns boxes
with an additional "yline" integer column defining the y coordinate at which the box should be drawn to avoid collision. If an error occurs, a simpleError object will be returned instead and the drawing should be aborted (see draw.boxes
code for a functional example).
Sylvain Mareschal