mapVariants.RdMethod to map the variants in a gdb to a set of ranges or features.
The input can be a set of ranges (CHROM, start, end), a bed-file or a gff/gtf-file.
Variants in the gdb will be mapped onto those ranges and annotated with the features/columns
included in the input file.
For example, variants can be easily mapped upon genomic features downloaded in gff format from ensembl.
The output can be written to disk (output parameter) or directly uploaded to the gdb (uploadName parameter).
mapVariants(
object,
ranges = NULL,
gff = NULL,
bed = NULL,
bedCols = character(),
fields = NULL,
uploadName = NULL,
output = NULL,
sep = "\t",
skipIndexes = FALSE,
overWrite = FALSE,
verbose = TRUE
)a gdb object
Can be 1) a data.frame, including at least 'CHROM','start', and 'end' columns.
2) a GenomicRanges::GRanges object. 3) a filepath to a ranges file containing at least 'CHROM','start', and 'end' columns.
Separator can be specified using the sep parameter (defaults to \\t).
Path to a gff- or gtf-file.
Path to a bed-file. Specify extra columns using the bedCols parameter.
A character vector of names of the extra columns to read from the BED-file. Optionally the vector can be a named vector to indicate the classes of the columns (i.e. c("gene_id" = "character", "gene_name"="character")). If not named, all extra columns will be read as character columns (see examples).
Feature fields to keep. Defaults to NULL in which case all fields are kept.
Name of table to upload to the gdb.
If not specified, either specifiy output to
write the results to disk, or otherwise the results will be returned in the R session.
Optionally, an output file path. Can be used instead of uploadName to write the results to disk.
Field separator, relevant if ranges is a filepath. Defaults to \\t.
Flag indicating whether to skip indexing of imported table.
Relevant if uploadName is specified, and thus the output table is imported in the gdb.
Defaults to FALSE.
if uploadName is specified, should an existing table in the gdb with the same name be overwitten?
Defaults to FALSE.
Should the method be verbose? Defaults to TRUE.
library(rvatData)
library(rtracklayer)
library(GenomicRanges)
gdb <- create_example_gdb()
# map variants to gene models
ranges <- GRanges(
seqnames = c("chr21", "chr4"),
ranges = IRanges(
start = c(31659666, 169369704),
end = c(31668931, 169612632)
),
gene_name = c("SOD1", "NEK1")
)
mapVariants(gdb,
ranges = ranges,
uploadName = "gene",
verbose = FALSE)
# similarly, ranges can be a data.frame
ranges <- data.frame(
CHROM = c("chr21", "chr4"),
start = c(31659666, 169369704),
end = c(31668931, 169612632),
gene_name = c("SOD1", "NEK1")
)
mapVariants(gdb,
ranges = ranges,
uploadName = "gene",
verbose = FALSE,
overWrite = TRUE)
#> Table 'gene' already exists, it will be overwritten (as `overWrite=TRUE`)
#> Table 'gene' removed from gdb
# often you'd want to map variants to a large set of ranges, such as ensembl models
# mapVariants supports several file formats, including gff/gtf, bed and ranges
# map variants using a gtf file
gtffile <- tempfile(fileext = ".gtf")
rtracklayer::export(makeGRangesFromDataFrame(ranges),
con = gtffile,
format = "gtf")
mapVariants(gdb,
gff = gtffile,
uploadName = "gene",
verbose = FALSE,
overWrite = TRUE)
#> Table 'gene' already exists, it will be overwritten (as `overWrite=TRUE`)
#> Table 'gene' removed from gdb
# map variants using a bed file
bedfile <- tempfile(fileext = ".bed")
rtracklayer::export(makeGRangesFromDataFrame(ranges),
con = bedfile,
format = "bed")
mapVariants(gdb,
bed = bedfile,
uploadName = "gene",
verbose = FALSE,
overWrite = TRUE)
#> Table 'gene' already exists, it will be overwritten (as `overWrite=TRUE`)
#> Table 'gene' removed from gdb
# see the variant annotation tutorial on the rvat website for more details