Generate optionally weighted variant sets using annotation table(s) uploaded to the gdb. See the tutorials for examples.

# S4 method for gdb
buildVarSet(
  object,
  varSetName,
  unitTable,
  unitName,
  output = NULL,
  intersection = NULL,
  where = NULL,
  weightName = "1",
  verbose = TRUE
)

Arguments

object

a gdb object.

varSetName

Name to assign varSet grouping. This identifier column is used to allow for subsequent mergeing of multiple varSet files for coordinated analysis of multiple variant filtering/ weighting strategies)

unitTable

Table containing aggregation unit mappings.

unitName

Field to utilize for aggregation unit names.

output

Output file name (output will be gz compressed text).

intersection

Additional tables to filter through intersection (i.e. variants absent from intersection tables will not appear in output). Multiple tables should be ',' delimited.

where

An SQL compliant where clause to filter output; eg: "CHROM=2 AND POS between 5000 AND 50000 AND AF<0.01 AND (cadd.caddPhred>15 OR snpEff.SIFT='D')".

weightName

Field name for desired variant weighting, must be a column within unitTable or other intersection table. Default value of 1 is equivalent to no weighting.

verbose

Should the function be verbose? Defaults to TRUE.

Examples


library(rvatData)

# Build a varset including variants with a moderate predicted impact
gdb <- create_example_gdb()
varsetfile_moderate <- tempfile()
buildVarSet(object = gdb, 
            output = varsetfile_moderate,
            varSetName = "Moderate", 
            unitTable = "varInfo", 
            unitName = "gene_name",
            where = "ModerateImpact = 1")
#> Generated varSetFile: /var/folders/cl/wvc0rvjx4vd5rzt2_fhpmfth0000gp/T//RtmpRC6myU/file1825c6b5961f5
#> varSetFile object
#> Path:/var/folders/cl/wvc0rvjx4vd5rzt2_fhpmfth0000gp/T//RtmpRC6myU/file1825c6b5961f5
#> Units:12

# Build a varset that contains CADD scores
varsetfile_cadd <- tempfile()
buildVarSet(object = gdb, 
            output = varsetfile_cadd,
            varSetName = "CADD", 
            unitTable = "varInfo", 
            unitName = "gene_name",
            weightName = "CADDphred")
#> Generated varSetFile: /var/folders/cl/wvc0rvjx4vd5rzt2_fhpmfth0000gp/T//RtmpRC6myU/file1825c392e52e2
#> varSetFile object
#> Path:/var/folders/cl/wvc0rvjx4vd5rzt2_fhpmfth0000gp/T//RtmpRC6myU/file1825c392e52e2
#> Units:12
            
# connect to varsetfile and retrieve variant sets
varsetfile <- varSetFile(varsetfile_moderate)
varsets <- getVarSet(varsetfile, unit = c("SOD1", "FUS"))

# see ?getVarSet, ?varSetFile and ?varSetList for more details on connecting and handling varsetfiles.
# see e.g., ?assocTest and ?aggregate for downstream methods that can loop through varsetfiles and varsetlists.