buildCorMatrix — buildCorMatrix • rvat

Build a block-wise burden correlation matrix, in order to correct for gene-gene correlations in geneSetAssoc. Burden scores should be stored in an aggregateFile object (see aggregate). The size of the blocks are controlled using the maxDist parameter, all gene-gene correlations beyond the block are set to zero. This function is based on previous work (https://github.com/opain/TWAS-GSEA).

buildCorMatrix(
  object,
  aggregateFile,
  memlimit = 1000,
  minR2 = 1e-04,
  makePD = TRUE,
  absolute = TRUE,
  maxDist = 2500000,
  verbose = TRUE
)

Arguments

object: rvatResult object
aggregateFile: aggregateFile object
memlimit: maximum number of units to load from the aggregateFile at a time.
minR2: R2 values < minR2 will be set to zero (leading to increased sparsity)
makePD: Make the correlation matrix positive definite? (TRUE/FALSE)
absolute: Should cormatrix be absolute? Defaults to TRUE.
maxDist: A distance larger than maxDist defines a new block. Defaults to 2.5Mb (5Mb window)
verbose: Should the function be verbose? Defaults to TRUE.

References

https://github.com/opain/TWAS-GSEA

Examples


library(rvatData)
data(rvbresults)
gdb <- gdb(rvat_example("rvatData.gdb"))

# generate the aggregates based on a varSetFile
varsetfile <- varSetFile(rvat_example("rvatData_varsetfile.txt.gz"))
varset <- getVarSet(varsetfile, 
                    unit = c("NEK1", "SOD1", "ABCA4"), 
                    varSetName = "High")
aggfile <- tempfile()
aggregate(x = gdb,
          varSet = varset,
          maxMAF = 0.001,
          output = aggfile,
          verbose = FALSE)

# build a block-wise correlation matrix
cormatrix <- buildCorMatrix(
  rvbresults,
  aggregateFile = aggregateFile(aggfile)
)
#> 139716/139740 units in the rvbResults object are not present in the aggregateFile, these will be excluded.
#> Analyzing block: 1/3
#> i = 1, loading new chunk
#> Analyzing block: 2/3
#> Analyzing block: 3/3