Build a block-wise burden correlation matrix, in order to correct for gene-gene correlations in geneSetAssoc. Burden scores should be stored in an aggdb object (see aggregate). The size of the blocks are controlled using the maxDist parameter, all gene-gene correlations beyond the block are set to zero. This function is based on previous work (https://github.com/opain/TWAS-GSEA).

buildCorMatrix(
  object,
  aggdb,
  memlimit = 1000,
  minR2 = 1e-04,
  makePD = TRUE,
  absolute = TRUE,
  maxDist = 2500000,
  verbose = TRUE
)

Arguments

object

An rvatResult object.

aggdb

An aggdb object.

memlimit

Maximum number of units to load from the aggdb at a time. Defaults to 1000.

minR2

R2 values < minR2 will be set to zero (leading to increased sparsity). Defaults to 1e-04.

makePD

Make the correlation matrix positive definite? (TRUE/FALSE)

absolute

Should cormatrix be absolute? Defaults to TRUE.

maxDist

A distance larger than maxDist defines a new block. Defaults to 2.5Mb (5Mb window)

verbose

Should the function be verbose? Defaults to TRUE.

Examples

library(rvatData)
data(rvbresults)
gdb <- create_example_gdb()

# generate the aggregates based on a varSetFile
varsetfile <- varSetFile(rvat_example("rvatData_varsetfile.txt.gz"))
varset <- getVarSet(
  varsetfile,
  unit = c("NEK1", "SOD1", "ABCA4"),
  varSetName = "High"
)
aggfile <- tempfile()
aggregate(
  x = gdb,
  varSet = varset,
  maxMAF = 0.001,
  output = aggfile,
  verbose = FALSE
)

# build a block-wise correlation matrix
cormatrix <- buildCorMatrix(
  rvbresults,
  aggdb = aggdb(aggfile)
)
#> 139716/139740 units in the rvbResults object are not present in the aggdb, these will be excluded.
#> Analyzing block: 1/3
#> i = 1, loading new chunk
#> Analyzing block: 2/3
#> Analyzing block: 3/3