buildCorMatrix.Rd
Build a block-wise burden correlation matrix, in order to correct for gene-gene correlations in geneSetAssoc
.
Burden scores should be stored in an aggregateFile
object (see aggregate
).
The size of the blocks are controlled using the maxDist
parameter, all gene-gene correlations beyond the block are set to zero.
This function is based on previous work (https://github.com/opain/TWAS-GSEA).
buildCorMatrix(
object,
aggregateFile,
memlimit = 1000,
minR2 = 1e-04,
makePD = TRUE,
absolute = TRUE,
maxDist = 2500000,
verbose = TRUE
)
rvatResult
object
aggregateFile
object
maximum number of units to load from the aggregateFile at a time.
R2 values < minR2 will be set to zero (leading to increased sparsity)
Make the correlation matrix positive definite? (TRUE/FALSE)
Should cormatrix be absolute? Defaults to TRUE
.
A distance larger than maxDist
defines a new block. Defaults to 2.5Mb (5Mb window)
Should the function be verbose? Defaults to TRUE
.
library(rvatData)
data(rvbresults)
gdb <- gdb(rvat_example("rvatData.gdb"))
# generate the aggregates based on a varSetFile
varsetfile <- varSetFile(rvat_example("rvatData_varsetfile.txt.gz"))
varset <- getVarSet(varsetfile,
unit = c("NEK1", "SOD1", "ABCA4"),
varSetName = "High")
aggfile <- tempfile()
aggregate(x = gdb,
varSet = varset,
maxMAF = 0.001,
output = aggfile,
verbose = FALSE)
# build a block-wise correlation matrix
cormatrix <- buildCorMatrix(
rvbresults,
aggregateFile = aggregateFile(aggfile)
)
#> 139716/139740 units in the rvbResults object are not present in the aggregateFile, these will be excluded.
#> Analyzing block: 1/3
#> i = 1, loading new chunk
#> Analyzing block: 2/3
#> Analyzing block: 3/3