An S4 class to manage geneSets. A geneSetList can be generated from gmt-files or a list of gene sets using the buildGeneSet() method. Specific geneSets can be retrieved using the getGeneSet method. A geneSetList can be used directly as input in geneSetAssoc() or assocTest to perform gene set analyses. The on-disk equivalent of a geneSetList is geneSetFile(), from which geneSets can be retrieved in the same way as a geneSetList using the getGeneSet method.

Build a geneSetList

  • buildGeneSet(x, ...): Generate a geneSetList or geneSetFile() that stores (weighted) gene sets for use geneSetAnalyses. This can be based on 1) GMT-files (https://www.gsea-msigdb.org/gsea/msigdb/) and 2) a list of vectors of units per gene set. See buildGeneSet() for details.

Getters

In the following code snippets, x is a geneSetList object.

  • geneGeneSet(x, set): Retrieve geneSets for specified names.

  • listGeneSets(x): Return a vector of all gene sets included in the geneSetList

  • listUnits(x): Return a vector of all units included across the gene sets in the geneSetList

Subsetting

A geneSetList can be subsetted in the same way as a normal R list, however, getGeneSet is the most convenient way to select genesets (see above). Some examples, where x is a geneSetList object:

  • x[[i]]: Return the i'th geneSet

  • x[i:j]: Return the i'th till j'th geneSets.

Gene set analyses

A geneSetList can be directly supplied to the geneSetAssoc() method, using the geneSet parameter, in combination with an rvbResult object. To perform gene set burden analyses, assocTest can be used.

Miscellaneous

In the following code snippets, x is a geneSetList object:

  • write(x, file = "data", append = "FALSE"): Write the geneSet to disk, in the geneSetFile format.

  • remapIDs(x,...): Remap IDs used in geneSets, see remapIDs for details.

  • dropUnits(x, unit = NULL): Remove specified units from geneSets included in the geneSetList.

Examples

library(rvatData)

# build a geneSetList
# can also be build based on a GMT-file (see ?buildGeneSet)
genesetlist <- buildGeneSet(
  list("geneset1" = c("SOD1", "NEK1"),
       "geneset2" = c("ABCA4", "SOD1", "NEK1"),
       "geneset3" = c("FUS", "NEK1")
       )
  )

# extract a couple of gene sets from the geneSetList, which will return a new geneSetList
getGeneSet(genesetlist, c("geneset1", "geneset2"))
#> geneSetList
#> Contains 2 sets

# list included gene sets and units
genesets <- listGeneSets(genesetlist)
head(genesets)
#> [1] "geneset1" "geneset2" "geneset3"
units <- listUnits(genesetlist)
head(units)
#> [1] "SOD1"  "NEK1"  "ABCA4" "FUS"  

# several basic list operations work on a geneSetList
length(genesetlist)
#> [1] 3
genesetlist[1:2]
#> geneSetList
#> Contains 2 sets
genesetlist[[1]]
#> An object of class "geneSet"
#> Slot "geneSetName":
#> [1] "geneset1"
#> 
#> Slot "units":
#> [1] "SOD1,NEK1"
#> 
#> Slot "w":
#> [1] "1,1"
#> 
#> Slot "metadata":
#> [1] ""
#> 

# write a geneset list to a geneSetFile on disk (see ?geneSetFile)
file <- tempfile()
write(genesetlist, file)
genesetfile <- geneSetFile(file)

# exclude units from all genesets included in a geneSetList
dropUnits(genesetlist, unit = "NEK1")
#> geneSetList
#> Contains 3 sets

# remap IDs
linker <- data.frame(
  gene_name = c("SOD1", "NEK1", "FUS", "ABCA4"),
  gene_id = c("ENSG00000142168","ENSG00000137601", "ENSG00000089280", "ENSG00000198691")
)
genesetlist_remapped <- remapIDs(genesetlist, linker)
#> 4/4 IDs in the geneSetList are present in the linker file.
listUnits(genesetlist_remapped)
#> [1] "ENSG00000142168" "ENSG00000137601" "ENSG00000198691" "ENSG00000089280"

# see ?geneSetAssoc and ?`assocTest-aggregateFile` to run gene set analyses