summarize_clusters.Rd
Summarize the terms of the cluster using pagerank algorithm
# S3 method for BOWER summarize_clusters( bower, cluster = NULL, pattern = NULL, sep = NULL, ncpus = NULL, disconnect_graph = FALSE, ... ) # S3 method for igraph summarize_clusters( graph, cluster = NULL, pattern = NULL, sep = NULL, ncpus = NULL, disconnect_graph = FALSE, ... )
cluster | vector of cluster labels for each geneset. |
---|---|
pattern | search pattern to remove from the terms. Unless specified, will default to built-in pattern. |
sep | separator used/found in gene set names to be changed to blank spaces. Default value is underscore ('_'). |
ncpus | number of cores used for parallelizing reconstruction. |
disconnect_graph | return a graph connecting only nodes in a cluster. |
... | passed to textrank::textrank_sentences. |
graph | geneset overlap graph. |
Returns a matrix of tf-idf score of tokens.
Given a list of text, it creates a sparse matrix consisting of tf-idf score for tokens from the text. See https://github.com/saraswatmks/superml/blob/master/R/TfidfVectorizer.R
. A k shortest-nearest neighbor graph is then computed using the overlap of of the terms.
gmt_file <- system.file("extdata", "h.all.v7.4.symbols.gmt", package = "bowerbird") bwr <- bower(gmt_file) bwr <- snn_graph(bwr) bwr <- find_clusters(bwr) bwr <- summarize_clusters(bwr, ncpus = 1) bwr #> BOWER class #> number of genesets: 50 #> genesets kNN Graph: #> IGRAPH 64116cd UNW- 50 124 -- #> + attr: name (v/c), cluster (v/n), geneset_size (v/n), terms (v/c), #> | labels (v/c), weight (e/n) #> + edges from 64116cd (vertex names): #> [1] HALLMARK_TNFA_SIGNALING_VIA_NFKB--HALLMARK_HYPOXIA #> [2] HALLMARK_TNFA_SIGNALING_VIA_NFKB--HALLMARK_TGF_BETA_SIGNALING #> [3] HALLMARK_TNFA_SIGNALING_VIA_NFKB--HALLMARK_IL6_JAK_STAT3_SIGNALING #> [4] HALLMARK_TNFA_SIGNALING_VIA_NFKB--HALLMARK_APOPTOSIS #> [5] HALLMARK_TNFA_SIGNALING_VIA_NFKB--HALLMARK_MYOGENESIS #> [6] HALLMARK_TNFA_SIGNALING_VIA_NFKB--HALLMARK_COMPLEMENT #> + ... omitted several edges #> number of geneset clusters: 9 #> Core genes: #> First six genes shown #> XENOBIOTIC METABOLISM : LIFR DNAJB9 CD36 ACOX1 IDH1 ECH1 ... #> E2F TARGETS : SAC3D1 KIF11 KIF23 RACGAP1 NUMA1 KIF2C ... #> ESTROGEN RESPONSE EARLY : JAG1 CTNNB1 GNAI1 FDFT1 DHCR7 FASN ... #> APOPTOSIS : ATF3 IER3 BIRC3 JUN EGR3 IL1B ... #> INTERFERON ALPHA RESPONSE : MX1 ISG15 IFIT3 IFI44 IFI35 IRF7 ... #> HEDGEHOG SIGNALING : VEGFA VLDLR MYH9 ERO1A DDIT4 STC2 ... #> APICAL JUNCTION : EGFR ADAM10 CLTC AP2M1 ARF1 MAPK1 ... #> TGF BETA SIGNALING : TGFB1 PMEPA1 SERPINE1 ID2 THBS1 PPP1R15A ... #> IL6 JAK STAT3 SIGNALING : IL4R IFNGR1 IL1R2 IL3RA TNFRSF1B CSF1 ...