`pymodulon.enrichment`

Contains functions for gene set enrichment analysis

Module Contents

Functions

`contingency`(set1, set2, all_genes)	Creates contingency table for gene enrichment
`compute_enrichment`(gene_set, target_genes, all_genes, label=None)	Computes enrichment statistic for gene_set in target_genes.
`FDR`(p_values, fdr, total=None)	Runs false detection correction for a table of statistics
`parse_regulon_str`(regulon_str, trn)	Converts a complex regulon (regulon_str) into a list of genes
`compute_regulon_enrichment`(gene_set, regulon_str, all_genes, trn)	Computes enrichment statistics for a gene_set in a regulon
`compute_trn_enrichment`(gene_set, all_genes, trn, max_regs=1, fdr=0.01, method='both', force=False)	Compare a gene set against an entire TRN
`compute_annotation_enrichment`(gene_set, all_genes, annotation, column, fdr=0.01)	Compare a gene set against a dataframe of gene annotations

pymodulon.enrichment.contingency(set1, set2, all_genes)[source]

Creates contingency table for gene enrichment

Parameters

set1 (set) – Set of genes (e.g. iModulon)
set2 (set) – Set of genes (e.g. regulon)
all_genes (set) – Set of all genes

Returns

Contingency table

Return type

np.ndarray

pymodulon.enrichment.compute_enrichment(gene_set, target_genes, all_genes, label=None)[source]

Computes enrichment statistic for gene_set in target_genes.

Parameters

gene_set (list) – Gene set for enrichment (e.g. genes in iModulon)
target_genes (list) –

Genes to be enriched against (e.g. genes in regulon or
GO term)
all_genes (list) – Set of all genes
label (list) – Label for target_genes (e.g. regulator name or GO term)

Returns

Table containing statistically significant enrichments

Return type

pd.Series

pymodulon.enrichment.FDR(p_values, fdr, total=None)[source]

Runs false detection correction for a table of statistics

Parameters

p_values (DataFrame) – DataFrame with a ‘pvalue’ column
fdr (float) – False detection rate
total (int) – Total number of tests (for multi-enrichment)

Returns

Table containing entries that passed multiple hypothesis correction

Return type

DataFrame

pymodulon.enrichment.parse_regulon_str(regulon_str, trn)[source]

Converts a complex regulon (regulon_str) into a list of genes

Parameters

regulon_str (str) – Complex regulon, where “/” uses genes in any regulon and “+” uses genes in all regulons
trn (DataFrame) – Table containing transcriptional regulatory network

Returns

reg_genes – Set of genes regulated by regulon_str

Return type

set

pymodulon.enrichment.compute_regulon_enrichment(gene_set, regulon_str, all_genes, trn)[source]

Computes enrichment statistics for a gene_set in a regulon

Parameters

gene_set (set) – Gene set for enrichment (e.g. genes in iModulon)
regulon_str (str) – Complex regulon, where “/” uses genes in any regulon and “+” uses genes in all regulons
all_genes (set) – Set of all genes
trn (DataFrame) – Table containing transcriptional regulatory network

Returns

result – Table containing statistically significant enrichments

Return type

DataFrame

pymodulon.enrichment.compute_trn_enrichment(gene_set, all_genes, trn, max_regs=1, fdr=0.01, method='both', force=False)[source]

Compare a gene set against an entire TRN

Parameters

gene_set (set) – Gene set for enrichment (e.g. genes in iModulon)
all_genes (set) – Set of all genes
trn (DataFrame) – Table containing transcriptional regulatory network
max_regs (int) – Maximum number of regulators to include in complex regulon (default: 1)
fdr (float) – False detection rate (default = .01)
method (str) – How to combine complex regulons. (default: ‘both’) “or” computes enrichment against union of regulons “and” computes enrichment against intersection of regulons “both” performs both tests
force (bool) – Allows computation of >2 regulators (default = False)

Returns

Table containing statistically significant enrichments

Return type

DataFrame

pymodulon.enrichment.compute_annotation_enrichment(gene_set, all_genes, annotation, column, fdr=0.01)[source]

Compare a gene set against a dataframe of gene annotations

Parameters

gene_set (set) – Gene set for enrichment (e.g. genes in iModulon)
all_genes (set) – Set of all genes
annotation (DataFrame) – Table containing gene annotations
column (str) – Name of column in the annotation DataFrame (default: ‘annotation’)
fdr (float) – False detection rate (default: 0.01)

Returns

Table containing statistically significant enrichments

Return type

pandas.DataFrame

pymodulon.enrichment

Module Contents

Functions

`pymodulon.enrichment`