pymodulon.enrichment

Contains functions for gene set enrichment analysis

Module Contents

Functions

contingency(set1, set2, all_genes)

Creates contingency table for gene enrichment

compute_enrichment(gene_set, target_genes, all_genes, label=None)

Computes enrichment statistic for gene_set in target_genes.

FDR(p_values, fdr, total=None)

Runs false detection correction for a table of statistics

parse_regulon_str(regulon_str, trn)

Converts a complex regulon (regulon_str) into a list of genes

compute_regulon_enrichment(gene_set, regulon_str, all_genes, trn)

Computes enrichment statistics for a gene_set in a regulon

compute_trn_enrichment(gene_set, all_genes, trn, max_regs=1, fdr=0.01, method='both', force=False)

Compare a gene set against an entire TRN

compute_annotation_enrichment(gene_set, all_genes, annotation, column, fdr=0.01)

Compare a gene set against a dataframe of gene annotations

pymodulon.enrichment.contingency(set1, set2, all_genes)[source]

Creates contingency table for gene enrichment

Parameters
  • set1 (set) – Set of genes (e.g. iModulon)

  • set2 (set) – Set of genes (e.g. regulon)

  • all_genes (set) – Set of all genes

Returns

Contingency table

Return type

np.ndarray

pymodulon.enrichment.compute_enrichment(gene_set, target_genes, all_genes, label=None)[source]

Computes enrichment statistic for gene_set in target_genes.

Parameters
  • gene_set (list) – Gene set for enrichment (e.g. genes in iModulon)

  • target_genes (list) –

    Genes to be enriched against (e.g. genes in regulon or

    GO term)

  • all_genes (list) – Set of all genes

  • label (list) – Label for target_genes (e.g. regulator name or GO term)

Returns

Table containing statistically significant enrichments

Return type

pd.Series

pymodulon.enrichment.FDR(p_values, fdr, total=None)[source]

Runs false detection correction for a table of statistics

Parameters
  • p_values (DataFrame) – DataFrame with a ‘pvalue’ column

  • fdr (float) – False detection rate

  • total (int) – Total number of tests (for multi-enrichment)

Returns

Table containing entries that passed multiple hypothesis correction

Return type

DataFrame

pymodulon.enrichment.parse_regulon_str(regulon_str, trn)[source]

Converts a complex regulon (regulon_str) into a list of genes

Parameters
  • regulon_str (str) – Complex regulon, where “/” uses genes in any regulon and “+” uses genes in all regulons

  • trn (DataFrame) – Table containing transcriptional regulatory network

Returns

reg_genes – Set of genes regulated by regulon_str

Return type

set

pymodulon.enrichment.compute_regulon_enrichment(gene_set, regulon_str, all_genes, trn)[source]

Computes enrichment statistics for a gene_set in a regulon

Parameters
  • gene_set (set) – Gene set for enrichment (e.g. genes in iModulon)

  • regulon_str (str) – Complex regulon, where “/” uses genes in any regulon and “+” uses genes in all regulons

  • all_genes (set) – Set of all genes

  • trn (DataFrame) – Table containing transcriptional regulatory network

Returns

result – Table containing statistically significant enrichments

Return type

DataFrame

pymodulon.enrichment.compute_trn_enrichment(gene_set, all_genes, trn, max_regs=1, fdr=0.01, method='both', force=False)[source]

Compare a gene set against an entire TRN

Parameters
  • gene_set (set) – Gene set for enrichment (e.g. genes in iModulon)

  • all_genes (set) – Set of all genes

  • trn (DataFrame) – Table containing transcriptional regulatory network

  • max_regs (int) – Maximum number of regulators to include in complex regulon (default: 1)

  • fdr (float) – False detection rate (default = .01)

  • method (str) – How to combine complex regulons. (default: ‘both’) “or” computes enrichment against union of regulons “and” computes enrichment against intersection of regulons “both” performs both tests

  • force (bool) – Allows computation of >2 regulators (default = False)

Returns

Table containing statistically significant enrichments

Return type

DataFrame

pymodulon.enrichment.compute_annotation_enrichment(gene_set, all_genes, annotation, column, fdr=0.01)[source]

Compare a gene set against a dataframe of gene annotations

Parameters
  • gene_set (set) – Gene set for enrichment (e.g. genes in iModulon)

  • all_genes (set) – Set of all genes

  • annotation (DataFrame) – Table containing gene annotations

  • column (str) – Name of column in the annotation DataFrame (default: ‘annotation’)

  • fdr (float) – False detection rate (default: 0.01)

Returns

Table containing statistically significant enrichments

Return type

pandas.DataFrame