pymodulon.plotting

Plotting functions for iModulons

Module Contents

Functions

barplot(values, sample_table, ylabel='', projects=None, highlight=None, ax=None, legend_kwargs=None)

Creates an overlaid scatter and barplot for a set of values (either gene

plot_expression(ica_data, gene, projects=None, highlight=None, ax=None, legend_kwargs=None)

Creates a barplot showing an gene’s expression across the compendium

plot_activities(ica_data, imodulon, projects=None, highlight=None, ax=None, legend_kwargs=None)

Creates a barplot showing an iModulon’s activity across the compendium

plot_metadata(ica_data, column, projects=None, highlight=None, ax=None, legend_kwargs=None)

Creates a barplot for values in the sample table

plot_regulon_histogram(ica_data, imodulon, regulator=None, bins=None, kind='overlap', ax=None, hist_label=('Not regulated', 'Regulon Genes'), color=('#aaaaaa', 'salmon'), alpha=0.7, ax_font_kwargs=None, legend_kwargs=None)

Plots a histogram of regulon vs non-regulon genes by iModulon weighting.

scatterplot(x, y, groups=None, colors=None, show_labels='auto', adjust_labels=True, line45=False, line45_margin=0, fit_line=False, fit_metric='pearson', xlabel='', ylabel='', ax=None, legend=True, ax_font_kwargs=None, scatter_kwargs=None, label_font_kwargs=None, legend_kwargs=None)

Generates a scatter-plot of the data given, with options for coloring by

plot_gene_weights(ica_data, imodulon, by='start', xaxis=None, xname='', **kwargs)

Plot gene weights on a scatter plot.

compare_gene_weights(ica_data, imodulon1, imodulon2, ica_data2=None, ortho_file=None, use_org1_names=True, **kwargs)

Create a scatterplot comparing the gene weights between two iModulons

compare_expression(ica_data, gene1, gene2, **kwargs)

Create a scatterplot comparing the compendium-wide expression profiles of two genes

compare_activities(ica_data, imodulon1, imodulon2, **kwargs)

Create a scatterplot comparing the compendium-wide activities of two iModulons

plot_dima(ica_data, sample1, sample2, threshold=5, fdr=0.1, label=True, adjust=True, table=False, **kwargs)

Plots a DiMA plot between two projects or two sets of samples

plot_explained_variance(ica_data, pc=True, ax=None)

Plots the cumulative explained variance for independent components and,

compare_imodulons_vs_regulons(ica_data, imodulons=None, cat_column=None, size_column=None, scale=1, reg_only=True, xlabel=None, ylabel=None, vline=0.6, hline=0.6, ax=None, scatter_kwargs=None, ax_font_kwargs=None, legend_kwargs=None)

Compare the overlaps between iModulons and their linked regulons

cluster_activities(ica_data, correlation_method='spearman', distance_threshold=None, show_thresholding=False, show_clustermap=True, show_best_clusters=False, n_best_clusters='auto', cluster_names=None, return_clustermap=False, dimca_sample1=None, dimca_sample2=None, dimca_threshold=5, dimca_fdr=0.1, dimca_label=True, dimca_adjust=True, dimca_table=False, **dima_kwargs)

Cluster all iModulon activity profiles using hierarchical clustering and display

metadata_boxplot(ica_data, imodulon, show_points=True, n_boxes=3, samples=None, strip_conc=True, ignore_cols=None, use_cols=None, return_results=False, ax=None, box_kwargs=None, strip_kwargs=None, swarm_kwargs=None)

Uses a decision tree regressor to automatically cluster iModulon activities

_encode_metadata(ica_data, samples, use_cols=None, ignore_cols=None, strip_conc=True)

_train_classifier(component, features, max_leaf_nodes=3)

_get_labels_from_tree(clf, encoding)

_get_sample_leaves(clf, features, labels, component)

_fit_line(x, y, ax, metric)

_get_fit(x, y)

_broken_line(x, A, B, C)

_solid_line(x, A, B)

_adj_r2(f, x, y, params)

_mod_freedman_diaconis(ica_data, imodulon)

Generates bins using optimal bin width estimate using the Freedman-Diaconis rule

_set_xaxis(xaxis, y)

Implements experimental xaxis param from plot_gene_weights. This

pymodulon.plotting.barplot(values, sample_table, ylabel='', projects=None, highlight=None, ax=None, legend_kwargs=None)[source]

Creates an overlaid scatter and barplot for a set of values (either gene expression levels or iModulon activities)

Parameters
  • values (Series) – List of values to plot

  • sample_table (DataFrame) – Sample table from IcaData object

  • ylabel (str, optional) – Y-axis label

  • projects (list or str, optional) – Name(s) of projects to show (default: show all)

  • highlight (list or str, optional) – Project(s) to highlight (default: None)

  • ax (Axes, optional) – Axes object to plot on, otherwise use current Axes

  • legend_kwargs (dict, optional) – Additional keyword arguments passed to matplotlib.pyplot.legend()

Returns

axAxes containing the barplot

Return type

Axes

pymodulon.plotting.plot_expression(ica_data, gene, projects=None, highlight=None, ax=None, legend_kwargs=None)[source]

Creates a barplot showing an gene’s expression across the compendium

Parameters
  • ica_data (IcaData) – IcaData object

  • gene (str) – Gene locus tag or name

  • projects (str or list, optional) – Name(s) of projects to show (default: show all)

  • highlight (str or list, optional) – Name(s) of projects to highlight (default: None)

  • ax (Axes, optional) – Axes object to plot on, otherwise use current Axes

  • legend_kwargs (dict, optional) – Additional keyword arguments passed to matplotlib.pyplot.legend()

Returns

axAxes containing the barplot

Return type

Axes

pymodulon.plotting.plot_activities(ica_data, imodulon, projects=None, highlight=None, ax=None, legend_kwargs=None)[source]

Creates a barplot showing an iModulon’s activity across the compendium

Parameters
  • ica_data (IcaData) – IcaData object

  • imodulon (int or str) – iModulon name

  • projects (list or str, optional) – Name(s) of projects to show (default: show all)

  • highlight (str or list, optional) – Name(s) of projects to highlight (default: None)

  • ax (Axes, optional) – Axes object to plot on, otherwise use current Axes

  • legend_kwargs (dict, optional) – Additional keyword arguments passed to matplotlib.pyplot.legend()

Returns

axAxes containing the barplot

Return type

Axes

pymodulon.plotting.plot_metadata(ica_data, column, projects=None, highlight=None, ax=None, legend_kwargs=None)[source]

Creates a barplot for values in the sample table

Parameters
  • ica_data (IcaData) – IcaData object

  • column (str) – Column name to plot

  • projects (list or str, optional) – Name(s) of projects to show (default: show all)

  • highlight (str or list, optional) – Name(s) of projects to highlight (default: None)

  • ax (Axes, optional) – Axes object to plot on, otherwise use current Axes

  • legend_kwargs (dict, optional) – Additional keyword arguments passed to matplotlib.pyplot.legend()

Returns

axAxes containing the barplot

Return type

Axes

pymodulon.plotting.plot_regulon_histogram(ica_data, imodulon, regulator=None, bins=None, kind='overlap', ax=None, hist_label=('Not regulated', 'Regulon Genes'), color=('#aaaaaa', 'salmon'), alpha=0.7, ax_font_kwargs=None, legend_kwargs=None)[source]

Plots a histogram of regulon vs non-regulon genes by iModulon weighting.

Parameters
  • ica_data (IcaData) – IcaData object

  • imodulon (int or str) – iModulon name

  • regulator (str, optional) – Name of regulator to compare enrichment against. If no regulator is given, choose top enrichment from imodulon_table

  • bins (int, str or list, optional) – The bins to use when generating the histogram. Passed on to matplotlib.pyplot.hist()

  • kind ('overlap' or 'side', optional) – Whether to plot an overlapping or side-by-side comparison histogram ( default: ‘overlap’)

  • ax (Axes, optional) – Axes object to plot on, otherwise use current Axes

  • hist_label (tuple, optional) – The label to use when plotting the regulon and non-regulon genes. Takes into a tuple of 2 values (first for non-regulon genes, second for regulon genes). Passed on to matplotlib.pyplot.hist()

  • color (list or str, optional) – The colors to use for regulon and non-regulon genes. Takes a Sequence of 2 values (first for non-regulon genes, second for regulon genes). Passed on to matplotlib.pyplot.hist()

  • alpha (float, optional) – Sets the opacity of the histogram (0 = transparent, 1 = opaque). Passed on to matplotlib.pyplot.hist()

  • ax_font_kwargs (dict, optional) – Additional keyword arguments for axes labels

  • legend_kwargs (dict, optional) – Additional keyword arguments passed to matplotlib.pyplot.legend()

Returns

axAxes containing the histogram

Return type

Axes

pymodulon.plotting.scatterplot(x, y, groups=None, colors=None, show_labels='auto', adjust_labels=True, line45=False, line45_margin=0, fit_line=False, fit_metric='pearson', xlabel='', ylabel='', ax=None, legend=True, ax_font_kwargs=None, scatter_kwargs=None, label_font_kwargs=None, legend_kwargs=None)[source]

Generates a scatter-plot of the data given, with options for coloring by group, adding labels, adding lines, and generating correlation or determination coefficients.

Parameters
  • x (Series) – X-axis data

  • y (Series) – Y-axis data

  • groups (dict, optional) – A dictionary mapping samples to group names

  • colors (str, list or dict, optional) – Color of points, list of colors for different groups, or dictionary mapping groups to colors

  • show_labels (bool or 'auto') – Show labels for data points

  • adjust_labels (bool) – Auto-adjust labels for data points

  • line45 (bool) – Show 45-degree line of equal values

  • line45_margin (float) – Show 45-degreen lines offset by a margin

  • fit_line (bool) – Draw a line of best fit on the scatterplot

  • fit_metric ('pearson', 'spearman', 'r2', or None) – Metric to report in legend for line of best fit

  • xlabel (str) – X-axis label

  • ylabel (str) – Y-axis label

  • ax (Axes, optional) – Axes object to plot on, otherwise use current Axes

  • legend (bool) – Show legend

  • ax_font_kwargs (dict, optional) – Additional keyword arguments for axis labels

  • scatter_kwargs (dict, optional) – Additional keyword arguments passed to matplotlib.pyplot.scatter()

  • label_font_kwargs (dict, optional) – Additional keyword arguments for labels passed to matplotlib.pyplot.text()

  • legend_kwargs (dict, optional) – Additional keyword arguments passed to matplotlib.pyplot.legend()

Returns

axAxes containing the scatterplot

Return type

Axes

pymodulon.plotting.plot_gene_weights(ica_data, imodulon, by='start', xaxis=None, xname='', **kwargs)[source]

Plot gene weights on a scatter plot.

Parameters
  • ica_data (IcaData) – IcaData object

  • imodulon (int or str) – iModulon name

  • by ('log-tpm-norm', 'length', or 'start') – Property to plot on x-axis. Superceded by xaxis

  • xaxis (list, dict or Series, optional) – Values on custom x-axis

  • xname (str, optional) – Name of x-axis if using custom x-axis

  • **kwargs – Additional keyword arguments passed to pymodulon.plotting.scatterplot()

Returns

axAxes containing the scatterplot

Return type

Axes

pymodulon.plotting.compare_gene_weights(ica_data, imodulon1, imodulon2, ica_data2=None, ortho_file=None, use_org1_names=True, **kwargs)[source]

Create a scatterplot comparing the gene weights between two iModulons

Parameters
  • ica_data (IcaData) – IcaData object

  • imodulon1 (int or str) – Name of iModulon on X-axis

  • imodulon2 (int or str) – Name of iModulon on X-axis

  • ica_data2 (IcaData, optional) – IcaData object for second iModulon (if comparing iModulons across objects)

  • ortho_file (str, optional) – Path to orthology file between organisms

  • use_org1_names (bool) – If true, use gene names from first organism. If false, use gene names from second organism (default: True)

  • **kwargs – Additional keyword arguments passed to pymodulon.plotting.scatterplot()

Returns

axAxes containing the scatterplot

Return type

Axes

pymodulon.plotting.compare_expression(ica_data, gene1, gene2, **kwargs)[source]

Create a scatterplot comparing the compendium-wide expression profiles of two genes

Parameters
Returns

axAxes containing the scatterplot

Return type

Axes

pymodulon.plotting.compare_activities(ica_data, imodulon1, imodulon2, **kwargs)[source]

Create a scatterplot comparing the compendium-wide activities of two iModulons

Parameters
Returns

axAxes containing the scatterplot

Return type

Axes

pymodulon.plotting.plot_dima(ica_data, sample1, sample2, threshold=5, fdr=0.1, label=True, adjust=True, table=False, **kwargs)[source]

Plots a DiMA plot between two projects or two sets of samples

Parameters
  • ica_data (IcaData) – IcaData object

  • sample1 (list or str) – List of sample IDs or name of “project:condition” for x-axis

  • sample2 (list or str) – List of sample IDs or name of “project:condition” for y-axis

  • threshold (float) – Minimum activity difference to determine DiMAs (default: 5)

  • fdr (float) – False detection rate (default: 0.1)

  • label (bool) – Label differentially activated iModulons (default: True)

  • adjust (bool) – Automatically adjust labels (default: True)

  • table (bool) – Return differential iModulon activity table (default: False)

  • **kwargs – Additional keyword arguments passed to pymodulon.plotting.scatterplot()

Returns

  • ax (~matplotlib.axes.Axes) – Axes containing the scatterplot

  • df_diff (~pandas.DataFrame, optional) – Table reporting differentially activated iModulons

pymodulon.plotting.plot_explained_variance(ica_data, pc=True, ax=None)[source]

Plots the cumulative explained variance for independent components and, optionally, principal components

Parameters
  • ica_data (IcaData) – IcaData object

  • pc (bool) – If True, plot cumulative explained variance of independent components

  • ax (Axes, optional) – Axes object to plot on, otherwise use current Axes

Returns

axAxes containing the line plot

Return type

Axes

pymodulon.plotting.compare_imodulons_vs_regulons(ica_data, imodulons=None, cat_column=None, size_column=None, scale=1, reg_only=True, xlabel=None, ylabel=None, vline=0.6, hline=0.6, ax=None, scatter_kwargs=None, ax_font_kwargs=None, legend_kwargs=None)[source]

Compare the overlaps between iModulons and their linked regulons

Parameters
  • ica_data (IcaData) – IcaData object

  • imodulons (list, optional) – List of iModulons to plot

  • cat_column (str, optional) – Column in the imodulon_table that stores the category of each iModulon

  • size_column (str, optional) – Column in the imodulon_table that stores the size of each iModulon

  • scale (float, optional (default: 1)) – Value used to scale the size of each point

  • reg_only (bool (default: True)) – Only plot iModulons with an entry in the regulator column of the imodulon_table

  • xlabel (str, optional) – Custom x-axis label (default: “# shared genes/Regulon size”)

  • ylabel (str, optional) – Custom y-axis label (default: “# shared genes/iModulon size”)

  • vline (float, optional (default: 0.6)) – Draw a dashed vertical line

  • hline (float, optional (default: 0.6)) – Draw a dashed horizontal line

  • ax (Axes, optional) – Axes object to plot on, otherwise use current Axes

  • scatter_kwargs (dict, optional) – Additional keyword arguments passed to matplotlib.pyplot.scatter()

  • ax_font_kwargs (dict, optional) – Additional keyword arguments for axes labels

  • legend_kwargs (dict, optional) – Additional keyword arguments passed to matplotlib.pyplot.legend()

Returns

axAxes containing the line plot

Return type

Axes

pymodulon.plotting.cluster_activities(ica_data, correlation_method='spearman', distance_threshold=None, show_thresholding=False, show_clustermap=True, show_best_clusters=False, n_best_clusters='auto', cluster_names=None, return_clustermap=False, dimca_sample1=None, dimca_sample2=None, dimca_threshold=5, dimca_fdr=0.1, dimca_label=True, dimca_adjust=True, dimca_table=False, **dima_kwargs)[source]

Cluster all iModulon activity profiles using hierarchical clustering and display the results using seaborn.clustermap()

Parameters
  • ica_data (IcaData) – IcaData object

  • correlation_method ('pearson', 'spearman', 'kendall', 'mutual_info' or callable) – Method for computing correlations between iModulon activities. See pandas.DataFrame.corr() Default is ‘spearman’.

  • distance_threshold (float, optional) – A distance from 0 to 1 to define flat clusters from the hierarchical clustering. Larger values yield fewer clusters. If None, automatic selection of optimal threshold will occur by maximizing the silhouette score across iModulons. (default: None)

  • show_thresholding (bool) – Show the plot of distance thresholds vs. silhouette scores (default: False)

  • show_clustermap (bool) – Show the clustermap (default: True)

  • show_best_clusters (bool) – Show individual clusters below complete clustermap

  • n_best_clusters (int or str) – Number of best clusters to show. If ‘auto’, only clusters with above-average silhouette scores are shown

  • cluster_names (dict, optional) – A dictionary mapping cluster indices to names, usually used after clustering has been performed previously.

  • return_clustermap (bool) – Return the axis containing the clustermap

  • dimca_sample1 (list or str) – List of sample IDs or name of “project:condition” of reference samples for Differential iModulon Cluster Analysis (DiMCA)

  • dimca_sample2 (list or str) – List of sample IDs or name of “project:condition” of target samples for DiMCA

  • dimca_threshold (float) – Minimum activity difference to determine DiMCAs

  • dimca_fdr (float) – False detection rate for DiMCA

  • dimca_label (bool) – Label differentially activated imodulon clusters (default: True)

  • dimca_adjust (bool) – Auto-adjust DiMCA cluster labels (default: True)

  • dimca_table (bool) – Return DiMCA table (default: False)

  • **dima_kwargs (dict, optional) – Additional keyword arguments passed to pymodulon.plotting.dima()

Returns

  • clusters (~sklearn.cluster.AgglomerativeClustering) – sklearn.cluster.AgglomerativeClustering of activity matrix

  • cg (~seaborn.matrix.ClusterGrid, optional) – ClusterGrid containing the clusterplot

  • dimca_ax (~matplotlib.axes.Axes, optional) – Axes containing the DiMCA scatterplot

  • dimca_table (~pandas.DataFrame, optional) – Table of differentially activated iModulon clusters

pymodulon.plotting.metadata_boxplot(ica_data, imodulon, show_points=True, n_boxes=3, samples=None, strip_conc=True, ignore_cols=None, use_cols=None, return_results=False, ax=None, box_kwargs=None, strip_kwargs=None, swarm_kwargs=None)[source]

Uses a decision tree regressor to automatically cluster iModulon activities using metadata. Displays results as a box plot.

Parameters
  • ica_data (IcaData) – IcaData object

  • imodulon (int or str) – iModulon name

  • show_points (bool or str (default: True)) – Overlay individual points on top of the boxplot. By default, this uses seaborn.stripplot(). show_points=’swarm’ will use :func`seaborn.swarmplot`.

  • n_boxes (int) – Number of boxes to create

  • samples (list, optional) – Subset of samples to analyze

  • strip_conc (bool) – Remove concentrations from metadata (e.g. “glucose(2g/L)” would be interpreted as just “glucose”)

  • ignore_cols (list, optional) – List of columns to ignore. If None, only “project” and “condition” are ignored

  • use_cols (list, optional) – List of columns to use. This supercedes ignore_cols.

  • return_results (bool) – Return a dataframe describing the classifications

  • ax (Axes, optional) – Axes object to plot on, otherwise use current Axes

  • box_kwargs (dict, optional) – Additional keyword arguments passed to seaborn.boxplot()

  • strip_kwargs (dict, optional) – Additional keyword arguments passed to seaborn.stripplot()

  • swarm_kwargs (dict, optional) – Additional keyword arguments passed to seaborn.swarmplot()

Returns

  • ax (~matplotlib.axes.Axes) – Axes containing the boxplot

  • df_classes (~pd.DataFrame) – Metadata classifications of the samples

pymodulon.plotting._encode_metadata(ica_data, samples, use_cols=None, ignore_cols=None, strip_conc=True)[source]
pymodulon.plotting._train_classifier(component, features, max_leaf_nodes=3)[source]
pymodulon.plotting._get_labels_from_tree(clf, encoding)[source]
pymodulon.plotting._get_sample_leaves(clf, features, labels, component)[source]
pymodulon.plotting._fit_line(x, y, ax, metric)[source]
pymodulon.plotting._get_fit(x, y)[source]
pymodulon.plotting._broken_line(x, A, B, C)[source]
pymodulon.plotting._solid_line(x, A, B)[source]
pymodulon.plotting._adj_r2(f, x, y, params)[source]
pymodulon.plotting._mod_freedman_diaconis(ica_data, imodulon)[source]

Generates bins using optimal bin width estimate using the Freedman-Diaconis rule

Parameters
Returns

bins – Numpy array of bins

Return type

ndarray

pymodulon.plotting._set_xaxis(xaxis, y)[source]

Implements experimental xaxis param from plot_gene_weights. This allows for users to generate a scatterplot comparing gene weight on the y-axis with any collection of numbers on the x-axis (as long as the lengths match).

Parameters
  • xaxis (list, set, tuple, dict, np.array, pd.Series) – Any collection or mapping of numbers (plots on x-axis)

  • y (pd.Series) – pandas Series of Gene Weights to be plotted on the y-axis of plot_gene_weights

Returns

x – Returns a pd.Series to be used as the x-axis data-points for generating the plot_gene_weights scatter-plot.

Return type

pd.Series