pymodulon.imodulondb
Functions for writing a directory for iModulonDB webpages
Module Contents
Functions
|
Checks for all issues and missing information prior to exporting to iModulonDB. |
|
Generates the iModulonDB page for the data and exports to the path. |
|
Converts the model’s imodulondb_table into dataset metadata |
|
Reformats the iModulon table according |
|
Generates the two versions of the gene presence file, one as a binary |
|
Generates all parts of the site that do not require large iteration loops |
|
Generates all files for all iModulons in data |
|
Generates all files for all iModulons in IcaData object |
|
Returns a list of relevant tfs from a string. Will ignore TFs not in the |
|
Creates the gene table dataframe for iModulonDB |
|
Helper function for imdb_gene_hist_df |
|
Creates a formatted string for the histogram legends. Helper function for |
|
Sorts TF strings for the legend of the histogram. Helper function for |
|
Creates the gene histogram for an iModulon |
|
Helper function to match genes to colors based on COG. Used by |
|
Generates a dataframe for the gene scatter plot in iModulonDB |
|
Generates the “n_replicates” column of the sample_table for iModulonDB. |
|
Generates a dataframe for the activity bar graph of iModulon k |
|
The Bacillus microarray dataset uses [] to create unusually complicated |
|
Finds the set of genes regulated by the boolean combination of regulators |
|
Generates a dataframe for the regulon venn diagram of iModulon k. Returns |
|
|
|
|
|
Adds links to the regulator string |
|
Adds links to the regulator string |
|
Generates a dataframe for the metadata of iModulon k |
|
|
|
|
|
Generates a dataframe for the iModulon table of gene g |
|
|
|
Generates all data for gene g, stores it in a subfolder of path_prefix |
-
pymodulon.imodulondb.
imodulondb_compatibility
(model, inplace=False, tfcomplex_to_gene=None)[source] Checks for all issues and missing information prior to exporting to iModulonDB. If inplace = True, modifies the model (not recommended for main model variables).
- Parameters
model (
IcaData
) – IcaData object to checkinplace (bool, optional) – If true, modifies the model to prepare for export. Not recommended for use with your main model variable.
tfcomplex_to_gene (dict, optional) – dictionary pointing complex TRN entries to matching gene names in the gene table (ex: {“FlhDC”:”flhD”})
- Returns
table_issues (pd.DataFrame) – Each row corresponds to an issue with one of the main class elements. Columns: * Table: which table or other variable the issue is in * Missing Column: the column of the Table with the issue (not case sensitive; capitalization is ignored). * Solution: Unless “CRITICAL” is in this cell, the site behavior if the issue remained is described here.
tf_issues (pd.DataFrame) – Each row corresponds to a regulator that is used in the imodulon_table. Columns: * in_trn: whether the regulator is in the model.trn. Regulators not in the TRN will be ignored in the site’s histograms and gene tables. * has_link: whether the regulator has a link in tf_links. If not, no link to external regulator databases will be shown. * has_gene: whether the regulator can be matched to a gene in the model. If this is false, then there will be no regulator scatter plot on the site. You can link TF complexes to one of their genes using the tfcomplex_to_gene input.
missing_g_links (pd.Series) – The genes on this list don’t have links in the gene_links. Their gene pages for these genes will not display links.
missing_DOIs (pd.Series) – The samples listed here don’t have DOIs in the sample_table. Clicking on their associated bars in the activity plots will not link to relevant papers.
-
pymodulon.imodulondb.
imodulondb_export
(model, path='.', cat_order=None, tfcomplex_to_gene=None, skip_iMs=False, skip_genes=False)[source] Generates the iModulonDB page for the data and exports to the path. If certain columns are unavailable but can be filled in automatically, they will be.
- Parameters
model (
IcaData
) – IcaData object to exportpath (str, optional) – Path to iModulonDB main hosting folder (default = “.”)
cat_order (list, optional) – List of categories in the imodulon_table, ordered as you would like them to appear in the dataset table (default = None)
tfcomplex_to_gene (dict, optional) – dictionary pointing complex TRN entries to matching gene names in the gene table ex: {“FlhDC”:”flhD”}
skip_iMs (bool, optional) – If this is True, do not output iModulon files (to save time)
skip_genes (bool, optional) – If this is True, do not output gene files (to save time)
- Returns
None
- Return type
-
pymodulon.imodulondb.
imdb_dataset_table
(model)[source] Converts the model’s imodulondb_table into dataset metadata for the gray box on the left side of the dataset page
-
pymodulon.imodulondb.
imdb_iM_table
(imodulon_table, cat_order=None)[source] Reformats the iModulon table according
-
pymodulon.imodulondb.
imdb_gene_presence
(model)[source] Generates the two versions of the gene presence file, one as a binary matrix, and one as a DataFrame
- Parameters
model (
IcaData
) – An IcaData object- Returns
mbin (~pandas.DataFrame) – Binarized M matrix
mbin_list (~pandas.DataFrame) – Table mapping genes to iModulons
-
pymodulon.imodulondb.
imodulondb_main_site_files
(model, path_prefix='.', rewrite_annotations=True, cat_order=None)[source] Generates all parts of the site that do not require large iteration loops
- Parameters
model (
IcaData
) – IcaData objectpath_prefix (str, optional) – Main folder for iModulonDB files (default = “.”)
rewrite_annotations (bool, optional) – Set to False if the gene_table and trn are unchanged (default = True)
cat_order (list, optional) – list of categories in data.imodulon_table.category, ordered as you want them to appear on the dataset page (default = None)
- Returns
main_folder – Dataset folder, for use as the path_prefix in imdb_generate_im_files()
- Return type
-
pymodulon.imodulondb.
imdb_generate_im_files
(model, path_prefix='.', gene_scatter_x='start', tfcomplex_to_gene=None)[source] Generates all files for all iModulons in data
- Parameters
model (
IcaData
) – IcaData objectpath_prefix (str, optional) – Dataset folder in which to store the files (default = “.”)
gene_scatter_x (str) – Column from the gene table that specificies what to use on the X-axis of the gene scatter plot (default = “start”)
tfcomplex_to_gene (dict, optional) – dictionary pointing complex TRN entries to matching gene names in the gene table ex: {“FlhDC”:”flhD”}
-
pymodulon.imodulondb.
imdb_generate_gene_files
(model, path_prefix='.')[source] Generates all files for all iModulons in IcaData object
-
pymodulon.imodulondb.
parse_tf_string
(model, tf_str, verbose=False)[source] Returns a list of relevant tfs from a string. Will ignore TFs not in the trn file. iModulonDB helper function.
-
pymodulon.imodulondb.
imdb_gene_table_df
(model, k)[source] Creates the gene table dataframe for iModulonDB :param model: IcaData object :type model:
IcaData
:param k: iModulon name :type k: int or str- Returns
res – DataFrame of the gene table that is compatible with iModulonDB
- Return type
-
pymodulon.imodulondb.
_component_DF
(model, k, tfs=None)[source] Helper function for imdb_gene_hist_df
-
pymodulon.imodulondb.
_tf_combo_string
(row)[source] Creates a formatted string for the histogram legends. Helper function for imdb_gene_hist_df.
-
pymodulon.imodulondb.
_sort_tf_strings
(tfs, unique_elts)[source] Sorts TF strings for the legend of the histogram. Helper function for imdb_gene_hist_df.
-
pymodulon.imodulondb.
imdb_gene_hist_df
(model, k, bins=20, tol=0.001)[source] Creates the gene histogram for an iModulon
- Parameters
- Returns
gene_hist_table – A dataframe for producing the histogram that is compatible with iModulonDB
- Return type
-
pymodulon.imodulondb.
_gene_color_dict
(model)[source] Helper function to match genes to colors based on COG. Used by imdb_gene_scatter_df.
-
pymodulon.imodulondb.
imdb_gene_scatter_df
(model, k, gene_scatter_x='start')[source] Generates a dataframe for the gene scatter plot in iModulonDB
-
pymodulon.imodulondb.
generate_n_replicates_column
(model)[source] Generates the “n_replicates” column of the sample_table for iModulonDB.
-
pymodulon.imodulondb.
imdb_activity_bar_df
(model, k)[source] Generates a dataframe for the activity bar graph of iModulon k
-
pymodulon.imodulondb.
_parse_regulon_string
(model, s)[source] The Bacillus microarray dataset uses [] to create unusually complicated TF strings. This function parses those, as a helper to _get_reg_genes for imdb_regulon_venn_df.
-
pymodulon.imodulondb.
_get_reg_genes
(model, tf)[source] Finds the set of genes regulated by the boolean combination of regulators in a TF string
-
pymodulon.imodulondb.
imdb_regulon_venn_df
(model, k)[source] Generates a dataframe for the regulon venn diagram of iModulon k. Returns None if there is no diagram to draw
-
pymodulon.imodulondb.
get_tfs_to_scatter
(model, tf_string, tfcomplex_to_genename=None, verbose=False)[source] - Parameters
- Returns
res – List of gene loci
- Return type
-
pymodulon.imodulondb.
imdb_regulon_scatter_df
(model, k, tfcomplex_to_genename=None)[source] - Parameters
- Returns
res – A dataframe for producing the regulon scatter plots in iModulonDB
- Return type
-
pymodulon.imodulondb.
tf_with_links_brackets
(model, tf_str)[source] Adds links to the regulator string Used with the complicated bracket system in Bacillus Microarray
-
pymodulon.imodulondb.
imdb_imodulon_basics_df
(model, k, reg_venn, reg_scatter)[source] Generates a dataframe for the metadata of iModulon k
- Parameters
- Returns
res – A dataframe of metadata for iModulon k in iModulonDB
- Return type
-
pymodulon.imodulondb.
make_im_directory
(model, k, path_prefix='.', gene_scatter_x='start', tfcomplex_to_genename=None)[source] - Parameters
model (
IcaData
) – IcaData objectpath_prefix (str, optional) – Path to the dataset folder. This function creates an ‘iModulon_files/k/’ subdirectory there to store everything. (default = “.”)
gene_scatter_x (str) – Passed to imdb_gene_scatter_df() to indicate the x axis type of that plot (default = “start”)
tfcomplex_to_genename (dict, optional) – dictionary pointing complex TRN entries to matching gene names in the gene table ex: {“FlhDC”:”flhD”}
- Returns
None
- Return type
-
pymodulon.imodulondb.
imdb_gene_im_table_df
(model, g, im_table, m_bin)[source] Generates a dataframe for the iModulon table of gene g
- Parameters
- Returns
perGene_table – A dataframe for the iModulon table of gene g in iModulonDB
- Return type