1. Introduction to the IcaData object

The pymodulon.core.IcaData object is at the core of the PyModulon package. This object holds all of the data related to the expression dataset, the iModulons, and their annotations.

[1]:
from pymodulon.core import IcaData
from pymodulon import example_data
from pymodulon.io import save_to_json, load_json_model

1.1. Minimum requirements

The IcaData object only requires two matrices, which are the results of performing Independent Component Analysis (ICA) on an expression dataset. For more information about ICA, see the iModulonDB about page

  • M: The iModulon matrix contains the Independent Components (ICs) themselves. Each column represents an IC, and each row contains the gene weights for each gene across each IC.

[2]:
M = example_data.M
M.head()
[2]:
AllR/AraC/FucR ArcA-1 ArcA-2 ArgR AtoC BW25113 Cbl+CysB CdaR CecR Copper ... thrA-KO translation uncharacterized-1 uncharacterized-2 uncharacterized-3 uncharacterized-4 uncharacterized-5 uncharacterized-6 ydcI-KO yheO-KO
b0002 -0.010888 -0.007717 -0.008502 -0.012186 -0.061489 -0.005599 -0.007377 -0.000795 0.004331 0.001845 ... 0.479209 0.035685 0.024778 -0.010660 -0.002123 -0.004416 -0.005428 -0.009219 -0.004345 -0.007838
b0003 -0.011467 0.003042 0.011448 -0.003685 -0.006106 0.006680 -0.043512 0.005107 0.000474 0.007650 ... 0.011420 0.040811 0.003324 -0.008424 -0.004415 -0.016126 -0.016476 -0.003497 -0.003583 0.003381
b0004 -0.008693 0.003944 0.012347 -0.008104 0.000585 0.003245 -0.041283 0.006390 0.004260 0.007109 ... 0.011339 0.036244 0.003710 -0.005212 0.000700 -0.011096 -0.006140 -0.003155 -0.008418 0.000129
b0005 0.006565 -0.001099 0.009415 -0.008507 0.005399 0.014748 -0.009249 -0.003058 -0.012649 -0.002370 ... -0.015324 0.028972 0.023969 0.000150 0.018497 0.009428 0.001255 -0.006890 -0.028069 0.021534
b0006 -0.006011 0.009889 -0.005555 -0.000152 -0.002454 0.009678 -0.003456 0.002160 -0.001924 -0.000628 ... -0.005661 0.000700 -0.002538 -0.006103 -0.002506 -0.005077 -0.004616 -0.003585 0.001607 0.001285

5 rows × 92 columns

  • A: The Activity matrix contains the condition-specific activities. Each column represents a sample, and each row contains the activity of each iModulon across all samples.

[3]:
A = example_data.A
A.head()
[3]:
control__wt_glc__1 control__wt_glc__2 fur__wt_dpd__1 fur__wt_dpd__2 fur__wt_fe__1 fur__wt_fe__2 fur__delfur_dpd__1 fur__delfur_dpd__2 fur__delfur_fe2__1 fur__delfur_fe2__2 ... efeU__menFentC_ale29__1 efeU__menFentC_ale29__2 efeU__menFentC_ale30__1 efeU__menFentC_ale30__2 efeU__menFentCubiC_ale36__1 efeU__menFentCubiC_ale36__2 efeU__menFentCubiC_ale37__1 efeU__menFentCubiC_ale37__2 efeU__menFentCubiC_ale38__1 efeU__menFentCubiC_ale38__2
AllR/AraC/FucR 0.378690 -0.378690 2.457678 2.248678 -0.327344 -0.259164 1.777251 2.690655 0.656937 0.319583 ... 1.041336 2.203940 3.698292 0.856998 1.557323 0.337806 0.943742 1.736640 0.499461 1.581476
ArcA-1 -0.440210 0.440210 -5.367360 -5.684301 0.131174 0.348843 -4.436389 -4.770469 -1.799113 -1.474222 ... -6.471714 -6.549861 -3.109145 -2.716183 -2.531192 -1.461022 -0.408849 -0.210397 -5.700321 -6.237836
ArcA-2 0.762258 -0.762258 2.619623 2.900696 3.120724 2.743634 1.989803 1.555835 1.782500 1.530811 ... 2.789653 3.959650 1.585147 0.811182 0.300414 2.537535 1.061408 2.634524 0.125513 1.178747
ArgR -0.289630 0.289630 -10.085719 -13.187916 2.371129 1.861918 -8.708701 -7.881588 -1.237027 -1.235604 ... -11.263744 -10.366813 -0.289217 0.389228 -5.142768 -5.014526 -3.648777 -4.125952 -4.286326 -5.475940
AtoC 0.250770 -0.250770 1.844767 2.055052 0.299345 0.425502 1.801217 1.790987 0.921254 1.410026 ... 3.821909 3.306573 2.652394 1.910173 0.927772 1.327549 1.846321 0.909667 2.064662 2.371405

5 rows × 278 columns

To create the IcaData object, the M and A datasets can be entered as either filenames or as a Pandas DataFrame

[4]:
ica_data = IcaData(M,A)
ica_data
[4]:
<pymodulon.core.IcaData at 0x7fc18620f9d0>

Once loaded, the M and A matrices can be accessed directly from the object

[5]:
ica_data.M.head()
[5]:
AllR/AraC/FucR ArcA-1 ArcA-2 ArgR AtoC BW25113 Cbl+CysB CdaR CecR Copper ... thrA-KO translation uncharacterized-1 uncharacterized-2 uncharacterized-3 uncharacterized-4 uncharacterized-5 uncharacterized-6 ydcI-KO yheO-KO
b0002 -0.010888 -0.007717 -0.008502 -0.012186 -0.061489 -0.005599 -0.007377 -0.000795 0.004331 0.001845 ... 0.479209 0.035685 0.024778 -0.010660 -0.002123 -0.004416 -0.005428 -0.009219 -0.004345 -0.007838
b0003 -0.011467 0.003042 0.011448 -0.003685 -0.006106 0.006680 -0.043512 0.005107 0.000474 0.007650 ... 0.011420 0.040811 0.003324 -0.008424 -0.004415 -0.016126 -0.016476 -0.003497 -0.003583 0.003381
b0004 -0.008693 0.003944 0.012347 -0.008104 0.000585 0.003245 -0.041283 0.006390 0.004260 0.007109 ... 0.011339 0.036244 0.003710 -0.005212 0.000700 -0.011096 -0.006140 -0.003155 -0.008418 0.000129
b0005 0.006565 -0.001099 0.009415 -0.008507 0.005399 0.014748 -0.009249 -0.003058 -0.012649 -0.002370 ... -0.015324 0.028972 0.023969 0.000150 0.018497 0.009428 0.001255 -0.006890 -0.028069 0.021534
b0006 -0.006011 0.009889 -0.005555 -0.000152 -0.002454 0.009678 -0.003456 0.002160 -0.001924 -0.000628 ... -0.005661 0.000700 -0.002538 -0.006103 -0.002506 -0.005077 -0.004616 -0.003585 0.001607 0.001285

5 rows × 92 columns

[6]:
ica_data.A.head()
[6]:
control__wt_glc__1 control__wt_glc__2 fur__wt_dpd__1 fur__wt_dpd__2 fur__wt_fe__1 fur__wt_fe__2 fur__delfur_dpd__1 fur__delfur_dpd__2 fur__delfur_fe2__1 fur__delfur_fe2__2 ... efeU__menFentC_ale29__1 efeU__menFentC_ale29__2 efeU__menFentC_ale30__1 efeU__menFentC_ale30__2 efeU__menFentCubiC_ale36__1 efeU__menFentCubiC_ale36__2 efeU__menFentCubiC_ale37__1 efeU__menFentCubiC_ale37__2 efeU__menFentCubiC_ale38__1 efeU__menFentCubiC_ale38__2
AllR/AraC/FucR 0.378690 -0.378690 2.457678 2.248678 -0.327344 -0.259164 1.777251 2.690655 0.656937 0.319583 ... 1.041336 2.203940 3.698292 0.856998 1.557323 0.337806 0.943742 1.736640 0.499461 1.581476
ArcA-1 -0.440210 0.440210 -5.367360 -5.684301 0.131174 0.348843 -4.436389 -4.770469 -1.799113 -1.474222 ... -6.471714 -6.549861 -3.109145 -2.716183 -2.531192 -1.461022 -0.408849 -0.210397 -5.700321 -6.237836
ArcA-2 0.762258 -0.762258 2.619623 2.900696 3.120724 2.743634 1.989803 1.555835 1.782500 1.530811 ... 2.789653 3.959650 1.585147 0.811182 0.300414 2.537535 1.061408 2.634524 0.125513 1.178747
ArgR -0.289630 0.289630 -10.085719 -13.187916 2.371129 1.861918 -8.708701 -7.881588 -1.237027 -1.235604 ... -11.263744 -10.366813 -0.289217 0.389228 -5.142768 -5.014526 -3.648777 -4.125952 -4.286326 -5.475940
AtoC 0.250770 -0.250770 1.844767 2.055052 0.299345 0.425502 1.801217 1.790987 0.921254 1.410026 ... 3.821909 3.306573 2.652394 1.910173 0.927772 1.327549 1.846321 0.909667 2.064662 2.371405

5 rows × 278 columns

If the M and A datasets have row or column names, these will be saved as the sample/gene/iModulon names. Since genes are often re-named when characterized, the locus tag is the preferred identifier.

[7]:
print('Gene names:',ica_data.gene_names[:5])
print('Sample names:',ica_data.sample_names[:5])
print('iModulon names:',ica_data.imodulon_names[:5])
Gene names: ['b0002', 'b0003', 'b0004', 'b0005', 'b0006']
Sample names: ['control__wt_glc__1', 'control__wt_glc__2', 'fur__wt_dpd__1', 'fur__wt_dpd__2', 'fur__wt_fe__1']
iModulon names: ['AllR/AraC/FucR', 'ArcA-1', 'ArcA-2', 'ArgR', 'AtoC']

1.2. Adding the Expression Matrix

The X matrix contains eXpression data and is primarily used for plotting functions. The column names of the X matrix are the sample names, and the row names are the gene identifiers.

[8]:
X = example_data.X
X.head()
[8]:
control__wt_glc__1 control__wt_glc__2 fur__wt_dpd__1 fur__wt_dpd__2 fur__wt_fe__1 fur__wt_fe__2 fur__delfur_dpd__1 fur__delfur_dpd__2 fur__delfur_fe2__1 fur__delfur_fe2__2 ... efeU__menFentC_ale29__1 efeU__menFentC_ale29__2 efeU__menFentC_ale30__1 efeU__menFentC_ale30__2 efeU__menFentCubiC_ale36__1 efeU__menFentCubiC_ale36__2 efeU__menFentCubiC_ale37__1 efeU__menFentCubiC_ale37__2 efeU__menFentCubiC_ale38__1 efeU__menFentCubiC_ale38__2
b0002 -0.061772 0.061772 0.636527 0.819793 -0.003615 -0.289353 -1.092023 -0.777289 0.161343 0.145641 ... -0.797097 -0.791859 0.080114 0.102154 0.608180 0.657673 0.813105 0.854813 0.427986 0.484338
b0003 -0.053742 0.053742 0.954439 1.334385 0.307588 0.128414 -0.872563 -0.277893 0.428542 0.391761 ... -0.309105 -0.352535 -0.155074 -0.077145 0.447030 0.439881 0.554528 0.569030 0.154905 0.294799
b0004 -0.065095 0.065095 -0.202697 0.119195 -0.264995 -0.546017 -1.918349 -1.577736 -0.474815 -0.495312 ... -0.184898 -0.225615 0.019575 0.063986 0.483343 0.452754 0.524828 0.581878 0.293239 0.341040
b0005 0.028802 -0.028802 -0.865171 -0.951179 0.428769 0.123564 -1.660351 -1.531147 0.240353 -0.151132 ... -0.308221 -0.581714 0.018820 0.004040 -1.228763 -1.451750 -0.839203 -0.529349 -0.413336 -0.478682
b0006 0.009087 -0.009087 -0.131039 -0.124079 -0.144870 -0.090152 -0.219917 -0.046648 -0.044537 -0.089204 ... 1.464603 1.415706 1.230831 1.165153 0.447447 0.458852 0.421417 0.408077 1.151066 1.198529

5 rows × 278 columns

[9]:
ica_data.X = X

1.3. Adding annotation tables

You may load in additional data tables with information about your samples, genes, or iModulons.

These tables are originally empty, but can be altered like any Pandas DataFrame.

[10]:
ica_data.gene_table.head()
[10]:
b0002
b0003
b0004
b0005
b0006

Annotation tables contain one sample/gene/iModulon per row, and information about the respective item in columns. For example, a gene_table may include the gene function, genomic position, or Cluster of Orthologous Groups (COG) Category. See the Creating the Gene Table tutorial for a step-by-step example on how to contruct this table. Gene names must match the gene names in the M matrix.

[11]:
gene_table = example_data.gene_table
gene_table.head()
[11]:
start end strand gene_name length operon COG accession
b0001 189 255 + thrL 66 thrLABC No COG Annotation NC_000913.3
b0002 336 2799 + thrA 2463 thrLABC Amino acid transport and metabolism NC_000913.3
b0003 2800 3733 + thrB 933 thrLABC Amino acid transport and metabolism NC_000913.3
b0004 3733 5020 + thrC 1287 thrLABC Amino acid transport and metabolism NC_000913.3
b0005 5233 5530 + yaaX 297 yaaX Function unknown NC_000913.3

The sample_table contains detailed experimental metadata about each sample. This must be manually created, and can contain information related to the strains or experimental conditions used in the study.

[12]:
sample_table = example_data.sample_table
sample_table.head()
[12]:
Study project condition Replicate # Strain Description Strain Base Media Carbon Source (g/L) Nitrogen Source (g/L) Electron Acceptor ... Growth Rate (1/hr) Evolved Sample Isolate Type Sequencing Machine ALEdb sample Additional Details Biological Replicates Alignment DOI GEO
Sample ID
control__wt_glc__1 Control control wt_glc 1 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... NaN No NaN MiSeq NaN NaN 2 94.33 doi.org/10.1101/080929 GSE65643
control__wt_glc__2 Control control wt_glc 2 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... NaN No NaN MiSeq NaN NaN 2 94.24 doi.org/10.1101/080929 GSE65643
fur__wt_dpd__1 Fur fur wt_dpd 1 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... 0.00 No NaN MiSeq NaN NaN 2 98.04 doi.org/10.1038/ncomms5910 GSE54900
fur__wt_dpd__2 Fur fur wt_dpd 2 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... 0.00 No NaN MiSeq NaN NaN 2 98.30 doi.org/10.1038/ncomms5910 GSE54900
fur__wt_fe__1 Fur fur wt_fe 1 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... 1.06 No NaN MiSeq NaN NaN 2 93.35 doi.org/10.1038/ncomms5910 GSE54900

5 rows × 26 columns

The project and condition columns in the sample_table will be useful for the plotting functions described in the Plotting Functions tutorial.

The imodulon_table contains information about each iModulon, such as regulator enrichments or iModulon size.

[13]:
imodulon_table = example_data.imodulon_table
imodulon_table.head()
[13]:
regulator f1score pvalue precision recall TP n_genes n_tf Category threshold
name
AllR/AraC/FucR allR/araC/fucR 0.750000 1.190000e-41 1.000000 0.600000 18.0 18 3 Carbon Source Utilization 0.086996
ArcA-1 arcA 0.130952 6.420000e-20 0.660000 0.072687 33.0 50 1 Energy Metabolism 0.058051
ArcA-2 arcA 0.087683 1.150000e-16 0.840000 0.046256 21.0 25 1 Energy Metabolism 0.081113
ArgR argR 0.177778 6.030000e-18 0.923077 0.098361 12.0 13 1 Amino Acid and Nucleotide Biosynthesis 0.080441
AtoC atoC 0.800000 1.520000e-12 0.666667 1.000000 4.0 6 1 Miscellaneous Metabolism 0.105756

The tables can be loaded into the IcaData object as either filenames or as a Pandas DataFrame

[14]:
ica_data.gene_table = gene_table
ica_data.sample_table = sample_table
ica_data.imodulon_table = imodulon_table
[15]:
ica_data.sample_table.head()
[15]:
Study project condition Replicate # Strain Description Strain Base Media Carbon Source (g/L) Nitrogen Source (g/L) Electron Acceptor ... Growth Rate (1/hr) Evolved Sample Isolate Type Sequencing Machine ALEdb sample Additional Details Biological Replicates Alignment DOI GEO
control__wt_glc__1 Control control wt_glc 1 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... NaN No NaN MiSeq NaN NaN 2 94.33 doi.org/10.1101/080929 GSE65643
control__wt_glc__2 Control control wt_glc 2 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... NaN No NaN MiSeq NaN NaN 2 94.24 doi.org/10.1101/080929 GSE65643
fur__wt_dpd__1 Fur fur wt_dpd 1 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... 0.00 No NaN MiSeq NaN NaN 2 98.04 doi.org/10.1038/ncomms5910 GSE54900
fur__wt_dpd__2 Fur fur wt_dpd 2 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... 0.00 No NaN MiSeq NaN NaN 2 98.30 doi.org/10.1038/ncomms5910 GSE54900
fur__wt_fe__1 Fur fur wt_fe 1 Escherichia coli K-12 MG1655 MG1655 M9 glucose(2) NH4Cl(1) O2 ... 1.06 No NaN MiSeq NaN NaN 2 93.35 doi.org/10.1038/ncomms5910 GSE54900

5 rows × 26 columns

[16]:
ica_data.gene_table.head()
[16]:
start end strand gene_name length operon COG accession
b0002 336 2799 + thrA 2463 thrLABC Amino acid transport and metabolism NC_000913.3
b0003 2800 3733 + thrB 933 thrLABC Amino acid transport and metabolism NC_000913.3
b0004 3733 5020 + thrC 1287 thrLABC Amino acid transport and metabolism NC_000913.3
b0005 5233 5530 + yaaX 297 yaaX Function unknown NC_000913.3
b0006 5682 6459 - yaaA 777 yaaA Function unknown NC_000913.3
[17]:
ica_data.imodulon_table.head()
[17]:
regulator f1score pvalue precision recall TP n_genes n_tf Category threshold
AllR/AraC/FucR allR/araC/fucR 0.750000 1.190000e-41 1.000000 0.600000 18.0 18 3 Carbon Source Utilization 0.086996
ArcA-1 arcA 0.130952 6.420000e-20 0.660000 0.072687 33.0 50 1 Energy Metabolism 0.058051
ArcA-2 arcA 0.087683 1.150000e-16 0.840000 0.046256 21.0 25 1 Energy Metabolism 0.081113
ArgR argR 0.177778 6.030000e-18 0.923077 0.098361 12.0 13 1 Amino Acid and Nucleotide Biosynthesis 0.080441
AtoC atoC 0.800000 1.520000e-12 0.666667 1.000000 4.0 6 1 Miscellaneous Metabolism 0.105756

1.4. Converting between gene names and locus tags

If the gene_table contains a gene_name columns, the name2num and num2name methods can convert between locus tags and gene names.

[18]:
ica_data.num2name('b0002')
[18]:
'thrA'
[19]:
ica_data.name2num('thrA')
[19]:
'b0002'

1.5. Adding the TRN

Adding the transcriptional regulatory network (TRN) to the IcaData object enables automated calculation of regulon enrichments. Each row of the TRN file represents a regulatory interaction. The TRN must contain the following columns:

  • regulator: Name of the regulator (/ or + characters will be converted to ;)

  • gene_id: Locus tag of the target gene (must be in ica_data.gene_names)

The following columns are optional, but are helpful to have:

  • regulator_id - Locus tag of regulator

  • gene_name - Name of gene (can automatically update this using name2num)

  • direction - Direction of regulation (+ for activation, - for repression, ? or NaN for unknown)

  • evidence - Evidence of regulation (e.g. ChIP-exo, qRT-PCR, SELEX, Motif search)

  • PMID - Reference for regulatory interaction

[20]:
trn = example_data.trn
trn.head()
[20]:
regulator gene_id effect
0 FMN b3041 -
1 L-tryptophan b3708 +
2 L-tryptophan b3709 +
3 TPP b0066 -
4 TPP b0067 -

Again, this table can be passed in as either a filename or a Pandas DataFrame.

[21]:
ica_data.trn = trn
ica_data.trn.head()
[21]:
regulator gene_id effect
0 FMN b3041 -
1 L-tryptophan b3708 +
2 L-tryptophan b3709 +
3 TPP b0066 -
4 TPP b0067 -

1.6. Inspecting iModulons

view_imodulon shows the information about each gene in the iModulon. Most information is retrieved from the gene_table, but the regulator column comes from the trn.

[22]:
ica_data.view_imodulon('GlpR')
[22]:
gene_weight start end strand gene_name length operon COG accession regulator
b2239 0.211384 2349934 2351011 - glpQ 1077 glpTQ Energy production and conversion NC_000913.3 crp,fis,fnr,glpR,ihf,nac,rpoD
b2240 0.306134 2351015 2352374 - glpT 1359 glpTQ Carbohydrate transport and metabolism NC_000913.3 crp,fis,fnr,glpR,ihf,nac,rpoD
b2241 0.375662 2352646 2354275 + glpA 1629 glpABC Energy production and conversion NC_000913.3 arcA,crp,fis,flhD;flhC,fnr,glpR,rpoD
b2242 0.328961 2354264 2355524 + glpB 1260 glpABC Amino acid transport and metabolism NC_000913.3 arcA,crp,fis,flhD;flhC,fnr,glpR,rpoD
b2243 0.315752 2355520 2356711 + glpC 1191 glpABC Energy production and conversion NC_000913.3 arcA,crp,fis,flhD;flhC,fnr,glpR,rpoD
b3426 0.350034 3562012 3563518 + glpD 1506 glpD Energy production and conversion NC_000913.3 arcA,crp,glpR,rpoD,yieP
b3926 0.290235 4115713 4117222 - glpK 1509 glpFKX Energy production and conversion NC_000913.3 crp,glpR,rpoD
b3927 0.312307 4117244 4118090 - glpF 846 glpFKX Carbohydrate transport and metabolism NC_000913.3 crp,glpR,rpoD

1.7. Searching for genes in iModulons

To find which iModulons contain a specific gene, use the imodulons_with method.

[23]:
ica_data.imodulons_with('b2239')
[23]:
['GlpR']

If the gene_table contains a gene_name columns, this function will work with either the locus tag or the gene name.

[24]:
ica_data.imodulons_with('carA')
[24]:
['PurR-2']

1.8. Renaming iModulons

Individual iModulons can be renamed using the rename_imodulons method

[25]:
print('Original iModulon Names:', ica_data.imodulon_names[:5])
ica_data.rename_imodulons({'AllR/AraC/FucR':'AllR'})
print('Renamed iModulon Names:', ica_data.imodulon_names[:5])
Original iModulon Names: ['AllR/AraC/FucR', 'ArcA-1', 'ArcA-2', 'ArgR', 'AtoC']
Renamed iModulon Names: ['AllR', 'ArcA-1', 'ArcA-2', 'ArgR', 'AtoC']

These changes are reflected throughout the IcaData object.

[26]:
print('M matrix columns:', ica_data.M.columns[:5])
M matrix columns: Index(['AllR', 'ArcA-1', 'ArcA-2', 'ArgR', 'AtoC'], dtype='object')

iModulon names can be updated all at once as well.

[27]:
print('Original iModulon Names:', ica_data.imodulon_names[:5])

new_names = ['AllR/AraC/FucR']+ica_data.imodulon_names[1:]

print('New iModulon names:', new_names[:5])

ica_data.imodulon_names = new_names

print('Renamed iModulon Names:', ica_data.imodulon_names[:5])
Original iModulon Names: ['AllR', 'ArcA-1', 'ArcA-2', 'ArgR', 'AtoC']
New iModulon names: ['AllR/AraC/FucR', 'ArcA-1', 'ArcA-2', 'ArgR', 'AtoC']
Renamed iModulon Names: ['AllR/AraC/FucR', 'ArcA-1', 'ArcA-2', 'ArgR', 'AtoC']

1.9. Copying IcaData objects

The copy method creates a new IcaData object identical to the old one.

[28]:
ica_data.copy()
[28]:
<pymodulon.core.IcaData at 0x7fc186195650>

1.10. Saving and Loading IcaData Objects

To facilitate data sharing, you can save IcaData objects as json files that can be easily re-loaded

[29]:
from pymodulon.io import *
from os import path
[30]:
filepath = path.join('tmp','ecoli_data.json')
save_to_json(ica_data,filepath)
[31]:
ica_data = load_json_model(filepath)
[32]:
ica_data.imodulon_table.head()
[32]:
regulator f1score pvalue precision recall TP n_genes n_tf Category threshold
AllR/AraC/FucR allR/araC/fucR 0.750000 1.190000e-41 1.000000 0.600000 18.0 18 3 Carbon Source Utilization 0.086996
ArcA-1 arcA 0.130952 6.420000e-20 0.660000 0.072687 33.0 50 1 Energy Metabolism 0.058051
ArcA-2 arcA 0.087683 1.150000e-16 0.840000 0.046256 21.0 25 1 Energy Metabolism 0.081113
ArgR argR 0.177778 6.030000e-18 0.923077 0.098361 12.0 13 1 Amino Acid and Nucleotide Biosynthesis 0.080441
AtoC atoC 0.800000 1.520000e-12 0.666667 1.000000 4.0 6 1 Miscellaneous Metabolism 0.105756
[ ]: