pymodulon.motif

Module Contents

Functions

_get_upstream_seqs(ica_data, imodulon, seq_dict, upstream, downstream)

Get upstream sequences for a table of operons

find_motifs(ica_data, imodulon, fasta_file, outdir = None, palindrome = False, nmotifs = 5, upstream = 500, downstream = 100, verbose = True, force = False, evt = 0.001, cores = 8, minw = 6, maxw = 40, minsites = None)

Finds motifs upstream of genes in an iModulon

_parse_meme(directory, DF_seqs, verbose, evt)

_parse_tomtom(tomtom_dir)

compare_motifs(motif_info = None, motif_file = None, motif_db = None, outdir = None, force=False, verbose=True, evt=0.001)

Compare a MEME motif against external motif databases

pymodulon.motif._get_upstream_seqs(ica_data, imodulon, seq_dict, upstream, downstream)[source]

Get upstream sequences for a table of operons

Parameters
  • ica_data (IcaData) – IcaData object

  • imodulon (Union[str,int]) – Name of iModulon

  • seq_dict (Dict) – Dictionary mapping accession numbers to Biopython SeqRecords

  • upstream (int) – Number of basepairs upstream from first gene in operon to include in motif search

  • downstream (int) – Number of basepairs upstream from first gene in operon to include in motif search

Returns

  • pd.DataFrame – DataFrame containing operon information

  • List[SeqRecord] – List of SeqRecords containing upstream sequences

pymodulon.motif.find_motifs(ica_data, imodulon, fasta_file, outdir=None, palindrome=False, nmotifs=5, upstream=500, downstream=100, verbose=True, force=False, evt=0.001, cores=8, minw=6, maxw=40, minsites=None)[source]

Finds motifs upstream of genes in an iModulon

Parameters
  • ica_data (IcaData) – IcaData object

  • imodulon (Union[int, str]) – iModulon name

  • fasta_file (Union[os.PathLike, List[os.PathLike]]) – Path or list of paths to fasta file(s) for organism

  • outdir (os.PathLike) – Path to output directory

  • palindrome (bool) – If True, limit search to palindromic motifs (default: False)

  • nmotifs (int) – Number of motifs to search for (default: 5)

  • upstream (int) – Number of basepairs upstream from first gene in operon to include in motif search (default: 500)

  • downstream (int) – Number of basepairs upstream from first gene in operon to include in motif search (default: 100)

  • verbose (bool) – Show steps in verbose output (default: True)

  • force (bool) – Force execution of MEME even if output already exists (default: False)

  • evt (float) – E-value threshold (default: 0.001)

  • cores (int) – Number of cores to use (default: 8)

  • minw (int) – Minimum motif width in basepairs (default: 6)

  • maxw (int) – Maximum motif width in basepairs (default: 40)

  • minsites (Optional[int]) – Minimum number of sites required for a motif. Default is the number of operons divided by 3.

Returns

# TODO

Return type

add documentation of return

pymodulon.motif._parse_meme(directory, DF_seqs, verbose, evt)[source]
pymodulon.motif._parse_tomtom(tomtom_dir)[source]
pymodulon.motif.compare_motifs(motif_info=None, motif_file=None, motif_db=None, outdir=None, force=False, verbose=True, evt=0.001)[source]

Compare a MEME motif against external motif databases

Parameters
  • motif_info (Optional[MotifInfo]) – MotifInfo object. Either ‘motif_info’ or ‘motif_file’ must be provided. ‘motif_info’ takes precedence over ‘motif_file’.

  • motif_file (Optional[os.PathLike]) – Txt file generated by MEME from a motif search.

  • motif_db (Optional[Union[List,str]]) – Name or path to MEME formatted databases

  • outdir (Optional[os.PathLike]) – Output directory for TOMTOM comparisons

  • force (bool) – Force execution of TOMTOM even if output already exists (default: False)

  • verbose (bool) – Show steps in verbose output (default: True)

  • evt (float) – E-value threshold (default: 0.001)