Transmission CCDs

This module contains all the functions relevant for creating transmission CCDs.

class pyccd.transmission_ccd.BaseClade(clade: frozenset)[source]

Bases: object

Base class representing a clade — a set of taxa/leaves.

Attributes:

clade (frozenset): A frozen set of taxa or node labels in this clade.

clade: frozenset
class pyccd.transmission_ccd.TransmissionAncestryClade(clade: frozenset, transm_ancest: str)[source]

Bases: BaseClade

Clade with transmission ancestor, i.e. who infected this clade.

Attributes:

transm_ancest (str): Transmission ancestor

transm_ancest: str
class pyccd.transmission_ccd.TransmissionBlockClade(clade: frozenset, has_block: bool)[source]

Bases: BaseClade

Clade with a flag indicating whether a transmission block (event) occurred on the edge.

Attributes:

has_block (bool): Whether this clade was preceded by a block of transmissions.

has_block: bool
class pyccd.transmission_ccd.TypeCCD(*values)[source]

Bases: Enum

Enum representing different types of CCD.

Attributes:

BLOCKS: Represents the “Blocks” mode, whether there is a block event. ANCESTRY: Represents the “Ancestry” mode for processing transmission ancestry.

ANCESTRY = 'Ancestry'
BLOCKS = 'Blocks'
pyccd.transmission_ccd._add_internal_clade(node, ccd_type, blockcount_map: dict, branch_lengths_map: dict) tuple[BaseClade, BaseClade, BaseClade, dict, dict][source]

Processes an internal node by constructing parent and child clades based on the CCD type, and updates blockcount and branch length maps accordingly.

Parameters:
  • node – The internal tree node to process. Assumed to have two children, a blockcount, branch length (dist), and optionally transmission ancestry.

  • ccd_type – Type of CCD to use and construct clades for.

  • blockcount_map – Dictionary mapping clades to a list of blockcounts.

  • branch_lengths_map – Dictionary mapping clades to a list of branch lengths.

Returns:

A tuple containing the parent clade, the two child clades (ordered by minimum leaf label), and the updated blockcount and branch lengths maps.

pyccd.transmission_ccd._add_leaf_clade(node, ccd_type: TypeCCD, blockcount_map: dict, branch_lengths_map: dict) tuple[dict, dict][source]

Processes a leaf node by creating an appropriate clade based on the CCD type, and updates blockcount and branch length maps accordingly.

Parameters:
  • node – The tree node corresponding to a leaf.

  • ccd_type – Type of CCD to use and construct clades for.

  • blockcount_map – Dictionary mapping clades to a list of blockcounts.

  • branch_lengths_map – Dictionary mapping clades to a list of branch lengths.

Returns:

Updated blockcount_map and branch_lengths_map.

pyccd.transmission_ccd._build_tree_dict_from_clade_splits(root_clade: BaseClade, seen_resolved_clades: dict) dict[source]

Constructs a tree dictionary from a set of resolved clade splits.

Given a root clade and a dictionary of previously resolved clade splits, this function recursively builds a dictionary representing the binary tree structure. Each entry maps a parent clade (of any subclass of BaseClade) to a tuple of its left and right child clades.

This function is designed to work with any class that inherits from the base BaseClade class, making it flexible for future extensions to other clade types.

Parameters:
  • root_clade – The root clade to start building the tree from. This clade and its children are instances of a subclass of Clade.

  • seen_resolved_clades – A dictionary mapping clades (of any subclass of BaseClade) to a tuple: ‘(probability, (parent_clade, left_clade, right_clade))’. Only the split information is used for constructing the tree.

Returns:

A dictionary mapping each clade (of any subclass of BaseClade) to its child clades (left, right), which are also instances of a subclass of BaseClade.

pyccd.transmission_ccd.get_transmission_ccd_tree_bottom_up(m1: dict, m2: dict, blockcount_map: dict, branch_lengths_map: dict, seed: int = 42) str[source]

Constructs the transmission CCD MAP tree using a bottom-up approach.

This function processes the given clades and splits, iterating over them in order of increasing clade size (smallest to largest). It calculates probabilities for each split, resolving ties randomly based on a 50% chance (controlled by a custom seed) and keeps only the highest probable splits for the MAP tree. The final resolved splits are used to construct the MAP tree, which is returned in Newick format.

Parameters:
  • m1 – A dictionary with clades as keys and their occurrences as values.

  • m2 – A dictionary with cladesplits as keys and their probabilities as values.

  • blockcount_map – A mapping of clades to their respective block counts

  • branch_lengths_map – A mapping of clades to their respective branch lengths

  • seed – A seed for the random number generator to control tie-breaking. Default is 42.

Returns:

A string representing the tree in Newick format, annotated with median block-counts and mean branch lengths (might change in future versions).

pyccd.transmission_ccd.get_transmission_maps(trees: list[TreeNode] | tuple[TreeNode], type_str: str = 'Ancestry') tuple[source]

Extracts all the relevant information from a list of Tree objects. The maps m1 and m2 are used as in the Larget approach for CCD1. With these we can construct a MAP tree, which can be annotated with branch lengths and blockcount summaries using the other two returns.

Parameters:
  • type_str – Currently, either ‘Blocks’ or ‘Ancestry’ to determine the types of CCD to construct.

  • trees – list of Trees from which to extract the clade splits from.

Returns:

Tuple of m1 (Clade counts), m2 (Clade split counts), blockcount_map (Blockcount counts), branch_lengths_map (Branch lengths)

pyccd.transmission_ccd.recursive_nwk_split_dict(clade, output, blockcount_map, branch_lengths_map) str[source]

Recursively generates a Newick string for the given clade. Currently, it annotates the median blockcount if a block is present. If the given clade is a TransmissionAncestryClade it also annotates that.

Parameters:
  • clade – The clade to generate the Newick string for.

  • output – A dictionary containing the child clades for each parent. As computed by the _build_tree_dirct_from_clade_splits function.

  • blockcount_map – A dictionary mapping clades to their associated blockcount values.

  • branch_lengths_map – A dictionary mapping clades to their branch lengths.

Returns:

A string representing the tree in Newick format, annotated with meadian blockcount and mean branch lengths.