DAGTaxonomy
DAGTaxonomy(n_nodes, diversity, children, parents, roots, leaves)A navigable DAG taxonomy extracted from embedding space.
Provides lattice operations for exploring hierarchical relationships: - Find ancestors/descendants of a concept - Find common ancestors between concepts - Find paths through the taxonomy - Identify convergence points (abstract attractors)
The taxonomy is a true lattice (not a tree) - nodes can have multiple parents, enabling multiple inheritance and diamond patterns.
Attributes
| Name | Type | Description |
|---|---|---|
| n_nodes | int | Total number of nodes in the taxonomy |
| diversity | np.ndarray | Neighbor diversity score for each node |
| children | Dict[int, List[Tuple[int, float, float]]] | Dict mapping node → list of (child, similarity, gap) |
| parents | Dict[int, List[Tuple[int, float, float]]] | Dict mapping node → list of (parent, similarity, gap) |
| roots | List[int] | Nodes with no parents (most general concepts) |
| leaves | List[int] | Nodes with no children (most specific concepts) |
Example
from dyf import build_dag_taxonomy
taxonomy = build_dag_taxonomy(embeddings, verbose=True) print(taxonomy.summary())
Find ancestors of a concept
ancestors = taxonomy.get_ancestors(node_idx, max_depth=5)
Find common ancestors between two concepts
common = taxonomy.get_common_ancestors(idx_a, idx_b)
Get convergence points (highly connected abstract concepts)
hubs = taxonomy.get_convergence_points(min_parents=10)
Methods
| Name | Description |
|---|---|
| get_all_paths | Find all paths from start to end. |
| get_ancestors | Get all ancestors of a node (transitive closure upward). |
| get_children | Get direct children of a node. |
| get_common_ancestors | Find common ancestors of two nodes. |
| get_convergence_points | Find convergence points (nodes with many incoming edges). |
| get_descendants | Get all descendants of a node (transitive closure downward). |
| get_diamond_patterns | Find diamond patterns (A → B, A → C, B → D, C → D). |
| get_divergence_points | Find divergence points (nodes with many outgoing edges). |
| get_lowest_common_ancestors | Find lowest common ancestors (LCAs) of two nodes. |
| get_node_depth | Get the depth of a node (distance from nearest root). |
| get_parents | Get direct parents of a node. |
| get_path | Find a path from start to end (following parent→child direction). |
| summary | Return summary statistics. |
get_all_paths
DAGTaxonomy.get_all_paths(start, end, max_depth=10)Find all paths from start to end.
In a lattice, there can be multiple paths between nodes.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| start | int | Starting node (more general) | required |
| end | int | Ending node (more specific) | required |
| max_depth | int | Maximum path length | 10 |
Returns
| Name | Type | Description |
|---|---|---|
| List[List[int]] | List of paths, where each path is a list of node indices |
get_ancestors
DAGTaxonomy.get_ancestors(node, max_depth=10)Get all ancestors of a node (transitive closure upward).
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| node | int | Starting node index | required |
| max_depth | int | Maximum depth to traverse | 10 |
Returns
| Name | Type | Description |
|---|---|---|
| set | Set of all ancestor node indices |
get_children
DAGTaxonomy.get_children(node)Get direct children of a node.
get_common_ancestors
DAGTaxonomy.get_common_ancestors(node_a, node_b, max_depth=10)Find common ancestors of two nodes.
In a lattice (vs tree), there can be multiple common ancestors, not just a single LCA.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| node_a | int | First node index | required |
| node_b | int | Second node index | required |
| max_depth | int | Maximum depth to search | 10 |
Returns
| Name | Type | Description |
|---|---|---|
| set | Set of common ancestor indices |
get_convergence_points
DAGTaxonomy.get_convergence_points(min_parents=5)Find convergence points (nodes with many incoming edges).
These are abstract concepts where many paths converge.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| min_parents | int | Minimum number of parents to qualify | 5 |
Returns
| Name | Type | Description |
|---|---|---|
| List[Tuple[int, int]] | List of (node_idx, parent_count) tuples, sorted by count descending |
get_descendants
DAGTaxonomy.get_descendants(node, max_depth=10)Get all descendants of a node (transitive closure downward).
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| node | int | Starting node index | required |
| max_depth | int | Maximum depth to traverse | 10 |
Returns
| Name | Type | Description |
|---|---|---|
| set | Set of all descendant node indices |
get_diamond_patterns
DAGTaxonomy.get_diamond_patterns(limit=100)Find diamond patterns (A → B, A → C, B → D, C → D).
Diamonds indicate multiple inheritance / non-tree structure.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| limit | int | Maximum number of diamonds to return | 100 |
Returns
| Name | Type | Description |
|---|---|---|
| List[Tuple[int, int, int, int]] | List of (top, left, right, bottom) tuples |
get_divergence_points
DAGTaxonomy.get_divergence_points(min_children=5)Find divergence points (nodes with many outgoing edges).
These are branching hubs where paths diverge.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| min_children | int | Minimum number of children to qualify | 5 |
Returns
| Name | Type | Description |
|---|---|---|
| List[Tuple[int, int]] | List of (node_idx, child_count) tuples, sorted by count descending |
get_lowest_common_ancestors
DAGTaxonomy.get_lowest_common_ancestors(node_a, node_b, max_depth=10)Find lowest common ancestors (LCAs) of two nodes.
Returns ancestors that are common to both nodes but have no descendants that are also common ancestors.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| node_a | int | First node index | required |
| node_b | int | Second node index | required |
| max_depth | int | Maximum depth to search | 10 |
Returns
| Name | Type | Description |
|---|---|---|
| List[int] | List of LCA indices (may be multiple in a lattice) |
get_node_depth
DAGTaxonomy.get_node_depth(node)Get the depth of a node (distance from nearest root).
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| node | int | Node index | required |
Returns
| Name | Type | Description |
|---|---|---|
| int | Minimum distance to any root, or -1 if not connected |
get_parents
DAGTaxonomy.get_parents(node)Get direct parents of a node.
get_path
DAGTaxonomy.get_path(start, end, max_depth=10)Find a path from start to end (following parent→child direction).
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| start | int | Starting node (more general) | required |
| end | int | Ending node (more specific) | required |
| max_depth | int | Maximum path length | 10 |
Returns
| Name | Type | Description |
|---|---|---|
| Optional[List[int]] | List of node indices from start to end, or None if no path |
summary
DAGTaxonomy.summary()Return summary statistics.