mine_dag_chains
mine_dag_chains(
embeddings,
k_neighbors=30,
similarity_threshold=0.55,
diversity_gap_threshold=0.02,
min_chain_length=3,
verbose=False,
)Mine DAG (directed acyclic graph) structures from embedding space.
Finds hierarchical chains by using neighbor diversity as a generality signal. Points with diverse neighbors connect many topics (general), while points with coherent neighbors form tight clusters (specific).
Chains flow from general → specific, following the diversity gradient.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| embeddings | np.ndarray | Embeddings to analyze (n, d). Will be L2-normalized. | required |
| k_neighbors | int | Number of neighbors for k-NN graph and diversity computation. | 30 |
| similarity_threshold | float | Minimum similarity for parent-child edges. | 0.55 |
| diversity_gap_threshold | float | Minimum diversity difference for directed edge. | 0.02 |
| min_chain_length | int | Minimum chain length to return. | 3 |
| verbose | bool | Print progress information. | False |
Returns
| Name | Type | Description |
|---|---|---|
| DAGMiningResult | DAGMiningResult with chains, diversity scores, and edge information. |
Example
from dyf import mine_dag_chains
result = mine_dag_chains(embeddings, verbose=True) print(result.summary())
Get clean hierarchies
clean = result.get_chains_by_coherence(min_coherence=0.65) for chain in clean[:10]: … print(f”[len={len(chain)}] {chain.indices}“)
Notes
- Low diversity points (decades, months) have highly coherent neighbors
- High diversity points sit at intersections of multiple topics
- Chains often converge to common “sinks” (abstract concepts)
- ~100% of extracted chains follow monotonic diversity gradient