mine_dag_chains

mine_dag_chains(
    embeddings,
    k_neighbors=30,
    similarity_threshold=0.55,
    diversity_gap_threshold=0.02,
    min_chain_length=3,
    verbose=False,
)

Mine DAG (directed acyclic graph) structures from embedding space.

Finds hierarchical chains by using neighbor diversity as a generality signal. Points with diverse neighbors connect many topics (general), while points with coherent neighbors form tight clusters (specific).

Chains flow from general → specific, following the diversity gradient.

Parameters

Name	Type	Description	Default
embeddings	np.ndarray	Embeddings to analyze (n, d). Will be L2-normalized.	required
k_neighbors	int	Number of neighbors for k-NN graph and diversity computation.	`30`
similarity_threshold	float	Minimum similarity for parent-child edges.	`0.55`
diversity_gap_threshold	float	Minimum diversity difference for directed edge.	`0.02`
min_chain_length	int	Minimum chain length to return.	`3`
verbose	bool	Print progress information.	`False`

Returns

Name	Type	Description
	DAGMiningResult	DAGMiningResult with chains, diversity scores, and edge information.

Example

from dyf import mine_dag_chains

result = mine_dag_chains(embeddings, verbose=True) print(result.summary())

Get clean hierarchies

clean = result.get_chains_by_coherence(min_coherence=0.65) for chain in clean[:10]: … print(f”[len={len(chain)}] {chain.indices}“)

Notes

Low diversity points (decades, months) have highly coherent neighbors
High diversity points sit at intersections of multiple topics
Chains often converge to common “sinks” (abstract concepts)
~100% of extracted chains follow monotonic diversity gradient