extract_boundary_persistence

extract_boundary_persistence(tree, margin_pct=0.1)

Identify points that persist as boundary across multiple tree depths.

Walks the built tree once, reading existing point_margin_map at each internal node. At each depth, points with margin below the margin_pct percentile threshold are tagged as boundary at that depth.

Points that are boundary at multiple depths have high boundary persistence — they straddle concepts at several levels of the hierarchy and are semantically polysemous bridge points.

Parameters

Name Type Description Default
tree PCA tree dict from build_pca_tree(). required
margin_pct Percentile threshold (0-1). Points with margin below this percentile at a given depth are boundary. 0.1

Returns

Name Type Description
dict with: boundary_depths: dict[int, list[int]] — point_idx -> list of depths where the point is boundary boundary_count: np.ndarray shape (n,) — number of boundary depths per point thresholds: dict[int, float] — margin threshold per depth