build_dyf_tree

build_dyf_tree(
    embeddings,
    max_depth,
    num_bits=3,
    min_leaf_size=4,
    seed=42,
    fit_method='raw_pca',
)

Build a DYF recursive tree over embeddings.

At each level, fits a DensityClassifier with num_bits bits, producing up to 2^num_bits children per node. Centroid similarities are stored as per-point margins for boundary persistence analysis.

Parameters

Name Type Description Default
embeddings (n, d) array of embedding vectors. required
max_depth Maximum tree depth (number of recursive splits). required
num_bits LSH bits per level (default 3 = up to 8-way splits). 3
min_leaf_size Stop splitting when a node has fewer than 2 * min_leaf_size points. 4
seed Random seed for DensityClassifier. 42
fit_method Fitting method — ‘raw_pca’ (default, PCA on full data), ‘pca’ (PCA on centroid subset), or ‘itq’ (iterative quantization for tighter partitions). 'raw_pca'

Returns

Name Type Description
Tree dict with keys: children, indices, depth, point_margin_map.