compute_neighbor_diversity

compute_neighbor_diversity(embeddings, k=15, neighbors=None)

Compute neighbor diversity for each point.

Diversity measures how dissimilar a point’s neighbors are to each other. High diversity indicates a “general” concept that connects multiple topics. Low diversity indicates a “specific” concept within a tight cluster.

Parameters

Name Type Description Default
embeddings np.ndarray Normalized embeddings (n, d) required
k int Number of neighbors to consider 15
neighbors Optional[np.ndarray] Pre-computed k-NN indices (n, k+1). If None, computed internally. None

Returns

Name Type Description
np.ndarray Diversity scores for each point (n,). Higher = more general.

Example

diversity = compute_neighbor_diversity(embeddings, k=15) general_idx = np.argsort(diversity)[-10:] # Most general concepts specific_idx = np.argsort(diversity)[:10] # Most specific concepts