compute_neighbor_diversity
compute_neighbor_diversity(embeddings, k=15, neighbors=None)Compute neighbor diversity for each point.
Diversity measures how dissimilar a point’s neighbors are to each other. High diversity indicates a “general” concept that connects multiple topics. Low diversity indicates a “specific” concept within a tight cluster.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| embeddings | np.ndarray | Normalized embeddings (n, d) | required |
| k | int | Number of neighbors to consider | 15 |
| neighbors | Optional[np.ndarray] | Pre-computed k-NN indices (n, k+1). If None, computed internally. | None |
Returns
| Name | Type | Description |
|---|---|---|
| np.ndarray | Diversity scores for each point (n,). Higher = more general. |
Example
diversity = compute_neighbor_diversity(embeddings, k=15) general_idx = np.argsort(diversity)[-10:] # Most general concepts specific_idx = np.argsort(diversity)[:10] # Most specific concepts