chunk_redundancy
chunk_redundancy(bucket_ids, doc_ids)Count same-doc siblings sharing each point’s bucket.
For each point, returns how many other chunks from the same document landed in the same bucket. 0 means unique in its bucket for that document.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| bucket_ids | Array-like of bucket assignments per point. | required | |
| doc_ids | Array-like of document identifiers per point (same length). | required |
Returns
| Name | Type | Description |
|---|---|---|
| np.ndarray | Integer array of length len(bucket_ids). Each value is the number | |
| np.ndarray | of same-doc siblings in the same bucket (excluding self). |