chunk_redundancy

chunk_redundancy(bucket_ids, doc_ids)

Count same-doc siblings sharing each point’s bucket.

For each point, returns how many other chunks from the same document landed in the same bucket. 0 means unique in its bucket for that document.

Parameters

Name	Type	Description	Default
bucket_ids		Array-like of bucket assignments per point.	required
doc_ids		Array-like of document identifiers per point (same length).	required

Name	Type	Description
	np.ndarray	Integer array of length len(bucket_ids). Each value is the number
	np.ndarray	of same-doc siblings in the same bucket (excluding self).