DocSpread

DocSpread(n_chunks, n_buckets, concentration, bucket_distribution)

Per-document chunk distribution across LSH buckets.

Attributes

Name Type Description
n_chunks int Total chunks for this document.
n_buckets int Distinct buckets touched by chunks.
concentration float n_chunks / n_buckets ratio. High = topically focused, low = multi-topic bridge.
bucket_distribution dict Mapping of bucket_id to chunk count.