Improve `Sampler` Parameter Documentation

disclaimer: prettified with ai

The sampler documentation should be cleaned up so it is easier to read and better aligned with the public API. In `ChunkSampler`, the parameter section should be reordered to follow the constructor flow more naturally, and several descriptions should be rewritten to better explain chunking, batching, masking, and RNG behavior. In `DistributedRandomSampler`, long parameter descriptions should be wrapped and tightened so the generated docs are easier to scan.

## Proposed Changes

- Reorder the `ChunkSampler` parameter docs to match the way users read and configure the sampler.
- Clarify `chunk_size`, `preload_nchunks`, and `batch_size`, especially the relationship between them.
- Rewrite `shuffle`, `drop_last`, `mask`, and `rng` descriptions to be more explicit and user-facing.
- Reformat long `DistributedRandomSampler` parameter descriptions for readability.
- Keep this as a documentation-only change with no functional behavior change.

## Focused Diff

```diff
diff --git a/src/annbatch/samplers/_chunk_sampler.py b/src/annbatch/samplers/_chunk_sampler.py
@@
-    batch_size
-        Number of observations per batch.
     chunk_size
-        Size of each chunk i.e. the range of each chunk yielded.
-    mask
-        A slice defining the observation range to sample from (start:stop).
-    shuffle
-        Whether to shuffle chunk and index order.
+        Number of contiguous observations per on-disk chunk.
     preload_nchunks
-        Number of chunks to load per iteration.
-    drop_last
-        Whether to drop the last incomplete batch.
-    rng
-        Random number generator for shuffling. Note that :func:`torch.manual_seed`
-        has no effect on reproducibility here; pass a seeded
-        :class:`numpy.random.Generator` to control randomness.
+        Number of chunks to group into each I/O request.
+        ``chunk_size * preload_nchunks`` must be divisible by
+        ``batch_size``.
+    batch_size
+        Number of observations per batch. Must not exceed
+        ``chunk_size * preload_nchunks``.
@@
+    shuffle
+        If ``True``, shuffle chunk order within each epoch.
+    drop_last
+        If ``True``, drop the final batch when it contains fewer than
+        ``batch_size`` observations.
+    mask
+        A ``slice`` restricting sampling to a sub-range of observations.
+        For example, ``slice(100, 500)`` limits sampling to observations
+        100 through 499.
+    rng
+        A :class:`numpy.random.Generator` used for shuffling and
+        replacement draws. When ``None``, a new default generator is
+        created.

diff --git a/src/annbatch/samplers/_distributed_random_sampler.py b/src/annbatch/samplers/_distributed_random_sampler.py
@@
-        Either a string naming a distributed backend (``"torch"`` or ``"jax"``),
-        or a callable that returns ``(rank, world_size)``.
+        Either a string naming a distributed backend (``"torch"`` or
+        ``"jax"``), or a callable that returns ``(rank, world_size)``.
@@
-        If *True*, round each rank's observation count down to a multiple of ``batch_size`` so that all workers (ranks) yield the same number of batches.
-        Set to *False* to use the raw ``n_obs // world_size`` split, which may result in an uneven number of batches per worker.
+        If *True*, round each rank's observation count down to a
+        multiple of ``batch_size`` so that all workers (ranks) yield
+        the same number of batches.
+        Set to *False* to use the raw ``n_obs // world_size`` split,
+        which may result in an uneven number of batches per worker.
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `Sampler` Parameter Documentation #178

Proposed Changes

Focused Diff

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve Sampler Parameter Documentation #178

Description

Proposed Changes

Focused Diff

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Improve `Sampler` Parameter Documentation #178