TODO: Re-implement `batch_generate`
TODO: Update all `generate_step` callsites
NOTE: `generate_step` taking `(bs, seq_len)` instead of `(seq_len,)` is
a breaking change. In particular, `sampler` and `logits_processors` will
need to handle logits of shape `(bs, vocab_size)` instead of `(vocab_size,)`.
The `prompt` argument can now be either a `str` or `list[str]`.
The change to `generate()` is backwards-compatible.
The changes to `generate_step()`, `top_p_sampling()`, and
`min_p_sampling()` are backwards-incompatible in order to unify shapes;
this could be changed by adding a few if-statements, if preferred.
* chore(mlx-lm): clean up the top p imp
* chore: clean up
* chore: add test
* chore: address comments
* chore: clean up docs string
* chore: clean up test