* use nn.RMSNorm, use sdpa, cleanup
* bump mlx versions
* minor update
* use fast layer norm
* version bump
* update requirement for whisper
* update requirement for gguf
* feat: add example for deepseek coder
* chore: remove hardcoded rope_scaling_factor
* feat: add quantization support
* chore: update readme
* chore: clean up the rope scalling factor param in create cos sin theta
* feat: add repetition_penalty
* style /consistency changes to ease future integration
* nits in README
* one more typo
---------
Co-authored-by: Awni Hannun <awni@apple.com>