* LoRA: Improve validation error for LoRA layer count exceeding model layer
This commit enhances the error handling when the specified LoRA layer count exceeds the total number of layers in the model. It clarifies the error message to provide actionable feedback for users, guiding them to adjust their input parameters accordingly.
* format + nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* lazy model import in mlx_lm
* change lora loading
* fix olmo lora
* remove a bunch of unused stuff from plamo
* move phixtral to mlx-lm and out of llms/
* feat(mlx-lm): add de-quant for fuse
* chore: disable quant in to linear when de-quant enabled
* chore: add better error handling for adapter file not found