Commit Graph

511 Commits

Author SHA1 Message Date
Shubbair
f70cef9567 Updating GAN Code... 2024-07-31 11:25:39 +03:00
Shubbair
6f7a6609b9 Updating MLX Notebook 2024-07-30 20:01:14 +03:00
Shubbair
0644cc101b Updating MLX Notebook 2024-07-30 19:50:02 +03:00
Shubbair
ad2b6643c0 Updating GAN Code... 2024-07-30 16:59:35 +03:00
Shubbair
3bea855bd2 Updating GAN Code... 2024-07-30 13:45:09 +03:00
Shubbair
c2d731d8a3 Updating GAN Code... 2024-07-30 13:24:53 +03:00
Shubbair
ba52447385 Updating GAN Code... 2024-07-30 13:21:38 +03:00
Shubbair
1e386b5c20 Updating GAN Code... 2024-07-30 02:56:13 +03:00
Shubbair
7438b54ecd Updating GAN Code... 2024-07-30 02:44:41 +03:00
Shubbair
7fea34d65e Updating GAN Code... 2024-07-30 02:37:09 +03:00
Shubbair
f505fe6e55 Updating GAN Code... 2024-07-30 02:17:12 +03:00
Shubbair
4e80759b39 Updating GAN Code... 2024-07-30 02:06:52 +03:00
Shubbair
306e53c402 Updating GAN Code... 2024-07-29 19:44:16 +03:00
Shubbair
bacaa9ec0e Updating GAN Code... 2024-07-29 01:30:08 +03:00
Shubbair
8d27be1442 Updating GAN Code... 2024-07-29 01:24:50 +03:00
Shubbair
4de0583b49 Updating GAN Code... 2024-07-28 19:18:35 +03:00
Shubbair
a07ef6d03b Updating GAN Code... 2024-07-28 18:11:39 +03:00
Shubbair
c0c8293842 Updating GAN Code... 2024-07-28 17:56:26 +03:00
Shubbair
d17d293df9 Updating GAN Code... 2024-07-28 17:35:36 +03:00
Shubbair
3e63cd93fe Updating GAN Code... 2024-07-28 17:26:24 +03:00
Shubbair
3716501e8d Updating GAN Code... 2024-07-28 17:22:40 +03:00
Shubbair
88a20b7276 Updating GAN Code... 2024-07-28 01:10:19 +03:00
Shubbair
8b1713737a Updating GAN Code... 2024-07-27 01:20:00 +03:00
Shubbair
f8b7094fb8 Updating GAN Code... 2024-07-27 01:19:50 +03:00
Shubbair
147cb3d2bc Updating GAN Code... 2024-07-27 01:09:51 +03:00
Shubbair
a05608c34d Updating GAN Code... 2024-07-27 00:22:29 +03:00
Shubbair
f176cce74d Updating GAN Code... 2024-07-27 00:19:08 +03:00
Shubbair
959c623908 Updating GAN Code... 2024-07-26 16:38:55 +03:00
Shubbair
591074bea8 Updating GAN Code... 2024-07-26 16:36:29 +03:00
Shubbair
d426586b03 Updating GAN Code... 2024-07-26 16:07:40 +03:00
Shubbair
5e7ce1048c Add GAN model 25/7 2024-07-25 21:00:41 +03:00
Alex Cheema
cd8efc7fbc
Add support for Llama-3.1 (#907)
* add dynamicNTK scaling rope

* remove unused var

* fix rope base

* llama3.1 fixes

* TODO for rope eval

* vectorise llama3 base freq calculation

* removed the arbitrary 2.0 rope_scale default case

* fix slow llama3.1 generation by evaluating stateless part of DynamicNTKScalingRoPE in init

* nits + format

* use mx.pi

* fix tests and add test for 3.1

---------

Co-authored-by: Prince Canuma <prince.gdt@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-07-23 13:21:32 -07:00
M. Ali Bayram
47060a8130
refactor: add force_download parameter to get_model_path function (#800) 2024-07-23 13:10:20 -07:00
Prince Canuma
3f337e0f0a
Add Mistral NeMo (fix) (#895)
* fix head_dim

* Update llms/mlx_lm/models/llama.py

* fix kv error

* formatting

* Delete test.py

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-07-22 06:09:24 -07:00
Prince Canuma
3d365b612a
Add support for InternLM-2.5 (#871)
* fix internlm-2

* formatting

* add dynamic ntk rope

* formatting

* move dynamic scaling rope to intermlm2.py

* add default max_position_embeddings
2024-07-17 16:38:22 -07:00
Anchen
561dcf5643
Add support for deepseek coder v2 lite (#882)
* feat: add support for deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

* fix softmax + some cleanup

* more nits

* fix rope

* fix original_max_position_embeddings in rope

* fix original_max_position_embeddings in rope config

* add group greedy

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-07-17 07:23:28 -07:00
Awni Hannun
f0c6c6e226
keep the server in a valid state (#889) 2024-07-15 18:35:36 -07:00
JosefAlbers
bfc1f2763b
longrope (#886) 2024-07-12 07:19:11 -07:00
Chime Ogbuji
8bf397e450
Pass use_dora parameter to linear_to_lora_layers (#885) 2024-07-11 14:34:34 -07:00
nicolov
fbe3247772
Add GPT-neox model (#863) 2024-07-11 06:13:17 -07:00
James A Capozzoli
9717307ff0
Validation with full data set, results in NaN validation score (#879)
* CLI arguments may set num_batches to -1

The CLI arguments allow you to validate with the entire dataset by passing a negative one value, but this quickly results in a division by zero `NaN` to appear as the validation score!

* Must properly assemble the mini batches when validating with entire dataset.

Tested locally, a validation of a novel took about an hour, with a loss of 0.928. Thanks @awni for the correction!

* Set up the pre-commit hooks and run them so that black may format lora.py.
2024-07-10 08:36:11 -07:00
Alex Wozniakowski
63800c8feb
Example of response generation with optional arguments (#853)
* Generate response with optional arguments

* Reference response generation example

* Include transformers and sentencepiece

* Update example to run Mistral-7B-Instruct-v0.3

* Link to generation example

* Style changes from pre-commit
2024-07-09 06:49:59 -07:00
Awni Hannun
68e88d42fb
Fix server for openai package (#877)
* fix

* fixes for 9b
2024-07-08 12:34:31 -07:00
Awni Hannun
20e221f7f7
Add recurrent gemma (#856)
* add recurrent gemma

* fix window cache
2024-07-07 12:10:04 -07:00
n8programs
1e05aef344
Add logit soft capping to gemma, and fix precision issues (#857)
* Add logit soft capping to gemma, and fix precision issues

Gemma was babbling nonsense - so I figured out it was due to not having logit softcapping and precision issues causing NaNs (so I implemented the softcapping and added more float32 inference). gemma-27b-it-4bit now works flawlessly (or near-flawlessly, no sliding-window attention).

* get rid of comments

* get rid of last comments (sry lol)

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-07-02 07:52:39 -07:00
Angelos Katharopoulos
f212b770d8
Server loads the model on demand from the request (#851) 2024-06-27 11:37:57 -07:00
Awni Hannun
538339b599
gemma2 (#855) 2024-06-27 10:06:28 -07:00
Awni Hannun
9f10728145
fix yi (#852) 2024-06-27 06:38:19 -07:00
Volodymyr Kyrylov
7979b84a9e
transformer_lm: add --dataset enwik8 (#838)
* transformer_lm: add --dataset enwik8

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-06-26 11:59:01 -07:00
Chime Ogbuji
df6bc09d74
Configuration-based use of HF hub-hosted datasets for training (#701)
* Add hf_dataset configuration for using HF hub-hosted datasets for (Q)LoRA training

* Pre-commit formatting

* Fix YAML config example

* Print DS info

* Include name

* Add hf_dataset parameter default

* Remove TextHFDataset and CompletionsHFDataset and use Dataset and CompletionsDataset instead, adding a text_key constructor argument to the former (and changing it to work with a provided data structure instead of just from a JSON file), and prompt_key and completion_key arguments to the latter with defaults for backwards compatibility.

* nits

* update docs

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-06-26 10:20:50 -07:00