bojanbabic 
							
						 
					 
					
						
						
							
						
						61297f547b 
					 
					
						
						
							
							Missing requirements needed for convert script ( #320 )  
						
						... 
						
						
						
						* fix requirements and add eos parameter
* fix black
* address comment
* address comments - remove new arg 
						
						
					 
					
						2024-01-18 19:04:24 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						37b41cec60 
					 
					
						
						
							
							Qlora ( #219 )  
						
						... 
						
						
						
						qlora 
						
						
					 
					
						2024-01-04 21:05:59 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						a5d6d0436c 
					 
					
						
						
							
							Support Hugging Face models ( #215 )  
						
						... 
						
						
						
						* support hf direct models 
						
						
					 
					
						2024-01-03 15:13:26 -08:00 
						 
				 
			
				
					
						
							
							
								Daniel Strobusch 
							
						 
					 
					
						
						
							
						
						1d09c4fecd 
					 
					
						
						
							
							keep dtype on model conversion ( #186 )  
						
						
						
						
					 
					
						2024-01-02 11:20:29 -08:00 
						 
				 
			
				
					
						
							
							
								Anchen 
							
						 
					 
					
						
						
							
						
						31ddbd7806 
					 
					
						
						
							
							add deepseek coder example ( #172 )  
						
						... 
						
						
						
						* feat: add example for deepseek coder
* chore: remove hardcoded rope_scaling_factor
* feat: add quantization support
* chore: update readme
* chore: clean up the rope scalling factor param in create cos sin theta
* feat: add repetition_penalty
* style /consistency changes to ease future integration
* nits in README
* one more typo
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2023-12-28 21:42:22 -08:00 
						 
				 
			
				
					
						
							
							
								Sushant 
							
						 
					 
					
						
						
							
						
						a516f4635d 
					 
					
						
						
							
							Fixed the return type for the __call__ method in Attention ( #190 )  
						
						
						
						
					 
					
						2023-12-26 09:32:43 -08:00 
						 
				 
			
				
					
						
							
							
								Daniel Strobusch 
							
						 
					 
					
						
						
							
						
						2bd20ef0e0 
					 
					
						
						
							
							shard llama model after conversion and unshard on loading ( #174 )  
						
						
						
						
					 
					
						2023-12-25 11:19:43 -08:00 
						 
				 
			
				
					
						
							
							
								Daniel Strobusch 
							
						 
					 
					
						
						
							
						
						848f118ac5 
					 
					
						
						
							
							use non-zero exit code on error ( #177 )  
						
						
						
						
					 
					
						2023-12-23 07:10:13 -08:00 
						 
				 
			
				
					
						
							
							
								Alvaro Bartolome 
							
						 
					 
					
						
						
							
						
						f4709cb807 
					 
					
						
						
							
							Align CLI args and some smaller fixes ( #167 )  
						
						... 
						
						
						
						* Add `.DS_Store` files to `.gitignore`
* Fix variable naming of `config` in `mixtral/convert.py`
* Align CLI args and minor fixes
* standardize
* one more
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2023-12-22 14:34:32 -08:00 
						 
				 
			
				
					
						
							
							
								Vaibhav Srivastav 
							
						 
					 
					
						
						
							
						
						0eaa323c10 
					 
					
						
						
							
							Fix conversion + inference errors. - Mistral ( #176 )  
						
						... 
						
						
						
						* Fix conversion + inference errors.
* wire rope_theta throuugh to nn.RoPE
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2023-12-22 14:10:25 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						3cf436b529 
					 
					
						
						
							
							Quantize example ( #162 )  
						
						... 
						
						
						
						* testing quantization
* conversion + quantization working
* one config processor
* quantization in mistral / nits in llama
* args for quantization
* llama / mistral conversion in good shape
* phi2 quantized
* mixtral
* qwen conversion 
						
						
					 
					
						2023-12-21 12:59:37 -08:00 
						 
				 
			
				
					
						
							
							
								Pedro Cuenca 
							
						 
					 
					
						
						
							
						
						ce30cc3d8f 
					 
					
						
						
							
							Use config.json in llama ( #159 )  
						
						... 
						
						
						
						* Use config.json in llama
* Fix pop
* Fix convert
* Typo 
						
						
					 
					
						2023-12-20 10:34:44 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						27c0a8c002 
					 
					
						
						
							
							Add llms subdir + update README ( #145 )  
						
						... 
						
						
						
						* add llms subdir + update README
* nits
* use same pre-commit as mlx
* update readmes a bit
* format 
						
						
					 
					
						2023-12-20 10:22:25 -08:00