Add Supported Quantized Phi-3-mini-4k-instruct gguf Weight (#717)

* support for phi-3 4bits quantized gguf weights * Added link to 4 bits quantized model * removed some prints * Added correct comment * Added correct comment * removed print Since last condition already prints warning for when quantization is None
2025-12-16 02:08:55 +08:00 · 2024-04-30 11:11:32 +08:00
parent 5513c4e57d
commit 7c0962f4e2
2 changed files with 20 additions and 1 deletions
--- a/llms/gguf_llm/README.md
+++ b/llms/gguf_llm/README.md
@@ -47,6 +47,10 @@ Models that have been tested and work include:
 - [TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF](https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF),
  for quantized models use:
  - `tinyllama-1.1b-chat-v1.0.Q8_0.gguf`
-  - `tinyllama-1.1b-chat-v1.0.Q4_0.gguf` 
+  - `tinyllama-1.1b-chat-v1.0.Q4_0.gguf`
+
+- [Jaward/phi-3-mini-4k-instruct.Q4_0.gguf](https://huggingface.co/Jaward/phi-3-mini-4k-instruct.Q4_0.gguf),
+  for 4 bits quantized phi-3-mini-4k-instruct use:
+  - `phi-3-mini-4k-instruct.Q4_0.gguf` 

 [^1]: For more information on GGUF see [the documentation](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md).