Commit Graph

  • a3257ff3cb emb2redis utility added. main antirez 2025-08-28 16:35:01 +02:00
  • 8fa6eb6523 Merge pull request #16 from jart/ftz Salvatore Sanfilippo 2025-01-09 16:46:11 +01:00
  • bd64d6e812 Merge pull request #15 from jart/features Salvatore Sanfilippo 2025-01-09 16:44:59 +01:00
  • 918234ce80 Remove flush to zero from bf16 Justine Tunney 2024-07-03 05:38:46 -07:00
  • 6deab767f9 Introduce --diffable flag Justine Tunney 2024-05-26 00:23:41 -07:00
  • 4e6455ecaf Merge pull request #14 from jart/update Salvatore Sanfilippo 2024-05-26 09:22:00 +02:00
  • ede59bb742 Add BF16 support and fix warnings Justine Tunney 2024-05-25 22:48:18 -07:00
  • 3e5c0a464d Merge pull request #12 from jmousseau/match-ints Salvatore Sanfilippo 2024-02-18 16:36:26 +01:00
  • 9c87cb78b0 Merge pull request #11 from jmousseau/leak-on-error Salvatore Sanfilippo 2024-02-18 16:34:10 +01:00
  • 7d25893516 Match key-value pair and tensor counts with header integer width Jack Mousseau 2024-02-18 07:26:35 -08:00
  • c2cef3d1d8 Prevent memory leak when tensor type is invalid Jack Mousseau 2024-02-18 07:23:31 -08:00
  • af7d88d808 Merge pull request #9 from jbochi/q4_1_fix Salvatore Sanfilippo 2024-01-10 17:13:17 +01:00
  • 55d6267c31 Fix q4_1 dequantization Juarez Bochi 2024-01-10 10:17:13 -05:00
  • fe34f6ec5c Merge pull request #8 from jbochi/q4 Salvatore Sanfilippo 2024-01-10 00:10:54 +01:00
  • dc69c608df Add support for q4_0 and q4_1 quantizations Juarez Bochi 2024-01-09 17:55:23 -05:00
  • eec3dc9f54 F16 output for dequantization. antirez 2024-01-09 18:46:26 +01:00
  • 26e3a59233 Rename gguf_init/end to more obvious names. antirez 2024-01-09 16:35:40 +01:00
  • 6eb4aeb2fb gguf_create(): take flags to be able to overwrite files. Fixes #7. antirez 2024-01-09 16:32:10 +01:00
  • 81dbf8f8d2 Merge pull request #6 from jbochi/reverse_stride Salvatore Sanfilippo 2024-01-09 15:48:46 +01:00
  • 419d4706f6 Q2_K dequantization. antirez 2024-01-05 23:38:47 +01:00
  • 50e79b9ec0 Print tensor with correct strides Juarez Bochi 2024-01-05 09:59:59 -05:00
  • e48ca317ea Merge pull request #5 from jbochi/inspect_shape Salvatore Sanfilippo 2024-01-04 20:32:19 +01:00
  • a42344e197 Merge pull request #4 from jbochi/show_shape Salvatore Sanfilippo 2024-01-04 20:31:23 +01:00
  • 92e1c67b8b Merge pull request #3 from jbochi/int_type_features Salvatore Sanfilippo 2024-01-04 20:30:32 +01:00
  • 58a0479bb4 Inspect tensor taking dims into consideration Juarez Bochi 2024-01-04 11:44:13 -05:00
  • a7e99574e2 Print tensor dimensions Juarez Bochi 2024-01-03 17:38:37 -05:00
  • 5d10eaac8d Add tensor type features for int types Juarez Bochi 2024-01-03 16:33:47 -05:00
  • b1f32c4088 Quantization functions refactoring. antirez 2024-01-03 21:02:17 +01:00
  • ff16bc3dcf Speed: use the right compilation flags to dequantize faster. antirez 2024-01-03 20:20:52 +01:00
  • b4e7da4ceb Merge pull request #1 from jbochi/typos Salvatore Sanfilippo 2024-01-03 14:54:30 +01:00
  • 04ec28ed35 Merge pull request #2 from jbochi/check_remap Salvatore Sanfilippo 2024-01-03 14:53:41 +01:00
  • 463fd63cf2 Check remap when appending kv/info/data Juarez Bochi 2024-01-03 08:01:00 -05:00
  • e5cdcec626 Fix some typos Juarez Bochi 2024-01-03 07:34:12 -05:00
  • c8469c4a27 Q6_K quantization implemented. antirez 2023-12-31 14:06:49 +01:00
  • 54b93edecb README: grammar. antirez 2023-12-30 18:08:27 +01:00
  • 4a5dfdcdad README: show subcommand example output. antirez 2023-12-30 18:02:21 +01:00
  • 53e7b2b156 README: grammar. antirez 2023-12-30 18:00:23 +01:00
  • e8b405aac8 README updated. antirez 2023-12-30 17:29:44 +01:00
  • a4858afb4d Implement f16/f32 in gguf_tensor_to_float(). antirez 2023-12-30 17:23:27 +01:00
  • 136e04977c README: add compare example. antirez 2023-12-30 15:47:52 +01:00
  • 951ce0e3c4 Compare subcommand: report difference as %. antirez 2023-12-30 15:43:44 +01:00
  • 3663d73c22 Compare subcommand: just skip tensors we can't yet dequantize. antirez 2023-12-30 10:13:38 +01:00
  • 400f60b75b --verbose and README updated. antirez 2023-12-29 22:50:41 +01:00
  • 54946cbf14 Compare subcommand. antirez 2023-12-28 17:24:05 +01:00
  • 2a599dc5d0 Show subcommand: print total parameters. antirez 2023-12-28 16:06:52 +01:00
  • e2062eea2c Q4_K dequantization. antirez 2023-12-28 11:25:08 +01:00
  • c25ccfa02a Q8_0 dequantization. antirez 2023-12-27 21:22:33 +01:00
  • 558c7c3c6d Clarify the need for FP16 implementation. antirez 2023-12-27 18:54:36 +01:00
  • bd4ecbda94 FP16 added. Split-mixtral improved. antirez 2023-12-27 15:13:42 +01:00
  • a77a4d061c Mixtral experts extraction test. antirez 2023-12-26 17:23:47 +01:00
  • 7e9c2bd6a7 Better explain the tensor total size math. antirez 2023-12-26 09:20:54 +01:00
  • 3081d69b8e split-mixtral: copying of keys + APIs needed. antirez 2023-12-26 09:14:50 +01:00
  • 96e7eb2d4c gguf-tools: accept subcommands. antirez 2023-12-26 00:07:56 +01:00
  • 53fb176b3b Initial API to create new GGUF files. antirez 2023-12-25 22:09:29 +01:00
  • 3eb30c1872 API to remap/rewind + mapping in write mode. antirez 2023-12-25 10:45:38 +01:00
  • f400e8a36f README added. antirez 2023-12-24 23:46:46 +01:00
  • b3092d3860 Compute tensor size in bytes. antirez 2023-12-24 23:44:24 +01:00
  • d54409bc9c Some library layout. antirez 2023-12-24 18:30:38 +01:00
  • 55a15a4230 Tensors parsing. antirez 2023-12-24 17:20:04 +01:00
  • 4ff25fb178 Limit array items printed. antirez 2023-12-24 12:21:41 +01:00
  • b47eaca8d1 GGUF parsing, initial design and functionalities. antirez 2023-12-24 10:36:26 +01:00