Commit Graph

55 Commits

Author SHA1 Message Date
Justine Tunney
ede59bb742 Add BF16 support and fix warnings
This change updates the data type definitions to be the same as the
latest source code. Support for the bfloat16 data type is available
however it can't interpret the IQ quantization formats yet. Cleanup
of compiler warnings and other nits have been fixed, but behavioral
changes have been avoided, and no new features are as of yet added.
2024-05-25 22:58:50 -07:00
Salvatore Sanfilippo
3e5c0a464d Merge pull request #12 from jmousseau/match-ints
Match key-value pair and tensor counts with header integer width
2024-02-18 16:36:26 +01:00
Salvatore Sanfilippo
9c87cb78b0 Merge pull request #11 from jmousseau/leak-on-error
Prevent memory leak when tensor type is invalid
2024-02-18 16:34:10 +01:00
Jack Mousseau
7d25893516 Match key-value pair and tensor counts with header integer width 2024-02-18 07:26:35 -08:00
Jack Mousseau
c2cef3d1d8 Prevent memory leak when tensor type is invalid 2024-02-18 07:24:44 -08:00
Salvatore Sanfilippo
af7d88d808 Merge pull request #9 from jbochi/q4_1_fix
Fix q4_1 dequantization
2024-01-10 17:13:17 +01:00
Juarez Bochi
55d6267c31 Fix q4_1 dequantization 2024-01-10 10:17:13 -05:00
Salvatore Sanfilippo
fe34f6ec5c Merge pull request #8 from jbochi/q4
Add support for q4_0 and q4_1 quantizations
2024-01-10 00:10:54 +01:00
Juarez Bochi
dc69c608df Add support for q4_0 and q4_1 quantizations 2024-01-09 18:04:18 -05:00
antirez
eec3dc9f54 F16 output for dequantization. 2024-01-09 18:46:26 +01:00
antirez
26e3a59233 Rename gguf_init/end to more obvious names. 2024-01-09 16:35:40 +01:00
antirez
6eb4aeb2fb gguf_create(): take flags to be able to overwrite files. Fixes #7. 2024-01-09 16:32:10 +01:00
Salvatore Sanfilippo
81dbf8f8d2 Merge pull request #6 from jbochi/reverse_stride
Print tensor with correct strides
2024-01-09 15:48:46 +01:00
antirez
419d4706f6 Q2_K dequantization. 2024-01-05 23:38:47 +01:00
Juarez Bochi
50e79b9ec0 Print tensor with correct strides 2024-01-05 09:59:59 -05:00
Salvatore Sanfilippo
e48ca317ea Merge pull request #5 from jbochi/inspect_shape
Inspect tensor taking dims into consideration
2024-01-04 20:32:19 +01:00
Salvatore Sanfilippo
a42344e197 Merge pull request #4 from jbochi/show_shape
Print tensor dimensions
2024-01-04 20:31:23 +01:00
Salvatore Sanfilippo
92e1c67b8b Merge pull request #3 from jbochi/int_type_features
Add tensor type features for int types
2024-01-04 20:30:32 +01:00
Juarez Bochi
58a0479bb4 Inspect tensor taking dims into consideration 2024-01-04 11:44:13 -05:00
Juarez Bochi
a7e99574e2 Print tensor dimensions 2024-01-03 17:41:33 -05:00
Juarez Bochi
5d10eaac8d Add tensor type features for int types 2024-01-03 16:33:47 -05:00
antirez
b1f32c4088 Quantization functions refactoring. 2024-01-03 21:02:47 +01:00
antirez
ff16bc3dcf Speed: use the right compilation flags to dequantize faster. 2024-01-03 21:02:47 +01:00
Salvatore Sanfilippo
b4e7da4ceb Merge pull request #1 from jbochi/typos
Fix some typos
2024-01-03 14:54:30 +01:00
Salvatore Sanfilippo
04ec28ed35 Merge pull request #2 from jbochi/check_remap
Check remap when appending kv/info/data
2024-01-03 14:53:41 +01:00
Juarez Bochi
463fd63cf2 Check remap when appending kv/info/data 2024-01-03 08:01:00 -05:00
Juarez Bochi
e5cdcec626 Fix some typos 2024-01-03 07:34:12 -05:00
antirez
c8469c4a27 Q6_K quantization implemented. 2023-12-31 14:06:49 +01:00
antirez
54b93edecb README: grammar. 2023-12-30 18:08:27 +01:00
antirez
4a5dfdcdad README: show subcommand example output. 2023-12-30 18:02:21 +01:00
antirez
53e7b2b156 README: grammar. 2023-12-30 18:00:23 +01:00
antirez
e8b405aac8 README updated. 2023-12-30 17:29:44 +01:00
antirez
a4858afb4d Implement f16/f32 in gguf_tensor_to_float(). 2023-12-30 17:23:27 +01:00
antirez
136e04977c README: add compare example. 2023-12-30 15:47:52 +01:00
antirez
951ce0e3c4 Compare subcommand: report difference as %. 2023-12-30 15:43:44 +01:00
antirez
3663d73c22 Compare subcommand: just skip tensors we can't yet dequantize. 2023-12-30 10:13:38 +01:00
antirez
400f60b75b --verbose and README updated. 2023-12-29 22:50:41 +01:00
antirez
54946cbf14 Compare subcommand. 2023-12-28 17:24:05 +01:00
antirez
2a599dc5d0 Show subcommand: print total parameters. 2023-12-28 16:07:16 +01:00
antirez
e2062eea2c Q4_K dequantization. 2023-12-28 12:31:35 +01:00
antirez
c25ccfa02a Q8_0 dequantization. 2023-12-27 21:22:33 +01:00
antirez
558c7c3c6d Clarify the need for FP16 implementation. 2023-12-27 18:54:36 +01:00
antirez
bd4ecbda94 FP16 added. Split-mixtral improved. 2023-12-27 15:25:18 +01:00
antirez
a77a4d061c Mixtral experts extraction test. 2023-12-26 17:23:47 +01:00
antirez
7e9c2bd6a7 Better explain the tensor total size math. 2023-12-26 09:20:54 +01:00
antirez
3081d69b8e split-mixtral: copying of keys + APIs needed. 2023-12-26 09:14:50 +01:00
antirez
96e7eb2d4c gguf-tools: accept subcommands. 2023-12-26 00:07:56 +01:00
antirez
53fb176b3b Initial API to create new GGUF files.
Also added a few libraries that will be needed soon.
The CLI was renamed with the final name of gguf-tools.
2023-12-25 22:10:07 +01:00
antirez
3eb30c1872 API to remap/rewind + mapping in write mode. 2023-12-25 10:45:38 +01:00
antirez
f400e8a36f README added. 2023-12-24 23:46:46 +01:00