hdeng-apple
86984cad68
Remove static initializers ( #2059 )
...
* Remove static initializers in device.cpp, load.cpp, pocketfft.h
* Remove static initializer InTracing::trace_stack
* Remove static initializer of CompilerCache cache
* Revert changes in pocketfft.h
* Remove duplicate private section of thread_pool()
2025-04-24 06:14:49 -07:00
Angelos Katharopoulos
ccb61d7aae
Ring distributed backend ( #1784 )
2025-01-27 22:15:01 -08:00
Awni Hannun
4ba0c24a8f
Export / import functions to / from a file ( #1642 )
...
* export and import functions
* refactor + works for few primitives
* nit
* allow primitives with state
* nit
* nit
* simplify serialize / deserialize
* fix for constants
* python bindings
* maybe fix serialize failure case
* add example
* more primitives, training kind of works
* same result for python and c++
* some fixes
* fix export
* template it up
* some simplificatoin
* rebase
* allow kwargs and multiple functions
* exporter
* more primitives for exporting
* deal with endianness
* handle invalid stream
* add docstring
2024-12-24 11:19:13 -08:00
Cheng
635117c5d4
Read/write files in binary mode ( #1698 )
2024-12-13 17:37:05 -08:00
Cheng
9635cffdc8
Include io.h in MSVC for IO functions ( #1661 )
2024-12-07 18:26:06 -08:00
Awni Hannun
dba2bd1105
Even Even Faster IO ( #1374 )
...
* even more faster io
* make reader pool static
* make python reader thread safe
* one more optimization
2024-08-29 16:05:40 -07:00
Awni Hannun
fcb65a3897
Even Faster I/O ( #1369 )
...
* try multithreading for faster IO
* smaller batch size
* Account for pread returning less than size
* nit
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com >
2024-08-28 11:49:07 -07:00
Awni Hannun
8ae751d3da
fix io ( #1343 )
...
* fix io
* fix io
* comment
2024-08-21 13:14:46 -07:00
Awni Hannun
d0630ffe8c
Read arrays from files faster ( #1330 )
...
* read faster
* faster write as well
* set default permission for linux
* comment
2024-08-14 20:09:56 -07:00
Cheng
9663c22fe9
Do not store iostream in shared_ptr ( #872 )
...
There is no need to store iostream in shared_ptr, doing so adds the cost
of a heap allocation.
2024-03-22 06:54:45 -07:00
Angelos Katharopoulos
a611b0bc82
Removes the retain_graph flag ( #385 )
...
* Adds global tracing flag
* Removes retain_graph in favor of is_tracer
2024-01-07 15:16:51 -08:00
Diogo
1f6ab6a556
Safetensor support ( #215 )
...
Co-authored-by: Awni Hannun <awni@apple.com >
2023-12-27 02:06:55 -08:00