CPU compile (#691)

* build and load shared object for cpu compile

* nits

* cpu compile tests pass

* cpu compile tests pass

* fix preamble for g++

* donation

* fix gpu buffer donation

* reuse prebuilt libraries

* faster contiguity conditoins

* fix test

* rid compiler warning

* fast erf

* Fix float16 for compile and add more types to cpu compile

* Remove a forgotten comment

* use cached libs

* nits

---------

Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
This commit is contained in:
Awni Hannun
2024-02-17 06:54:32 -08:00
committed by GitHub
parent c3965fc5ee
commit dc937b8ed3
13 changed files with 1716 additions and 192 deletions

View File

@@ -319,6 +319,9 @@ void compile_simplify(
case 1:
v = *a.data<uint8_t>();
break;
case 2:
v = *a.data<uint16_t>();
break;
case 4:
v = *a.data<uint32_t>();
break;