Awni Hannun
|
40c62c1321
|
Use int64 stride everywhere (#1671)
* use int64 stride everywhere
* fix ext
* fix ext
* more shape + cleanup
* one more
* few more
|
2024-12-09 11:09:02 -08:00 |
|
Awni Hannun
|
98b6ce3460
|
Refactor reductions and fix scatter atomics for large sizes (#1300)
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
|
2024-08-22 16:03:31 -07:00 |
|
Awni Hannun
|
fe3167d7ea
|
smaller CPU binary (#1203)
* smaller CPU binary
* fix no cpu build
|
2024-06-14 09:46:55 -07:00 |
|
Angelos Katharopoulos
|
29221fa238
|
Implement vjps for some primitives in the fast namespace (#883)
* Implement rope vjp in terms of rope
* RMSNormVJP primitive and kernel
* Add LayerNormVJP primitive and kernel
|
2024-03-26 16:35:34 -07:00 |
|
Angelos Katharopoulos
|
9e6b8c9f48
|
Refactor the reduction kernels (#277)
|
2023-12-24 14:47:57 -08:00 |
|
Awni Hannun
|
46a39e5b1f
|
copyright + ack
|
2023-11-30 11:12:53 -08:00 |
|
Awni Hannun
|
8ca7f9e8e9
|
awni's commit files
|
2023-11-29 10:30:41 -08:00 |
|