Chunyang Wen
16856a0160
Remove useless pass ( #364 )
...
Co-authored-by: Chunyang Wen <chunyang_wen@apple.com>
2024-01-04 06:34:01 -08:00
toji
d2467c320d
Added support for python copy ( #335 )
...
* Added support for python copy
* precommit changes
* removed `_compiled_call_impl` line
* added tests and suggested changes
* ACK changes
2024-01-03 20:59:40 -08:00
Angelos Katharopoulos
e7f5059fe4
Support for quantized matmul with w and w^T ( #349 )
...
* Add the metal qvm implementation
* Add qmm_n
* Add gradient wrt to input for quantized_matmul
2024-01-03 14:22:36 -08:00
Gabrijel Boduljak
c7edafb729
implemented InstanceNorm ( #244 )
...
* implemented instancenorm
* implemented vector_norm in cpp
added linalg to mlx
* implemented vector_norm python binding
* renamed vector_norm to norm, implemented norm without provided ord
* completed the implementation of the norm
* added tests
* removed unused import in linalg.cpp
* updated python bindings
* added some tests for python bindings
* handling inf, -inf as numpy does, more extensive tests of compatibility with numpy
* added better docs and examples
* refactored mlx.linalg.norm bindings
* reused existing util for implementation of linalg.norm
* more tests
* fixed a bug with no ord and axis provided
* removed unused imports
* some style and API consistency updates to linalg norm
* remove unused includes
* fix python tests
* fixed a bug with frobenius norm of a complex-valued matrix
* complex for vector too
* addressed PR review comments
* fixed import order in __init__
* expected values in instancenorm tests are simple lists
* minor return expression style change
* added InstanceNorm to docs
* doc string nits
* added myself to individual contributors
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-03 12:21:15 -08:00
Awni Hannun
dff4a3833f
Module checks the weight on load_weights
( #337 )
...
* update module to check weights on load, also fix docs and reorganize tests
* nits + rebase
* a few more docs updates for Module
* use manual module file
* comment
2024-01-02 18:55:42 -08:00
Angelos Katharopoulos
436bec9fd9
Fix the implementation of the Bilinear layer ( #347 )
2024-01-02 16:46:18 -08:00
Asaf Zorea
295ce9db09
Feature expand nn linear ( #315 )
...
* Added an identity and bilinear layers
Added a reset_parameters option
Added normal init for bias
* pre-commit run
* add type hints for parameters and the return type
change Bilinear math to x_1 and x_2
change __call__ arguments to x and y instead of input and output
add explanation to the Initialization
* Remove unnecessary reshape
* Added 'i' to bilinear formula
* Changed bilinear computation to two matrix multiplications
* avoid saving intermediate results, kept y in bilinear for better clarity (can be replaced with x1)
* Changed math formula in Linear
Added more explanation to math formulas
Changed x1, x2 reshape to support all inputs sizes
2024-01-02 06:08:53 -08:00
Josh Soref
44c1ce5e6a
Spelling ( #342 )
...
* spelling: accumulates
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: across
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: additional
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: against
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: among
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: array
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: at least
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: available
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: axes
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: basically
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: bfloat
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: bounds
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: broadcast
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: buffer
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: class
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: coefficients
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: collision
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: combinations
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: committing
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: computation
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: consider
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: constructing
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: conversions
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: correctly
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: corresponding
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: declaration
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: default
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: dependency
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: destination
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: destructor
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: dimensions
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: divided
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: element-wise
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: elements
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: endianness
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: equivalent
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: explicitly
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: github
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: indices
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: irregularly
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: memory
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: metallib
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: negative
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: notable
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: optional
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: otherwise
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: overridden
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: partially
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: partition
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: perform
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: perturbations
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: positively
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: primitive
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: repeat
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: repeats
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: respect
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: respectively
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: result
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: rounding
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: separate
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: skipping
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: structure
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: the
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: transpose
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: unnecessary
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: unneeded
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
* spelling: unsupported
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
---------
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-01-01 21:08:17 -08:00
Nripesh Niketan
e09bf35b28
feat: Add Dropout3d layer to nn.layers ( #313 )
...
* feat: Add Dropout3d layer to nn.layers
* acknowledgement
* Add dropout tests to test_nn.py
* run pre-commit
* Add activation functions and dropout3d ops
* Add dropout tests for bfloat16 and float16
2023-12-31 14:01:21 -08:00
Hazem Essam
e3b8da2a49
Added implementation for Scaled RoPE. ( #261 )
...
* Added scale for RoPE
* Ran pre-commit
* Added RoPE scaling test
* Added docstring for scale parameter
* Modified docstrings
2023-12-31 06:06:01 -08:00
Nripesh Niketan
5ad8fb7268
feat: add softsign, softmax, hardswish, logsoftmax activation function ( #309 )
...
* feat: add softsign activation function
* run pre-commit
* Add Softsign activation function
* Add Softsign activation function
* Add documentation for ReLU6, Softplus, and Softsign activations
* Update activation functions in neural network layers
* Add LogSoftmax and Hardswish activations
* run pre-commit
* Update activations.py
* Added acknowledgements
* Fix activation function comments
* Fix activation functions in neural network layers
2023-12-29 11:49:36 -08:00
Angelos Katharopoulos
d29770eeaa
Update batchnorm to have the running stats in parameters ( #305 )
2023-12-28 14:31:10 -08:00
Chunyang Wen
040c3bafab
Add missing f str ( #306 )
...
Co-authored-by: Chunyang Wen <chunyang_wen@apple.com>
2023-12-28 06:09:34 -08:00
Chunyang Wen
05767b026f
Add information for dropout probability ( #304 )
...
Co-authored-by: Chunyang Wen <chunyang_wen@apple.com>
2023-12-27 21:51:30 -08:00
YUN, Junwoo
4417e37ede
Transformer fix ( #167 )
...
* add transformer with dropout, fix transformer ffm, layernorm order
* precommit changes
* precommit changes
* add docstring, activation, norm_first
* run precommit
* run precommit
* add doctstring
* precommit
* style nits in docs
---------
Co-authored-by: junwoo-yun <junwoo.yun@bagelcode.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-27 08:48:36 -08:00
__mo_san__
a123c3c7d2
implement-batch-norm-layer ( #217 )
...
- Add batch normalization layer
---------
Co-authored-by: Robert McCraith <mccraithrobert@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-25 07:32:53 -08:00
Vidit Agarwal
acf1721b98
Corrected the example of value_and_grad ( #274 )
...
* Corrected the example for mx.value_and_grad
* Reformat through pre-commit/black
2023-12-23 11:06:38 -08:00
Justin Deschenaux
e8deca84e0
Add dropout2d ( #250 )
2023-12-22 08:02:29 -08:00
Hazem Essam
0aa65c7a6b
Added ALiBi implementation ( #232 )
2023-12-21 14:36:38 -08:00
Angelos Katharopoulos
b3916cbf2b
Improve names of quantization arguments ( #235 )
...
* Change the default quantization group_size to 64
* Rename groups to group_size and width to bits
2023-12-20 16:53:53 -08:00
Angelos Katharopoulos
57fe918cf8
Adds C++ and nn quantization utilities ( #230 )
...
* Add C++ de-/quantize ops
* Add quantize functions to the docs and tests
* Add a QuantizedLinear module
2023-12-20 14:17:38 -08:00
Juarez Bochi
f4f6e17d45
Fix cross-attention ( #210 )
...
* Fix cross-attention
With the current code, ln2 is a no-op. Its output should be passed to the cross-attention layer
* Add name to contributors
2023-12-18 12:27:27 -08:00
Awni Hannun
ee0c2835c5
Docs updates ( #198 )
...
Reorganize NN docs + a few other tidbits.
2023-12-17 13:20:55 -08:00
Awni Hannun
2e02acdc83
add base kwarg to rope ( #186 )
2023-12-15 16:47:59 -08:00
Víctor Aguilar
f24200db2c
accross -> across ( #183 )
2023-12-15 13:46:50 -08:00
Awni Hannun
25f70d4ca4
Fix divide types + floor divide (//) ( #138 )
...
* divide types
* fix black + test
2023-12-11 20:20:58 -08:00
Diogo
02de234ef0
Activations LeakyReLU / PReLU / Softplus / Mish ( #109 )
...
* Leaky_relu / prelu / softplus / mish
* added tests
* updated bench
* remove torch refs, add init to PReLU
* added arvix reference to mish
* added missing docs
2023-12-11 19:40:57 -08:00
Nicholas Santavas
f5df47ec6e
Add Step, ELU, SELU, Swish activation functions ( #117 )
...
* Add Step, ELU, SELU, Swish activation functions
This commit adds the Step, ELU, SELU and Swish activations functions
* add to the docs
* review
2023-12-11 17:04:07 -08:00
Jason
b0cd092b7f
Added activation functions: leaky_relu relu6 softplus elu celu logsigmoid ( #108 )
...
* added leaky_relu relu6 softplus elu celu logsigmoid
* minor fixes for docstring and benchmark imports
* fixed elu implementation and added tests
* added tests for optional param, changed leaky_relu param to fit pytorch documentation
2023-12-10 16:31:38 -08:00
Awni Hannun
71d1fff90a
Bug fix in metal binary kernel dispatch for large arrays ( #125 )
...
* bug fix
* format
2023-12-10 16:12:31 -08:00
Henry Ansah
68bf1d7867
add nn module for sigmoid activation ( #111 )
...
* add nn module for sigmoid activation
* update .gitignore with .cache folder generated by jetbrains fleet ide
* remove .cache folder
2023-12-10 07:00:39 -08:00
__mo_san__
ef7b8756c0
Add tanh activation function ( #115 )
...
* added Adagrad optimizer ...
* added Tanh activation function ...
* reformatted file ...
* remove unrelated stuff ...
* Update activations.py
2023-12-09 19:25:38 -08:00
Joe Barrow
ac6dc5d3eb
Adding optional bias param to MultiHeadAttention ( #104 )
...
* Adding optional param to
* Run style-checker
2023-12-09 11:04:28 -08:00
Zach Schillaci
5b9be57ac3
Add isort pre-commit and run ( #68 )
2023-12-08 11:31:47 -08:00
Zach Schillaci
d11d77e581
Spelling fixes in transformer.py ( #59 )
2023-12-07 13:32:09 -08:00
rushyam
2e126aeb7e
Feature Addition: Encoder-Decoder Transformer Architecture ( #50 )
...
* Implemented decoder-transformer-layer, decoder-transformer and introduce encoder-decoder transformer
* added relu layer
* add src, tgt, memory mask
---------
Co-authored-by: rushyam <rushyam@rushyams-MacBook-Air.local>
2023-12-07 07:37:36 -08:00
Jagrit Digani
2440fe0124
NPY loading segfault bug ( #34 )
...
* Fixed Gil semantics in loading and saving from python file streams
2023-12-06 12:03:47 -08:00
Markus Enzweiler
2ffaee0c0d
Updated default argument for stride to 1 in Conv2d() in the docstring ( #22 )
2023-12-06 07:17:58 -08:00
Awni Hannun
46a39e5b1f
copyright + ack
2023-11-30 11:12:53 -08:00
Jagrit Digani
e6306cfee9
jagrit's commit files
2023-11-29 10:52:08 -08:00
Angelos Katharopoulos
d1f86272a2
angelos's commit files
2023-11-29 10:42:59 -08:00
Awni Hannun
8ca7f9e8e9
awni's commit files
2023-11-29 10:30:41 -08:00