mirror of
https://github.com/ml-explore/mlx.git
synced 2025-07-15 21:21:16 +08:00
Fix cross-attention (#210)
* Fix cross-attention With the current code, ln2 is a no-op. Its output should be passed to the cross-attention layer * Add name to contributors
This commit is contained in:
parent
4d4af12c6f
commit
f4f6e17d45
@ -7,6 +7,7 @@ with a short description of your contribution(s) below. For example:
|
||||
|
||||
MLX was developed with contributions from the following individuals:
|
||||
|
||||
- Juarez Bochi: Fixed bug in cross attention.
|
||||
|
||||
# Third-Party Software
|
||||
|
||||
|
@ -157,7 +157,7 @@ class TransformerDecoderLayer(Module):
|
||||
x = x + y
|
||||
|
||||
y = self.ln2(x)
|
||||
y = self.cross_attention(x, memory, memory, memory_mask)
|
||||
y = self.cross_attention(y, memory, memory, memory_mask)
|
||||
x = x + y
|
||||
|
||||
y = self.ln3(x)
|
||||
|
Loading…
Reference in New Issue
Block a user