mirror of
				https://github.com/ml-explore/mlx.git
				synced 2025-10-31 07:58:14 +08:00 
			
		
		
		
	Fix cross-attention (#210)
* Fix cross-attention With the current code, ln2 is a no-op. Its output should be passed to the cross-attention layer * Add name to contributors
This commit is contained in:
		| @@ -7,6 +7,7 @@ with a short description of your contribution(s) below. For example: | ||||
|  | ||||
| MLX was developed with contributions from the following individuals: | ||||
|    | ||||
| - Juarez Bochi: Fixed bug in cross attention. | ||||
|  | ||||
| # Third-Party Software | ||||
|  | ||||
|   | ||||
| @@ -157,7 +157,7 @@ class TransformerDecoderLayer(Module): | ||||
|         x = x + y | ||||
|  | ||||
|         y = self.ln2(x) | ||||
|         y = self.cross_attention(x, memory, memory, memory_mask) | ||||
|         y = self.cross_attention(y, memory, memory, memory_mask) | ||||
|         x = x + y | ||||
|  | ||||
|         y = self.ln3(x) | ||||
|   | ||||
		Reference in New Issue
	
	Block a user
	 Juarez Bochi
					Juarez Bochi