mirror of
				https://github.com/ml-explore/mlx.git
				synced 2025-10-31 07:58:14 +08:00 
			
		
		
		
	Fix cross-attention (#210)
* Fix cross-attention With the current code, ln2 is a no-op. Its output should be passed to the cross-attention layer * Add name to contributors
This commit is contained in:
		| @@ -7,6 +7,7 @@ with a short description of your contribution(s) below. For example: | |||||||
|  |  | ||||||
| MLX was developed with contributions from the following individuals: | MLX was developed with contributions from the following individuals: | ||||||
|    |    | ||||||
|  | - Juarez Bochi: Fixed bug in cross attention. | ||||||
|  |  | ||||||
| # Third-Party Software | # Third-Party Software | ||||||
|  |  | ||||||
|   | |||||||
| @@ -157,7 +157,7 @@ class TransformerDecoderLayer(Module): | |||||||
|         x = x + y |         x = x + y | ||||||
|  |  | ||||||
|         y = self.ln2(x) |         y = self.ln2(x) | ||||||
|         y = self.cross_attention(x, memory, memory, memory_mask) |         y = self.cross_attention(y, memory, memory, memory_mask) | ||||||
|         x = x + y |         x = x + y | ||||||
|  |  | ||||||
|         y = self.ln3(x) |         y = self.ln3(x) | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user
	 Juarez Bochi
					Juarez Bochi