Cross Attention in Decoder Block of Transformer

Cross Attention in Decoder Block of Transformer

- July 06, 2025

Notice where the cross attention is marked, 2 arrows are coming from encoder block, and one is coming from decoder block.

Why do we need to consider Encoder Block?

Now, lets say we have predicted 2 words, and we need to predict the 3rd word? It will depend on what?

Ofc, first 2 words of decoder block, and original sentence context from the Encoder block.

So, we need to figure out the relationship between these two.

How will we get the relationship?

q : Hindi (from Decoder Block)

k : Eng (from Encoder Block)

v : Eng (from Encoder Block)

Comments