Explain Q, K, and V matrices in self-attention

June 18, 2026Source: jalammar.github.iointermediate

This tests the information-retrieval intuition behind self-attention. Cover that Q, K, V are linear projections of one input; Q requests, K indexes, V supplies content; scores weight a sum of V.

This tests whether you understand the information-retrieval pattern behind self-attention, not just the formula. A strong answer explains that Q, K, and V are learned linear projections of the same input embedding; Q represents the current token's request, K indexes all tokens, and V holds the content to retrieve; the dot product of Q and K creates a compatibility score that weights a sum of V vectors. A red flag is describing them as separate model inputs or fixed embeddings rather than three views of the identical source sequence.

Read the original → jalammar.github.io

#llms
#transformers
#self-attention
#machine-learning
#interview

Get five bites like this every day.

Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.

Get on Play Store Get on App Store