tezvyn:

Cosine Similarity: Measuring Direction, Not Distance

Source: Wikipedia: Cosine similaritybeginner

Cosine similarity measures the angle between two vectors, not their distance, to gauge similarity. It asks, "Do these point in the same direction?" This is fundamental in AI for comparing text embeddings, where a vector's direction represents its meaning. The main footgun is confusing it with Euclidean distance; cosine similarity ignores vector magnitude, so two vectors can be far apart in space but still be considered nearly identical if their orientation is the same.

Think of cosine similarity as measuring the directional alignment of two vectors, not their physical distance. It calculates the cosine of the angle between them, effectively asking "how much do these point in the same direction?" rather than "how far apart are their endpoints?". This is a cornerstone of modern AI for comparing text embeddings, where vectors for "cat" and "feline" are semantically related because they point in a similar direction. The biggest footgun is confusing it with Euclidean distance. Cosine similarity completely ignores vector magnitude (length), so two documents with vastly different word counts can have a high similarity score if their topic proportions are the same, which can be misleading if scale is important.

Read the original → Wikipedia: Cosine similarity

Get five bites like this every day.

Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.

Cosine Similarity: Measuring Direction, Not Distance · Tezvyn