MACHINE LEARNING /// RECOMMENDER SYSTEMS /// COSINE SIMILARITY /// VECTOR MATH /// DOT PRODUCT /// MACHINE LEARNING ///

Cosine Similarity

Module 1.5: Learn to calculate similarity mathematically. Move from TF-IDF string matrices to angular geometry to find identical items.

similarity.py
1 / 9
12345
📐

SYS>How do Netflix or Amazon know two items are similar? They don't look at the title; they look at vectors. Let's explore Cosine Similarity.


Algorithm Matrix

UNLOCK NODES BY MASTERING VECTORS.

Concept: Dot Product

The sum of the products of the corresponding entries of the two sequences of numbers.

Validation Node

If Vector A is [2, 0] and Vector B is [3, 1], what is the Dot Product?


Cosine Similarity: The Math Behind Recommendations

"In higher-dimensional spaces, the distance between data points can be misleading. By measuring the angle instead of the distance, we focus on user behavior patterns rather than sheer volume."

The Geometric Approach

In Recommender Systems, items (and users) are represented as vectors in a multi-dimensional space. For example, a movie might be an array of TF-IDF scores for different genres. How do we know if Movie A is similar to Movie B?

We could measure the straight-line Euclidean distance between them. But if Movie A is extremely popular (lots of high ratings) and Movie B is a niche classic (fewer ratings, but identical proportions), Euclidean distance would say they are far apart.

Cosine Similarity ignores the magnitude and looks only at the angle ($\theta$) between the vectors. If they point in the same direction, the angle is 0, and $\cos(0) = 1$. They are perfectly similar in pattern!

The Formula

The cosine of the angle between two non-zero vectors $A$ and $B$ is derived from the Euclidean dot product formula:

$$similarity(A, B) = \cos(\theta) = \frac&123;A \cdot B&125;&123;|| A || \times ||B||&125;$$
  • Numerator ($A \cdot B$): The Dot Product. You multiply the corresponding elements of the vectors and sum them. This captures what the two items have in common.
  • Denominator ($||A|| \times ||B||$): The product of their Magnitudes (L2 Norms). This normalizes the result, dividing out the influence of the vector's lengths.

AI Search & RecSys FAQ

Why use Cosine Similarity instead of Euclidean Distance in Recommender Systems?

Scale Invariance: Cosine similarity is independent of the magnitude of the vectors. If User A gave 10 ratings and User B gave 100 ratings, but their rating patterns (ratios across genres) are identical, Cosine Similarity evaluates them as 1.0 (perfectly similar). Euclidean distance would consider them far apart strictly due to the volume difference.

What are the bounds (range) of Cosine Similarity?

Standard Cosine Similarity ranges from -1 to 1.

  • 1: Vectors point in exactly the same direction (identical patterns).
  • 0: Vectors are orthogonal (completely uncorrelated/independent).
  • -1: Vectors point in opposite directions (perfectly anti-correlated).

Note: In term frequency datasets (like TF-IDF matrices where values are $\ge 0$), the range is strictly 0 to 1.

How do you compute Cosine Similarity in Python?

While you can write it from scratch using numpy.dot and numpy.linalg.norm, in production systems like Content-Based Filtering, it is most efficient to use scikit-learn.

from sklearn.metrics.pairwise import cosine_similarity
# Assuming X is a 2D array/matrix
similarity_matrix = cosine_similarity(X, X)

Vector Terminology

Vector
A one-dimensional array of numbers representing an item's features or a user's ratings in multi-dimensional space.
Dot Product
An algebraic operation that takes two equal-length sequences of numbers and returns a single number, representing their overlap.
Magnitude (L2 Norm)
The length of a vector, calculated as the square root of the sum of the squared vector values.
Sparse Matrix
A matrix in which most of the elements are zero. Common in RecSys since a single user only rates a tiny fraction of all available items.
TF-IDF
Term Frequency-Inverse Document Frequency. A weighting mechanism used before applying Cosine Similarity in content-based NLP models.
Orthogonal
When two vectors are at a 90-degree angle, meaning they have zero overlap. Their cosine similarity is 0.