Computer Vision is often about getting the image into the right shape. Geometric transformations allow us to normalize orientations and resize data for neural networks.
1Geometric Warping
Images are not static pictures; they are geometric data. In this module, we will learn how to scale, rotate, and warp images using Affine Transformations, bending reality through mathematics.
Scaling is more than just stretching or shrinking dimensions. It requires mathematically guessing new pixel values. Interpolation determines how OpenCV fills in the gaps between pixels when resizing up or down.
# Image Scaling and Resizing
# Requires Interpolation Algorithms
# To prevent data loss or artifacts.2Downsampling (Shrinking)
When shrinking an image, we use cv2.INTER_AREA. It averages the pixels inside an area instead of just sampling blindly.
This is incredibly important because it prevents nasty, wavy artifacts known as Moiré patterns. It is widely considered the absolute best choice when you are downsizing an image in OpenCV.
import cv2
img = cv2.imread('input.jpg')
# Scaling down: Use INTER_AREA for best quality
# Reduces size without Moiré patterns
resized = cv2.resize(img, (300, 300), interpolation=cv2.INTER_AREA)3Upsampling (Zooming In)
For enlarging images (zooming in), INTER_CUBIC or INTER_LINEAR are much better.
They use complex polynomial math to smoothly estimate new pixel values, creating a sharper look rather than blocky pixels. INTER_CUBIC is slow but high quality, while INTER_LINEAR is the default choice, providing a good balance of speed and quality for general purpose resizing.
# Scaling up: Use INTER_CUBIC (high quality but slow)
# Alternatively: INTER_LINEAR (good balance)
zoomed = cv2.resize(img, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)4Rotation Matrices
Rotation is technically an 'Affine Transformation'. To rotate, we first calculate a 2x3 Transformation Matrix (M). This matrix holds the trigonometric data needed to shift every single pixel in the array.
We use cv2.getRotationMatrix2D to generate the matrix. We provide the center, the angle in degrees, and a scaling factor. Then, we apply that matrix using cv2.warpAffine to actually perform the rotation.
(h, w) = img.shape[:2]
center = (w // 2, h // 2)
# center, angle (deg), scale
M = cv2.getRotationMatrix2D(center, 45, 1.0)
# Apply the transformation matrix to the image
rotated = cv2.warpAffine(img, M, (w, h))5Affine Properties & Translation
Affine transformations have a crucial property: they always preserve parallel lines. If two roads are parallel in the input image, they remain parallel after translation, rotation, or scaling.
By manually creating a matrix, we can translate (move) an image. We create a float32 matrix with [1, 0, tx] and [0, 1, ty], where tx and ty are the number of pixels to shift on the X and Y axes. A negative tx would shift the image left.
import numpy as np
# Translation Matrix
# tx=100 (shift right), ty=50 (shift down)
M_trans = np.float32([[1, 0, 100], [0, 1, 50]])
shifted = cv2.warpAffine(img, M_trans, (w, h))