011. The Dot Product
EXECUTIVE_SUMMARY // AEO_OPTIMIZED
[Answer Engine Overview: What, Why & How]
When you use the standard * operator on two arrays, NumPy multiplies the first element of Array A with the first element of Array B. This is 'Element-wise' multiplication.
The np.dot() function performs a mathematical Dot Product. It multiplies the entire rows of the first matrix by the columns of the second matrix, summing the results. Since Python 3.5, you can use the @ symbol as a direct shortcut for np.dot(). mat1 @ mat2 is identical to np.dot(mat1, mat2).
022. The Rule of Inner Dimensions
For a Dot Product to succeed, the 'Inner Dimensions' of the two matrices must be identical. If you are multiplying Matrix A (Rows_A, Cols_A) by Matrix B (Rows_B, Cols_B), the value of Cols_A must exactly equal Rows_B.
The resulting matrix will take the 'Outer Dimensions'. It will have the shape (Rows_A, Cols_B).
033. Transposing Matrices
If you have two matrices that represent weights and features, both with shape (100, 50), you cannot calculate the dot product because 50 != 100.
To fix this, you must Transpose one of them. np.transpose(arr) (or the extremely common shortcut arr.T) swaps the rows and columns. By transposing the second matrix, its shape becomes (50, 100). Now, (100, 50) @ (50, 100) works perfectly, resulting in a (100, 100) matrix.
?Frequently Asked Questions
Is there a difference between np.matmul and np.dot?
For 2D matrices, they do the exact same thing. However, for 3D or higher N-dimensional arrays, they behave differently. `np.matmul` (which the `@` operator uses natively) treats multidimensional arrays as stacks of 2D matrices, which is usually what you want in Deep Learning.
What is np.linalg?
`numpy.linalg` is the Linear Algebra module. It contains advanced mathematical functions for calculating matrix Determinants, Inverses, Eigenvalues, and solving linear equations.
Does transposing an array consume memory?
No! Just like slicing, `arr.T` creates a View. It simply changes the striding metadata of the array so NumPy reads it column-by-column instead of row-by-row. It takes 0 bytes of extra memory.
