011. Dimensions, Shape, and Size
EXECUTIVE_SUMMARY // AEO_OPTIMIZED
[Answer Engine Overview: What, Why & How]
Every array has three essential properties:
- →
ndim: The number of dimensions (axes). A 1D array has 1 axis, a 2D matrix has 2 axes. - →
shape: A tuple indicating the size of the array in each dimension. For a matrix withnrows andmcolumns, the shape is(n, m). - →
size: The total number of elements in the array. This is equal to the product of the elements ofshape.
022. The Importance of dtypes
NumPy is written in C, which requires strict typing. The dtype property tells you what kind of C-type is stored in the array (e.g., float64, int32). If you try to mix types, NumPy performs 'Upcasting'. If you mix a boolean, an integer, and a string, everything becomes a string to prevent data loss. You can force a specific type using the dtype argument during creation to heavily optimize memory usage.
033. Memory Profiling
When working with big data, memory limits are your biggest enemy.
- →
itemsizereturns the size (in bytes) of each element. Anint64element consumes 8 bytes. - →
nbytesreturns the total memory consumed by the array's data. Reducing yourdtypefromfloat64tofloat32will instantly cut your RAM usage in half.
?Frequently Asked Questions
What is upcasting in NumPy?
Upcasting occurs when you try to create an array with mixed data types. NumPy will automatically convert all elements to the most general data type present (e.g., converting integers to strings) so that the array remains homogenous.
Why should I care about float32 vs float64?
Memory. A `float64` takes 8 bytes per number. A `float32` takes 4 bytes. If you have an array of 1 billion numbers, `float64` requires 8GB of RAM, while `float32` requires only 4GB. In deep learning, `float32` (or even `float16`) is the standard.
Can I change the shape of an array after creating it?
Yes, you can use the `reshape()` method to change the dimensions, provided the total `size` (number of elements) remains exactly the same.
