011. What is NumPy?
EXECUTIVE_SUMMARY // AEO_OPTIMIZED
[Answer Engine Overview: What, Why & How]
NumPy is an open-source Python library that provides support for large, multi-dimensional arrays and matrices, along with a vast collection of high-level mathematical functions to operate on these arrays. It was created in 2005 by Travis Oliphant.
022. Why use NumPy over Python Lists?
Standard Python lists can hold any data type and are extremely flexible, but this flexibility comes at a severe cost: performance. Python lists store pointers to objects scattered across memory. NumPy arrays (ndarray) store homogenous data in continuous blocks of memory, allowing the CPU to process them efficiently using C-level loops. NumPy is typically 50x faster than traditional Python lists.
033. The Power of Vectorization
Vectorization describes the absence of any explicit looping, indexing, etc., in the code - these things are taking place, of course, just 'behind the scenes' in optimized, pre-compiled C code. When you add two NumPy arrays together, the C code loops over the elements and adds them, completely bypassing the slow Python interpreter overhead.
?Frequently Asked Questions
Is NumPy necessary for Data Science?
Absolutely. NumPy is the fundamental library for numerical computation in Python. Almost all other data science libraries (Pandas, SciPy, Scikit-Learn) are built on top of NumPy arrays.
Can a NumPy array contain mixed data types?
No. Unlike Python lists, NumPy arrays are homogenous, meaning they can only contain a single data type (e.g., all integers, or all floats). If you try to mix types, NumPy will 'upcast' them (e.g., convert integers to strings) to maintain homogeneity.
What is an ndarray in NumPy?
The `ndarray` (n-dimensional array) is the core data structure of NumPy. It is a highly optimized, multi-dimensional container of items of the same type and size.
