NumPy Tutorial

Welcome! This tutorial covers essential concepts of NumPy, a foundational Python library for numerical computing. By the end, you'll understand how to set up your environment, work with multi-dimensional arrays, understand data types, and create arrays in NumPy.

Environment Setup

To start using NumPy, install it by following these steps:

pip install numpy

Once installed, you can verify the installation by running:

import numpy as np
print(np.__version__)

If you see the version number, you're ready to use NumPy!

NumPy Ndarray

The ndarray (n-dimensional array) is the core data structure in NumPy. It allows efficient storage and manipulation of large datasets. Ndarrays can have multiple dimensions, making them ideal for mathematical operations on matrices and higher-dimensional arrays.

Key Properties of an Ndarray

Shape: The dimensions of the array, accessed with array.shape.
Size: The total number of elements, accessed with array.size.
Data type (dtype): Type of elements stored, like int32, float64.

For example, a 2x3 ndarray can be created and analyzed like this:

import numpy as np
array = np.array([[1, 2, 3], [4, 5, 6]])
print(array.shape)  # (2, 3)
print(array.size)   # 6

NumPy Data Types

NumPy supports many data types to accommodate different numerical values. Some common data types include:

int32 or int64: Integer numbers
float32 or float64: Floating-point numbers
complex64: Complex numbers
bool_: Boolean values

To specify a data type when creating an array, use the dtype parameter:

array = np.array([1.0, 2.5, 3.8], dtype=np.float32)

This allows for precise control over memory usage and computational efficiency.

NumPy Array Creation

NumPy offers multiple methods for creating arrays. Here are a few essential ones:

np.array([1, 2, 3]): Converts a Python list to a NumPy array.
np.zeros((2, 3)): Creates a 2x3 array of zeros.
np.ones((2, 3)): Creates a 2x3 array of ones.
np.arange(0, 10, 2): Creates an array from 0 to 10 with a step of 2.
np.linspace(0, 1, 5): Generates 5 equally spaced numbers from 0 to 1.

For example, creating an array of zeros and an array of sequential numbers:

zeros_array = np.zeros((2, 3))
sequence_array = np.arange(10)

These methods offer flexibility for initializing data in various scenarios.

Array from Existing Data

NumPy allows you to create arrays from existing data structures like lists, tuples, or other arrays. Use np.array() to convert a list or tuple to a NumPy array:

import numpy as np
original_list = [1, 2, 3, 4]
new_array = np.array(original_list)

This flexibility is essential when you need to switch between Python’s built-in structures and NumPy for more complex operations.

Arrays within Numerical Range

NumPy offers functions to create arrays within a numerical range efficiently. Here are some commonly used ones:

np.arange(start, stop, step): Generates values from start to stop with specified step.
np.linspace(start, stop, num): Generates num evenly spaced values from start to stop.

np.arange(0, 10, 2)      # Output: [0, 2, 4, 6, 8]
np.linspace(0, 1, 5)      # Output: [0., 0.25, 0.5, 0.75, 1.]

NumPy Broadcasting

Broadcasting allows NumPy to perform arithmetic operations on arrays of different shapes. When dimensions are compatible, smaller arrays are virtually expanded to match the shape of larger arrays:

array = np.array([1, 2, 3])
result = array + 10    # Output: [11, 12, 13]

Broadcasting is highly efficient and avoids the need for explicit loops, allowing for faster computations with less code.

NumPy Array Iteration

You can iterate over a NumPy array using Python loops or use NumPy’s own functions for efficient element access:

for x in np.nditer(array): Iterates element-by-element, ideal for n-dimensional arrays.
np.ndenumerate(array): Provides both index and value in iteration.

array = np.array([[1, 2], [3, 4]])
for index, x in np.ndenumerate(array):
    print(index, x)

NumPy Bitwise Operators

NumPy supports bitwise operations, which operate at the bit level. Commonly used operators include:

np.bitwise_and(x, y): Bitwise AND
np.bitwise_or(x, y): Bitwise OR
np.invert(x): Bitwise NOT

a = np.array([0b1100])
b = np.array([0b1010])
result = np.bitwise_and(a, b)  # Output: [0b1000]

NumPy String Functions

NumPy provides string functions that operate element-wise on arrays. Examples include:

np.char.add(["Hello"], [" World"]): Concatenates strings.
np.char.lower(["Hello"]): Converts strings to lowercase.
np.char.split(["Hello World"]): Splits strings.

np.char.add(["Num"], ["Py"])        # Output: ['NumPy']
np.char.lower(["NUMPY"])       # Output: ['numpy']

String functions are convenient for text processing within numerical computations.

NumPy Mathematical Functions

NumPy provides numerous mathematical functions that operate element-wise on arrays, making it easy to perform calculations across entire datasets:

np.add(x, y): Adds two arrays element-wise.
np.subtract(x, y): Subtracts one array from another.
np.multiply(x, y): Multiplies elements in arrays.
np.divide(x, y): Divides elements in arrays.
np.exp(x): Calculates the exponential of each element in the array.

x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
result = np.add(x, y)  # Output: [5, 7, 9]

These functions allow efficient computations over large arrays without the need for looping.

Statistical Functions

NumPy includes statistical functions to analyze and summarize data easily:

np.mean(array): Computes the mean (average) of an array.
np.median(array): Computes the median.
np.std(array): Calculates the standard deviation.
np.var(array): Calculates the variance.

array = np.array([1, 2, 3, 4, 5])
mean_val = np.mean(array)   # Output: 3.0

These functions are crucial for data analysis, allowing quick calculations of essential statistical values.

Sorting & Searching

NumPy offers tools for sorting and searching within arrays. Sorting can be done in ascending or descending order, and elements can be searched by index or value:

np.sort(array): Sorts elements in an array.
np.argsort(array): Returns indices of sorted elements.
np.where(condition): Returns indices of elements meeting the condition.
np.searchsorted(array, value): Finds the index where the value should be inserted to maintain order.

array = np.array([3, 1, 2])
sorted_array = np.sort(array)  # Output: [1, 2, 3]

These functions streamline sorting and locating data in arrays, enabling more efficient data management.

Copies and Views

In NumPy, a copy of an array is a new array with copied data, whereas a view is a new array object that refers to the same data as the original array:

array.copy(): Creates a deep copy (data and array structure).
array.view(): Creates a shallow copy (only array structure is copied).

original = np.array([1, 2, 3])
copy_array = original.copy()
view_array = original.view()

Modifications to a view will affect the original array, while modifications to a copy will not.

Matrix Library

NumPy’s matrix class allows users to perform matrix operations more naturally than with arrays, including:

np.matrix(): Creates a matrix.
matrix.T: Returns the transpose of the matrix.
matrix.I: Computes the inverse of the matrix.
matrix * other_matrix: Matrix multiplication.

mat1 = np.matrix([[1, 2], [3, 4]])
mat2 = np.matrix([[5, 6], [7, 8]])
product = mat1 * mat2

Using matrix objects simplifies the syntax for linear algebra operations and matrix manipulation.

NumPy Linear Algebra

NumPy includes functions to perform linear algebra operations, such as solving linear equations, finding eigenvalues, and calculating determinants. Here are some of the most commonly used linear algebra functions:

np.linalg.inv(matrix): Computes the inverse of a matrix.
np.linalg.det(matrix): Computes the determinant of a matrix.
np.linalg.eig(matrix): Returns eigenvalues and eigenvectors of a square matrix.
np.linalg.solve(a, b): Solves the equation a * x = b for x.

matrix = np.array([[4, 7], [2, 6]])
determinant = np.linalg.det(matrix)  # Output: 10.0

These functions are essential for applications in physics, engineering, and machine learning where linear algebra plays a key role.

Matrix Multiplication

NumPy supports multiple methods for matrix multiplication:

np.dot(a, b): Computes the dot product of two arrays, including matrix multiplication.
np.matmul(a, b): Performs matrix multiplication.
a @ b: An alternative operator for matrix multiplication introduced in Python 3.5.

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
product = np.dot(a, b)

Matrix multiplication is a key operation in linear algebra, enabling complex mathematical modeling and data transformations in many scientific fields.

NumPy Functions

NumPy includes a wide range of functions for array manipulation, mathematical operations, and data transformations. Here are some essential NumPy functions:

np.sum(array): Sums all elements in an array.
np.mean(array): Computes the mean of elements in an array.
np.max(array) / np.min(array): Finds the maximum or minimum value in an array.
np.reshape(array, shape): Changes the shape of an array without modifying its data.

array = np.array([1, 2, 3, 4])
mean_value = np.mean(array)  # Output: 2.5

These functions simplify the process of performing calculations and reshaping data within arrays, making data analysis faster and more efficient.

Creating Arrays with numpy.array()

The numpy.array() function is used to create arrays from Python lists or tuples. Here’s how you can use it:

np.array([1, 2, 3]): Creates a 1-dimensional array from a list.
np.array([[1, 2], [3, 4]]): Creates a 2-dimensional array from a nested list.
np.array([1, 2, 3], dtype=np.float64): Specifies the data type with dtype parameter.

one_d_array = np.array([1, 2, 3])
two_d_array = np.array([[1, 2], [3, 4]])

Using numpy.array() allows for the easy creation of arrays of any dimension and data type, making it the core array creation method in NumPy.

numpy.concatenate()

The numpy.concatenate() function joins two or more arrays along a specified axis.

import numpy as np
array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])
combined = np.concatenate((array1, array2))  # Output: [1, 2, 3, 4, 5, 6]

numpy.append()

The numpy.append() function appends values to the end of an array.

array = np.array([1, 2, 3])
new_array = np.append(array, [4, 5])  # Output: [1, 2, 3, 4, 5]

numpy.reshape()

Use numpy.reshape() to change the shape of an array without altering its data.

array = np.array([1, 2, 3, 4, 5, 6])
reshaped = np.reshape(array, (2, 3))  # Output: [[1, 2, 3], [4, 5, 6]]

numpy.sum()

The numpy.sum() function calculates the sum of all elements in an array.

array = np.array([[1, 2], [3, 4]])
total_sum = np.sum(array)  # Output: 10

numpy.random()

The numpy.random module provides functions to generate random numbers.

random_array = np.random.rand(3)  # Output: [0.5, 0.2, 0.9] (random values)

numpy.zeros()

Creates an array filled with zeros, specifying the desired shape.

zero_array = np.zeros((2, 3))  # Output: [[0., 0., 0.], [0., 0., 0.]]

numpy.log()

The numpy.log() function calculates the natural logarithm of each element in an array.

array = np.array([1, np.e, np.e**2])
log_values = np.log(array)  # Output: [0., 1., 2.]

numpy.where()

The numpy.where() function returns indices where a condition is met or applies changes based on conditions.

array = np.array([1, 2, 3, 4]) 
result = np.where(array > 2, array, -1)  # Output: [-1, -1, 3, 4]

numpy.argsort()

Returns the indices that would sort an array, useful for ordering data without changing it.

array = np.array([3, 1, 2]) 
sorted_indices = np.argsort(array)  # Output: [1, 2, 0]

numpy.transpose()

The numpy.transpose() function switches the axes of an array, useful in matrix operations.

array = np.array([[1, 2], [3, 4]]) 
transposed = np.transpose(array)  # Output: [[1, 3], [2, 4]]

numpy.mean()

Calculates the mean (average) of array elements, useful for data analysis.

array = np.array([1, 2, 3, 4]) 
mean_value = np.mean(array)  # Output: 2.5

numpy.unique()

The numpy.unique() function finds unique elements in an array and optionally returns indices or counts of unique elements.

array = np.array([1, 2, 2, 3, 4, 4]) 
unique_values = np.unique(array)  # Output: [1, 2, 3, 4]

numpy.ndarray.tolist()

The tolist() method converts a NumPy array to a regular Python list.

array = np.array([1, 2, 3]) 
list_version = array.tolist()  # Output: [1, 2, 3]

numpy.dot()

The numpy.dot() function performs dot product calculations for vectors and matrices, commonly used in linear algebra.

array1 = np.array([1, 2]) 
array2 = np.array([3, 4]) 
dot_product = np.dot(array1, array2)  # Output: 11

numpy.loadtxt()

The numpy.loadtxt() function loads data from a text file, useful for handling large datasets in text format.

data = np.loadtxt('data.txt', delimiter=',')  # Load CSV data

numpy.clip()

The numpy.clip() function limits array values to a specified minimum and maximum range.

array = np.array([1, 2, 3, 4, 5]) 
clipped_array = np.clip(array, 2, 4)  # Output: [2, 2, 3, 4, 4]

numpy.ndarray.flatten()

The flatten() method creates a 1D copy of a multi-dimensional array.

array = np.array([[1, 2], [3, 4]]) 
flattened_array = array.flatten()  # Output: [1, 2, 3, 4]

numpy.meshgrid()

The numpy.meshgrid() function generates coordinate matrices from two or more 1D arrays, often used for plotting functions over grids.

x = np.array([1, 2, 3]) 
y = np.array([4, 5]) 
X, Y = np.meshgrid(x, y) 
# X: [[1, 2, 3], [1, 2, 3]], Y: [[4, 4, 4], [5, 5, 5]]

numpy.std()

The numpy.std() function calculates the standard deviation of array elements, indicating data spread around the mean.

array = np.array([1, 2, 3, 4]) 
std_dev = np.std(array)  # Output: 1.118

numpy.argmax()

The numpy.argmax() function returns the index of the maximum element in an array.

array = np.array([1, 3, 2]) 
max_index = np.argmax(array)  # Output: 1

numpy.diff()

The numpy.diff() function calculates the difference between adjacent elements of an array.

array = np.array([1, 2, 4, 7]) 
diff_array = np.diff(array)  # Output: [1, 2, 3]

numpy.empty()

Creates an uninitialized array of a specified shape. Values in the array are arbitrary until overwritten.

empty_array = np.empty((2, 2))  # Output may be arbitrary, as it’s uninitialized

numpy.histogram()

The numpy.histogram() function calculates the histogram of an array, useful in data analysis and visualization.

array = np.array([1, 2, 1, 4, 5, 1, 2, 3, 5])
hist, bins = np.histogram(array, bins=[0, 1, 2, 3, 4, 5])

Here, hist contains the count of elements within each bin range, and bins defines the bin edges.

numpy.sort()

The numpy.sort() function sorts the elements of an array in ascending order.

array = np.array([3, 1, 2])
sorted_array = np.sort(array)  # Output: [1, 2, 3]

numpy.average()

The numpy.average() function computes the weighted average of an array.

array = np.array([1, 2, 3, 4])
avg = np.average(array)  # Output: 2.5

numpy.pad()

The numpy.pad() function pads an array along any axis with specified values.

array = np.array([1, 2, 3])
padded_array = np.pad(array, (1, 1), constant_values=0)  # Output: [0, 1, 2, 3, 0]

numpy.ravel()

The numpy.ravel() function flattens an array into a 1D array.

array = np.array([[1, 2], [3, 4]])
flattened_array = np.ravel(array)  # Output: [1, 2, 3, 4]

numpy.save()

The numpy.save() function saves an array to a binary file, allowing for quick storage and retrieval.

array = np.array([1, 2, 3])
np.save('array.npy', array)

numpy.arccos()

The numpy.arccos() function computes the inverse cosine of each element in the array (returns angles in radians).

array = np.array([1, 0])
result = np.arccos(array)  # Output: [0. 1.57079633]

numpy.arcsin()

The numpy.arcsin() function computes the inverse sine (arcsine) of each element in the array (returns angles in radians).

array = np.array([0, 1])
result = np.arcsin(array)  # Output: [0. 1.57079633]

numpy.arctan()

The numpy.arctan() function computes the inverse tangent (arctan) of each element in the array (returns angles in radians).

array = np.array([0, 1])
result = np.arctan(array)  # Output: [0.         0.78539816]

numpy.degrees()

The numpy.degrees() function converts angles from radians to degrees.

array = np.array([0, np.pi / 2])
degrees_array = np.degrees(array)  # Output: [  0.  90.]

numpy.tan()

The numpy.tan() function computes the tangent of each element in the array (in radians).

array = np.array([0, np.pi / 4])
tan_array = np.tan(array)  # Output: [0.         1.         ]

numpy.deg2rad()

The numpy.deg2rad() function converts angles from degrees to radians.

array = np.array([0, 90])
radians_array = np.deg2rad(array)  # Output: [0.         1.57079633]

numpy.hypot()

The numpy.hypot() function computes the Euclidean distance (sqrt(x^2 + y^2)) element-wise for two arrays.

x = np.array([3, 4])
y = np.array([4, 3])
hypotenuse = np.hypot(x, y)  # Output: [5. 5.]

numpy.rad2deg()

The numpy.rad2deg() function converts angles from radians to degrees.

array = np.array([0, np.pi])
degrees_array = np.rad2deg(array)  # Output: [  0. 180.]

numpy.radians()

The numpy.radians() function converts angles from degrees to radians.

array = np.array([0, 180])
radians_array = np.radians(array)  # Output: [0.         3.14159265]

numpy.arcsinh()

The numpy.arcsinh() function computes the inverse hyperbolic sine of each element.

array = np.array([1, 2])
result = np.arcsinh(array)  # Output: [0.88137359 1.44363548]

numpy.arctanh()

The numpy.arctanh() function computes the inverse hyperbolic tangent of each element.

array = np.array([0.5, 0.7])
result = np.arctanh(array)  # Output: [0.54930614 0.86770007]

numpy.ceil()

The numpy.ceil() function computes the ceiling of each element, rounding up to the nearest integer.

array = np.array([1.2, 2.3, 3.7])
ceiled_array = np.ceil(array)  # Output: [2. 3. 4.]

numpy.fix()

The numpy.fix() function rounds each element to the nearest integer toward zero.

array = np.array([1.7, -2.5])
fixed_array = np.fix(array)  # Output: [ 1. -2.]

numpy.floor()

The numpy.floor() function rounds each element down to the nearest integer.

array = np.array([1.2, 2.7, 3.8])
floored_array = np.floor(array)  # Output: [1. 2. 3.]

numpy.rint()

The numpy.rint() function rounds each element to the nearest integer.

array = np.array([1.5, 2.5, 3.5])
rinted_array = np.rint(array)  # Output: [2. 2. 4.]

numpy.tanh()

The numpy.tanh() function computes the hyperbolic tangent of each element.

array = np.array([0, 1])
tanh_array = np.tanh(array)  # Output: [0.         0.76159416]

numpy.trunc()

The numpy.trunc() function truncates each element to the nearest integer toward zero.

array = np.array([1.9, -2.3])
truncated_array = np.trunc(array)  # Output: [ 1. -2.]

numpy.matlib.empty()

The numpy.matlib.empty() function creates an uninitialized matrix, faster than creating with np.empty().

matrix = np.matlib.empty((2, 2))

numpy.matlib.eye()

The numpy.matlib.eye() function creates a matrix with ones on the diagonal and zeros elsewhere (identity matrix).

identity_matrix = np.matlib.eye(3)

numpy.matlib.identity()

The numpy.matlib.identity() function creates an identity matrix, similar to np.matlib.eye().

identity_matrix = np.matlib.identity(4)

numpy.matlib.ones()

The numpy.matlib.ones() function creates a matrix filled with ones.

ones_matrix = np.matlib.ones((3, 3))

numpy.matlib.zeros()

The numpy.matlib.zeros() function creates a matrix filled with zeros.

zeros_matrix = np.matlib.zeros((2, 2))

numpy.arange()

The numpy.arange() function generates an array with regularly spaced values within a specified range.

array = np.arange(0, 10, 2)  # Output: [0, 2, 4, 6, 8]

numpy.asarray()

The numpy.asarray() function converts input data (like lists) into a NumPy array.

list_data = [1, 2, 3]
array_data = np.asarray(list_data)

numpy.frombuffer()

The numpy.frombuffer() function reads an array from a buffer object (binary stream).

buffer = b'12345'
array_from_buffer = np.frombuffer(buffer, dtype=np.uint8)

numpy.fromiter()

The numpy.fromiter() function creates an array from an iterable object.

iterable = range(5)
array_from_iter = np.fromiter(iterable, dtype=int)

numpy.linspace()

The numpy.linspace() function generates an array of evenly spaced numbers over a specified range.

array = np.linspace(0, 1, 5)  # Output: [0.   0.25 0.5  0.75 1.  ]

numpy.logspace()

The numpy.logspace() function generates an array of numbers spaced evenly on a logarithmic scale.

array = np.logspace(0, 1, 5)  # Output: [ 1.          2.51188643  6.30957344 15.84893192 39.81071706]