OpenCV Tutorial

In this tutorial, we'll go over the essential concepts of OpenCV: how to install it, how to read and save images, perform basic operations on images, resize, and rotate images using OpenCV.

OpenCV Installation

To use OpenCV in Python, you'll need to install the opencv-python package. You can install it via pip:

pip install opencv-python

You can also install the opencv-python-headless package for environments without GUI support, such as servers or Docker containers:

pip install opencv-python-headless

Read & Save Images

OpenCV provides functions for reading and saving images. Here's an example of how to read an image and save it:

import cv2

# Read image
img = cv2.imread('path_to_image.jpg')

# Save image
cv2.imwrite('output_image.jpg', img)

The cv2.imread() function is used to read an image, and cv2.imwrite() is used to save the image to a new file.

Basic Operations on Images

OpenCV supports various basic image operations like color conversion, blurring, and edge detection. Here's how to convert an image to grayscale:

# Convert image to grayscale
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)


# Display the image
cv2.imshow('Grayscale Image', gray_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

The function cv2.cvtColor() converts the image from one color space to another (in this case, from BGR to grayscale).

OpenCV Resize Image

Resizing an image can be done using the cv2.resize() function. This function allows you to set new dimensions or scale the image by a given factor:

# Resize image
resized_image = cv2.resize(img, (width, height))


# Display resized image
cv2.imshow('Resized Image', resized_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

You can also use the fx and fy parameters to resize the image by a scaling factor rather than specifying the exact size.

OpenCV Image Rotation

OpenCV makes it easy to rotate images by an arbitrary angle. Here's how to rotate an image using a rotation matrix:

# Get image dimensions
height, width = img.shape[:2]

# Define the center of rotation
center = (width // 2, height // 2)

# Define the rotation matrix
rotation_matrix = cv2.getRotationMatrix2D(center, angle=45, scale=1)


# Perform the rotation
rotated_image = cv2.warpAffine(img, rotation_matrix, (width, height))


# Display rotated image
cv2.imshow('Rotated Image', rotated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In the above code, cv2.getRotationMatrix2D() generates a rotation matrix, and cv2.warpAffine() applies the transformation to the image.

OpenCV Drawing Functions

OpenCV allows you to draw basic shapes (like circles, rectangles, lines) on images. Here's how to draw some shapes on an image:

import cv2

# Create a blank image (black)
img = np.zeros((512, 512, 3), dtype=np.uint8)

# Draw a rectangle
cv2.rectangle(img, (100, 100), (400, 400), (255, 0, 0), 3)

# Draw a circle
cv2.circle(img, (250, 250), 100, (0, 255, 0), -1)

# Draw a line
cv2.line(img, (100, 100), (400, 400), (0, 0, 255), 5)


# Show image
cv2.imshow('Drawing Shapes', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

The functions cv2.rectangle(), cv2.circle(), and cv2.line() are used for drawing shapes. The parameters allow you to define the shape's position, size, color, and thickness.

OpenCV Blob Detection

Blob detection identifies regions in the image that have similar intensity or texture. The SimpleBlobDetector class in OpenCV can detect these blobs. Here's an example:

import cv2

# Load image
img = cv2.imread('path_to_image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply threshold
_, thresh = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)

# Set up blob detector
detector = cv2.SimpleBlobDetector_create()

# Detect blobs
keypoints = detector.detect(thresh)

# Draw blobs
img_with_blobs = cv2.drawKeypoints(img, keypoints, np.array([]), (0, 255, 0), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)


# Show image with blobs
cv2.imshow('Blobs Detected', img_with_blobs)
cv2.waitKey(0)
cv2.destroyAllWindows()

This example uses cv2.SimpleBlobDetector_create() to set up a detector and detect() to identify the blobs. The blobs are drawn using cv2.drawKeypoints().

Canny Edge Detection

Canny edge detection is a popular edge detection technique that detects a wide range of edges in images. Here's how to use it:

import cv2

# Load image
img = cv2.imread('path_to_image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply Canny edge detection
edges = cv2.Canny(img, 100, 200)


# Show edges
cv2.imshow('Canny Edges', edges)
cv2.waitKey(0)
cv2.destroyAllWindows()

The cv2.Canny() function is used to detect edges. The two parameters (100 and 200) are the lower and upper thresholds for the edge detection process.

OpenCV Gaussian Blur

Gaussian blur is used to reduce noise and detail in an image. It's commonly used as a preprocessing step for edge detection or other operations.

import cv2

# Load image
img = cv2.imread('path_to_image.jpg')

# Apply Gaussian Blur
blurred_image = cv2.GaussianBlur(img, (5, 5), 0)


# Show blurred image
cv2.imshow('Gaussian Blur', blurred_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

The cv2.GaussianBlur() function applies a Gaussian filter to the image. The first parameter is the image, the second is the size of the kernel (5x5 in this case), and the third is the standard deviation in the X and Y directions.

OpenCV Image Filters

Filters like median and bilateral filtering can be used to smooth the image or remove noise. Here's how to apply a median filter:

import cv2

# Load image
img = cv2.imread('path_to_image.jpg')

# Apply median filter
filtered_image = cv2.medianBlur(img, 5)


# Show filtered image
cv2.imshow('Median Filter', filtered_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

The cv2.medianBlur() function applies a median filter to the image. The second parameter defines the kernel size, which should be an odd number.

OpenCV Image Threshold

Thresholding is a technique used to segment an image into binary regions based on pixel intensity. Here's how to apply simple thresholding:

import cv2

# Load image
img = cv2.imread('path_to_image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply binary thresholding
_, thresh = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)


# Show thresholded image
cv2.imshow('Thresholded Image', thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()

The cv2.threshold() function is used to apply a threshold to the image. Any pixel intensity greater than the threshold is set to the maximum value (255), and the rest is set to 0.

OpenCV Contours

Contours are curves that join continuous points along a boundary. They are very useful for object detection and image segmentation. To find contours, use cv2.findContours() and draw them with cv2.drawContours().

import cv2

# Load image
img = cv2.imread('path_to_image.jpg', cv2.IMREAD_GRAYSCALE)

# Threshold the image
_, thresh = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)

# Find contours
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Draw contours
cv2.drawContours(img, contours, -1, (0, 255, 0), 3)


# Show the result
cv2.imshow('Contours', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

The cv2.findContours() function retrieves contours from a binary image, and cv2.drawContours() allows you to draw the contours on the image. This technique is widely used in shape detection.

OpenCV Mouse Event

OpenCV provides mouse event handling for interacting with images, such as clicking or dragging the mouse. Below is an example of how to use mouse events to interact with an image.

import cv2

# Global variables
drawing = False           # true if mouse is pressed
ix, iy = -1, -1


# Mouse callback function
def draw_rectangle(event, x, y, flags, param):
global ix, iy, drawing

if event == cv2.EVENT_LBUTTONDOWN:
          drawing = True
          ix, iy = x, y


elif event == cv2.EVENT_MOUSEMOVE:
          if drawing == True:
          img[:] = img_copy.copy()
          cv2.rectangle(img, (ix, iy), (x, y), (0, 255, 0), 2)


elif event == cv2.EVENT_LBUTTONUP:
          drawing = False
          cv2.rectangle(img, (ix, iy), (x, y), (0, 255, 0), 2)


# Load image
img = cv2.imread('path_to_image.jpg')
img_copy = img.copy()


# Set mouse callback function
cv2.setMouseCallback('Image', draw_rectangle)


# Display image
cv2.imshow('Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

This example demonstrates how to draw a rectangle on an image using mouse events. The function cv2.EVENT_LBUTTONDOWN detects when the left mouse button is pressed, and cv2.EVENT_MOUSEMOVE is used to update the rectangle as the mouse moves.

OpenCV Template Matching

Template matching is used to search for a subimage (template) within a larger image. Here's how to perform template matching in OpenCV.

import cv2

# Load image and template
img = cv2.imread('path_to_image.jpg')
template = cv2.imread('path_to_template.jpg')

# Perform template matching
result = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)

# Get the coordinates of the matched region
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)

# Draw rectangle around the matched region
h, w = template.shape[:2]
cv2.rectangle(img, max_loc, (max_loc[0] + w, max_loc[1] + h), (0, 255, 0), 2)


# Show result
cv2.imshow('Matched Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

cv2.matchTemplate() performs the template matching. The result is an image where each pixel indicates how much the corresponding region matches the template. The best match can be found using cv2.minMaxLoc().

OpenCV Erosion & Dilation

Erosion and dilation are morphological operations that are used to process images. Erosion shrinks the image, while dilation enlarges it. These operations are typically used in conjunction with binary images.

import cv2
# Load image in grayscale
img = cv2.imread('path_to_image.jpg', cv2.IMREAD_GRAYSCALE)

# Threshold the image
_, thresh = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)

# Apply erosion
erosion = cv2.erode(thresh, None, iterations=1)

# Apply dilation
dilation = cv2.dilate(thresh, None, iterations=1)


# Show results
cv2.imshow('Erosion', erosion)
cv2.imshow('Dilation', dilation)
cv2.waitKey(0)
cv2.destroyAllWindows()

cv2.erode() and cv2.dilate() perform the erosion and dilation operations, respectively. These operations are useful for reducing noise or emphasizing features in an image.

OpenCV Video Capture

OpenCV also supports real-time video capture using the webcam or a video file. You can process frames in real-time as you capture them.

import cv2

# Capture video from webcam
cap = cv2.VideoCapture(0)

while(True):
ret, frame = cap.read()
if not ret:
          break

# Display the frame
cv2.imshow('Video Frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
          break

# Release video capture object
cap.release()
cv2.destroyAllWindows()

The cv2.VideoCapture() function opens a video stream. You can capture individual frames using cap.read() and display them with cv2.imshow().

Face Detection & Recognition

Face detection is the process of detecting human faces in digital images. OpenCV provides pre-trained classifiers for detecting faces. You can also perform face recognition by matching detected faces with known identities.

import cv2

# Load the pre-trained Haar Cascade Classifier
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# Read the image
img = cv2.imread('path_to_image.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(gray, 1.1, 4)

# Draw rectangle around faces
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)


# Display the result
cv2.imshow('Face Detection', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

This example uses the Haar Cascade Classifier to detect faces in an image. You can adjust the detection parameters to fine-tune face detection results.

Limitations in Face Detection

Face detection using OpenCV's Haar Cascades has some limitations. It might fail in low-light conditions, be less accurate for small faces or faces at an angle, and struggle in complex scenes with multiple people.

Human Activity Recognition with OpenCV

Human activity recognition (HAR) is a process of identifying human activities in video sequences. It can be done using feature extraction and machine learning techniques. In OpenCV, you can use techniques like background subtraction, optical flow, and keypoint detection.

import cv2

# Initialize video capture
cap = cv2.VideoCapture('path_to_video.mp4')

# Initialize background subtractor
fgbg = cv2.createBackgroundSubtractorMOG2()
while(cap.isOpened()):
ret, frame = cap.read()
if not ret:
          break

# Apply background subtraction
fgmask = fgbg.apply(frame)

# Show result
cv2.imshow('Human Activity Recognition', fgmask)
if cv2.waitKey(1) and 0xFF == ord('q'):
          break


cap.release()
cv2.destroyAllWindows()

In this example, background subtraction is used to highlight the human activity by removing the static background. This technique helps isolate moving objects from the background for better recognition.

Count Number of Faces in an Image

Counting faces in an image can be done using face detection methods. By detecting all the faces and counting the number of detected bounding boxes, you can estimate the number of faces in an image.

import cv2
# Load the image
img = cv2.imread('path_to_image.jpg')

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Load face cascade
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# Detect faces
faces = face_cascade.detectMultiScale(gray, 1.1, 4)

# Count faces
num_faces = len(faces)
print(f'Number of faces detected: {num_faces}')

This simple code detects faces using Haar Cascade and counts the number of faces in the image. It's useful for counting people in an image or video.

Impact of Learning Rate on Model

The learning rate is a critical hyperparameter in training machine learning models. A high learning rate may cause the model to converge too quickly and miss the optimal solution, while a low learning rate may lead to slow convergence.

In OpenCV, machine learning models such as Support Vector Machines (SVM) and Neural Networks are trained with a learning rate parameter. Experimenting with different learning rates can help improve model performance.

# Example: Neural network training with OpenCV
import cv2

# Create and train an SVM or ANN with different learning rates
# Adjust the learning rate to optimize the model

Experiment with different learning rates to observe how the model's performance changes during training. Typically, you can tune the learning rate using grid search or randomized search techniques.

Track Objects with Camshift using OpenCV

The Camshift (Continuously Adaptive Mean Shift) algorithm is used to track moving objects in video sequences. It's an improvement of the Mean Shift algorithm and adapts dynamically to the size and orientation of the tracked object.

import cv2

# Initialize video capture
cap = cv2.VideoCapture('path_to_video.mp4')

# Read the first frame
ret, frame = cap.read()

# Define initial region of interest (ROI)
roi = (x, y, width, height)

# Set up the ROI for Camshift
roi_hist = cv2.calcHist([frame], [0], None, [256], [0, 256])

# Apply Camshift algorithm
ret, track_window = cv2.CamShift(roi_hist, roi)

# Draw the tracking window
cv2.rectangle(frame, (x, y), (x + width, y + height), (0, 255, 0), 2)

# Display the result
cv2.imshow('Tracking Object', frame)
cv2.waitKey(0)
cv2.destroyAllWindows()

The Camshift algorithm continuously adjusts the window size and orientation, making it more suitable for tracking objects that change appearance or move erratically.