Image Segmentation

What is Image Segmentation?

Segmentation: Partitioning image into meaningful regions

Goal: Group pixels belonging to same object/region

Types:

  • Semantic segmentation: Label each pixel by class
  • Instance segmentation: Separate individual objects
  • Panoptic segmentation: Combine semantic + instance

Methods: Thresholding, clustering, watershedding, deep learning

Thresholding Methods

Simple thresholding: Fixed threshold

Adaptive thresholding: Local threshold per region

Otsu's method: Automatic optimal threshold

  1. import cv2
  2. # simple threshold
  3. _, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
  4. # Otsu's automatic threshold
  5. #ans: ret, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
  6. #ans: ret is the computed threshold value

Adaptive Thresholding

Why adaptive? Handles varying illumination

Methods:

  • Mean: Threshold = mean of neighborhood - C
  • Gaussian: Threshold = weighted mean - C
  1. # adaptive mean thresholding
  2. #ans: adaptive = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, blockSize=11, C=2)
  3. # adaptive Gaussian thresholding
  4. #ans: adaptive = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
  5. # blockSize: neighborhood size (must be odd)
  6. #ans: C: constant subtracted from mean

Watershed Segmentation

Watershed algorithm: Treats image like topographic map

Concept:

  • Image is topographic surface (intensity = height)
  • Water fills basins from markers
  • Watershed lines separate regions

Steps:

  1. Find markers (sure foreground)
  2. Find background
  3. Apply watershed
  4. Extract boundaries

Watershed Implementation

  1. # binary threshold
  2. _, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
  3. # noise removal (opening)
  4. kernel = np.ones((3, 3), np.uint8)
  5. opening = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel, iterations=2)
  6. # sure background (dilation)
  7. #ans: sure_bg = cv2.dilate(opening, kernel, iterations=3)
  8. # sure foreground (distance transform + threshold)
  9. #ans: dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5)
  10. #ans: _, sure_fg = cv2.threshold(dist_transform, 0.7 * dist_transform.max(), 255, 0)
  11. # unknown region
  12. #ans: sure_fg = np.uint8(sure_fg)
  13. #ans: unknown = cv2.subtract(sure_bg, sure_fg)

Watershed Markers

  1. # label markers
  2. #ans: _, markers = cv2.connectedComponents(sure_fg)
  3. # add 1 to all labels (background becomes 1, not 0)
  4. #ans: markers = markers + 1
  5. # mark unknown region as 0
  6. #ans: markers[unknown == 255] = 0
  7. # apply watershed
  8. #ans: markers = cv2.watershed(img, markers)
  9. # mark boundaries (-1) in red
  10. #ans: img[markers == -1] = [0, 0, 255]

GrabCut

GrabCut: Iterative segmentation using graph cuts

Input: Rectangle around object

Algorithm:

  • Iteratively estimates foreground/background
  • Uses Gaussian Mixture Models (GMM)
  • Refines boundary

Advantage: Semi-automatic, minimal user input

GrabCut in OpenCV

  1. # define rectangle around object
  2. #ans: rect = (50, 50, 450, 290) # (x, y, width, height)
  3. # initialize mask and models
  4. mask = np.zeros(img.shape[:2], np.uint8)
  5. bgdModel = np.zeros((1, 65), np.float64)
  6. fgdModel = np.zeros((1, 65), np.float64)
  7. # apply GrabCut
  8. #ans: cv2.grabCut(img, mask, rect, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_RECT)
  9. #ans: 5 iterations
  10. # extract foreground
  11. #ans: mask2 = np.where((mask == 2) | (mask == 0), 0, 1).astype('uint8')
  12. #ans: result = img * mask2[:, :, np.newaxis]

K-Means Clustering

K-Means: Cluster pixels by color similarity

Steps:

  1. Reshape image to list of pixels
  2. Apply k-means (k clusters)
  3. Replace pixels with cluster centers
  4. Reshape back

Result: Segmented regions with k colors

K-Means in OpenCV

  1. # reshape to 2D array of pixels
  2. #ans: pixels = img.reshape((-1, 3))
  3. #ans: pixels = np.float32(pixels)
  4. # define criteria and apply k-means
  5. criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.2)
  6. k = 3
  7. #ans: _, labels, centers = cv2.kmeans(pixels, k, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
  8. # convert centers to uint8
  9. #ans: centers = np.uint8(centers)
  10. # map labels to centers
  11. #ans: segmented = centers[labels.flatten()]
  12. #ans: segmented = segmented.reshape(img.shape)

Contour-Based Segmentation

Contours: Boundaries of shapes

findContours: Detects all contours

  1. # threshold to binary
  2. _, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
  3. # find contours
  4. #ans: contours, hierarchy = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
  5. # draw contours
  6. #ans: cv2.drawContours(img, contours, -1, (0, 255, 0), 2)
  7. #ans: -1 draws all contours
  8. # filter by area
  9. #ans: large_contours = [c for c in contours if cv2.contourArea(c) > 1000]

Exercises - Part 1 (Concepts)

  1. # what is image segmentation?
  2. #ans: partitioning image into meaningful regions
  3. # what is Otsu's method?
  4. #ans: automatic optimal threshold computation
  5. # why adaptive thresholding?
  6. #ans: handles varying illumination
  7. # what is watershed algorithm?
  8. #ans: treats image as topographic map, fills basins
  9. # what is GrabCut?
  10. #ans: iterative segmentation using graph cuts and GMMs

Exercises - Part 2 (Concepts)

  1. # k-means for segmentation?
  2. #ans: clusters pixels by color similarity
  3. # what are contours?
  4. #ans: boundaries of shapes/objects
  5. # semantic vs instance segmentation?
  6. #ans: semantic labels pixels, instance separates objects
  7. # advantage of GrabCut?
  8. #ans: semi-automatic, needs only rectangle input

Exercises - Part 3 (Coding)

  1. # Otsu's thresholding
  2. #ans: ret, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
  3. # adaptive thresholding
  4. #ans: adaptive = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
  5. # find contours
  6. _, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
  7. #ans: contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
  8. #ans: cv2.drawContours(img, contours, -1, (0, 255, 0), 2)

Exercises - Part 4 (Coding)

  1. # k-means with 4 clusters
  2. pixels = img.reshape((-1, 3))
  3. pixels = np.float32(pixels)
  4. criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.2)
  5. #ans: _, labels, centers = cv2.kmeans(pixels, 4, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
  6. #ans: centers = np.uint8(centers)
  7. #ans: segmented = centers[labels.flatten()].reshape(img.shape)

Exercises - Part 5 (Mixed)

  1. # GrabCut segmentation
  2. rect = (50, 50, 450, 290)
  3. mask = np.zeros(img.shape[:2], np.uint8)
  4. bgdModel = np.zeros((1, 65), np.float64)
  5. fgdModel = np.zeros((1, 65), np.float64)
  6. #ans: cv2.grabCut(img, mask, rect, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_RECT)
  7. #ans: mask2 = np.where((mask == 2) | (mask == 0), 0, 1).astype('uint8')
  8. #ans: result = img * mask2[:, :, np.newaxis]

Google tag (gtag.js)