SIFT & SURF

SIFT Overview

SIFT: Scale-Invariant Feature Transform (Lowe, 1999)

Key properties:

  • Scale invariant: Detects features at multiple scales
  • Rotation invariant: Assigns orientation to keypoints
  • Illumination invariant: Robust to lighting changes
  • Partial occlusion resistant

Patent expired: 2020, now free to use commercially

SIFT Algorithm Steps

1. Scale-space extrema detection

  • Build Gaussian pyramid (multiple scales)
  • Compute Difference of Gaussians (DoG)
  • Find local maxima/minima across scales

2. Keypoint localization

  • Reject low-contrast points
  • Eliminate edge responses

3. Orientation assignment

  • Compute gradient histogram
  • Assign dominant orientation(s)

4. Descriptor generation

  • 16×16 neighborhood → 4×4 grid
  • 8-bin histogram per cell → 128 values

Scale-Space Detection

Why multiple scales?

  • Features appear at different sizes
  • Same feature in different images may have different scales

Gaussian pyramid:

DoG: Difference of Gaussians approximates Laplacian

SIFT Parameters

  1. import cv2
  2. # create SIFT with default parameters
  3. #ans: sift = cv2.SIFT_create()
  4. # custom parameters
  5. #ans: sift = cv2.SIFT_create(nfeatures=500, nOctaveLayers=3, contrastThreshold=0.04, edgeThreshold=10, sigma=1.6)
  6. # nfeatures: max features to retain (0=all)
  7. #ans: keeps 500 best features
  8. # nOctaveLayers: layers per octave
  9. #ans: more layers = more scales, slower
  10. # contrastThreshold: filter low-contrast
  11. #ans: higher = fewer features, stronger ones
  12. # edgeThreshold: filter edge responses
  13. #ans: higher = more edge features accepted
  14. # sigma: initial gaussian blur
  15. #ans: 1.6 is standard value

SIFT Detection & Computation

  1. gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
  2. sift = cv2.SIFT_create()
  3. # detect keypoints only
  4. #ans: keypoints = sift.detect(gray, None)
  5. # detect and compute descriptors
  6. #ans: keypoints, descriptors = sift.detectAndCompute(gray, None)
  7. # keypoint properties
  8. #ans: kp = keypoints[0]
  9. #ans: x, y = kp.pt # position
  10. #ans: size = kp.size # scale
  11. #ans: angle = kp.angle # orientation (degrees)
  12. #ans: response = kp.response # corner strength

SIFT Descriptor

128-dimensional vector:

  • 4×4 grid of cells
  • 8-bin gradient histogram per cell
  • 4×4×8 = 128 values

Normalization: Makes descriptor illumination invariant

Matching: Use L2 (Euclidean) distance

SURF Overview

SURF: Speeded-Up Robust Features (Bay, 2006)

Key improvements over SIFT:

  • 3-5× faster: Uses integral images
  • Comparable performance
  • 64 or 128 dimensions

Caveat: Still patented, requires opencv-contrib

SURF Algorithm

Speed optimizations:

  1. Integral images: Fast box filter computation
  2. Hessian matrix approximation: Box filters instead of Gaussian
  3. Wavelet responses: Simplified gradient computation

Steps similar to SIFT:

  • Scale-space detection (using Hessian)
  • Orientation assignment
  • Descriptor generation

SURF in OpenCV

  1. # requires opencv-contrib-python
  2. # create SURF detector
  3. #ans: surf = cv2.xfeatures2d.SURF_create()
  4. # set Hessian threshold
  5. #ans: surf.setHessianThreshold(400)
  6. #ans: lower = more features, weaker ones
  7. # detect and compute
  8. gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
  9. #ans: keypoints, descriptors = surf.detectAndCompute(gray, None)
  10. # use extended descriptor (128-dim instead of 64)
  11. #ans: surf.setExtended(True)
  12. #ans: 128 dimensions for more distinctiveness

SIFT vs SURF

SIFT:

  • More accurate, distinctive
  • Better for difficult matching
  • Slower (Gaussian filters)
  • 128 dimensions
  • Patent expired (free)

SURF:

  • Faster (integral images)
  • Good for real-time
  • 64/128 dimensions
  • Still patented
  • Good SIFT alternative

Both: Scale + rotation invariant, robust descriptors

Matching SIFT Features

  1. # detect SIFT in two images
  2. sift = cv2.SIFT_create()
  3. kp1, des1 = sift.detectAndCompute(gray1, None)
  4. kp2, des2 = sift.detectAndCompute(gray2, None)
  5. # brute-force matcher with L2 norm
  6. #ans: bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True)
  7. #ans: matches = bf.match(des1, des2)
  8. #ans: matches = sorted(matches, key=lambda x: x.distance)
  9. # draw matches
  10. #ans: img_matches = cv2.drawMatches(img1, kp1, img2, kp2, matches[:30], None, flags=2)

Ratio Test (Lowe's Ratio)

Problem: Ambiguous matches (multiple similar descriptors)

Solution: Compare best match to second-best match

Ratio test:

  1. if distance(1st) < 0.75 × distance(2nd):
  2. accept match
  3. else:
  4. reject (ambiguous)

0.75: Standard threshold, adjust for stricter/looser matching

Ratio Test Implementation

  1. # BFMatcher with kNN (k=2)
  2. bf = cv2.BFMatcher(cv2.NORM_L2)
  3. #ans: matches = bf.knnMatch(des1, des2, k=2)
  4. #ans: returns 2 best matches for each descriptor
  5. # apply ratio test
  6. good_matches = []
  7. #ans: for m, n in matches:
  8. #ans: if m.distance < 0.75 * n.distance:
  9. #ans: good_matches.append(m)
  10. # draw good matches
  11. img_matches = cv2.drawMatches(img1, kp1, img2, kp2, good_matches, None, flags=2)

Exercises - Part 1 (Concepts)

  1. # what does SIFT stand for?
  2. #ans: Scale-Invariant Feature Transform
  3. # how many dimensions in SIFT descriptor?
  4. #ans: 128 dimensions
  5. # what is DoG?
  6. #ans: Difference of Gaussians, approximates Laplacian
  7. # why multiple scales in SIFT?
  8. #ans: features appear at different sizes across images
  9. # main advantage of SURF over SIFT?
  10. #ans: 3-5× faster using integral images

Exercises - Part 2 (Concepts)

  1. # what is Lowe's ratio test?
  2. #ans: compare 1st best match to 2nd best, reject if ratio > 0.75
  3. # why ratio test?
  4. #ans: filters ambiguous matches (multiple similar descriptors)
  5. # SIFT descriptor grid structure?
  6. #ans: 4×4 cells, 8-bin histogram each = 128 values
  7. # what distance metric for SIFT?
  8. #ans: L2 (Euclidean) distance
  9. # is SIFT rotation invariant?
  10. #ans: yes, assigns orientation to each keypoint

Exercises - Part 3 (Coding)

  1. # create SIFT with 1000 features
  2. #ans: sift = cv2.SIFT_create(nfeatures=1000)
  3. # detect and compute SIFT
  4. gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
  5. #ans: keypoints, descriptors = sift.detectAndCompute(gray, None)
  6. # get first keypoint properties
  7. #ans: kp = keypoints[0]
  8. #ans: x, y = kp.pt
  9. #ans: size = kp.size
  10. #ans: angle = kp.angle

Exercises - Part 4 (Coding)

  1. # match SIFT features
  2. sift = cv2.SIFT_create()
  3. kp1, des1 = sift.detectAndCompute(gray1, None)
  4. kp2, des2 = sift.detectAndCompute(gray2, None)
  5. #ans: bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True)
  6. #ans: matches = bf.match(des1, des2)
  7. #ans: matches = sorted(matches, key=lambda x: x.distance)
  8. # apply ratio test
  9. bf = cv2.BFMatcher(cv2.NORM_L2)
  10. #ans: matches = bf.knnMatch(des1, des2, k=2)
  11. #ans: good = [m for m, n in matches if m.distance < 0.75 * n.distance]

Exercises - Part 5 (Mixed)

  1. # SIFT with higher contrast threshold
  2. #ans: sift = cv2.SIFT_create(contrastThreshold=0.08)
  3. #ans: fewer features, higher quality
  4. # draw SIFT keypoints with rich info
  5. sift = cv2.SIFT_create()
  6. kp, des = sift.detectAndCompute(gray, None)
  7. #ans: img_kp = cv2.drawKeypoints(img, kp, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
  8. #ans: shows size and orientation
  9. # stricter ratio test
  10. bf = cv2.BFMatcher(cv2.NORM_L2)
  11. matches = bf.knnMatch(des1, des2, k=2)
  12. #ans: good = [m for m, n in matches if m.distance < 0.6 * n.distance]
  13. #ans: 0.6 is stricter than 0.75, fewer but better matches

Google tag (gtag.js)