A digital image is a 2D array (matrix) of pixels:
uint8
Image coordinates:
image[row, col]
image[y, x]
Grayscale images:
Color images (BGR in OpenCV):
[0, 0, 0]
[255, 255, 255]
[255, 0, 0]
[0, 255, 0]
[0, 0, 255]
import cv2import numpy as np# images are numpy arraysimg = np.array([[0, 50, 100], [150, 200, 255]], dtype=np.uint8)#ans: 2x3 grayscale imageprint(img.shape)#ans: (2, 3) - 2 rows, 3 columnsprint(img.dtype)#ans: uint8 - values 0-255
# load a color imageimg = cv2.imread('image.jpg')#ans: loads as BGR format (not RGB!)print(img.shape)#ans: (height, width, channels)height, width, channels = img.shape#ans: height=rows, width=cols, channels=3# load as grayscalegray = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)#ans: loads as grayscaleprint(gray.shape)#ans: (height, width) - no channel dimension
# accessing individual pixelsimg = cv2.imread('image.jpg')pixel = img[100, 50]#ans: pixel at row 100, col 50print(pixel)#ans: [B, G, R] values (3 numbers)# modify a pixelimg[100, 50] = [255, 0, 0]#ans: sets pixel to blue (BGR format)
# create blank black imageblack = np.zeros((480, 640, 3), dtype=np.uint8)#ans: 480x640 black image (all zeros)# create white imagewhite = np.ones((480, 640, 3), dtype=np.uint8) * 255#ans: 480x640 white image (all 255s)# create colored imagered = np.zeros((480, 640, 3), dtype=np.uint8)red[:, :, 2] = 255#ans: red image (3rd channel = red in BGR)
Formula: height × width × channels × bytes_per_pixel
height × width × channels × bytes_per_pixel
Example: 1920×1080 RGB image
Why uint8?
# what is the shape?img = np.zeros((300, 400, 3))#ans: (300, 400, 3) - 300 rows, 400 cols, 3 channels# how many pixels total?#ans: 300 × 400 = 120,000 pixels# how many values in total?#ans: 300 × 400 × 3 = 360,000 values# what color is pixel [255, 255, 0] in BGR?#ans: yellow (blue=255, green=255, red=0)
# why use uint8 instead of int?#ans: saves memory, 0-255 range sufficient for images# what's the difference?gray = np.zeros((100, 100))color = np.zeros((100, 100, 3))#ans: gray is 2D, color is 3D with 3 channels# which is the height?img.shape returns (480, 640, 3)#ans: 480 is height (first dimension = rows)# what happens with value 300 in uint8?img = np.array([[300]], dtype=np.uint8)#ans: overflow, 300 % 256 = 44
# create 100x100 grayscale imagegray = np.zeros((100, 100), dtype=np.uint8)#ans: black grayscale image# create 100x100 color imagecolor = np.zeros((100, 100, 3), dtype=np.uint8)#ans: black color image with 3 channels# access pixel at row 10, col 20img = np.random.randint(0, 256, (50, 50, 3), dtype=np.uint8)pixel = img[10, 20]#ans: array with 3 BGR values
# create green image (BGR format)green = np.zeros((100, 100, 3), dtype=np.uint8)green[:, :, 1] = 255#ans: green channel (index 1) set to 255# create half-white half-black imageimg = np.zeros((100, 100), dtype=np.uint8)img[:50, :] = 255#ans: top 50 rows white, bottom 50 rows black# create solid cyan image (blue + green)cyan = np.zeros((100, 100, 3), dtype=np.uint8)cyan[:, :, 0] = 255 # bluecyan[:, :, 1] = 255 # green#ans: cyan image [255, 255, 0]
# how to get image dimensions?height = img.shape[0]width = img.shape[1]channels = img.shape[2]#ans: shape gives (height, width, channels)# what's the memory size of 1024x768 RGB image?#ans: 1024 × 768 × 3 = 2,359,296 bytes ≈ 2.4 MB# create image using np.fullimg = np.full((100, 100, 3), [255, 0, 0], dtype=np.uint8)#ans: blue image (all pixels [255, 0, 0])# what is img.ndim for grayscale?gray = np.zeros((100, 100))#ans: 2 (2-dimensional array)
Google tag (gtag.js)