DEV Community

haratena9
haratena9

Posted on

Image Processing #2 RGB, Histogram

About this memorandum

Recently, I had a chance to experience deep learning and image/video processing at work, but there were many things I didn't understand about how to touch parameters and amplify data, so I decided to study video processing from scratch.

Color images and grayscale images

  • Grayscale image: pixel value represents brightness
  • RGB color image: the pixel value represents the brightness of each RGB (for clarity, see the image below)
    • 1 pixel: 8bit×3=24bit(3Byte)

Summary
However, in reality, the correct understanding is that it consists of three R, G and B grayscale images.
Since the car is originally red, the pixel value in the R channel on the left is close to 255, and it appears to be white in the grayscale image. On the other hand, the G and B channels have smaller pixel values and appear to be black.
Original car
reality
Code

im = imread([file_path])

imshow(im)
plt.title("original RGB image")
plt.show()

r_channel = im[:, :, 0]
g_channel = im[:, :, 1]
b_channel = im[:, :, 2]

fig = plt.figure(figsize=(15,3))

for i, c in zip(range(3), 'RGB'):
    ax = fig.add_subplot(1, 3, i + 1)
    imshow(im[:, :, i], vmin=0, vmax=255)
    plt.colorbar()
    plt.title(f'{c} channel')

plt.show()
Enter fullscreen mode Exit fullscreen mode

RGB&BGR

The only difference is the interpretation of the data, but be careful.

RGB

  • Used in many textbook descriptions.
  • Many image processing libraries also use this format.
    • skimage, matplotlib for python

BGR

  • Also often used
    • opencv (python, C/C++)
    • COLORREF on Windows (0x00bbggrr in hexadecimal)
    • Hardware

A common mistake is that when images loaded with opencv (BGR) are displayed with scikit-image (RGB), the red and blue are reversed.
mistake
Code (example of mistake)

im_BGR = cv2.imread(INPUT_DIR + 'IMG-4034.JPG') # OpenCV
imshow(im_BGR) # matplotlibのimshowはRGBを仮定
plt.title('show BGR image as RGB image')
plt.axis('off')
plt.show()
Enter fullscreen mode Exit fullscreen mode

There are several ways to fix this, including the following
Code:RGB and GBR conversion

### use the built-in functions
im_BGR_to_RGB = cv2.cvtColor(im_BGR, cv2.COLOR_BGR2RGB)
imshow(im_BGR_to_RGB)
plt.title('show RGB-converted BGR image as RGB image')
plt.axis('off')
plt.show()

### not use the built-in functions1
im_BGR_to_RGB = im_BGR[:, :, ::-1]
imshow(im_BGR_to_RGB)
plt.title('show RGB-converted BGR image as RGB image')
plt.axis('off')
plt.show()

### not use the built-in functions1(explanation process of 1 above.)
im_BGR_to_RGB = np.zeros_like(im_BGR)

im_BGR_to_RGB[:, :, 0] = im_BGR[:, :, 2]
im_BGR_to_RGB[:, :, 1] = im_BGR[:, :, 1]
im_BGR_to_RGB[:, :, 2] = im_BGR[:, :, 0]

imshow(im_BGR_to_RGB)
plt.title('show RGB-converted BGR image as RGB image')
plt.axis('off')
plt.show()
Enter fullscreen mode Exit fullscreen mode

How to create a grayscale image.

There is no wrong or right way to do this, just different standards.
However, all methods look almost the same, but the values are different, so it is necessary to recognize when working together.
Original
use the built-in functions
use the built-in functions
Divide the sum of the RGB values by three
Divide the sum of the RGB values by three
Standard: PAL/NTSC
PAL/NTSC
Standard: HDTV (same as built-in functions)
HDTV
Code

im = imread(INPUT_DIR + 'IMG-4034.JPG')

imshow(im)
plt.title("original RGB image")
plt.show()

# Using the built-in rgb2gray function;gray = 0.2125 R + 0.7154 G + 0.0721 B
im_gray1 = rgb2gray(im)
imshow(im_gray1, vmin=0, vmax=1) # 型はfloat,範囲は[0,1]になる
plt.colorbar()
plt.title("rgb2gray min {0} max {1}".format(im_gray1.min(), im_gray1.max() ))
plt.show()

# The average of RGB is used as a grayscale image. First convert to float (the range will be [0,255]), then convert to uint8 for display.
im_gray2 = (im[:,:,0].astype(float) +
            im[:,:,1].astype(float) + 
            im[:,:,2].astype(float)) / 3
imshow(im_gray2, vmin=0, vmax=255)
plt.colorbar()
plt.title("(R+B+G)/3 min {0:.2f} max {1:.2f}".format(im_gray2.min(), im_gray2.max() ))
plt.show()


# The weighted average of RGB is used as the grayscale image.
# https://en.wikipedia.org/wiki/Grayscale#Luma_coding_in_video_systems
im_gray3 = (0.299 * im[:,:,0].astype(float) +
            0.587 * im[:,:,1].astype(float) + 
            0.114 * im[:,:,2].astype(float))
imshow(im_gray3, vmin=0, vmax=255)
plt.colorbar()
plt.title("$\gamma'$ of PAL and NTSC min {0:.2f} max {1:.2f}".format(im_gray3.min(), im_gray3.max() ))
plt.show()

# The weighted average of RGB is used as a grayscale image. The weight coefficients vary depending on the standard.
# https://en.wikipedia.org/wiki/Grayscale#Luma_coding_in_video_systems
# This is what rgb2gray() uses.http://scikit-image.org/docs/dev/api/skimage.color.html#skimage.color.rgb2gray
im_gray4 = (0.2126 * im[:,:,0].astype(float) +
            0.7152 * im[:,:,1].astype(float) + 
            0.0722 * im[:,:,2].astype(float))
imshow(im_gray4, vmin=0, vmax=255)
plt.colorbar()
plt.title("$\gamma'$ of HDTV min {0:.2f} max {1:.2f}".format(im_gray4.min(), im_gray4.max() ))
plt.show()
Enter fullscreen mode Exit fullscreen mode

Histogram

A graph showing the frequency distribution of pixel values.
The third picture shows a drawing of Metamon on a whiteboard.
The peak of the Metamon drawing can be seen around 0.5~0.8, although it is a little uneven depending on the amount of ink in the marker (the amount of force used when drawing, etc.).
In addition, the area where the PC display on the other side of the whiteboard is reflected (to the left of Metamon's mouth) is quite white.This is thought to be the peak around 0.

Hist1
Hist2
Hist3
Code

im_files = ['file_path1', 'file_path2', 'file_path3']

for file in im_files:
    im = imread(file)[:,:,:3]  # In the case of RGBA, extract only RGB

    fig = plt.figure(figsize=(20,3))

    ax = fig.add_subplot(1, 3, 1)
    im = rgb2gray(im) # Range;[0,1]
    imshow(im)
    plt.axis('off')

    bins = 256

    ax = fig.add_subplot(1, 3, 2)
    freq, bins = histogram(im)
    plt.plot(bins, freq)
    plt.xlabel("intensity")
    plt.ylabel("frequency")
    plt.title('histogram (linear)')
    plt.xlim(0,1)


    ax = fig.add_subplot(1, 3, 3)
    freq, bins = histogram(im)
    plt.plot(bins, freq)
    plt.xlabel("intensity")
    plt.ylabel("log frequency")
    plt.yscale('log')
    plt.title('histogram (log)')
    plt.xlim(0,1)

    plt.show();

Enter fullscreen mode Exit fullscreen mode

Latest comments (0)