DEV Community

Cover image for Binarization of images by Otsu and Niblack methods
Sayyid.M
Sayyid.M

Posted on

Binarization of images by Otsu and Niblack methods

In this task, it is required to compare the work of three methods of image binarization - the Otsu method, the modified Otsu method, and the Niblack method: implement them programmatically and give examples of images where one of the methods is superior to the others.

Descriptions of methods and their implementation.

The Otsu method
The method consists in constructing a histogram of the brightness of the pixels of the image, and then in finding the threshold for splitting the histogram into two classes. In the Father's method, such a threshold corresponds to the minimum of intra-class variance, or, equivalently, to the maximum of inter-class variance:

Image description
Software implementation of the method in Python:

Modified Otsu method
The classical Quality criterion can be derived from the maximum likelihood principle for normally distributed classes of image pixels. In this case, the likelihood is considered for conditional distributions, i.e. pixel brightness distributions, provided that they belong to certain classes. If we introduce the probability of belonging to specific classes according to [1], and apply the maximum likelihood principle to the joint distribution of pixel brightness, we get another criterion.
For binarization (that is, the case of two classes), an image pixel belongs to class 1 if its brightness exceeds a certain threshold, and belongs to class 0 in the opposite case. The threshold is defined by the following expression:

Image description

Without the first term, the criterion would correspond to the classical criterion of the Otsu, since the threshold would be determined by the minimum of intra-class variance. The first term depends only on the saturation of classes, and thus extends the applicability of the classical Otsu criterion to the case of unbalanced classes.
Software implementation:

The Niblack method
Both the classical Otsu method and its modification select a global threshold for the entire image. In some cases, this may not be acceptable - for example, when the image has different illumination in different parts.
The Niblack method selects a local threshold - its own for each pixel. The threshold can be calculated by the formula:

Image description
The first term is the average for a window of some size, the second is the standard deviation for the same window multiplied by an experimentally selected coefficient.
Software implementation (for k=0.1 and a window of size 40x40):

img = cv.imread('/content/baboon.png', cv.IMREAD_GRAYSCALE)
kernel = np.ones((40,40),dtype=np.float32)/1600
img_mean = cv.filter2D(img,-1,kernel)
img_sqmean = cv.filter2D(np.power(img,2),-1,kernel)
img_sd = np.sqrt(img_sqmean - np.power(img_mean,2))
k = 0.1
threshold_mat = img_mean + k*img_sd
out_img = (img>threshold_mat).astype(np.uint8)*255
cv.imwrite('out2.png',out_img)
Enter fullscreen mode Exit fullscreen mode

Comparison of algorithms
Let's compare the results obtained by different methods for several images.
Let there be an original image:

Image description

According to the modified Otsu method:

Image description
According to the Niblack method (window 40x40, k=0.01):

Image description

As you can see, both the Otsu method and its modification give a similar result, allowing you to correctly separate the background from the letters. The Niblack method, in principle, also gives the correct result (except for the boundaries), but it requires a fine selection of parameters, and therefore its use for this image is not justified.

Consider another image:

Image description
Its Otsu binarized version

Image description

Image description

According to the Niblack method (window 40x40, k=0.05):

Image description

For this, the images of the population of classes differ significantly, and therefore the classical Otsu method turns out to be unacceptable - the text is not separated from the background. However, the modification copes much better. The Niblack method does not work well for images with a large proportion of background (or, perhaps, its parameters were incorrectly selected - for example, too small window was taken. But since the modification of the Otsu method works well for such images, and does not require the selection of parameters, the use of the Niblack method is not justified by anything).

Finally, consider the last image.

Image description
Binarization by Otsu:

Image description
By modification of the Otsu method:

Image description
According to the Niblack method (window 40x40, k=0.1):

Image description
As you can see, for such an image, only the Niblack method gives the correct result. This is due to the uneven illumination of the image, and, as a result, the fundamental impossibility of correctly selecting a global threshold.

Thus, the choice of the binarization method is largely determined by the input image.
The multiscale Niblack method:

def makekernel(size):
    return np.ones((size,size),dtype=np.float32)/size**2

img = cv.imread('/content/baboon.png', cv.IMREAD_GRAYSCALE).astype(np.float32)
size = 40
img_mean = []
img_sqmean = []
img_sd = []
threshold_mat = np.zeros(img.shape, dtype=float)
num = int(np.log2(min(img.shape)/20))
for i in range(num):
    kernel = makekernel(size) 
    img_mean.append(cv.filter2D(img,-1,kernel).astype(np.float32))
    img_sqmean.append(cv.filter2D(np.power(img,2),-1,kernel).astype(np.float32))
    img_sd_tmp = np.maximum(img_sqmean[-1] - np.power(img_mean[-1],2), np.zeros(img_mean[-1].shape, dtype=float))
    img_sd_tmp = np.sqrt(img_sd_tmp)
    img_sd.append(img_sd_tmp)
    size *= 2
k = 0.01
sigma_thresh = 35
for i in range(img.shape[0]):
    for j in range(img.shape[1]):
        for t in range(num):
            if img_sd[t][i,j]< sigma_thresh:
                continue
            threshold_mat[i,j] = img_mean[t][i,j] + k*img_sd[t][i,j]
            break
#threshold_mat = img_mean[ind] + k*img_sd[ind]
out_img = (img>threshold_mat).astype(np.uint8)*255
cv.imwrite('out12baboon.png',out_img)
cv.imwrite('mean.png',img_mean[0])
cv.imwrite('sd.png',img_sd[3])
Enter fullscreen mode Exit fullscreen mode

Image description
Result

Image description

Image description

Image description

Full implementation of the code can be found here.
Git hub

[1] Kurita T., Otsu N., Abdelmalek N. Maximum likelihood thresholding based on population mixture models //Pattern recognition. – 1992. – Т. 25. – №. 10. – С. 1231-1240.

Top comments (0)