There are three types of organisms one can see in above video, the round one, the long one and couple of amobea
Last year, I was introduced to a wonderful scientific instrument called the ‘Foldscope’. I have spent hours observing things with it. One of my favorite past times is to observe ciliates using the Foldscope. Ciliates are very simple single cell organisms which are easy to find and come in numerous shapes and sizes. Most ciliates move very fast, and you need some skill with a microscope to follow them on the slide. This inspired me to write some code that could detect moving objects in a video and draw rectangles around them. Amazingly, I believe I was able to do a decent job with under 60 lines of python code
In this post I will discuss concepts which I used for detecting moving objects and how the work together to come up with the end results.
The first thing we need to do is be able to load frames one by one from a video. OpenCV makes this task very easy. OpenCV has a very convenient function called ‘cv2.VideoCapture’ which allows returns an object that can be used to find out information about the video (like width, height,frame rate). The same object allows us to read a single frame from the video by calling ‘read()’ method on it. The ‘read()’ method returns two values, a boolean indicating success of the operation and the frame as an image.
cap = cv2.VideoCapture("video_input.mp4") if cap.isOpened(): width = cap.get(3) # float height = cap.get(4) # float fps = cap.get(cv2.CAP_PROP_FPS)
The full video can be read frame by frame using following code:
success = True while success: success,im = cap.read() if not success: break
Writing videos with OpenCV is also very easy. Similar to ‘VideoCapture’ function, the ‘VideoWriter’ function can be used to write video , frame by frame. This function expects path of output image, code information,frames per second, width and height of output image as parameters
fourcc = cv2.VideoWriter_fourcc('m','p','4','v') out = cv2.VideoWriter('video_output.mp4',fourcc , int(fps), (int(width),int(height)))
Write a frame to the video is as easy as calling
The above video was generated out of frame diffs from the original video
Images are represented as matrices in the memory. OpenCV has a function called ‘cv2.absdiff()’ which can be used to calculated absolute difference of two images. This is the basis of our motion detection. We are relying on the fact that when something in the video moves it's absdiff will be non zero for those pixels. However if something stationary and has not moved in two consequitve frames, the absdiff will be zero. So, as we read the video frame by frame, we compare current frame with older frame and calculate absdiff matrix. The dimensions of this matrix are same as of the images to be compared.
Sounds easy , right ? But there are some problems with this approach. Firstly, cameras and software produce artifacts when then capture and encode videos. Such artifacts give us non-zero diff even when the object is stationary . Uneven lighting and focussing can also cause non-zero diffs for stationary portions of images.
After experimenting with some approaches, I found out that thresholding the diff image using mean values works very well.
frame_diff[frame_diff < frame_diff[np.nonzero(frame_diff)].mean()] = 0 frame_diff[frame_diff > frame_diff[np.nonzero(frame_diff)].mean()]
Edge detection using 'Sobel' filter performed on the frame diff video
As microorganisms move, they push matter around them, which gives positive pixels after diffing. But we want to differentiate micro-organisms from other things. Also focussing plays an important part here. Generally a lot of out of focus moving objects will also give positive frame differences. Mostly these are blurred objects which we simply want to ignore. This is where edge detection comes into play on edges and borders in the image. To find those we use edge detection. This can be easily achieved by using 'sobel' filter from scikit learn package.
output = filters.sobel(frame_diff)
Video generated after performing contour detection on the Sobel filter output
Most protozoans like ciliates will not always show a clear border (because they are mostly transparent). So when we use edge detection to detect shapes/outlines of moving objects, we get broken edges. In my experience contour detection works very well to group such broken edges and generated a more continuous border.
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
OpenCV has built-in functions to find contours. The best part is, the function is able to find nested contour structures and return a hirearchy. The 'cv2.findCountours' function returns heirarchy of contours. We only consider top level contours (who dont have a parent). If for a contour 'idx' , if 'hierarchy[idx]' returns -1, it means that it is a top level contour and it does not have any parent. Everything else we ignore.
Video showing bounding boxes drawn over contours
Creating boxes around countours can require a bit of math. Luckily OpenCV has a convinient function 'cv2.boundingRect' which returns center coordinates, with and height of bounding rect around a given contour.Once we have that, drawing a rectangle on our frame can simply be down using cv2.rectangle function. We can pass the color and border-width when drawing the rectangle to this function.
if hierarchy[idx]== -1 : x,y,w,h = cv2.boundingRect(contour) if (w*h <= (q)**2): continue image = cv2.rectangle(image, (x,y),(x+w,y+h), color, 1)
Like I explained earlier, videos taken using a microscope can be messy. There can a lot going on. We may be only interested in detecting objects of a certain size. This where I have introduced a parameter called 'q'. This parameter was used for altering settings for various filters I experimented with. Currently this is only used to filter out bounding rects which are smaller than q^2 in area. You should experiment with different values of 'q' , depending the resolution of your video and size of objects you are interested in.
I want to make this approach fast enough so that it can run in realtime. Also it would be nice if I can get this ported to android or a mobile phone. I also plan to experiment with ML based segmentation techniques for better detection.
The full code , along with sample video is available on Github: