DEV Community

Cover image for TIL How to Take Hundred of Images Through Google Colab
Aisha
Aisha

Posted on

TIL How to Take Hundred of Images Through Google Colab

I recently embarked on a fascinating journey inspired by Nicholas Renotte's tutorial on building a Facial Recognition Model from a research paper to code. Despite its complexity, the tutorial offered clear guidance, yet I encountered a few hurdles, particularly with Google Colab's integration with webcams.

Within the 8 part series we had to create sample images of our face using the web camera. Initially, I struggled with accessing my webcam in Google Colab. Nicholas was using OpenCV and that was not working via Google Colab. It wasn't until I discovered the hidden gem of code snippets that I realized there was a solution at my fingertips.

Lesson #1: Google Colab offers invaluable resources through its code snippets feature.

Forgive me if you already knew this but I didn't so that counts as a lesson.

However, my challenges didn't end there. I needed to capture hundreds of facial images, which proved to be a tedious task with the default camera capture snippet. Enter

Lesson #2. Google Colab's ability to execute JavaScript within the notebook.

This newfound knowledge revolutionized my approach. I customized the camera capture process. I replaced closing the camera after the picture was taken to a button-driven image capture that closed on command. Harnessing JavaScript's power, I extended functionality, maintaining the camera connection until my task was complete.

All my focus has been on Data Science lately and not one time until now did I realize that you can run Javascript within the notebook. Honestly I never had to, but that's not the point.

Yet, even armed with JavaScript, I encountered my final obstacle: integrating Python functions with JavaScript callbacks. It took some troubleshooting and a return to the Google Colab documentation to pinpoint the missing link. With

Lesson #3. Orchestrating a seamless interaction between JavaScript and Python.

Turns out not only does Javascript have to call the python function.

// Send the base64-encoded image data to Python
const is_saved = await google.colab.kernel.invokeFunction('notebook.save_image', [imageData, full_file_path], {});
Enter fullscreen mode Exit fullscreen mode

But you also have to register the call back.

output.register_callback('notebook.save_image', save_image)
Enter fullscreen mode Exit fullscreen mode

I mean like, duh. Be patient with me I'm being vulnerable here.

The culmination of these lessons resulted in a robust workflow, enabling the capture and processing of over 400 facial images for model training. Witnessing the model in action was immensely gratifying, and I'm eager to share my journey with others, hoping to spare them similar trials. For those interested in exploring my progress or delving into Nicholas Renotte's invaluable tutorials, I've shared my work on GitHub and encourage following Nicholas for further inspiration.

Webcam in Notebook

Check out my progress on GitHub: GitHub
And don't forget to follow Nicholas Renotte: Nicholas Renotte's YouTube Channel

Oh and here's are my code snippets.

crops the images

def crop_square(img, size, interpolation=cv2.INTER_AREA):
    h, w = img.shape[:2]
    min_size = min(h, w)

    # Centralize and crop
    center_x, center_y = w // 2, h // 2
    half_size = min_size // 2
    crop_img = img[center_y - half_size:center_y + half_size, center_x - half_size:center_x + half_size]
    resized = cv2.resize(crop_img, (size, size), interpolation=interpolation)

    return resized
Enter fullscreen mode Exit fullscreen mode

saves the images

def save_image(image_data, full_file_path):
  try:
    # Decode the base64-encoded image data
    binary = base64.b64decode(image_data)

    # Load the image directly as a NumPy array
    image_array = cv2.imdecode(np.frombuffer(binary, dtype=np.uint8), cv2.IMREAD_COLOR)

    # Process the image
    processed_image = crop_square(image_array, 250)

    # Save the processed image to a file
    cv2.imwrite(full_file_path, processed_image)
    return True
  except Exception as e:
    print(f'Error saving image: {e}')
    return False

output.register_callback('notebook.save_image', save_image)
Enter fullscreen mode Exit fullscreen mode

opens the webcam for some clickety click action

def take_photo(dir, generate_file_name=True, quality=0.8, verification_callback=None, capture_btn_name='Capture'):

  if generate_file_name:
    # name the new file with unique identifier
    full_file_path = os.path.join(dir, f"{uuid.uuid1()}.jpg")
  else:
    full_file_path = dir

  # create javascript function
  js = Javascript('''
     async function takePhoto(full_file_path, quality, verification_callback, capture_btn_name) {

      // create shell with div and capture button
      const div = document.createElement('div');
      div.style.padding = '10px';

      // create capture button
      const capture = document.createElement('button');
      capture.textContent = capture_btn_name;
      capture.style.marginRight = '10px';
      div.appendChild(capture);

      // create close button
      const close = document.createElement('button')
      close.textContent = 'Close';
      div.appendChild(close);

      // create video space
      const video = document.createElement('video');
      video.style.display = 'block';

      // add video block to div
      div.appendChild(video);

      // create video element
      const stream = await navigator.mediaDevices.getUserMedia({video: true});

      // add dive to the body
      document.body.appendChild(div);

      // set the source and play the media
      video.srcObject = stream;
      await video.play();

      // resize the output to fit the video element.
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      capture.onclick = async () => {
          // create a canvas element to draw the video frame
          const canvas = document.createElement('canvas');

          // set the canvas width and height to match the video dimensions
          canvas.width = video.videoWidth;
          canvas.height = video.videoHeight;

          // draw the current video frame onto the canvas
          canvas.getContext('2d').drawImage(video, 0, 0);

          // Convert the canvas content to a base64-encoded JPEG image data URL
          // with the specified quality
          const dataUrl = canvas.toDataURL('image/jpeg', quality);

          // Split the data URL to extract the base64-encoded image data
          const imageData = dataUrl.split(',')[1];

          // Send the base64-encoded image data to Python
          const is_saved = await google.colab.kernel.invokeFunction('notebook.save_image', [imageData, full_file_path], {});

          // If the verification callback is provided, invoke it
          if(is_saved.data && verification_callback !== undefined){
            const result = await google.colab.kernel.invokeFunction(verification_callback, [], {});
            console.log(result);
          }

          return is_saved.data;
      };

      close.onclick = () => {
          // stop the video stream to release resources
          stream.getVideoTracks()[0].stop();

          // remove the video element and its containing div from the DOM
          div.remove();
      };
    }
    ''')

  # create and display the javascript function
  display(js)

  # call the javascript function
  display(Javascript(f'takePhoto("{full_file_path}", "{quality}", "{verification_callback}", "{capture_btn_name}")'))
Enter fullscreen mode Exit fullscreen mode

Top comments (0)