Alex Retana

Posted on Jul 10 • Edited on Jul 17

Comparing 3 ways to Train a Face Mask Classifier: Tensorflow, AWS Canvas, and Rekognition

#computervision #aws #tensorflow #ai

🛠️ Introduction

A few years ago, I built a simple face mask image classifier using Keras and TensorFlow, trained locally on my own hardware. Recently, I decided to revisit this project for a few reasons:

To see how easy (or hard) it would be to rerun my old Jupyter notebook from 4–5 years ago.
To try running custom training jobs inside Amazon SageMaker Studio, instead of relying on my own machine.
And while I was at it, I wanted to compare my custom-trained model against other ways of building and deploying models on AWS, including low-code/no-code tools and out-of-the-box computer vision APIs.

Here are the three approaches I tested:

✅ Classic deep learning: Running my original Jupyter notebook inside a SageMaker Studio JupyterLab instance, retraining the model with TensorFlow, then hosting it for a front-end demo using TensorFlow.js + S3.

⚙️ Low-code/no-code: Using AWS SageMaker Canvas, which lets you upload images and build models through a point-and-click UI, without writing code.

🧠 Fully managed pre-trained service: Using AWS Rekognition’s facial analysis API to see if it can detect masks directly — no training required.

For each method, I wanted to evaluate:

Ease of training/setup
Options for deployment (can it run in the frontend? backend only? real-time or batch?)
AWS pricing cost
Computational cost & latency (how fast can it return predictions?)

In the rest of this article, I’ll walk through each method, compare their results, and share what I learned along the way.

📦 Method 1: Classic Deep Learning (TensorFlow + Jupyter)

📜 Revisiting the Old Project

The starting point for this method is my older project:

👉 alexretana/facemaskclassifier on GitHub

This was a small computer vision project I created a few years ago to explore transfer learning and pretrained models. The goal was to build a pipeline that could detect faces in an image and classify whether each face was wearing a mask. To do this, I combined a YOLOv3 model (pretrained to detect faces) with a custom classifier trained to recognize masks.

The workflow was straightforward: given an input image, the YOLOv3 model would identify and draw bounding boxes around the faces. Each detected face would then be cropped and passed to the mask classifier, which predicted “mask” or “no mask” along with a confidence score. Finally, the pipeline overlaid labels on the image to show the results.

I learned a lot during this process, especially about loading and fine-tuning pretrained models, feature extraction, and how to stitch multiple models together into a single pipeline.

Special thanks to PyImageSearch by Adrian Rosebrock. Many tutorials there helped me build this!

If you’re curious, the repo contains several notebooks:

PlayAroundWithPretrainModels.ipynb – experimenting with pretrained models
TransferLearning-FeatureExtraction.ipynb – logistic regression on extracted features
TransferLearning-FineTurning.ipynb – fine-tuning pretrained model layers
predict.ipynb – final pipeline: detection → cropping → classification → annotated output

Next, I'll describe how I retrained and ran this project inside SageMaker Studio instead of on my local machine.

⚙️ Running in SageMaker Studio

With my old notebooks ready, I wanted to see how easy it would be to train the same model using AWS SageMaker Studio, instead of my local machine.

🛠 If you haven’t set up SageMaker Studio yet, here’s AWS’s quick start guide — it walks through creating the Studio environment in a few clicks.

Once my SageMaker Studio was provisioned, the workflow was surprisingly smooth. From the Studio home dashboard, it’s straightforward to launch new compute instances to run Jupyter notebooks or other tools. I started by spinning up an ml.t3.medium instance, the cheapest option at the time of writing, just to get started.

The UI makes it easy to open a terminal or create a new notebook. I opened the terminal to clone my old project repo from GitHub. One thing I quickly realized: my original project didn’t include a requirements.txt file (lesson learned for the future!). Thankfully, SageMaker’s default environments already come with many common libraries pre-installed, including:

pandas
numpy
tensorflow / keras
scikit-learn

The only extra dependencies I had to install were:

imutils
opencv-python

For OpenCV to work properly, it also needed an additional system package:

sudo apt install -y libgl1

The biggest hiccup I ran into was around dataset preparation: my old notebooks didn’t include clear instructions or scripts to recreate the train/validation/test splits. I had to figure that part out again before training could actually run. The dataset itself has over 10,000 images (but is thankfully only around 20 MB). At first, I tried simply dragging and dropping the dataset into the JupyterLab web interface, but this turned out to be unreliable: not every file transferred, and it took a long time.

From reading the docs and best practices, a better solution (and a common pattern for larger file transfers) was to:

Upload the dataset to an S3 bucket
Download it from S3 to the notebook instance using the terminal

Uploading to S3 took about 20 minutes, but copying it down to the notebook instance was much faster; probably under a minute. This workflow felt much cleaner and avoided partial transfers.

Aside from that, the first notebook TransferLearning-FeatureExtraction.ipynb ran without any code changes. But I did run into another practical issue: the ml.t3.medium instance didn’t have enough RAM, and the process kept running out of memory, which would crash the kernel and restart the instance.

The fix was simple: I shut down the notebook instance and upgraded it to an ml.m5d.2xlarge instance (which has about 32GB RAM, which is roughly what I used to have on my local dev machine). After restarting, everything picked up right where it left off. No need to clone the repos and redownload images; however, the packages did have to get reinstalled.

After training my model in the new SageMaker environment, I wanted to compare the training curves to those from my earlier runs a few years ago.

In this chart, you can see there are two graphs for each year. That’s because the transfer learning process includes two rounds of training: first training only the network head, and then fine-tuning the entire model after unfreezing more layers.

While the overall accuracy results are similar, I noticed that the training loss and training accuracy curves are much noisier and more sporadic in the recent run.

From what I’ve read, improvements in data augmentation, optimizer updates, and weight initialization defaults in frameworks like Keras and TensorFlow over the last few years can produce this kind of noisier but potentially more robust training process. If anyone has experience or thoughts on why this might happen, I’d love to hear your perspective in the comments!

⚙️ Method 2: Low-Code / No-Code with SageMaker Canvas

For the second approach, I wanted to try AWS SageMaker Canvas — a no-code tool that lets you build machine learning models through a web UI, without writing a single line of code.

The first step was to prepare my dataset in a format Canvas could use. To do this, I reorganized the images into labeled folders (e.g., mask/ and no_mask/). When you import data in Canvas, it can automatically use the folder names as class labels. I then uploaded this new dataset structure into an S3 bucket.

In Canvas, creating the dataset is straightforward: you create a new dataset and point it at your S3 bucket location. Once imported, you can see the list of images and labels Canvas detected.

🏗 Training the model

I kicked off a standard training job (since the quick mode couldn't handle the size of my dataset). Canvas estimated it might take 3–5 hours, but in reality it completed in under 2 hours — maybe even less than one.

The best part? It was truly one-click training: Canvas doesn’t ask you to choose architectures or tune hyperparameters. Instead, it quietly evaluates multiple candidate models behind the scenes, though it doesn’t disclose exactly which models it tried or what metrics guided the selection.

📊 Model evaluation & explainability

For evaluation, Canvas automatically showed me per-label accuracy so I could see which class performed better, along with actual examples of images it got right or wrong. It also generated heatmaps (using Class Activation Maps) that highlighted where the model focused when making decisions, and included a confusion matrix to visualize where it confused “masked” vs “unmasked.” All of this appeared right after training finished, without needing to write any visualization code.

⚡ Making predictions

When it came time to test the model, Canvas offered two options: upload a single image to get an instant prediction, or run a batch prediction over multiple images at once. I tried both, but unfortunately the outputs either came back empty or had “FAILED” values in the CSV results, so I decided to skip ahead and deploy the model as an inference endpoint instead.

With just a few clicks, Canvas can deploy your trained model to an endpoint you can call via API, and I did that so I could finish my evaluation outside of the Canvas UI.

Starting from code written in fine tuning, I adapted a similar function to evaluate the accuracy of this model's predictions.

The results were surprisingly good: Canvas’s model ended up with slightly better accuracy than my manually trained TensorFlow model. However, batch processing did take a bit longer overall. Though it’s worth noting that both models were running inference on the same instance type ml.m5d.2xlarge, so the comparison is fair in terms of hardware. Here’s the classification report showing the final accuracy and per-class metrics:

In the end, SageMaker Canvas impressed me: it handled training, visualization, and deployment with almost no code. While I did run into some quirks with the batch prediction UI, the overall experience was very beginner-friendly — and the final model quality was competitive with a hand-crafted TensorFlow pipeline (granted my model is 5 years old).

🧠 Method 3: Fully Managed Pre-Trained Service (Rekognition Custom Labels)

For the last approach, I wanted to explore Amazon Rekognition’s Custom Labels feature, which lets you train your own image classifier on a custom dataset — still without writing code, but built directly into Rekognition’s console rather than SageMaker. The interface make following the steps developing your model straight forward and stream line.

The setup was familiar: I uploaded my dataset to an S3 bucket, using labeled folders (masked/ and unmasked/) so Rekognition could automatically detect the classes. After confirming the dataset, training was supposed to be as simple as clicking a button and waiting for it to finish.

However, the training failed on my first attempt. After digging into the documentation, I realized Rekognition requires all images in the training and test datasets to meet a minimum resolution. My original dataset included images smaller than that threshold. To fix this, I wrote a quick script to resize all images to an acceptable resolution, re-uploaded the updated dataset to S3, and restarted the training job.

In hindsight, this might explain why the prediction feature in Canvas also struggled with the same dataset, although it’s interesting that the inference endpoint created by Canvas worked fine with those smaller images.

After the training completed (which took about an hour), the results ended up being pretty comparable to what I got with SageMaker Canvas, and noticeably better than my old YOLOv3-based code.

One important limitation, though: unlike Canvas, Rekognition Custom Labels doesn’t let you register and download the raw model artifact. Instead, you’re fully dependent on calling Rekognition’s API for inference. That makes the solution less portable if you ever want to run the model outside AWS. On the plus side, this also means it’s incredibly quick to get started: after training finishes, you can deploy and start making predictions right away. Overall, this makes Rekognition Custom Labels a strong option for proof-of-concept projects or when you need to get something running with minimal setup.

💰 Cost Analysis

While testing each method, I kept track of the costs I saw in my AWS billing dashboard. Running everything manually through the Jupyter notebook (inside SageMaker Studio) ended up costing me less than $12 total — even after upgrading to a more expensive instance for training.

In contrast, SageMaker Canvas cost quite a bit more: about $49. To be fair, a lot of that cost probably came from my repeated attempts to run batch predictions, which ultimately didn’t work but still counted as billed time. If I would have to estimate the cost if things had run smoothly, I'd guess $10-$20.

Rekognition Custom Labels was by far the cheapest in my experiment: I was only charged $7.90. It’s worth noting, though, that this only covers training costs — not the cost of hosting the model or running real-time inference in production. I’m also curious how well Rekognition pricing scales over time as usage increases.

✅ Final Review & Comparison

Here’s how the three approaches stack up:

Method	Control & Flexibility	Ease of Use	Cost in Test	Portability	Notes
Classic Jupyter + TensorFlow	⭐⭐⭐⭐⭐	⭐⭐	~$12	Can export / host anywhere	Most setup & coding required; fully customizable
SageMaker Canvas	⭐⭐⭐	⭐⭐⭐⭐	~$49(probably actually ~$10-$20)	Can export model artifact	Great built-in visualizations; had issues with batch predictions; higher cost
Rekognition Custom Labels	⭐	⭐⭐⭐⭐⭐	~$8	Must use Rekognition API	Fastest setup; lowest upfront cost; can't download model; great for proof of concept

In the end, each option had its place:

If you want full control and portability, running your own TensorFlow notebooks (even inside SageMaker Studio) still feels best.

If you prefer no-code training and easy visualization tools, Canvas makes it remarkably simple to build, analyze, and deploy models — though at a higher cost and occasional quirks.

And if you just need to get something working fast, Rekognition Custom Labels is incredibly quick to set up and cheap to run — as long as you’re okay relying on AWS’s API for hosting.

Overall, revisiting this project showed me that today’s cloud tools can save a huge amount of time — but there are still trade-offs in cost, control, and portability. In the next article, I’ll look at deploying these models and providing a usable live demo so you can see them in action.
I’d love to hear if you’ve tried similar experiments, or what your experience has been — drop a comment below!

DEV Community