DEV Community

Sofia Tretiak
Sofia Tretiak

Posted on

Real-Time Emotion and Fatigue Detection System with ResNet18 and MediaPipe

This project was developed as part of the Solvro Summer Challenge at Wrocław University of Science and Technology. The goal was to design a real-time system capable of detecting human emotions and early signs of fatigue or stress using only a standard webcam and local computation.

The project combines two independent components:
1. emotion recognition using a convolutional neural network,
2. fatigue detection based on facial landmarks and behavioral cues.

System Overview

The application runs from a single entry point (main.py) and offers two modes of operation via a simple CLI interface:
• Emotion detection mode
• Stress and fatigue detection mode

Depending on the user’s choice, the system initializes the appropriate pipeline and logging configuration.

Emotion Recognition Module

For emotion recognition, I trained a ResNet18 convolutional neural network using the FER-2013 (Facial Expression Recognition 2013) dataset.

Training setup:
• Input: RGB images resized to 224×224
• Dataset: FER-2013 (7 emotion classes)
• Optimizer: Adam
• Loss function: CrossEntropyLoss
• Batch size: 64
• Epochs: 20

The final test accuracy reached approximately 65%, which is consistent with typical results on FER-2013 without extensive fine-tuning.

One notable limitation is the poor recognition of the “Disgust” class. This is a known issue with FER-2013, as this emotion is underrepresented and visually ambiguous compared to others such as Happy or Surprise.

After training, the model weights were saved in PTH format and integrated into the real-time detection pipeline.

Fatigue and Stress Detection Module

The fatigue detection component is based on facial landmark analysis using MediaPipe Face Mesh, which provides 468 facial landmarks in real time.

The system focuses on two key physiological indicators:
• Eye Aspect Ratio (EAR)
Used to detect eye closure and blinking frequency, which correlate with fatigue.
• Mouth Aspect Ratio (MAR)
Used to detect yawning and mouth tension, common indicators of tiredness or discomfort.

Unlike the emotion mode, no visual indicators are shown on the camera feed. All measurements are processed in the background and continuously logged for later analysis.

Logging and Analysis

Both modules use a unified logging system.
For emotions, each predicted class is timestamped and stored.
For fatigue detection, logs include EAR, MAR, yawning events, blink detection, and a cumulative fatigue score.

This design allows offline analysis of user behavior over time without requiring a graphical dashboard.

Design decisions
Local processing only, Modular architecture – emotion and fatigue pipelines are fully separable, CLI-based interface

Collaboration
The project was developed collaboratively:
• My contribution: emotion recognition pipeline, model training, dataset handling, and model integration
• Second contributor: fatigue detection logic based on facial landmarks and real-time analysis

https://github.com/TSofi/face-emotion-recognition
https://www.linkedin.com/in/sofia-tretiak

This project is a simple, working proof of concept that shows how emotion recognition and fatigue detection can be combined using computer vision and deep learning. The focus was not on squeezing out the highest possible accuracy, but on building something that actually works end to end.

Top comments (0)