Syahrul Al-Rasyid

Posted on Sep 2

TensorRT Engine Model Fixes Implementation

This document describes the implementation of fixes for two critical issues with TensorRT (.engine) YOLO models:

Class names showing as indices instead of actual names
Segmentation masks not returning at original resolution

Problems Addressed

Issue 1: Class Names as Indices

When using TensorRT engine models (.engine files), the model.names attribute may not be properly populated, causing class predictions to show as indices (e.g., "0", "1", "2") instead of meaningful names like "food_label", "invoice", "qr_code".

Issue 2: Mask Resolution Mismatch

Segmentation masks from YOLO models are typically returned at the model's input resolution (e.g., 640x640), not at the original image resolution. This causes mask coordinates to be misaligned with the actual objects in the original frame.

Solution Overview

1. YAML-Based Class Name Loading (`src/utils/config_loader.py`)

Features:

Loads class names from fastrtc-test/config/data.yml
Creates mapping from class indices (0-112) to class names
Provides fallback mechanisms for missing or invalid configurations
Validates class mapping completeness

Key Functions:

load_class_names_from_yaml(config_path: str) -> Dict[int, str]
get_class_name_safe(class_mapping: Dict[int, str], class_id: int, fallback_prefix: str) -> str
load_class_names_with_fallback(config_path: str, fallback_names: List[str]) -> Dict[int, str]

2. Mask Resolution Utilities (`src/utils/mask_utils.py`)

Features:

Resizes masks from model input resolution to original frame resolution
Handles various mask formats (binary, float, different dimensions)
Optimized for performance with proper interpolation methods
Validates mask dimensions and properties

Key Functions:

resize_mask_to_original(mask: np.ndarray, original_shape: Tuple[int, int]) -> np.ndarray
calculate_mask_area(mask: np.ndarray) -> int
process_yolo_masks(masks_data: Union[np.ndarray, List[np.ndarray]], original_shape: Tuple[int, int]) -> List[np.ndarray]

3. Detector Integration (`src/core/detector.py`)

Enhanced Constructor:

def __init__(
    self,
    # ... existing parameters ...
    class_config_path: str = "fastrtc-test/config/data.yml",
):

Key Changes:

Class Name Resolution:
- Loads class mapping from YAML during initialization
- Prioritizes config-based names over model.names
- Provides safe fallback for unknown class indices
Mask Processing:
- Captures original frame dimensions
- Resizes masks to original resolution during post-processing
- Maintains mask accuracy and alignment

Configuration File Format

The fastrtc-test/config/data.yml file should contain:

nc: 113  # Number of classes
names: [
  'basil_chicken_alfredo_linguine_l',
  'basil_chicken_alfredo_linguine_m',
  # ... more food items ...
  'food_label',     # Index 44
  # ... more food items ...
  'invoice',        # Index 55
  # ... more food items ...
  'qr_code',        # Index 81
  # ... remaining classes ...
]

Usage Examples

Basic Usage with TensorRT Model

from src.core.detector import YOLOSegmentationDetector

# Initialize detector with TensorRT engine
detector = YOLOSegmentationDetector(
    model_path="models/best.engine",
    class_config_path="fastrtc-test/config/data.yml"
)

# Process frame - will now show proper class names and original resolution masks
annotated_frame, detection_info = detector.detect_and_segment(frame)

# Check detection results
for detection in detection_info["detections"]:
    print(f"Class: {detection['class']}")  # Now shows "food_label" instead of "44"
    print(f"Mask shape: {detection['mask_array'].shape}")  # Now matches frame resolution

Custom Configuration Path

detector = YOLOSegmentationDetector(
    model_path="models/custom_model.engine",
    class_config_path="/path/to/custom/data.yml"
)

Performance Considerations

Class Name Loading

Impact: Minimal - loaded once during initialization
Memory: ~5KB for 113 class names
Fallback: Automatic fallback to model.names if config fails

Mask Resizing

Method: OpenCV bilinear interpolation (optimized)
Performance: ~1-2ms per mask (depends on resolution difference)
Memory: Temporary allocation during resize operation
Optimization: Only resizes if resolution differs

Testing and Validation

Run the comprehensive test suite:

cd fastrtc-test
python test_tensorrt_fixes.py

Test Coverage:

✅ YAML file structure validation
✅ Class name loading and mapping
✅ Mask resizing for multiple resolutions
✅ Detector integration with config
✅ Fallback behavior testing

Error Handling

Class Name Loading Errors

# Handles missing config file
# Handles malformed YAML
# Handles missing required fields
# Provides meaningful error messages

Mask Processing Errors

# Handles empty or invalid masks
# Handles dimension mismatches
# Provides fallback to model resolution
# Logs warnings for debugging

Debug Logging

Enable detailed logging to see the fixes in action:

import logging
logging.basicConfig(level=logging.DEBUG)

# Will show messages like:
# "🏷️ Using config class name: 44 -> food_label"
# "📐 Resized mask from (640, 640) to (1080, 1920), area: 150000"

Compatibility

Model Formats

✅ TensorRT Engine (.engine) - Primary target
✅ PyTorch (.pt) - Backwards compatible
✅ ONNX (.onnx) - Should work

Resolution Support

✅ HD (1280x720)
✅ Full HD (1920x1080)
✅ 4K (3840x2160)
✅ Custom resolutions

YOLO Versions

✅ YOLOv8 Segmentation
✅ YOLOv11 Segmentation
✅ Custom YOLO models (with proper config)

Troubleshooting

Common Issues

"Config file not found"

   # Ensure the config file exists:
   ls -la fastrtc-test/config/data.yml

"Class mapping empty"

   # Validate YAML structure:
   python -c "import yaml; print(yaml.safe_load(open('fastrtc-test/config/data.yml')))"

"Mask resize failed"

   # Check OpenCV installation:
   python -c "import cv2; print(cv2.__version__)"

Debug Mode

Set environment variable for verbose logging:

export FASTRTC_DEBUG=true
python your_detection_script.py

Future Enhancements

Planned Improvements

GPU-accelerated mask resizing using CUDA
Caching of resized masks for repeated detections
Support for polygon-based masks
Automatic config file generation from model metadata

Configuration Extensions

Multiple config file support
Dynamic class name updates
Class hierarchy and grouping
Custom mask processing pipelines

Performance Benchmarks

Based on testing with various configurations:

Resolution	Mask Count	Resize Time	Memory Usage
HD → FullHD	5 masks	~2ms	+5MB
FullHD → 4K	10 masks	~8ms	+20MB
Model → HD	3 masks	~1ms	+2MB

Note: Times measured on Intel i7-10700K with 32GB RAM

DEV Community

TensorRT Engine Model Fixes Implementation

Problems Addressed

Issue 1: Class Names as Indices

Issue 2: Mask Resolution Mismatch

Solution Overview

1. YAML-Based Class Name Loading (`src/utils/config_loader.py`)

2. Mask Resolution Utilities (`src/utils/mask_utils.py`)

3. Detector Integration (`src/core/detector.py`)

Configuration File Format

Usage Examples

Basic Usage with TensorRT Model

Custom Configuration Path

Performance Considerations

Class Name Loading

Mask Resizing

Testing and Validation

Error Handling

Class Name Loading Errors

Mask Processing Errors

Debug Logging

Compatibility

Model Formats

Resolution Support

YOLO Versions

Troubleshooting

Common Issues

Debug Mode

Future Enhancements

Planned Improvements

Configuration Extensions

Performance Benchmarks

Top comments (0)

Problems Addressed

Issue 1: Class Names as Indices

Issue 2: Mask Resolution Mismatch

Solution Overview

1. YAML-Based Class Name Loading (src/utils/config_loader.py)

2. Mask Resolution Utilities (src/utils/mask_utils.py)

3. Detector Integration (src/core/detector.py)

Configuration File Format

Usage Examples

Basic Usage with TensorRT Model

Custom Configuration Path

Performance Considerations

Class Name Loading

Mask Resizing

Testing and Validation

Error Handling

Class Name Loading Errors

Mask Processing Errors

Debug Logging

Compatibility

Model Formats

Resolution Support

YOLO Versions

Troubleshooting

Common Issues

Debug Mode

Future Enhancements

Planned Improvements

Configuration Extensions

Performance Benchmarks

1. YAML-Based Class Name Loading (`src/utils/config_loader.py`)

2. Mask Resolution Utilities (`src/utils/mask_utils.py`)

3. Detector Integration (`src/core/detector.py`)