The Problem Shows Up Around Frame 2000
YOLO inference on video looks fine for the first minute. Then your process starts eating 8GB of RAM, frame rate drops from 30 FPS to 5, and eventually the whole thing crashes with an OOM error.
I've seen this pattern across YOLOv5, YOLOv8, and YOLOv11. The symptoms are identical: gradual memory bloat, progressively slower inference, and frame drops that compound over time. The root causes are usually a combination of frame buffer accumulation, result tensor retention, and inefficient CPU-GPU memory transfers.
Here's what actually fixes it.
Fix #1: Release Frame Buffers Explicitly
OpenCV's VideoCapture doesn't automatically release frame buffers when you're done with them. If you're running inference in a loop without clearing old frames, they pile up in memory.
python
import cv2
from ultralytics import YOLO
import gc
model = YOLO('yolov8n.pt') # 6.2MB checkpoint, ~300MB VRAM
cap = cv2.VideoCapture('traffic.mp4')
frame_count = 0
while cap.isOpened():
ret, frame = cap.read()
---
*Continue reading the full article on [TildAlice](https://tildalice.io/yolo-video-inference-memory-fixes/)*

Top comments (0)