SciForce

Posted on Aug 19 • Edited on Sep 2

Smart Stable Monitoring System for Premium Remote Horse Care

#ai #devops #computervision #bigdata

Client Profile

The client is a private horse club that provides full-time care and boarding for pedigree horses owned by individuals. The club handles all daily responsibilities — feeding, cleaning, grooming, exercise, and health checks — as owners rarely visit in person. Some horses are fully owned by clients, while others are co-owned with the club as part of long-term investment agreements.

Owners expect high standards of care and regular updates without having to contact staff directly. To meet this need, the club introduced a monitoring system with 24/7 video access, stable condition tracking (temperature, humidity, etc.), and automatic alerts — helping owners stay informed about their horses at any time.

Challenge

1) Camera Limitations
The system relied on budget-friendly Wi-Fi IP cameras that did not support onboard video storage or direct disk recording. Instead, video data had to be pulled from the camera’s RTSP stream in real time and saved to local storage using a custom-built service. This required handling unstable connections, maintaining persistent stream capture, and managing compatibility quirks of the camera firmware.

2) Unstable Connectivity
Cameras frequently disconnected due to weak Wi-Fi signals. Disconnections occurred unpredictably, interrupting video streams and requiring manual or scripted reboots. To maintain uninterrupted recording, a monitoring mechanism was implemented to regularly ping each camera, detect downtime, and trigger automated recovery actions, such as reconnect attempts or power resets.

3) Large Video Storage Requirements
Continuous 24/7 recording generated several gigabytes per camera per day. To avoid high cloud storage and bandwidth costs, all data had to be stored locally, with each stable equipped with a dedicated on-premise server and storage capacity up to 500 TB.

4) Multi-Service Deployment Complexity
Each core component, such as video streaming, user management, and sensor data, was built as a separate API service. While this modular approach offered flexibility, it also increased system complexity. Each service required individual deployment, configuration, and maintenance, often without centralized orchestration or service discovery. Ensuring consistency across deployments demanded additional tooling and manual oversight.

5) Per-Location Infrastructure Overhead
Due to high video data volumes and cost-sensitive design, a separate physical server was installed at each stable to handle video capture and storage locally. This on-premise model minimized cloud costs but introduced significant operational overhead. Each new location required provisioning hardware, configuring network access, deploying services, and setting up monitoring, all of which scaled poorly without automation.

6) Monitoring and Reliability
Frequent camera dropouts and recording failures made system reliability a major concern. Since the cameras lacked built-in health checks, a custom monitoring stack (Prometheus + Grafana) had to be developed to track availability, stream status, and storage activity. This added engineering overhead and required ongoing effort to manage false alerts, maintain data accuracy, and ensure issues were detected before data was lost.

Solution

Hybrid Infrastructure
The system was designed with local video servers deployed at each stable to handle 24/7 recording and high-volume storage onsite (~100TB per location). This approach eliminated reliance on cloud streaming and significantly reduced operating costs. Local processing ensured uninterrupted capture even during connectivity loss, while modular deployment allowed each location to operate independently.

Modular API Design
The system used separate APIs for video streaming, user management, and sensor data, each deployed independently. This allowed for targeted updates and flexible scaling — with video APIs running locally and user services hosted centrally. While this improved maintainability, it required strict coordination across service boundaries and consistent handling of auth and data flow.

Scalable Deployment
Each stable was equipped with its own dedicated video module running on local hardware, responsible for ingesting, processing, and storing camera streams independently. In contrast, user and admin-facing APIs — such as account management and access control — were designed for centralized or cloud-based deployment.

This setup allowed the system to scale horizontally: adding a new stable required only provisioning local video infrastructure, while reusing the same shared backend services.

Camera Integration
Low-cost Wi-Fi IP cameras used in the system lacked native recording or monitoring features, requiring custom integration. A stream handler was developed to pull RTSP feeds in real time, write video data to local storage, and manage stream reconnections on failure. Additional logic was added to track camera status, perform periodic health checks, and trigger alerts if streams were interrupted.

Operational Monitoring
A custom monitoring stack using Prometheus and Grafana was implemented to provide real-time visibility into system health. Metrics included camera uptime, last successful stream timestamp, frame drop rates, disk I/O, and network latency. The dashboard enabled support teams to quickly identify stalled streams, offline devices, or storage issues.

Probes and exporters were deployed per site to collect metrics from distributed local servers, ensuring each stable could be monitored independently while still feeding into a centralized overview.

Features

For Horse Owners:

Live Video Access
Stream low-latency real-time video from 2–3 IP cameras installed in each horse’s stall, covering different angles (e.g., feeding area, resting zone, entrance). Video is delivered via RTSP or HLS and made available through a secure, user-authenticated web interface. Owners can access the stream from desktop or mobile devices without needing to contact staff.

Video History
Owners can access archived recordings via a timeline interface, selecting specific dates and hours for playback. Footage is streamed directly from the stable’s local server, with support for in-browser viewing and optional clip downloads. Recordings are retained for a set period based on storage capacity (e.g., 7–30 days).

Environment Data
Temperature, humidity, and barometric pressure are collected from digital sensors near each stall and updated every 30–60 seconds. Readings are shown alongside live video and can be viewed historically as timestamped values or trend graphs, helping owners monitor stable conditions and spot potential issues.

Multi-Horse Access
Owners with multiple horses are granted access to all associated stalls through a single account. The interface includes a dropdown or dashboard view to switch between horses, displaying each horse’s live video feed, environmental data, and video archive individually. Access rights are managed at the user level to ensure owners only see data linked to their registered animals.

Behavioral Observation & Health Monitoring
The system can be used to monitor changes in a horse’s behavior over time (such as restlessness, lying patterns, or feeding interruptions) which may indicate health issues. Owners can review footage with trainers or veterinarians for early detection and intervention.

Training & Performance Review
Video recordings offer educational value for owners, riders, and trainers. Footage can be reviewed to assess training sessions, stall behavior, or post-recovery progress. This supports more informed decisions and targeted training strategies.

Activity Summary
For stables with motion detection or video analytics enabled, owners can receive daily summaries of their horse’s activity (e.g., time spent lying down, eating, or pacing). This data helps build a long-term behavioral profile per horse.

For Admins and Club Staff:

User Management
Admins manage user accounts via an internal panel, linking each user to specific horses and their associated stalls and cameras. Role-based access control (RBAC) is enforced at the API level to ensure users can only access data for their assigned horses. All activity is logged for audit purposes.

Camera Management
Admins can register new RTSP-enabled IP cameras by providing stream URLs, assigning each to a specific stall within the system. Cameras can be activated or deactivated remotely, and their configuration (location, angle, stall ID, status) is stored in a centralized database. The system tracks stream availability, connection history, and last-seen status to support diagnostics and maintenance.

Automated Storage Cleanup
Each local video server runs a background service that monitors disk usage and deletes the oldest video files when space drops below a set threshold (e.g., 10–15%). Retention policies are configurable per site based on available storage (e.g., ~30 days on a 500TB disk). Cleanup runs in the background to avoid disrupting active recordings.

System Health Overview
Club staff use a centralized Grafana dashboard to monitor the real-time status of each stall. The interface highlights offline cameras, stalled streams, and storage usage trends, enabling quick identification of issues without technical intervention. Visual indicators and alert flags help prioritize maintenance tasks, while historical graphs support troubleshooting recurring connectivity or hardware problems.

Staff Training Library
Clips from real stall footage can be saved and reused to train new staff on recognizing early warning signs (e.g., signs of colic, restlessness, reduced eating), proper care routines, or emergency procedures. This improves onboarding and helps maintain consistent quality of care across locations.

Remote Veterinary Review
Video history can be shared with veterinarians to assess incidents or monitor recovery without an on-site visit. This supports remote consultations, speeds up diagnostics, and allows experts to review behavior or stall conditions before giving recommendations.

Trainer Feedback for Owners
For clubs that offer training, camera footage from riding sessions or turnouts can be reviewed by trainers and shared with owners. This helps evaluate progress, refine technique, and support informed decision-making in horse conditioning or rehabilitation.

Development Process

1) Initial Research & Local Testing
The process began with setting up a production-identical Wi-Fi IP camera to simulate real deployment conditions. Initial steps included validating RTSP stream access, testing authentication methods, and assessing behavior under unstable network conditions. Stream capture was prototyped to evaluate buffering strategies, file writing to local storage, and resilience to dropped connections.

We gave attention to codec compatibility and timestamp accuracy. The findings from this phase defined the technical constraints and informed the design of a stable, external video recording pipeline.

2) Video Capture Service Implementation
A custom video capture service was implemented to ingest live RTSP streams from multiple IP cameras in real time. The service segmented recordings into fixed-duration files (e.g., one-hour MP4 chunks) to simplify storage management and playback indexing.

We stored the files directly on the local disk on each stable’s server, bypassing cloud dependencies to reduce bandwidth use and operating costs. The service included built-in watchdog timers, automatic reconnection on stream failure, and graceful handling of camera dropouts to ensure minimal data loss.

Recordings were stored in a timestamped directory structure organized by date, stall ID, and camera, enabling fast lookup for both users and internal tools.

3) Screening & Playback System
A dedicated playback service was developed to make recorded video segments easily accessible to end users. All video files were stored locally and automatically organized by stall, camera ID, date, and time, enabling fast and efficient lookup. The frontend interface included a visual timeline that allowed users to scroll through and select specific time blocks for playback.

The backend module served MP4 files directly from disk, supporting on-demand loading and smooth browser-based streaming. Recordings were divided into fixed-length segments (e.g., one hour) to optimize both performance and storage. Users also had the option to download selected video clips for offline viewing or sharing with veterinarians or caretakers.

4) Storage Cleanup Mechanism
To manage continuous video recording without exhausting disk space, an automated cleanup process was built into each local server. It runs in the background and safely removes outdated footage based on configurable rules.

Triggers automatically when available disk space drops below a set threshold (e.g., 10–15%).
Deletes video files starting from the oldest, using a structured folder system (by date, stall, and camera).
Operates without interrupting live recording or user playback sessions.
Retention periods (e.g., 7–30 days) are configurable per location to match hardware capacity.
Includes safety checks to prevent deletion of files that are still in use or being written.

5) Monitoring & Support Tools
To ensure stable performance and fast issue resolution, an internal monitoring system was implemented using Prometheus for data collection and Grafana for visualization. Each local server exposed key metrics related to camera health and storage activity, enabling staff to monitor all locations in real time.
The dashboard included:

Camera availability — real-time online/offline status for each connected device
Last stream timestamp — the most recent time a camera successfully transmitted video
Ping latency — network response times between server and camera
Recording status — confirmation that video was actively being written to disk
Disk usage alerts — warnings when local storage neared capacity thresholds

Data was collected at fixed intervals (e.g., every 15–30 seconds) and visualized in grouped panels by stall or server. Alert rules automatically notified staff if a camera went offline, if video streaming stalled, or if storage approached critical limits. This setup allowed the internal team to detect failures early, verify camera behavior remotely, and maintain consistent service quality across all deployed sites.

Technical Highlights

Backend: Python, FastAPI
Video Processing: FFmpeg
Monitoring: Prometheus + Grafana
Frontend & Admin Tools: Streamlit
Web Server & Security: Nginx + Let’s Encrypt
Deployment: Docker

Impact

The new system made daily operations more efficient and gave horse owners better visibility and confidence in the care provided.

95% of owners started using the platform in the first month
Most clients quickly adopted the system, using live video and sensor data to stay updated on their horse’s condition without needing to call or message staff.

Around 60% fewer routine questions for staff
With easy access to video and stall data, owners asked fewer day-to-day questions, which gave staff more time to focus on horse care and stable management.

Up to 80% savings on storage costs
By recording and storing video directly at each stable, the club avoided expensive cloud fees and kept long-term operating costs much lower.

This setup helped the club grow its services to more locations while keeping quality high and costs under control.

DEV Community