With AI entering our lives, home automation and smart home systems have become much simpler and accessible to everyone. I've long wanted to create a system that would take most of the product management duties off my hands. This routine task eats up so much time that could be spent more productively.
The Idea: From Total Chaos to Automation
The ultimate goal is a system that automatically tracks all products in your home, knows their composition and expiration dates, auto-orders what's running low, and ideally even removes expired items (though that's still science fiction, as is automatic fridge organization). It should suggest menus based on family diets, allergies, taste preferences, and eating history.
But we have to start somewhere! For a long time, figuring out products, expiration dates, and composition without manual entry was a problem. One solution was barcode scanning, but... Different barcode databases, not always accessible, barcodes don't reveal composition, and bringing every product to the scanner isn't real automation.
But AI changes everything! Now we can recognize objects in photos, instantly find manufacturer info, composition, and there are even systems that find optimal prices! Life is getting better! That's what I was thinking just yesterday.
My immediate goal is a test project monitoring products in one cabinet first, then expanding to others and the fridge once debugged.
What We're Building On
Kitchen food cabinet 80×40×60 cm. Glass shelves (this is important).
Technical Foundation: ESP32-S3 + OV5640 Camera (No Tags Needed)
For full hands-free automation, I'm using ESP32-S3 with OV5640 camera module (5MP, fisheye lens 160-170° for complete shelf coverage).
Camera placement: Horizontally mounted above each glass shelf (20-30 cm above, 45° downward angle for full row visibility to the back wall), attached with double-sided tape or clips — this way they see through the glass without product obstruction (height up to 20 cm).
When the door opens (GPIO sensor), ESP captures a photo in 1 second, and AI (TensorFlow Lite Micro or YOLOv8-lite on edge) recognizes products: grains, cans, spices by shape, color, packaging. Accuracy after calibration (10 photos/product) — 92-98% for 50+ kitchen items, even with stacked rows.
One 60×40 cm shelf needs 1 ESP32-S3 camera (fisheye covers 100% without multiplexing), three shelves need 3 modules with WiFi bridge to central hub.
AI extrapolates occluded areas (back rows), counts quantities, reads expiration dates from labels (OCR on ESP-IDF). Data sent via MQTT to .NET backend.
I have Arduino — should I use it? No need: ESP32-S3 is self-sufficient (WiFi, GPIO, camera, powerful CPU/PSRAM for AI), simpler and cheaper for IoT without extra boards. Arduino works only for sensor testing, but loses to camera+AI in performance and connectivity.
.NET Backend + AI Agent Integration
ASP.NET Core backend receives photos from ESP32 (WiFi POST or MQTT), ML.NET for local vision model or cloud YOLO API.
C# agent (LangChain.NET) analyzes inventory:
- Generates menus: "Chicken, rice, carrots → pilaf for 4, 450 kcal/serving"
- Considers family: allergies, tastes (spicy for husband, gluten-free for child)
- Auto-order: Ozon/Wildberries API integration for best prices
API endpoint example:
[HttpPost("analyze-shelf")]
public async Task<IActionResult> AnalyzeShelf(IFormFile image)
{
var model = await _mlContext.Model.Load("yolov8_kitchen.zip", out _);
var prediction = model.CreatePredictionEngine<ImageData, Prediction>(model).Predict(input);
// Process: product list, quantities, dates
var agent = new KitchenAgent(inventory, familyPrefs);
var menu = agent.SuggestMeals();
return Ok(new { inventory, menu });
}
💰 Cost Breakdown
| Component | Price (RUB) | Notes |
|---|---|---|
| ESP32-S3 + OV5640 fisheye (x3) | 2344 | 618 RUB/module + 200 RUB fisheye lens + 240 RUB shipping (or 900 RUB/unit on Ozon) [3] |
| 3 glass shelves | 2200 | Tempered 6-8mm |
| Sensor + relay | 500 | GPIO auto-trigger |
| Total | 5044 | Compact, 160° coverage without blind spots |
Prototype on ESP-IDF or MicroPython ready in a weekend: GPIO for sensor, MQTT to .NET API, Telegram bot for alerts ("Milk gone, buy 2L").
ESP32-S3 runs YOLO-lite onboard (50ms/frame), waiting for AliExpress parts — once they arrive, I'll assemble and write Part 2 with real tests and code.
❓ Have You Tried This?
Curious: has anyone built similar smart shelves with camera+AI fisheye? Did you get it working in practice?
Share in comments — experiences, repos, occlusion or accuracy issues!
This isn't sci-fi, it's a real step toward household automation with .NET + ESP32 + AI. Try it yourself — open source code (ESP32-CAM pantry on GitHub), hardware available on Ali.
Worth scaling to full kitchen? 🚀[3]
Part 1/2. Part 2 with real tests, code, and recognition accuracy — coming once cameras arrive from Ali!

Top comments (0)