
Most AI image captioners just tell you what is in the picture. I wanted to build something deeper. I built Visioryx to bridge the gap between Neural Scene Intelligence and Digital Hardware Forensics.
Key Engineering Wins:
Multimodal Analysis: Integrated the Salesforce/blip-image-captioning-base model to transform pixels into descriptive natural language.
Metadata Recovery: Engineered a Python-based forensic layer using PIL to extract hidden EXIF headers (Camera Model, ISO, Aperture).
The "Scrubbed Data" Problem: One of my biggest challenges was handling images from platforms like Unsplash or Facebook that strip metadata. I implemented a fallback logic that identifies "Data Stripping" as a forensic state rather than a system failure.
Full-Stack Utility: Designed a custom Glassmorphic UI and an automated PDF report generator for professional hand-offs.
The Stack:
Backend: FastAPI / Uvicorn
AI Logic: PyTorch / Transformers
Frontend: Tailwind CSS
Live Demo: https://huggingface.co/spaces/yousafk279/Visioryx
Open Source: https://github.com/yousafk279/visioryx
Top comments (0)