WonderLab

Posted on Apr 5

Open Source Project of the Day (Part 30): banana-slides - Native AI PPT Generation App Based on nano banana pro

#ai #opensource #gemini #ppt

Introduction

"Vibe your PPT like vibing code."

This is Part 30 of the "Open Source Project of the Day" series. Today we explore banana-slides (GitHub), open-sourced by Anionex.

Have you ever found yourself the night before a presentation with a blank slide deck — full of brilliant ideas, but completely drained by the drudgery of layouts and design? Traditional AI PPT tools may be "fast," but they're often locked into preset templates, offer little freedom, and produce homogeneous results. banana-slides is built on Google's nano banana pro image generation model, delivering a "native Vibe PPT" experience: three creation paths — one sentence, outline, and page description — upload any template or materials, intelligently parse PDF/Docx/MD files, use natural language voice editing on specific areas (e.g., "change page three to a case study," "replace this chart with a pie chart"), and one-click export to PPTX or PDF, with support for Editable PPTX (Beta) — text and images remain freely editable in PowerPoint. The project uses a React + Flask full-stack architecture, Docker one-click deployment, and targets a wide audience from beginners to professionals, with the goal of "lowering the bar for PPT creation so anyone can quickly produce beautiful, professional presentations."

What You'll Learn

banana-slides' positioning: a native AI PPT generation app based on nano banana pro, moving toward true "Vibe PPT"
Three creation paths: idea, outline, page description — plus Vibe-style natural language editing
Material parsing capabilities: multi-format uploads, smart extraction, style references
Technical architecture: React + Vite + Flask + SQLite + Gemini API
Comparison with notebooklm slide deck and project advantages

Prerequisites

Comfortable with Docker or Node.js/Python development environments
Basic familiarity with LLM APIs (e.g., Gemini, OpenAI)
For self-hosted deployment, a Google Gemini API Key is required (image generation requires a paid tier) or access via a proxy like AIHubMix

Project Background

Project Introduction

banana-slides is a native AI PPT generation app built on nano banana pro (Google Gemini image generation model). It supports creating presentations from three paths — idea, outline, and page description — automatically extracts charts and text from attachments, accepts uploaded template images for style customization, and allows natural language voice edits on specific areas or entire pages. Results can be exported as standard PPTX or PDF, with support for Editable PPTX (Beta): exported pages have text and images that can be freely edited in PowerPoint, with text styles (font size, color, bold, etc.) preserved as closely as possible. The project's slogan is "Vibe your PPT like vibing code," aiming to satisfy both "fast" and "beautiful" PPT needs, solving the problems of fixed templates, low flexibility, and homogenization in traditional AI PPT tools.

Target user groups:

Beginners: Zero-barrier generation of attractive PPTs with no design experience needed
PPT professionals: Reference AI-generated layouts and text-image compositions for design inspiration
Educators: Quickly convert teaching content into illustrated lesson plans
Students: Quickly complete assignment presentations, focusing on content rather than layout
Professionals: Rapidly visualize business proposals and product introductions

Author/Team Introduction

Organization: Anionex (GitHub)
Website: bananaslides.online
Community: WeChat group available in README; sponsors include AIHubMix, AI Huobao, Yuyun, etc.
Commercial license: Free for personal/educational/non-profit use under AGPL-3.0; commercial closed-source or private deployment requires a Commercial License from the author

Project Stats

⭐ GitHub Stars: 12.1k+
🍴 Forks: 1.4k+
📦 Version: v0.4.0 (February 2026)
📄 License: AGPL-3.0
🌐 Website: bananaslides.online
🐳 Docker: Supports amd64 / arm64 with pre-built images

Main Features

Core Purpose

banana-slides' core purpose is to quickly generate high-quality, editable PPTs driven by natural language and materials:

Multi-path creation: Start from a one-sentence idea, a structured outline, or page-by-page descriptions — AI auto-completes the outline and page content
Material parsing: Upload PDF, Docx, MD, Txt files; automatically extract key points, image links, and chart data as generation materials
Style customization: Upload reference images or templates to control the overall visual style
Vibe-style editing: Verbally edit with natural language (e.g., "change page three to a case study," "replace this chart with a pie chart") — AI responds in real time
Export: One-click export to PPTX or PDF; editable PPTX mode allows text and images to be freely modified in PowerPoint

Use Cases

Reports/Proposals: Presentation due tomorrow — input a topic or outline and quickly generate a professional PPT
Lesson plans: Upload teaching content or documents and auto-generate illustrated lesson plans
Assignment presentations: Students input their topic and focus on content rather than layout
Design inspiration: Professionals reference AI-generated layouts and compositions
Iterative refinement: Verbally request changes after generation, without going through menus

Quick Start

Recommended: Docker Compose Deployment

git clone https://github.com/Anionex/banana-slides
cd banana-slides
cp .env.example .env
# Edit .env, configure GOOGLE_API_KEY (or a proxy like AIHubMix)
docker compose -f docker-compose.prod.yml up -d

Access the frontend at http://localhost:3000, backend API at http://localhost:5000.

Sample environment variables (Gemini format):

AI_PROVIDER_FORMAT=gemini
GOOGLE_API_KEY=your-api-key-here
GOOGLE_API_BASE=https://generativelanguage.googleapis.com
# Proxy example: https://aihubmix.com/gemini

From source: Requires Python 3.10+, uv, Node.js 16+; backend: uv sync then uv run python app.py; frontend: npm install then npm run dev.

Core Features

Three creation paths: Idea (one sentence generates outline and descriptions), Outline (manual or AI-generated), Page Description (per-page control)
Natural language editing: Verbally modify outlines or descriptions (e.g., "change page three to a case study") — AI adjusts in real time
Multi-format material parsing: PDF, Docx, MD, Txt upload; auto-parses key points, images, charts
Style references: Upload templates or reference images to customize PPT visual style
Local re-rendering: Select an unsatisfactory area and verbally describe the change (e.g., "replace this chart with a pie chart")
Full-page optimization: High-quality, visually consistent pages generated by nano banana pro
Multi-format export: PPTX, PDF, default 16:9, ready to present
Editable PPTX (Beta): Export high-fidelity, clean-background editable pages with text styles preserved; Baidu OCR API recommended for best results (see issue #121)
Multi-model support: Gemini, OpenAI, Vertex AI, Lazyllm (can mix DeepSeek, Doubao, Tongyi, etc.)
Internationalization and dark mode: Chinese/English toggle, light/dark/system theme

Project Advantages

Comparison with notebooklm slide deck (official README comparison, may change with updates):

Feature	notebooklm	banana-slides
Page limit	15 pages	Unlimited
Re-editing	Not supported	Selection edit + voice edit
Adding materials	Cannot add after generation	Add freely after generation
Export format	PDF only	PDF + Editable PPTX
Watermark	Free tier has watermarks	No watermark, freely add/remove elements

Why choose banana-slides?

True Vibe: Built on nano banana pro — good text-image quality and consistency, accurate text rendering and adherence to reference image styles
Flexible creation: Not locked into preset templates; upload any materials and templates, make multi-round voice edits
Editable export: Supports exporting PPTX freely editable in PowerPoint, not just stacked images
Open-source self-hostable: Docker one-click deployment, privatizable; supports multiple LLM APIs for cost and compliance control

Detailed Project Analysis

Technical Architecture

Frontend: React 18 + TypeScript + Vite 5, Zustand state management, React Router v6, Tailwind CSS, @dnd-kit drag-and-drop, Lucide React icons, Axios HTTP client.

Backend: Python 3.10+, Flask 3.0, uv package manager, SQLite + Flask-SQLAlchemy, Google Gemini API (or OpenAI/Vertex/Lazyllm), python-pptx for PPT handling, Pillow for image processing, ThreadPoolExecutor for concurrency, Flask-CORS for cross-origin.

AI Capabilities: Both text generation (outlines, descriptions, etc.) and image generation (page rendering) depend on LLMs; the core image generation is nano banana pro, requiring an API that supports image generation (Gemini free tier only supports text, not images).

Project Structure

frontend/: React app; pages/ contains Home, OutlineEditor, DetailEditor, SlidePreview, History; components/ contains outline, preview, shared, layout, history; store/ is Zustand; api/ is interface wrappers
backend/: Flask app; models/ contains Project, Page, Task, Material, UserTemplate, ReferenceFile, PageImageVersion; services/ contains ai_service, file_service, file_parser_service, export_service, task_manager, prompts; controllers/ is REST API
tests/: Tests; v0_demo/: Early demo; output/: Exported files

Key Implementation

Creation pipeline: Idea → AI generates outline → generates per-page descriptions → calls nano banana pro to generate page images → assembles PPT
Material parsing: file_parser_service parses PDF/Docx/MD/Txt, extracting text, images, charts for use during generation
Editable PPTX: Text in generated images is recognized via OCR and restored as editable text boxes, preserving font size, color, bold, and other styles as much as possible; requires Baidu OCR API (see issue #121)
Multi-model support: Via AI_PROVIDER_FORMAT and Lazyllm configuration, mix text and image models from different vendors

Development Roadmap (Selected)

✅ Completed: Three creation paths, Markdown image parsing, single-page material addition, selection Vibe editing, various file parsing, editable PPTX export
🔄 In progress: Multi-layer precise cutout editable export, web search, Agent mode
🧭 Planned: Online playback, animations and page transitions, multi-language support
🏢 Commercial: User system

Project Resources

Official Resources

🌟 GitHub: github.com/Anionex/banana-slides
🌐 Website: bananaslides.online
📄 English README: README_EN.md
🐛 Issues: GitHub Issues
📋 Editable PPTX notes: issue #121
📖 Beginner deployment tutorial: @ShellMonster's tutorial

Related Resources

AIHubMix (multi-model API proxy, reduces migration cost)
Baidu Cloud OCR (editable export optimization)
uv (Python package manager)
nano banana pro (Google Gemini image generation)

Who Should Use This

Users who need to create PPTs quickly: Reports, proposals, lesson plans, assignment presentations
Creators who want "fast and beautiful": Don't want to be constrained by fixed templates; need multi-round natural language edits
Technical teams: Want to self-host AI PPT services and control data and costs
Developers interested in Vibe PPT and the nano banana ecosystem: Learn full-stack AI application architecture and multi-model integration

Welcome to visit my personal homepage for more useful knowledge and interesting products

DEV Community