DEV Community

Manthan Ankolekar
Manthan Ankolekar

Posted on

3

Building an AI-Powered Image Editor with Google's Gemini API

Introduction

AI is transforming image editing by allowing users to enhance and modify images based on text prompts. In this blog, we’ll explore how I built the Gemini Image Editor, a Node.js application that leverages Google’s Gemini API to edit images with AI.

This project allows users to upload an image, describe modifications, and receive an AI-enhanced version.

Project Overview

The Gemini Image Editor is a REST API that supports:

Uploading images and applying modifications.

Google Gemini API integration for AI-powered editing.

Multer file upload handling.

Express.js backend with easy-to-use API endpoints.

Tech Stack

  • Node.js - Backend runtime
  • Express.js - Web framework
  • Google Generative AI SDK - Image modification
  • Multer - File upload handling
  • dotenv - Environment variables

Getting Started

1. Clone the Repository

git clone https://github.com/manthanank/gemini-image-editor.git
cd gemini-image-editor
Enter fullscreen mode Exit fullscreen mode

2. Install Dependencies

npm install
Enter fullscreen mode Exit fullscreen mode

3. Configure Environment Variables

Create a .env file and add your Google Gemini API key:

GEMINI_API_KEY=your_google_gemini_api_key
PORT=5000
Enter fullscreen mode Exit fullscreen mode

4. Start the Server

npm start
Enter fullscreen mode Exit fullscreen mode

Your server will run at http://localhost:5000 🚀


API Endpoints

Modify an Image

📌 Endpoint: POST /api/image/modify

📌 Request Body:

  • prompt (string): Modification instructions
  • image (file): The image to modify

📌 Response:

{
  "message": "Image modified successfully",
  "imagePath": "uploads/modified_1710342456.png"
}
Enter fullscreen mode Exit fullscreen mode

Project Structure

gemini-image-editor/
├── controllers/       # Business logic
├── middleware/        # Multer file upload
├── routes/            # API endpoints
├── services/          # Google Gemini AI logic
├── uploads/           # Stores images
├── server.js          # Entry point
├── package.json       # Dependencies
└── .env               # Environment variables
Enter fullscreen mode Exit fullscreen mode

Core Implementation

1. Setting Up the Express Server

The server.js file initializes the app and ensures the uploads/ directory exists:

const app = require("./app");
const { port } = require("./config/env");
const fs = require("fs");

// Ensure uploads directory exists
if (!fs.existsSync("uploads")) {
  fs.mkdirSync("uploads");
}

app.listen(port, () => {
  console.log(`Server running on http://localhost:${port}`);
});
Enter fullscreen mode Exit fullscreen mode

2. Handling Image Upload & Modification

The imageController.js file manages requests:

const { modifyImage } = require("../services/geminiService");

async function modifyImageController(req, res) {
  const { prompt } = req.body;
  const imageFile = req.file;

  if (!prompt || !imageFile) {
    return res.status(400).json({ error: "Prompt and image are required" });
  }

  try {
    const modifiedImagePath = await modifyImage(prompt, imageFile.path);
    res.status(200).json({ message: "Image modified successfully", imagePath: modifiedImagePath });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
}

module.exports = { modifyImageController };
Enter fullscreen mode Exit fullscreen mode

3. Multer Middleware for File Uploads

The uploadMiddleware.js file sets up Multer to store images in uploads/:

const multer = require("multer");
const path = require("path");

const storage = multer.diskStorage({
  destination: (req, file, cb) => {
    cb(null, "uploads/");
  },
  filename: (req, file, cb) => {
    cb(null, Date.now() + path.extname(file.originalname));
  },
});

const upload = multer({ storage });

module.exports = upload;
Enter fullscreen mode Exit fullscreen mode

4. Google Gemini API Integration

The geminiService.js file connects to Gemini and modifies the image:

const { GoogleGenerativeAI } = require("@google/generative-ai");
const fs = require("fs");
const { geminiApiKey } = require("../config/env");

const genAI = new GoogleGenerativeAI(geminiApiKey);

async function modifyImage(prompt, imagePath) {
  const imageData = fs.readFileSync(imagePath);
  const base64Image = imageData.toString("base64");

  const contents = [
    { text: prompt },
    {
      inlineData: {
        mimeType: "image/png",
        data: base64Image,
      },
    },
  ];

  const model = genAI.getGenerativeModel({
    model: "gemini-2.0-flash-exp-image-generation",
    generationConfig: {
      responseModalities: ["Text", "Image"],
    },
  });

  try {
    const response = await model.generateContent(contents);
    for (const part of response.response.candidates[0].content.parts) {
      if (part.inlineData) {
        const imageData = part.inlineData.data;
        const buffer = Buffer.from(imageData, "base64");
        const outputPath = `uploads/modified_${Date.now()}.png`;
        fs.writeFileSync(outputPath, buffer);
        return outputPath;
      }
    }
  } catch (error) {
    console.error("Error modifying image:", error);
    throw new Error("Failed to modify image");
  }
}

module.exports = { modifyImage };
Enter fullscreen mode Exit fullscreen mode

Conclusion

With Node.js, Express, Multer, and Google Gemini API, we’ve built an AI-powered image editor that allows users to upload images, apply modifications using text prompts, and receive AI-enhanced versions. 🚀

🔹 Potential Enhancements:

🔹 Add a frontend UI with Angular for an interactive user experience.

🔹 Support cloud storage for better image management.

🔹 Allow multiple modifications in one request.

👉 Ready to explore AI-powered image editing? Try the GitHub Repository!

AWS Q Developer image

Your AI Code Assistant

Automate your code reviews. Catch bugs before your coworkers. Fix security issues in your code. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

Top comments (0)

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

👋 Kindness is contagious

If this article connected with you, consider tapping ❤️ or leaving a brief comment to share your thoughts!

Okay