DEV Community

Cover image for Fun with Avatars: Crafting the core engine | Part. 1
Lewis Munyi
Lewis Munyi

Posted on • Originally published at lewismunyi.hashnode.dev

Fun with Avatars: Crafting the core engine | Part. 1

Table of Contents

Introduction

In this article series, I will walk you through the development of a service that generates random avatars using cutting-edge AI technology. The service will take specific features and parameters as input and, based on these attributes, deliver a unique avatar to the user. Notably, this project is open source, and available on GitHub. Feel free to fork it, contribute, and enhance its capabilities.

GitHub logo lewis-munyi / avatars-as-a-service

A fun approach to generating avatars using AI

Avatars as a Service

A fun approach to generating avatars using AI

This is the accompanying code for the Avatars as a Service series on Hashnode




We'll leverage Python's simplicity for accessibility, with potential future translations for broader reach.

Pre-requisites

We will be running this project in Python 3.10 on Mac/Linux, and we will use Poetry to manage our dependencies. Later, we will bundle our app into a container using docker for deployment.

Before diving into the project, ensure you have the following set up:

  1. Python 3.10 installed on Mac/Linux/Windows. Follow these instructions to set it up on your OS.

  2. Poetry installed for managing dependencies. Refer to this article for installation instructions.

Project Design

This is a simple overview of what the full project entails:

a picture showing the full project plan

This project will be broken down into 4 parts:

  • Part 1: Creating the project, establishing the API, and developing the avatar generation module.

  • Part 2: Focuses on containerizing the application for deployment.

  • Part 3: Delves into optimizing the service for cost-effectiveness.

  • Part 4: Explores the integration of the service into real-life projects.

Create project

Open your terminal and run the following command to create our working folder and navigate to it.

poetry new avatars-as-a-service && cd avatars-as-a-service
Enter fullscreen mode Exit fullscreen mode

That will create a directory avatars-as-a-service with the necessary files.

.
├── LICENSE
├── README.md
└── avatars-as-a-service
    ├── README.rst
    ├── avatars_as_a_service
    │   └── __init__.py
    ├── pyproject.toml
    └── tests
        ├── __init__.py
        └── test_avatars_as_a_service.py
Enter fullscreen mode Exit fullscreen mode

Optionally, initialize a git repository for version tracking. This will create a .git folder in our project if there is none already.

git init
Enter fullscreen mode Exit fullscreen mode

Start a shell session in our virtual environment by running:

poetry shell
Enter fullscreen mode Exit fullscreen mode

Understanding FastAPI

We will create our API using FastAPI, a modern high-performance web framework for building fast APIs with Python. It is designed to be easy to use, efficient, and highly scalable. Some key features of FastAPI include:

  1. Speed: FastAPI is built on Starlette, an asynchronous web framework, allowing it to handle high loads with incredible speed and efficiency. It is really fast.

  2. Type Annotations: FastAPI utilizes Pythons type hinting system to provide automatic request/response validation, resulting in enhanced reliability and fewer bugs.

  3. API Documentation: FastAPI generates interactive documentation with Swagger UI and ReDoc, making exploring and understanding API endpoints effortless.

  4. Security: FastAPI supports various authentication methods, including OAuth2, API key validation, and JWT tokens, enabling secure API development.

  5. Asynchronous Support: FastAPI is designed to take advantage of Python's async and await syntax, enabling efficient handling of I/O-bound operations.

FastAPI uses Uvicorn, an ASGI (Asynchronous Server Gateway Interface) web server implementation for Python.

Install them and other required dependencies by running:

poetry add fastapi 'uvicorn[standard]'
Enter fullscreen mode Exit fullscreen mode

Next, create a main.py file at the root of our project with the following code.

from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def root():
    return {"message": "Hello World!"}
Enter fullscreen mode Exit fullscreen mode

Start the Uvicorn server by running:

uvicorn main:app --reload
Enter fullscreen mode Exit fullscreen mode

Thereload flag ensures that our server reloads on any changes.

Once the server is up navigate to the address http://127.0.0.1:8000 to see the following JSON response.

{"message": "Hello World!"}
Enter fullscreen mode Exit fullscreen mode

Visit http://127.0.0.1:8000/docs for the Swagger-powered API documentation. We will also use it to interact with our API.

a screenshot showing the default swagger page on the browser

Create Serializers

The Avatar properties will be a set of pre-selected choices that will be easier to work with. Let us go ahead and create the file to hold the Enums we are going to need. In ./avatars_as_a_service/serializers/ create a file called AvatarFeatures.py

mkdir ./avatars_as_a_service/serializers/ && touch ./avatars_as_a_service/serializers/AvatarFeatures.py
Enter fullscreen mode Exit fullscreen mode

Add the following code to it:

from enum import Enum

# Will be used to specify whether to create a cartoon-y avatar or something less fun :/
class Mood(Enum):
    FUN = "fun"
    OFFICIAL = "official"

class HeadShape(Enum):
    OVAL = 'oval'
    ROUND = 'round'
    SQUARE = 'square'
    HEART = 'heart'
    TRIANGULAR = 'triangular'

class SkinTone(Enum):
    FAIR = 'fair'
    MEDIUM = 'medium'
    DARK = 'dark'

class SmileType(Enum):
    WIDE = 'wide'
    FULL = 'full'
    CLOSED_LIP = 'closed-lip'
    OPEN_LIP = 'open-lip'
    TEETH_BARING = 'teeth-baring'
    SMIRK = 'smirk'

class NoseType(Enum):
    STRAIGHT = 'straight'
    ROMAN = 'roman'
    BUTTON = 'button'
    SNUB = 'snub'
    WIDE = 'wide'
    NARROW = 'narrow'

class EyeColor(Enum):
    BROWN = 'brown'
    BLUE = 'blue'
    AMBER = 'amber'
    HAZEL = 'hazel'
    GREEN = 'green'
    GREY = 'grey'
Enter fullscreen mode Exit fullscreen mode

Next, Create the Avatar Serializer. This file will hold the Avatar properties, Request, Response and other classes.

touch ./avatars_as_a_service/serializers/Avatar.py
Enter fullscreen mode Exit fullscreen mode

Add the following code to it:

import os

from pydantic import BaseModel
from typing import Union
from openai import OpenAI
from dotenv import load_dotenv

from app.serializers.AvatarFeatures import Mood, HeadShape, EyeColor, SkinTone, SmileType, NoseType

load_dotenv()

class Avatar(BaseModel):
    skin_tone: Union[SkinTone, None]
    head_shape: Union[HeadShape, None] = None
    eye_color: Union[EyeColor, None] = None
    smile_type: Union[SmileType, None] = None
    nose_type: Union[NoseType, None] = None
    glasses: Union[bool, None] = False
    mood: Union[Mood, None] = Mood.FUN
    description: Union[str, None] = None

    # Method used to generate a prompt string from the various properties supplied
    def generate_prompt(self) -> str:
        if self.description and self.description != '':  # If a description is provided then it will override the other Avatar properties
            return self.description

        prompt = 'create an avatar for a person with the following description: '

        if self.skin_tone:
            prompt += f'{self.skin_tone.value} skin color/complexion, '

        if self.head_shape:
            prompt += f'a {self.head_shape.value} head, '

        if self.eye_color:
            prompt += f'{self.eye_color.value} eyes, '

        if self.smile_type:
            prompt += f'a {self.smile_type.value} smile, '

        if self.nose_type:
            prompt += f'a {self.nose_type.value} nose, '

        if self.glasses:
            prompt += f'and with a pair of glasses, '

        if self.mood:
            if self.mood.value == 'fun':
                prompt += f'and make it fun and/or cartoon-y.'
            else:
                prompt += f'and make it official and/or office appropriate.'

        return prompt

    def dall_e_2_search(self):
        try:
            client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'), )
            response = client.images.generate(
                model="dall-e-2",
                prompt=self.generate_prompt(),
                size="256x256",
                quality="standard",
                n=1,
            )

            res = AvatarResult()
            res.image_url = response.data[0].url
            return res

        except Exception as e:
            print(str(e))

class AvatarResult(BaseModel):
    image_url: str = None

class AvatarRequest(BaseModel):
    properties: Avatar

class AvatarResponse(BaseModel):
    data: AvatarResult = None
    prompt: str = None
Enter fullscreen mode Exit fullscreen mode

We will be using Dall-E 2 to generate our avatars. Don't forget to grab an OpenAI API key and paste it into your .env file at the root (/) of your project by running the below command:

echo "OPENAI_API_KEY=<paste-api-key-here>" > .env
Enter fullscreen mode Exit fullscreen mode

Install OpenAI's Python library by running:

poetry add openai
Enter fullscreen mode Exit fullscreen mode

Create web routes

We're now ready to create our first web route. Open the main.py file and paste the following code:

from fastapi import FastAPI
from avatars_as_a_service.serializers.Avatar import AvatarRequest, AvatarResponse

app = FastAPI()

@app.post("/query")
def search(req: AvatarRequest) -> AvatarResponse:
    avatar = req.properties

    res = AvatarResponse()
    res.data = avatar.dall_e_2_search()  # Get OpenAi Payload Image URL
    res.prompt = avatar.generate_prompt()

    return res
Enter fullscreen mode Exit fullscreen mode

The route above takes an AvatarRequest object and returns an AvatarResponse object. The AvatarRequest object takes in the following parameters:

// AvatarRequest
{
  "properties": {
    "skin_tone": "",
    "head_shape": "",
    "eye_color": "",
    "smile_type": "",
    "nose_type": "",
    "glasses": bool,
    "mood": "",
    "description": "" // Optional
  }
}
Enter fullscreen mode Exit fullscreen mode

The AvatarResponse object returns the following:

// AvatarResponse
{
  "data": {
    "image_url": "string" // Our avatar
  },
  "prompt": "string" // The prompt that was sent to OpenAI
}

Enter fullscreen mode Exit fullscreen mode

App testing

Once that's done our app is ready for testing. Head to our API docs and click 'Try it out' to customize the query object.

a screenshot showing a request object

a screenshot showing the response from the previous request

The description we've entered overrides all the selected(default) avatar properties. This is the image generated for the prompt(description) spongebob laughing

a screenshot showing the actual image of spongebob that's been generated

But that's not quite the type of avatar we want. Let's build one with the correct properties.

a screenshot showing a correct request with avatar features

The above query does run correctly, and it returns an AI-generated avatar. Yay! 🎉

a picture showing the generated avatar

Conclusion

We've finally reached the end of Part 1. Our app creates fun avatars based on several features provided. In Part 2, we will package the app to run on docker containers for easy deployment and distribution.

Cheers!

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.