DEV Community

Cover image for Adding DALL-E to your Elixir app
Byron Salty
Byron Salty

Posted on

Adding DALL-E to your Elixir app

I recently updated my little image generation game, Teleprompt, to use DALL-E 3 instead of Stable Diffusion as the image engine.

If you're curious about the reasoning, overall product and architecture, check out my previous, high-level article with those details - as well as some cool pictures of dragons!

In this article, I will give you the few pieces of code you need to create your own DALL-E 3 integration with Elixir.

...

Step 1 - Setup OpenAI

DALL-E is part of the OpenAI product suite, under Image Generation.

Create an account with OpenAI and on the left nav bar you'll see an option for API Keys.

From there you can select "Create new secret key" and copy the value that will look like sk-....

I like to add this to a .env file to source and make available to my project:

export OPENAI_API_KEY=sk-...u8pj
Enter fullscreen mode Exit fullscreen mode

Step 2 - API request

In Teleprompt, I setup a GenServer in order to asynchronously call the OpenAI API. Depending on your use case you might not need this complexity but I felt it was important for a webapp to be more event driven and then use PubSub to inform listeners when the image generation was complete.

defmodule Teleprompt.GenerationHandler do
  use GenServer

  ...

  def start_generating(prompt) do
    # create a post request to the server
    GenServer.cast(__MODULE__, {:generate, prompt, listener_code})
  end

  @impl true
  def handle_cast({:generate, prompt, listener_code}, _state) do
    endpoint = "https://api.openai.com/v1/images/generations"
    openai_api_key = @openai_key
    {model, size} = {"dall-e-3", "1024x1024"}
    # {model, size} = {"dall-e-2", "512x512"}

    data =
      %{
        "model" => model,
        "size" => size,
        "quality" => "standard",
        "n" => 1,
        "prompt" => prompt
      }
      |> Jason.encode!()

    opts = [async: true, recv_timeout: 30_000, timeout: 30_000]

    response =
      HTTPoison.post!(
        endpoint,
        data,
        [
          {"Content-Type", "application/json"},
          {"Authorization", "Bearer #{openai_api_key}"}
        ],
        opts
      )

    response_body = 
      response
      |> Map.get(:body)
      |> Jason.decode!()

    # url in body: body["data"][0]["url"]
    url = response_body |> get_in(["data", Access.at(0), "url"])

  ...
Enter fullscreen mode Exit fullscreen mode

Here we are making the post with mostly default parameters. I did try out DALL-E 2 as well, which allows for different parameters and different options (like sizes).

One parameter I was a bit surprised to see missing is a seed, so I don't know if DALL-E allows you to create repeatable results.

You can see the OpenAI API reference here.

I also used the default response type, which is a url to a public image, however I could have also specified a different response_format and received the file contents as base64 encoded json.

Step 3 - Use the file

In my case I did want to download the file and manipulated it so I immediately take the URL and do some processing on it. Perhaps the b64_json response would have made more sense but I was already setup to handle urls so I left that code in place.

One question I have is how long the images would last on the OpenAI CDN if you wanted to directly use the url they give you in your app.

I didn't trust that the image would last forever so I took the url, downloaded the file, uploaded it to AWS to serve myself.

Here's how the end of my generate handler looks, after I get the generated image url:

    ...
    url = response_body |> get_in(["data", Access.at(0), "url"])
    {file_name, file_path} = download_and_resize_image(url)

    # upload to s3
    {:ok, file_binary} = File.read(file_path)

    write_file_to_s3(file_name, file_binary, "image/png")

    Teleprompt.Messaging.received_image(listener_code)

    {:noreply, nil}
  end

Enter fullscreen mode Exit fullscreen mode

I'll save some of the small details around resizing but the download code is simply a get and File.write:

  ...
  file_path = "/tmp/#{file_name}"
  {:ok, response} = HTTPoison.get(image_url)

  File.write!(file_path, response.body)
  ...
Enter fullscreen mode Exit fullscreen mode

That's really all there is to it. Now go make some images!

Top comments (0)