DEV Community

Dale Nguyen
Dale Nguyen

Posted on • Edited on

Audio 2 Text 2 Image Generation with Analog & Cloudflare Worker AI

This is a submission for the Cloudflare AI Challenge.

What I Built

This is simple app where you generate images from text input.

Demo

Image description

Demo link: https://cloudflare-challange.pages.dev/

My Code

You can check my code at: https://github.com/dalenguyen/cloudflare-challenge

Journey

This is an interesting challenge since I haven't used CloudFlare Pages to deploy web applications. Turns out that, the deployment process is really straightforward and can be done via Cloudflare dashboard.

Another thing is that this's done with Analog - a full-stack Angular meta framework which means that you create an entire application with full support from backend.

Here is the stack detail:

  • Analog
  • Nx Workspace
  • Github
  • Cloudflare Pages
  • Worker AI
  • @cf/bytedance/stable-diffusion-xl-lightning for text to image model generation
  • @cf/openai/whisper for audio to text
  • uform-gen2-qwen-500m for image to text

Multiple Models and/or Triple Task Types

I combined three 3 models to do different tasks that support image generation:

  • Audio to text: listen to voice command and apply to the input field
  • Text to image: generate image from text input
  • Image to text: provide further description on generated image

Top comments (7)

Collapse
 
uzondu9 profile image
Uzondu • Edited

Great job Dale Nguyen. You were fast ⚡. For me though , it isn't that straight forward. At first i thought you could create any frontend application locally, then
deploy it to cloudflare workers. Before that or while deploying it you could then integrate its AI models. So i thought maybe i would only use HTML CSS and JavaScript. But then i realized that
these applications created using cloudflare features , use other files and languages such as typescript,
.json files and others which i haven't been exposed to. So how do i do this ? I am supposed to learn new languages ? If so which?

Collapse
 
dalenguyen profile image
Dale Nguyen

All you need is JavaScript. TypeScript is basically JavaScript with type.

In my case, there're a frontend app and a backend that handles requests from the frontend. It's because the AI Token shouldn't be exposed on the frontend, so you have to hide it in the backend. Any meta framework can help you with that such as Next.js.

If you want to fast start, try the example from Cloudflare and start from there: developers.cloudflare.com/workers-...

Collapse
 
uzondu9 profile image
Uzondu • Edited

Thanks alot for this information Dale Nyugen , i am very grateful 🤝💌.
I'll check it out.

Thread Thread
 
dalenguyen profile image
Dale Nguyen

Cool. Keep me posted. If you have any questions, just let me know :D

Thread Thread
 
uzondu9 profile image
Uzondu • Edited

I have a little problem. While I was trying to create a new worker on Cloudflare dashboard ,
there at the bottom of the page the 10020 error message kept throwing up. If you have experienced this or you
know about this maybe you can help me out. However , if don't know about this at all , then sorry for the trouble. By the way I this problem occurred on Edge, Chrome and Firefox.

Thread Thread
 
dalenguyen profile image
Dale Nguyen

I haven't seen it. If you want to chat about the cloudflare, you can join the discord. I'm also there :D

discord.gg/cloudflaredev

Collapse
 
jess profile image
Jess Lee

Nice work!