DEV Community

Cover image for AI Video Generation - Where are we today?
Balaji Jayakumar
Balaji Jayakumar

Posted on

AI Video Generation - Where are we today?

Have you ever wondered how AI is changing the world of video generation?

In recent months, there have been major advances in AI video generation. This technology is still in its early stages, but it has the potential to revolutionize the way we create and consume video content.

In this blog post, we will discuss the latest advances in AI video generation and explore the potential applications of this technology. We will also provide some examples of how AI video generation is being used today.

Lets Read on to learn more about this exciting new technology!

Generative AI for Images

A video is nothing but a set of arranged images displayed in a rapid succession. Let's have a look at Generative AI Imaging before we go to videos.

This is a very popular topic and it has already affected a lot of sectors including digital art, content-creation. When it comes to AI image generation there are 4 main players.

  • MidJourney
  • DALL-E
  • Adobe Firefly
  • Stable diffusion

These models are trained on massive datasets of images, and they can use this data to create new images that are both realistic and creative.

I went ahead and gave a same prompt to all 4 models.
PROMPT: Award-winning landscape photography
AI collage

AI for videos

It all started with this viral video of will smith eating spaghetti.

will smith

ModelScope is a diffusion model from Alibaba that can generate new videos from prompts. chaindrop used ModelScope to create a video of Will Smith eating spaghetti. They first generated the video at 24 FPS, then used Flowframes to increase the FPS to 48 and slow it down to half speed.

Of course, ModelScope isn't the only game in town regarding the emerging field of text2video. Runway debuted "Gen-2" and Stability AI also released an SDK for videos.

Dance v2v gif

With Stable Animation SDK, you can create animations in three different ways:

  1. Text to animation
  2. Text + image to animation
  3. Video + text to animation

While this is not anything new and a lot of community-made solutions like Deforum (https://lnkd.in/gb4eqbzg) have been existing for the past few months, the ease of use of an SDK will enable a lot of creative projects.

Human

Stable diffusion also released Control Net.
ControlNet is a neural network that adds extra conditions to diffusion models to control image generation. This allows for unprecedented levels of control over the content, style, and composition of generated images.

building gif

Cant wait to see what else the world has to offer!

Top comments (0)