DEV Community

Connie Leung
Connie Leung

Posted on

Interpolate a Video from the First and Last Frames with Veo 3.1 and Nano Banana

Google released Veo 3.1 with new features and one of them is using the first and last frames to interpolate a video. When leveraging the image generation of Gemini 2.5 Flash Image model (a.k.a Nano Banana), Veo 3.1 model can use the first and last images to generate a short video.

This interpolation feature is specifically enabled by the lastFrame parameter, a capability exclusive to the latest Veo 3.1 model.

This blog post describes how I used Gemini 2.5 Flash Image model to implement a visual story feature to create a sequence of images. Then, Veo 3.1 uses the first and last images to create the dynamic content of a video.

Configure Veo Model Used

This feature exists in Veo 3.1 only, so the application should not call it when Veo 2 or Veo 3 is used. I added a new environment variable, IS_VEO31_USED, to indicate whether or not the latest Veo model is used. Later, the value is injected into the application to control the argument that the Gemini API receives to generate a video.

IS_VEO31_USED="true"
Enter fullscreen mode Exit fullscreen mode
// firebase-ai.config
{
  // ... other configuration values
  "is_veo31_used": true
}
Enter fullscreen mode Exit fullscreen mode
// gemini provider

import firebaseConfig from '../../firebase-ai.json';
import { GEMINI_AI, IS_VEO31_USED } from '../constants/gemini.constant';

export const IS_VEO31_USED = new InjectionToken<boolean>('IS_VEO31_USED');

export function provideGemini() {
  return makeEnvironmentProviders([
    {
      provide: IS_VEO31_USED,
      useValue: firebaseConfig.is_veo31_used,
    }
  ])
}
Enter fullscreen mode Exit fullscreen mode

IS_VEO31_USED is an injection token that injects the static value of firebaseConfig.is_veo31_used. In this case, the value is "true".

This pattern cleanly passes the static configuration value (firebaseConfig.is_veo31_used) into any service or component, ensuring the application consistently knows which Veo model is being targeted.

export const appConfig: ApplicationConfig = {
  providers: [
    provideGemini(),
  ]
};
Enter fullscreen mode Exit fullscreen mode

The provideGemini function is provided to the application application, so the value is ready after application launch.

Building a Visual Story Generator with Nano Banana

Define Steps Prompts in Visual Story Service

// Visual Story Service

@Injectable({
  providedIn: 'root'
})
export class VisualStoryService {
  buildStepPrompts(genArgs: VisualStoryGenerateArgs): string[] {
    const { userPrompt, args } = genArgs;
    const currentPrompt = userPrompt.trim();

    if (!currentPrompt) {
      return [];
    }

    const stepPrompts: string[] = [];

    for (let i = 0; i < args.numberOfImages; i++) {
      const storyPrompt = this.buildStoryPrompt({ userPrompt: currentPrompt, args }, i + 1);
      stepPrompts.push(storyPrompt);
    }

    return stepPrompts;
  }

  private buildStoryPrompt(genArgs: VisualStoryGenerateArgs, stepNumber: number): string {
    const { userPrompt, args } = genArgs;
    const { numberOfImages, style, transition, type } = args;
    let fullPrompt = `${userPrompt}, step ${stepNumber} of ${numberOfImages}`;

    // Add context based on type
    switch (type) {
      case 'story':
        fullPrompt += `, narrative sequence, ${style} art style`;
        break;
      case 'process':
        fullPrompt += `, procedural step, instructional illustration`;
        break;
      // ... other type of visual story
    }

    if (stepNumber > 1) {
      fullPrompt += `, ${transition} transition from previous step`;
    }

    return fullPrompt;
  }
}
Enter fullscreen mode Exit fullscreen mode

The VisualStoryService accepts form values to create step prompts for a sequence of images.

// Example: userPrompt="A wizard making a potion", args.numberOfImages=3, args.type='story'

Step 1 Prompt: "A wizard making a potion, step 1 of 3, narrative sequence, cinematic art style"
Step 2 Prompt: "A wizard making a potion, step 2 of 3, narrative sequence, cinematic art style, liquid dissolving transition from previous step"
Step 3 Prompt: "A wizard making a potion, step 3 of 3, narrative sequence, cinematic art style, liquid dissolving transition from previous step"
Enter fullscreen mode Exit fullscreen mode

Interpolate a video

@Injectable({
  providedIn: 'root'
})
export class GenMediaService {
  private readonly geminiService = inject(GeminiService);

  async generateVideoFromFrames(imageParams: GenerateVideoFromFramesRequest) {
    const isVeo31Used = imageParams.isVeo31Used || false;

    const loadVideoPromise = isVeo31Used ?
       this.geminiService.generateVideo({
          prompt: imageParams.prompt,
          imageBytes: imageParams.imageBytes,
          mimeType: imageParams.mimeType,
          config: {
            aspectRatio: '16:9',
            resolution: "720p",
            lastFrame: {
              imageBytes: imageParams.lastFrameImageBytes,
              mimeType: imageParams.lastFrameMimeType
            }
          }
       }) : this.getFallbackVideoUrl(imageParams);

      return await loadVideoPromise;
    }
  }

  private async getFallbackVideoUrl(imageParams: GenerateVideoRequestImageParams) {
    return this.geminiService.generateVideo({
      prompt: imageParams.prompt,
      imageBytes: imageParams.imageBytes,
      mimeType: imageParams.mimeType,
      config: {
        aspectRatio: '16:9',
      }
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

The GenMediaService service provides a generateVideoFromFrames method to call the Gemini API to interpolate a video using the first image (imageBytes and mimeType) and the last image (config.lastFrame.imageBytes and config.lastFrame.mimeType). The config.lastFrame is what enables the interpolation to happen in Veo 3.1.

When the isVeo31Used flag is false, the getFallbackVideoUrl method generates a video with the first image and aspect ratio only. The application could be using an older video that does not support resolution. Therefore, the config object leaves the resolution property out for safe backward-compatibility practice.

// Visual Story Service

interpolateVideo(request: GenerateVideoFromFramesRequest): Promise<VideoResponse> {
    return this.genMediaService.generateVideoFromFrames(request);
}
Enter fullscreen mode Exit fullscreen mode

The VisualStoryService delegates the task to the GenMediaService to generate the video.

Video Interpolation Component

<app-visual-story-video
    [userPrompt]="this.promptArgs().userPrompt"
    [images]="this.genmedia()?.images()"
/>
Enter fullscreen mode Exit fullscreen mode

I created an Angular component to interpolate a video and play it in a video player.

@Component({
  selector: 'app-visual-story-video',
  imports: [...import components...],
  template: `
    @if (canGenerateVideoFromFirstLastFrames()) {
      <button type="button (click)="generateVideoFromFrames()">
          Interpolate video
      </button>

      @let videoUrl = videoResponse()?.videoUrl;
      @if (isLoading()) {
        <app-loader />
      } @else if (videoUrl) {
        <app-video-player class="block" [videoUrl]="videoUrl" />
      }
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush,
})
export default class VisualStoryVideoComponent {
  private readonly visualStoryService = inject(VisualStoryService);
  private readonly isVeo31Used = inject(IS_VEO31_USED);

  images = input<ImageResponse[] | undefined>(undefined);
  userPrompt = input.required<string>();

  isLoading = signal(false);
  videoResponse = signal<VideoResponse | undefined>(undefined);

  firstImage = computed(() => this.images()?.[0]);
  lastImage = computed(() => {
    const numImages = this.images()?.length || 0;
    return numImages < 2 ? undefined : this.images()?.[numImages - 1];
  });

  canGenerateVideoFromFirstLastFrames = computed(() => {
    const hasFirstImage = !!this.firstImage()?.data && !!this.firstImage()?.mimeType;
    const hasLastImage = !!this.lastImage()?.data && !!this.lastImage()?.mimeType;
    return this.isVeo31Used && hasFirstImage && hasLastImage;
  });

  async generateVideoFromFrames(): Promise<void> {
    try {
      this.isLoading.set(true);
      this.videoResponse.set(undefined);

      if (!this.canGenerateVideoFromFirstLastFrames()) {
        return;
      }

      const { data: firstImageData, mimeType: firstImageMimeType } = this.firstImage() || { data: '', mimeType: '' };
      const { data: lastImageData, mimeType: lastImageMimeType } = this.lastImage() || { data: '', mimeType: '' };
      const result = await this.visualStoryService.interpolateVideo({
        prompt: this.userPrompt(),
        imageBytes: firstImageData,
        mimeType: firstImageMimeType,
        lastFrameImageBytes: lastImageData,
        lastFrameMimeType: lastImageMimeType,
        isVeo31Used: this.isVeo31Used
      });
      this.videoResponse.set(result);
    } finally {
      this.isLoading.set(false);
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

The canGenerateVideoFromFirstLastFrames computed signal conditionally displays the 'Interpolate video' button. When the button is visable, there are two images for interpolation. When the video URL is available, the videoREsponse signal is overwritten with the new value. Then, the video player component automatically plays it.

Resources

Top comments (0)