Veo 3 is a powerful video generation model by Google Deepmind, which was originally available through Vertex (Google Cloud). Now available with Google Gemini Pro, it can be used with limited access.
The usage is limited to 3 videos per day and also the video that can be generated is 8 seconds. Though the numbers are very small, it can be fun to experiment with it.
Here are some of the prompts that I tried with this video generation:
For the initial attempt, I went with one of the traditional concepts using a cat and it was good.
A kitten playing guitar in a street
For the next one, I tried to get a video of screen recording/sharing of a code editor with some Python code. Even though the specifications are mentioned in the prompt, the video that was generated by the model wasn’t very accurate with respect to the specifics.
A screenshare video (macOS — Visual Studio Code Editor) of typing a Python code (code for factorial of a number) without any audio
It was able to create the title for the video very specific to my description in the prompt but not the video :)
I got to know that the model is able to generate only the specifics related to an object in the video or the scenery, but not like text or any other little items which are in very small level of pixels.
The above examples are with the mode of Text-to-Video. Additionally, you can try to create a video with an image input (something like Image-Text-to-Video) with the prompt to the model. This is still limited to the 8-seconds video length restriction.
This video generation model can also be accessed via Gemini API on a paid-preview basis (more details on the official documentation page). Try out some of your own prompts and play with the model.
Happy Learning !!
Top comments (0)