This is a simplified guide to an AI model called Stable-Diffusion-3 maintained by Stability-Ai. If you like these kinds of analysis, you should join AImodels.fyi or follow us on Twitter.
Model overview
Stable Diffusion 3 is a text-to-image model developed by Stability AI that has significantly improved performance compared to previous versions. It can generate high-quality, photo-realistic images from text prompts with better understanding of complex prompts, improved typography, and more efficient resource usage. Stable Diffusion 3 builds upon the capabilities of earlier Stable Diffusion models, as well as related text-to-image models like SDXL and Stable Diffusion Img2Img.
Model inputs and outputs
Stable Diffusion 3 takes in a text prompt, a cfg (guidance scale) value, a seed, aspect ratio, output format, and output quality. It generates an array of image URLs as output. The guidance scale controls how closely the output matches the input prompt, the seed sets the random number generator for reproducibility, and the other parameters allow customizing the image generation.
Inputs
- Prompt: The text prompt describing the desired image
- Cfg: The guidance scale, controlling similarity to the prompt
- Seed: A seed value for reproducible image generation
- Aspect Ratio: The aspect ratio of the output image
- Output Format: The file format of the output image, such as WEBP
- Output Quality: The quality setting for the output image
Outputs
- Array of Image URLs: The generated images as a list of URLs
Capabilities
Stable Diffusion 3 demonstrates sign...
Top comments (0)