DEV Community

Unlocking High-Quality Realistic Pictures: Tips and Tricks with AWS Bedrock

Why you should be familiar with Bedrock:

70% of enterprises leverage AI services for business growth and communication.
68% of marketing and event organizations use Generative AI to enhance user experience and engagement.
90% of customers believe Generative AI enhances existing sales services both online and in-store.
If your business isn't utilizing Generative AI yet, it's time to explore AWS Bedrock.

What Generative AI enables you to do:

Generate ideas, concepts, and drafts to boost human creativity.
Tailor personalized content based on individual preferences, improving user experience.
Automate content generation at scale, facilitating the production of large volumes.
Create realistic simulations.

What is AWS Bedrock?

A fully managed service.
Offers a range of high-performing foundation models (FMs) for building generative AI applications.

playgrounds
Provides interactive playgrounds for text, chat, and image exploration.

Orchestration knowledge base
Allows customization and fine-tuning of FMs using your own data while ensuring privacy, security, and compliance with standards like GDPR and HIPAA.

Single agent API
Offers a user-friendly Single agent API for easy application inference.

Serverless architecture
pay as you go, no long-term commitments, no infrastructure management, and automatic scalability. Focus on what you want to accomplish without worrying about connecting FMs with code.

Available Foundation Models:

AI21 Labs: Jurassic
Anthropic: Claude
Cohere: Command & Embed
Meta: Llama 2
Mistral AI: Mixtral 8x7B and Mistral 7B
Stability AI: Stable Diffusion XL
Amazon: Amazon Titan
Latest Supported Model: Claude 3

My Role:

Assisting financial service companies in adopting generative AI for streamlined development workflows and production-ready systems.
Focusing on foundation models Titan v1 (lite & express) and Claude 3.

Leveraging the Free Bedrock Environment:

China AWS provides a free bedrock environment for exploring Generative AI capabilities.
Generate countless pictures at no cost to showcase the performance of bedrock.

Creating Realistic Conceptual Pictures:

Developed a two-day Hong Kong on-site conference project for proof of concept.
Picture optimized for social media sharing to engage and captivate audiences.
Utilizing FMs Titan v1 and Claude 3 on AWS bedrock.

Enhancing Accuracy Tips and tricks:

Here are some methods to improve picture accuracy according to your expectations.

1 Bedrock's Limited Background Understanding
2 Bedrock's Hand Gesture Representation
3 Bedrock's Eye Contact and Quality

Issue: Bedrock's Limited Background Understanding

Bedrock often generates images with default, less exciting backgrounds, like crowded environment: Meetup, summit, conference.
Lack of comprehension about the venue's activities leads to less engaging visuals.

Image description

Creating Interesting Backgrounds:

Use vivid descriptions to help Bedrock capture your desired atmosphere and emotions.
Incorporate more generic environment details like outdoor, indoor, night, or sunny settings.

Desired Image Types:
Cloudy joyful after-parties, outdoor meetups, nightclubs, and restaurants.
Bedrock excels at producing amazing pictures with a joyful, chilly atmosphere and cloudy backgrounds.

Tips for Bedrock Usage:
Utilize exciting descriptions to convey your imagination and emotions effectively.
Avoid using generic or unexciting descriptions that may result in boring default images.

Advanced Tip for Background Fine-Tuning:
To further customize the background, include additional generic environment descriptions like outdoor, indoor, night, or sunshine settings.
Example: After-party
Image description

Issue: Bedrock's Hand Gesture Representation

Bedrock often produces images with strange hand gestures and unrealistic figures when holding props like cups and bags.

Image description

Understanding Bedrock's Limitations:
Hand gestures pose a challenge for many foundation models, not just Titan.
Changing the model won't resolve this issue.

Tips for Better Hand Gestures:
Avoid complex hand positions and actions like gripping a cup or holding a bag.
Foundation models excel at drawing fists or open palms without specific gestures.
Design tasks that involve props requiring a fist or open palm, such as controlling a joystick or holding a book.
Observe daily life to understand common hand gestures for more accurate representations.

Good News:
Foundation models can accurately depict hands forming a fist or showing the palm without any specific gesture.

Examples of Avoiding Hand Gestures:
Controlling a joystick (making a fist)
Holding a book (finger not interacting with the object, simple task)
Giving a thumbs-up (similar to making a fist)
DJ playing music on a panel (showing the back of the palm without any gesture)
Portrait shooting (crossing arms)

Image description

Image description

Tips for Realistic Hand Gestures:
Remember that foundation models lack human understanding and familiarity with human body structure.
Daily life observations can provide insights into natural hand gestures during various activities.

Advanced Tip for Hand Gestures:
To fine-tune hand gestures, consider using a backbone system that addresses both body and hand gestures directly.

Issue: Bedrock's Eye Contact and Quality

Bedrock often generates portrait images where models don't look at the camera or have unbalanced eye positions.
This results in odd and unprofessional pictures.

Image description

Understanding the Eye Contact Challenge:

Some foundation models struggle with portrait generation, and human sensitivity to eye contact exacerbates the issue.

Improving Image Quality:

Enhancing eye contact quality requires substantial effort and may not be suitable for proof of concept purposes.

Utilizing Titan's Advantage:

Titan produces better overall quality, including facial expressions and eye balance, especially for models wearing glasses.
Take advantage of this by incorporating "sunglasses" and specifying "under sunshine" to avoid strong eye contact.
This approach results in stylish and interesting images perfect for social media and conceptual art.

Image description

Making It More Engaging:

Titan performs exceptionally well with specific topics like Formula 1 and engineering uniforms.
Utilize this expertise by creating images related to realistic engineering, leveraging the AWS community's engineering background.

Image description

Tip for Random Results:

Generate images in batches of five and adjust the seed or introduce more random patterns to increase the chance of obtaining a masterpiece.

Image description

Consider Singular vs. Plural:

Titan may overlook plural references, resulting in single individuals instead of groups.
To ensure accuracy, use phrases like "a group of models" to obtain the desired outcome.
Understanding Object Relationships:

Image description

Titan struggles to comprehend object relationships in images.

Avoid complex tasks involving multiple objects, as the results may be humorous or unexpected.
Simplify the tasks to ensure more accurate and reliable image generation

Image description

Why you need to learn prompt engineering:

Importance of Prompt Engineering:

Prompt engineering is an accessible skill to learn but requires effort to master.
It demands patience for trial and error, as well as luck to generate captivating images.

Leveraging Model Advantages:

Understand the model's strengths to generate high-quality images.
For example, Titan excels at depicting uniforms, sunglasses, and models in sunny environments.

Image description

Avoiding Model Limitations:

Identify the model's weaknesses and avoid utilizing them.
For instance, Titan struggles with hand gestures when holding cups or depicting closely grouped individuals.

Image description

Image description

Image description

Image description

Image description

Bedrock's Benefits: What bedrock can help you:

Gaming Industry:
Quickly create conceptual art sets and demo videos for proof of concept projects.

Entertainment Companies or Small Businesses:
Generate venue visuals for floor plan collaboration with co-hosted parties.

Multichannel Market Selling:
Create visually appealing graphics and context for upselling and cross-selling on platforms like Facebook, LinkedIn, and Twitter.

Top comments (1)

Collapse
 
kennc profile image
Kenn C

Good article.