DEV Community

Cathy Lai
Cathy Lai

Posted on

I Thought My One Sentence Is Creating a Perfect GPT Image; Until I Realized It Had Been Learning My "Taste" All Along

When I first started experimenting with GPT image generation for my Garden Visualizer project, I honestly thought the workflow would be simple:

  1. Upload image
  2. Write one good prompt
  3. Receive perfect result

A prompt like:

“Add low-maintenance shrubs to this Auckland backyard.”

And it works?

Not exactly!…

After weeks of experimenting with ChatGPT app with staging photos, gardens, landscaping concepts, and “future visualizations”, I slowly realized something surprising:

The impressive results I was getting were not coming from a single prompt at all. They were coming from hidden context that ChatGPT had been accumulating throughout the conversation.

The Strange Thing I Noticed

At some point, I could type something incredibly vague like:

“Show after 3 years.”

…and GPT somehow knew:

  • Which shrubs I meant
  • Where the lemon tree it was
  • That I disliked overcrowded gardens
  • That I preferred realistic suburban styling
  • That I wanted sparse planting
  • That I preferred low-maintenance layouts

That was the moment I realized: The “prompt” was not the whole product. The conversation itself had become part of the generation engine.

My Wrong Assumption About the Image Edit API

Initially, I assumed the Image Edit API worked similarly to ChatGPT conversations.

I thought:

  • the model would somehow “remember”
  • the AI would learn over time
  • previous generations would influence future ones automatically

But API calls are usually stateless - which means every request is effectively isolated unless you manually resend the context.

So this:

{
  "users_image": "messy_backyard.jpg",
  "prompt": "plz generate a nicely layout idea for this garden"
}
Enter fullscreen mode Exit fullscreen mode

…actually contains almost no useful information.

The model has no idea:

  • How dense should be the plants
  • What realism level
  • What maintenance standard
  • What climate
  • What style direction
  • Budget constraints

The API is not “remembering”. MY app has to carry the memory.

How About Using Reference Images?

Then I got curious. Instead sending the original image and a pure text prompt, what if I also include example images I liked?

Suddenly the AI had a visual target. Not exact copying — but guidance. The reference image could influence:

  • density
  • lighting mood
  • realism
  • staging style
  • color palette
  • suburban vs luxury feel

And honestly, this may become one of the most important patterns in AI apps going forward.

Not:

“Here is my magic prompt.”

But:

“Here is the style, taste, realism, and emotional direction I want you to follow.”

Final Thought

I started this project thinking:

“The better the prompt, the better the image.”

Now I think a more accurate statement is:

“The better the accumulated context, the better the direction.”

And that subtle difference completely changed how I think about building AI products.

Top comments (0)