Cathy Lai

Posted on May 26

I Thought My One Sentence Is Creating a Perfect GPT Image; Until I Realized It Had Been Learning My "Taste" All Along

#promptengineering #ai #devjournal #buildinpublic

When I first started experimenting with GPT image generation for my Garden Visualizer project, I honestly thought the workflow would be simple:

Upload image
Write one good prompt
Receive perfect result

A prompt like:

“Add low-maintenance shrubs to this Auckland backyard.”

And it works?

Not exactly!…

After weeks of experimenting with ChatGPT app with staging photos, gardens, landscaping concepts, and “future visualizations”, I slowly realized something surprising:

The impressive results I was getting were not coming from a single prompt at all. They were coming from hidden context that ChatGPT had been accumulating throughout the conversation.

The Strange Thing I Noticed

At some point, I could type something incredibly vague like:

“Show after 3 years.”

…and GPT somehow knew:

Which shrubs I meant
Where the lemon tree it was
That I disliked overcrowded gardens
That I preferred realistic suburban styling
That I wanted sparse planting
That I preferred low-maintenance layouts

That was the moment I realized: The “prompt” was not the whole product. The conversation itself had become part of the generation engine.

My Wrong Assumption About the Image Edit API

Initially, I assumed the Image Edit API worked similarly to ChatGPT conversations.

I thought:

the model would somehow “remember”
the AI would learn over time
previous generations would influence future ones automatically

But API calls are usually stateless - which means every request is effectively isolated unless you manually resend the context.

So this:

{
  "users_image": "messy_backyard.jpg",
  "prompt": "plz generate a nicely layout idea for this garden"
}

…actually contains almost no useful information.

The model has no idea:

How dense should be the plants
What realism level
What maintenance standard
What climate
What style direction
Budget constraints

The API is not “remembering”. MY app has to carry the memory.

How About Using Reference Images?

Then I got curious. Instead sending the original image and a pure text prompt, what if I also include example images I liked?

Suddenly the AI had a visual target. Not exact copying — but guidance. The reference image could influence:

density
lighting mood
realism
staging style
color palette
suburban vs luxury feel

And honestly, this may become one of the most important patterns in AI apps going forward.

Not:

“Here is my magic prompt.”

But:

“Here is the style, taste, realism, and emotional direction I want you to follow.”

Final Thought

I started this project thinking:

“The better the prompt, the better the image.”

Now I think a more accurate statement is:

“The better the accumulated context, the better the direction.”

And that subtle difference completely changed how I think about building AI products.

DEV Community

I Thought My One Sentence Is Creating a Perfect GPT Image; Until I Realized It Had Been Learning My "Taste" All Along

The Strange Thing I Noticed

My Wrong Assumption About the Image Edit API

How About Using Reference Images?

Final Thought

Top comments (0)