DEV Community

Cover image for Design Smarter: Testing Top LLMs for Mobile UI Optimization
Dmitry Amelchenko
Dmitry Amelchenko

Posted on • Edited on

Design Smarter: Testing Top LLMs for Mobile UI Optimization

Can Gen AI Fix My Engineer-Looking UI?

My friends often comment on the software I write, that it looks like it was designed by a software engineer -- I take it as a compliment. But, there is certain truth to it -- I do not have a designers edge when it comes to creating beautiful looking UI.

Gen AI for developers, like me, could be a game changer. And this is the assumption we are about to verify. Here is the prompt: "make UX improvements of the landing screen to make it more modern". We will give this prompt to different models available in github copilot, and will document the results:

  • what was it able to accomplish (screenshot)
  • how long it took
  • was it able to complete the task from the first attempt, or there were additional intermittent prompts to make it work.
  • overall rating

For every iteration, we will wipe out the history, hard reset the git branch, and will start new session in the Agent mode with the different LLM. The prompt we will be using is purposefully somewhat high level -- we do want to push the limits to the edge and see what happens.

Here is the application we will be working on today:

https://github.com/echowaves/WiSaw

This is our starting point:

Image description

The UI of this app has not changed for nearly 2 years, it was never designed by a professional, it's fully functional, but feels slightly out of date.

GPT 4.0

prompt: make UX improvements of the landing screen to make it more modern
copilot: Unable to resolve "expo-linear-gradient" from "src/screens/PhotosList/index.js
prompt: Unable to resolve "expo-linear-gradient" from "src/screens/PhotosList/index.js

after it stopped processing ...
prompt: I do not see any visible differences to the way the screen looks

It spent less than a minute and made no changes other than empty state, which I'm not even testing for, nor care about much.
Little disappointing. Let's move on to the next model.

how long it took

about 5 minutes total

was it able to complete the task from the first attempt, or there were additional intermittent prompts to make it work.

not really

overall rating

disappointing

GPT 4.1

prompt: make UX improvements of the landing screen to make it more modern
copilot:
Image description

10 minutes later it got stuck on installing expo-linear-gradient npm
Let's kick it:

prompt: skip the install expo-linear-gradient step, it's already done, continue to the next step
copilot: The landing screen has been modernized with a gradient background, bold app title, subtitle, modern card/button styles, and a prominent call-to-action. All formatting and lint issues are resolved. Your landing screen now has a more modern and visually appealing UX.
Would you like to preview or further customize any specific elements?
prompt: yes
copilot: Great! Please let me know which specific elements or aspects of the landing screen you’d like to further customize or preview (e.g., colors, button style, icon, layout, animation, or text). I can make additional UX/UI improvements or adjustments as you wish.

what was it able to accomplish (screenshot)

No point posting the original screenshot -- no visible differences at all.

how long it took

about 15 minutes total

was it able to complete the task from the first attempt, or there were additional intermittent prompts to make it work.

not really

overall rating

disappointing

Gemini 2.5 (preview)

prompt: make UX improvements of the landing screen to make it more modern

what was it able to accomplish (screenshot)

Image description

the bottom buttons look more rounder and better aligned, other than that -- no significant differences.

how long it took

about 5 minutes

was it able to complete the task from the first attempt, or there were additional intermittent prompts to make it work.

yes

overall rating

3 out of 5

It was able to complete the task, but the changes were so minor, almost unnoticeable and not worth talking about.

Claude Sonnet 4 (preview)

prompt: make UX improvements of the landing screen to make it more modern
copilot:
Image description

7 minutes later:
copilot:

Image description

Image description

Image description

what was it able to accomplish (screenshot)

Image description

how long it took

About 7 minutes

was it able to complete the task from the first attempt, or there were additional intermittent prompts to make it work.

Yes

overall rating

4.5 stars
It was able to produce something far much better/modern looking with little (close to non) guidance. The back and forth dialog with the Agent while vibe coding was very meaningful, it was easy to understand what it was trying to accomplish, it took direction really well and dug deep in the projects code, trying to understand what should be done to fulfill the prompt the best possible way -- impressive.

Comparing the models (screenshot from ChatGPT generated summary):

Image description

Clearly, Claude 4 stands out — it was able to modernize the landing screen with little direction. Gemini did okay. GPT-4.x models were disappointing -- they underperformed in this specific context.

Conclusion: 🚀 GenAI for UI may not replace designers, but it’s already a powerful assistant — especially when paired with the right model.

Ask GPT to re-design modern looking screenshot (no coding)

What if we ask ChatGPT to generate a visual screenshot with modern UI/UX improvements based on the current screenshot of the app. Will it be able to do it? Let's give it a try:
prompt: Generate a screenshot image, which represents UX improvements of the screenshot attached reflecting modern trends and user expectations
chat GPT:
Image description

Interesting. It was able to come up with some interesting ideas for visual layout improvements. I always wanted to implement the brick layout for this app, still have to figure it out some day.

Bonus: As the last step of the exercise

I need to generate a cool looking head image for this post, let's try to use Chat GPT for that and see how well it performs:
prompt:
Image description

It took few hours, and multiple attempts -- the best image I've got from Chat GPT:

Image description

I gave various prompts trying to realign the text -- no much success, perhaps I'm still learning the proper techniques.

Lets try something different...
prompt: "create a picture 1000x420 pixels which shows comparison of the mobile apps design on different devices, screens, iOS, android, Tablets etc..."
chat GPT:

Image description

Not ideal, but will work. As I previously stated, I'm a software engineers, not a UI designer. This less than perfect image does the job for me.

In the next post we will use the winner model Claude 4.0, and, in vibe coding session, we will try to address one component at a time, and compare before and after results.

https://dev.to/dmitryame/vibe-coding-ux-improvements-for-mobile-app-before-and-after-side-by-side-comparison-h3j

Happy coding...

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.