Over the past year, I've completed five projects for medium and large established companies. Four were greenfield builds from scratch; the other involved integrating AI features into an existing platform.
I'm fundamentally a software engineer who's deeply curious about AI. I've gravitated here because it's fascinating.
That said, if you need to pin me on the AI spectrum, AI Engineer fits best. The field is still evolving, so titles are a bit of a mess. I mean, we're still debating software engineering titles after decades!
In this post, I thought I'd tell you more about what I actually do day-to-day and what I believe makes me an AI Engineer.
Why "AI Engineer"?
While looking for gigs, I’ve seen a LOT of job titles. They're like a startup's org chart - fluid and confusing. Here's some that I saw and what job descriptions and requirements they had:
- Machine Learning Engineer: Heavy on deploying ML models at scale, often with a focus on production pipelines (e.g., using Kubernetes for model serving). It's more ops-intensive.
- Data Scientist: This is a classic data role. They dive into data analysis, hypothesis testing, and storytelling with stats. They might build prototypes in Jupyter notebooks and prepare a presentation about findings, but they're not usually wiring it all into a full app.
- Data Engineer: The pipeline maestro. They build ETL processes, manage data lakes on AWS S3 or GCP BigQuery, and ensure data flows cleanly.
- Deep Learning Engineer: Specialised in neural nets, often researching architectures like transformers or GANs. Think tweaking layers in PyTorch for cutting-edge performance.
- AI Solutions Architect: Basically a Solutions Architect (or DevOps?) that knows how to work with AI-specific tools. They design high-level systems, advise on cloud setups (e.g., Azure AI vs. Google Vertex), and align tech with business goals. Less coding, more diagramming and cloud infrastructure setup.
- Research Engineer: Academia-adjacent, pushing boundaries with papers. They invent new algorithms, not integrate them into products.
- Algorithm Engineer: I've seen this pop up too: focused on optimising core math, like efficient sorting for AI inference.
The AI space is still nascent, so overlap is huge, and for some roles the title and description were completely different from what I have here.
For me, AI Engineer screams hands-on builder. I'm not theorising in a lab or handling streams of petabytes of data, but rather I'm crafting end-to-end solutions that work in the real world.
What does an AI Engineer actually do?
At my core, I'm solving software engineering problems in an AI context. I take AI capabilities (like pre-trained models from Azure AI Foundry or OpenRouter) and make them work in production.
Here's a breakdown of my toolkit and typical tasks:
1. Model setup and tweaking
On every project I had to choose a foundation model (e.g., OpenAI GPT-5, XAI Grok-4, Gemini-2.5 from Google AI, or Google Vertex).
So far I never had to fine-tune a model for a production-ready project; most things were resolved via different prompting or extra context. I did fine-tune once in a hackathon we had at Sainsbury’s, and if I had to do it for a project, I would just use standard platform-provided tools.
For example, Azure has a guide on techniques, fine-tuning data formats, and tools it has to improve general LLMs. This would be my starting point.
Another important and frequent task I found myself doing is testing and experimenting with prompts, handling token limits, and just making sure it behaves more or less deterministically.
“Deterministic” as in “trying to control” or “predict” what the output will be so that we could prep our code/app/ui for this.
Counterintuitive? Yes. But that’s what a lot of companies want, that’s what I’ve seen the most amongst requirements.
2. Integration and infrastructure
LLM calls are just API calls to someone else’s infrastructure. These calls can’t exist on their own; you have to embed them into a framework.
Of course, some LLM- or AI agent-oriented frameworks have a built-in server, like LangGraph server, but not all, and quite often those are raw. I, personally, don’t like using those in production.
So what I did was set up either a FastAPI server, where I could add auth middleware, database integrations, custom routing, or just use existing Next.js or Astro API routes as a back-end to expose AI models or agents, and custom SSE or WebSockets setup to handle real-time responses in the front-end.
With that setup came lots of Docker containers. Lots and lots of Docker images… 😄
3. Frontend and user experience
Why would you do all this infrastructure if no one will be able to interact with it? So we’re putting on a front-end engineer hat and handle all those streaming chat responses token-by-token for that snappy feel with React and TypeScript.
Moreover, LLMs not only generate text but also images. And they accept images and docs. So I spent a lot of time debugging multimodal issues, figuring out why an LLM rejects an image, converting it to base64, and sending it correctly via REST API.
And then doing the same vice versa.
Also, lots of parsing and formatting tasks, because LLMs don’t like HTML. So if you scraped a page and want to feed it into an LLM, then you better convert it to markdown.
Same goes for tables. LLMs don’t understand tables. Transforming them to an LLM-appropriate markdown or JSON is what I did.
4. End-to-end problem solving
Just like any software project, there’s a lot more to handle end-to-end, like observability and monitoring with tools like Azure Insights, Sentry, LangSmith; ensuring low latency; adding guardrails and AI configs so that AI doesn’t say anything inappropriate; API rate limiting; error handling (do you send an error as a chat message or show a toast with an error message?); handling secrets (because we’re not inexperienced vibe coders who push secrets to public repos, are we?)…
Essentially, all of the classical software engineering, I’d say. Just with a bit of AI twist. 🤷
What I’m not touching
I'm not touching the "data territory" much. If you need me to:
- Scour datasets, clean outliers with Pandas,
- Train from scratch on SageMaker,
- Evaluate with metrics like ROC-AUC or Weights & Biases dashboards,
- Iterate on hyperparameters...
...I can do it (I know the theory from self-study and curiosity). But that's veering into Data Scientist/Engineer land, and if you’re not doing PoC you better find someone who’s better at this.
My sweet spot is software issues. Things like:
- Debugging why a model inference fails in production.
- Optimising API endpoints for 100ms response times.
- Ensuring cross-browser compatibility for AI-generated content.
- Scaling from prototype to handling 10k concurrent users.
- Integrating with legacy systems (e.g., bridging AI outputs to a monolithic Java backend).
I’m a blue-collar worker of the AI World
In essence, as an AI Engineer, I'm the bridge between fancy AI models and real apps. I handle the coding, setup, and fixes to make everything work smoothly in production. From my five projects with different companies, I've learned this role is all about practical building, not data crunching or research.
No PhD needed. Just strong coding skills and a love for getting AI into users' hands.
Top comments (1)
Love how you broke down the title confusion - it's real. I've been building AI tools for mental health tracking and every project feels like redefining what "AI Engineer" even means.
Your point about being a "hands-on builder" resonates hard. Most of my work is less about training models and more about making GPT-4/Claude actually useful in production. The gap between a cool demo and something that handles real user data reliably is... significant.
One thing I'd add: the integration work is where most time goes. Not the AI part itself, but connecting it to existing systems, handling edge cases, and making sure it doesn't hallucinate when users depend on it. That's where software engineering fundamentals matter way more than knowing the latest transformer architecture.
Curious - have you found prompt engineering becoming a bigger part of the job than expected? For me it's become almost half the work.