18 months of building with LLMs. Here's what survived my actual workflow.
The Stack
Not the trendy stuff. The tools I reach for every day.
Claude Pro — $20/mo
Core LLM. I use the API for automation and claude.ai for exploration. The 200K context window is genuinely class-leading.
Why it wins: Context. I can give it an entire codebase and say "what does this do?" and get a coherent answer.
Cursor — $20/mo
VS Code fork with AI baked in at the core. Not a plugin — an IDE designed around AI.
Why it wins: Codebase awareness. It indexes your project and actually understands context across files.
Pydantic — Free
For structured output from LLMs. Define your schema once, get validated output.
from pydantic import BaseModel
class WeatherResponse(BaseModel):
city: str
temp_c: float
condition: str
result = llm.parse_pydantic(user_prompt, WeatherResponse)
No more manual JSON parsing and validation.
Helicone — Free tier
LLM observability. See what's being sent to your models, track costs, spot patterns in failures.
Why it wins: Doesn't slow down your code. Drop-in logging that actually tells you useful things.
What I Stopped Using
LangChain: Too much abstraction for what it gives you. Raw API calls + Pydantic = 90% of what LangChain provides without the complexity.
向量数据库 for everything: Everyone's reaching for vector DBs. Most of the time, a simple keyword search or relational DB is faster and more reliable.
Complex prompt chaining: If your workflow needs 5 LLM calls chained together, your architecture is probably wrong.
The Honest Take
LLM tooling has matured. The "best" tools are the boring ones that stay out of your way:
- Claude for intelligence
- Cursor for coding
- Pydantic for structure
- Helicone for observability
Everything else is context-dependent. These are what survived 18 months of real work.
Writing about what actually works, not what's trendy.
Top comments (0)