When you wrap an external API in Elixir (a module called Acme for whatever the upstream service is), most of what Acme.fetch_user/1 does is response handling: check that the response means what you expect, pull data out of it, translate failures into something the rest of your application can use. The code that does this almost always lives in your function body, somewhere after the Tesla.get call. A case, or a handle_response/1 chained on at the end of a |> pipeline.
This article is about why that's the wrong place for it. Not because the code is ugly, but because of one structural difference between a |> chain and a Tesla middleware: middleware can fail the request itself, in a way that lights up your telemetry. A function call after Tesla.get cannot. And that difference is the entire reason your observability stack (logs, APM, alerts, whatever you've wired up to catch outages) is undercounting them.
⚠️ A note before we start. Code samples use Tesla. For the purposes of this article (middleware-style extension points, telemetry support, request/response pipelines), Tesla and Req have functional parity, so the same shapes translate cleanly to Req. Pick whichever you're using and translate as you go.
Real APIs don't read their own docs
A quick reality check before the diagnosis, because most of the bad code I see assumes a world that doesn't exist.
Hello-world examples in tutorials usually pick GitHub or some other well-behaved REST API. Real APIs are messier. A few examples off the top of my head:
- I've seen well-known, large providers respond with bodies that fail to parse. Actually broken JSON, in production 🙃
- A field documented as a
booleanreturns the JSON string"true"for true and, more impressively, the JSON string"No"for false. - A field documented as a string returns
"N/A", or just"", to mean "no data".
The list goes on. Some upstreams really are this sloppy. Even when they aren't, your API client still has to translate external bytes into something your application can rely on, and parsing those bytes is a thing that can fail, just as fundamentally as a connection timeout can fail. And, hold this thought, if a malformed JSON makes Tesla return {:error, ...} and you accept that as a failed request, a malformed value inside an already-decoded JSON should be the same kind of failure. Same problem, one layer up.
The pattern that quietly fails
Here is a fetch_user/1 written in a pattern I see a lot. handle_response/1 is a shared helper: every endpoint in the module funnels through it. fetch_user, create_user, delete_invoice all end with |> handle_response(): one place for the wrap, one place for error translation. We're showing one endpoint here to keep the example small.
def fetch_user(id) do
case client() |> Tesla.get("/users/#{id}") |> handle_response() do
{:ok, body} -> {:ok, decode_user(body)}
{:error, :not_found} -> {:error, :not_found}
_ -> {:error, :service_unavailable}
end
end
defp handle_response({:ok, %Tesla.Env{status: status, body: body}})
when status in 200..299 do
{:ok, body}
end
defp handle_response({:ok, %Tesla.Env{status: 404}}) do
{:error, :not_found}
end
defp handle_response({:ok, %Tesla.Env{status: status, body: body}}) do
Logger.error("Acme API returned unexpected status #{status}: #{inspect(body)}")
{:error, {:unexpected_status, status, body}}
end
defp handle_response(result), do: result
defp decode_user(body) do
{:ok, created_at, _} = DateTime.from_iso8601(body["created_at"])
%{
id: body["id"],
first_name: body["first_name"],
last_name: body["last_name"],
email: body["email"],
created_at: created_at
}
end
The case maps every outcome to something honest: 2xx becomes {:ok, decode_user(body)}, 404 becomes {:error, :not_found}, anything else becomes {:error, :service_unavailable}. From the caller's perspective, this is a clean API.
But here's the trap. When the upstream returns a 500, handle_response/1 produces {:error, {:unexpected_status, 500, body}}. The case falls through to _ -> {:error, :service_unavailable}. The caller gets a clean tagged tuple, and nothing about that path is wrong from the function's point of view.
The thing that is wrong is one layer below. From the HTTP client's point of view, the request succeeded: bytes went out, bytes came back, the body decoder ran, the middleware chain returned {:ok, env}. The standard Tesla telemetry event, the one your observability stack is wired up to, also says success, and the dashboard stays green. And it's not only dashboards: Tesla.Middleware.Retry with its default should_retry (a match on {:error, _}), or anything else that decides what to do next based on the result of the request, sees {:ok, env} and stays quiet. The 500 only becomes an "error" inside handle_response/1's return value and inside the case mapping, both of them in your application code, after the request lifecycle is already over.
That's also why Logger.error lines like the one above start showing up inside handle_response/1 in the first place. Someone notices the dashboards aren't catching 5xx and patches the gap by hand. The error reaches the logs, but it sidesteps the existing telemetry pipeline entirely, and every API client module ends up reinventing its own logging conventions.
And status codes aren't the only place this pattern misbehaves. Look at decode_user/1 again: it returns the user map directly, with no way to signal a parse error. If the upstream renames email to email_address, body["email"] is nil, and the function quietly returns a user with nil where the email should be. The caller has no idea the contract was violated. If body["created_at"] is missing or malformed, DateTime.from_iso8601/1 returns {:error, _} and the {:ok, created_at, _} = ... match raises a MatchError. Phoenix or Oban catches it as an exception, but Tesla still considers the request successful and the telemetry event still says :ok.
Middleware can fail the request
This is the most underrated thing about middleware in Tesla pipelines. They look like a glorified |> chain: middleware in a list, requests flow through, responses come back. And if that were all they were, you could replace them with a handle_response/1 chained after the call and lose nothing.
But middleware can do something a |> chain in your function body cannot: fail the request itself. When a middleware returns {:error, ...} from Tesla.run, the request short-circuits. The surrounding telemetry middleware emits its [:tesla, :request, :stop] event with error: reason in the event metadata instead of just env: env, and the caller of Tesla.get gets back {:error, ...} (or, for Tesla.get!, a Tesla.Error exception). A handle_response/1 chained on after the call can wrap the result in any tagged tuple it wants, but it has no way to reach back and mark the request itself as failed, so none of the failures it detects show up in telemetry. Only middleware can do that.
A 404 from fetch_user/1 is fine: the user doesn't exist, return {:error, :not_found}. A 404 from a POST /users endpoint is a different story: the URL is wrong, or a route got renamed, and the request is structurally broken. Exception territory. Same status, opposite semantics. The middleware can't decide globally what counts as "ok"; the expectations have to come from the caller.
Doing it in middleware
A Tesla middleware that reads, per call, which statuses are expected and how to parse their bodies. Acme.fetch_user/1 rewritten:
def fetch_user(id) do
case Tesla.get(client(), "/users/#{id}",
opts: [
parse_body: %{
200 => &decode_user/1,
404 => fn _ -> {:ok, nil} end
}
]) do
{:ok, %Tesla.Env{status: 200, body: user}} -> {:ok, user}
{:ok, %Tesla.Env{status: 404}} -> {:error, :not_found}
{:error, _} -> {:error, :service_unavailable}
end
end
defp decode_user(body) do
with :ok <- check_required(body, ["id", "first_name", "last_name", "email", "created_at"]),
{:ok, created_at, _} <- DateTime.from_iso8601(body["created_at"]) do
{:ok,
%{
id: body["id"],
first_name: body["first_name"],
last_name: body["last_name"],
email: body["email"],
created_at: created_at
}}
end
end
defp check_required(body, fields) do
case Enum.filter(fields, &is_nil(Map.get(body, &1))) do
[] -> :ok
missing -> {:error, {:missing_fields, missing}}
end
end
The middleware behind parse_body:
defmodule ParseBody do
@behaviour Tesla.Middleware
@impl true
def call(env, next, _opts) do
with {:ok, env} <- Tesla.run(env, next) do
parsers = Keyword.fetch!(env.opts, :parse_body)
case Map.fetch(parsers, env.status) do
:error ->
{:error,
{:unexpected_status,
status: env.status, expected: Map.keys(parsers), body: env.body}}
{:ok, parser} ->
case parser.(env.body) do
{:ok, parsed} ->
{:ok, %{env | body: parsed}}
{:error, reason} ->
{:error,
{:parse_error, status: env.status, reason: reason, body: env.body}}
end
end
end
end
end
Each parser is a function from the response body to {:ok, value} | {:error, reason}. decode_user/1 collects every required field that's missing or null, parses the timestamp, and returns {:ok, %{...}} only when every step succeeds. Any of those failures short-circuits the with to its own tagged tuple. The middleware turns that into a failed Tesla request, not a separate validation step the caller has to remember to run.
The function body now matches a clean contract: 200 maps to {:ok, user}, 404 to {:error, :not_found}, every other outcome (including {:error, {:unexpected_status, ...}} and {:error, {:parse_error, ...}} from ParseBody) collapses into {:error, :service_unavailable}. The when status in 200..299 guard is gone. The handle_response/1 helper is gone. The Logger.error line is gone with it: the failed request now fires [:tesla, :request, :stop] with the error in metadata, and whatever you've wired up to that event handles the logging.
Notice that {:error, _} swallows the structured errors, so the function doesn't need to unpack them.
Order matters when wiring it in. ParseBody goes after Tesla.Middleware.Telemetry (so Telemetry records its {:error, _} as the request's result) and before Tesla.Middleware.JSON (so the body is already decoded when ParseBody sees it):
defp client do
Tesla.client([
{Tesla.Middleware.BaseUrl, "https://api.acme.com"},
{Tesla.Middleware.Telemetry, metadata: %{service: :acme}},
ParseBody,
Tesla.Middleware.JSON
])
end
That's the response contract enforced at the transport boundary, with zero protocol-level code in any of your API client functions. Schema mismatches and unexpected statuses now show up in the same telemetry event, the same dashboards, the same exception path as connection timeouts. Your operational view of the system tells the truth.
A note on Tesla.get!
If retrying or falling back on a transient outage isn't part of how the caller actually behaves, :service_unavailable is ceremony. Consider how application code typically treats its own database: nobody pattern-matches {:error, :service_unavailable} on every Repo.get/2. We assume the database works; if it doesn't, the request crashes and the supervisor or web framework deals with it.
You can apply the same discipline to an upstream API. Use Tesla.get! instead of Tesla.get, drop :service_unavailable from the contract, and let any failed request become a Tesla.Error exception. Phoenix turns it into a 500. Oban retries the job. The two clauses left in fetch_user/1 collapse to:
case Tesla.get!(...) do
%Tesla.Env{status: 200, body: user} -> {:ok, user}
%Tesla.Env{status: 404} -> {:error, :not_found}
end
Whether to do this depends on the upstream's stability and the caller's context. For an unstable third party where the caller has a reasonable fallback, {:error, :service_unavailable} is the honest choice. For a service you treat with the same trust as your database, the bang version is cleaner. Both are fine; just pick deliberately.
When it wants to be data
The parser functions in the parse_body map have a recognizable shape: take a body, return {:ok, structured_term} | {:error, reason}. After writing a few (especially ones full of check_required and DateTime.from_iso8601 boilerplate), they start to look like data. A schema you declare instead of a function you write.
That's what tesla_middleware_mold and Mold do. Acme.fetch_user/1 once more:
def fetch_user(id) do
case Tesla.get(client(), "/users/#{id}", opts: [mold: %{200 => user_schema(), 404 => nil}]) do
{:ok, %Tesla.Env{status: 200, body: user}} -> {:ok, user}
{:ok, %Tesla.Env{status: 404}} -> {:error, :not_found}
{:error, _} -> {:error, :service_unavailable}
end
end
defp user_schema do
%{
id: :string,
first_name: :string,
last_name: :string,
email: :string,
created_at: :datetime
}
end
200 => schema runs Mold.parse/2. 404 => nil accepts the response without parsing. Anything else fails the request.
What you save by writing a schema instead of decode_user/1 isn't reuse. A helper function reuses just as well. It's the parser plumbing itself. Every decode_user-style function ends up reimplementing the same boilerplate: required-field checks, ISO timestamps, nullability, defaults. Mold.parse/2 provides them out of the box.
The function shape Mold accepts as a parser is the same one the middleware accepts (body -> {:ok, term} | {:error, reason}), so you can mix schemas with hand-written parsers in the same map, depending on which is cleaner for which response.
One last thing. Nothing in this article requires tesla_middleware_mold. The hand-rolled ParseBody above is around 20 lines once you collapse the formatting; dropping it into your own codebase already buys you everything the design argument hinges on. Reach for the library when you also want to stop hand-writing parsers. Until then, the pattern is the part that matters.
Top comments (0)