The release of NodeLLM 1.16 marks a significant milestone in our journey to provide production-grade infrastructure for AI applications. While earlier releases focused on basic integration and safety, version 1.16 focuses on surgical control and multimodal parity.
As agentic workflows become more complex, the ability to guide model behavior with precision—and handle failures gracefully—becomes the difference between a toy and a tool.
🎨 Advanced Image Manipulation
NodeLLM 1.16 introduces high-fidelity image editing and manipulation support. This moves beyond simple text-to-image generation into the realm of In-painting, Masking, and Variations.
Surgical Image Edits
You can now pass source images and masks to the paint() method. For OpenAI providers, this automatically routes requests to the /v1/images/edits endpoint using the specialized gpt-image-1 (DALL-E 2) model, which remains the state-of-the-art for manipulation tasks.
const llm = createLLM({ provider: "openai" });
// Modify an existing logo using a mask
const response = await llm.paint("Add a futuristic robot head to the logo", {
model: "gpt-image-1",
images: ["logo.png"],
mask: "logo-mask.png",
size: "1024x1024"
});
await response.save("edited-logo.png");
Image Variations & Asset Support
Generate visual variations of a source image without a prompt, or pass base64/URL assets seamlessly. The underlying BinaryUtils handles the conversion to provider-standard multipart formats, so you don't have to worry about binary boundaries or mime-types.
🛠️ Precision Tool Orchestration
One of the most usefull features for agentic workflows is the ability to force (or prevent) tool usage at specific turns. NodeLLM 1.16 introduces the choice and calls directives.
Tool Choice
You can now mandate tool usage or force a specific tool, similar to OpenAI's tool_choice but normalized across all major providers (Anthropic, Gemini, Bedrock, and Mistral).
-
required: The model must call at least one tool. -
"get_weather": The model must call the specific tool namedget_weather. -
none: Tools are disabled for this turn, even if defined.
Sequential Execution (calls: 'one')
Modern models often attempt to perform multiple tool calls in parallel. While efficient, this can lead to "parallel hallucinations" where later calls depend on the output of earlier ones. Use calls: 'one' to force the model to proceed sequentially, turn-by-turn.
const chat = llm.chat("gpt-4o");
// Force a specific tool and disable parallel calls for reliability
const response = await chat.ask("What is the temperature in London?", {
choice: "get_weather",
calls: "one"
});
🛡️ AI Self-Correction for Tool Failures
Building on the Self-Correction middleware introduced in v1.15, version 1.16 hardens the tool execution pipeline.
If a model attempts to call a non-existent tool, NodeLLM now catches the error and returns a descriptive "unavailable tool" response along with the list of valid tools. This allows the model to instantly self-correct its proposal without throwing an application-level exception. Similarly, arguments failing Zod validation are fed back to the model as "Invalid Arguments" results, enabling agents to fix their own mistakes.
🎙️ Advanced Transcription & Diarization
Our audio support has also received a major upgrade. The Transcription interface now supports Word-level Timestamps and enhanced Diarization (speaker tracking).
- Fine-grained Timestamps: Use
timestamp_granularitiesin OpenAI/Mistral to get precise sub-second timing for every word. - ORM Parity: The
Transcriptionclass now includes.metaand.rawgetters, ensuring the persistence layer captures the full provider response.
Getting Started
NodeLLM 1.16.0 is a "Big Release" that brings your AI infrastructure closer to the standard expected of modern production applications.
npm install @node-llm/core@1.16.0
For the complete list of architectural refinements and bug fixes, please see our Commit History and CHANGELOG.
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.