Martin Tuncaydin

Posted on May 13

Building Travel Copilots with OpenAI's Assistants API: Function Calling and Persistent Threads

#openai #traveltechnology #aiassistants #functioncalling

Building Travel Copilots with OpenAI's Assistants API: A Practitioner's Guide to Function Calling and Persistent Threads

I've spent the last eighteen months working with travel technology teams who are trying to answer one deceptively simple question: how do we build AI assistants that actually help our agents close bookings faster? Not chatbots that frustrate users with canned responses, but genuine copilots that understand context, retrieve the right information and execute real actions.

The answer, I've found, lies in OpenAI's Assistants API—a framework that combines function calling, file search, and persistent conversation threads in ways that feel purpose-built for complex B2B workflows like travel agent desks. But this isn't about replacing human expertise. It's about augmenting it with tools that remember context, surface relevant policy documents, and automate the tedious lookups that consume hours of an agent's day.

Why Traditional Chatbots Fail in B2B Travel

I've watched dozens of travel companies deploy chatbots that follow the same pattern: they handle simple FAQs brilliantly, then collapse the moment a customer asks about group bookings with mixed cabin classes, or tries to modify a multi-leg itinerary with different fare rules per segment.

The problem isn't the underlying language model. GPT-4 can absolutely reason through complex travel scenarios. The problem is architecture. Most chatbots are stateless—they forget context between messages, can't access live inventory systems, and have no mechanism to retrieve internal policy documents that govern edge cases.

I remember sitting with a corporate travel team in Frankfurt who showed me their agent desktop. Each booking required checking five different systems: the GDS for availability, a PDF library for corporate travel policies, a CRM for traveller preferences, a separate tool for duty-of-care alerts, and a spreadsheet tracking unused tickets. Their agents were brilliant, but they spent more time switching contexts than actually serving customers.

This is where the Assistants API becomes transformative. It's designed around three capabilities that map perfectly to travel agent workflows: function calling to execute actions in external systems, file search to retrieve policy knowledge, and persistent threads that maintain conversation context across days or even weeks.

Function Calling: Turning Conversations into Actions

The breakthrough with function calling is that it lets language models decide when to stop generating text and start executing code. I describe it to clients as giving the AI a toolkit: instead of just talking about flight options, it can actually query availability, calculate fares, or check seat maps.

In a travel context, this means defining functions like search_flights, check_hotel_availability, retrieve_booking, or calculate_fare_rules. The model reads the conversation, determines which function to call, extracts the right parameters from natural language, and returns structured results.

What makes this powerful is that it's not rule-based. I don't have to anticipate every possible way an agent might phrase a request. If they say "show me morning flights from London to Dubai next Tuesday under £600 in business class", the model maps that to the right function parameters: origin LHR, destination DXB, date, price ceiling, cabin class.

I've built systems where a single assistant can orchestrate calls across a dozen different functions—checking availability, applying corporate discounts, validating traveller profiles, even generating PDF itineraries. The model decides the sequence, handles errors, and asks clarifying questions when parameters are ambiguous.

Does this mean avoiding AI entirely? Absolutely not. The key is designing functions that return rich context, not just raw data. Instead of returning a JSON blob of flight options, I return formatted text that includes not just price and times, but fare rules, baggage allowances, and refund policies. This lets the model weave that information into natural responses that agents can immediately relay to customers.

File Search: Making Policy Knowledge Instantly Accessible

Every travel organisation has a sprawling library of documents that govern how bookings should be handled: corporate travel policies, supplier agreements, fare rule PDFs, destination guides, duty-of-care protocols. Agents are expected to know all of this, but in reality they spend enormous time searching through folders or asking colleagues.

The Assistants API's file search capability solves this by creating a vector store—essentially a searchable index of all your documentation. You upload files in formats like PDF, Word, or Markdown, and the API automatically chunks them, generates embeddings, and makes them retrievable.

What I love about this approach is that it's not keyword search. When an agent asks "what's our policy on last-minute cancellations for executives?", the system retrieves relevant passages based on semantic meaning, not just matching the word "cancellation". It understands synonyms, context, and intent.

I recently worked with a team managing corporate travel for pharmaceutical companies. Their policies were Byzantine: different rules for clinical trial participants versus sales reps, special provisions for emergency medical travel, complex approval hierarchies. We uploaded their entire policy library—about two hundred documents—and suddenly agents could ask questions in plain language and get precise answers with citations.

The citations are crucial. The API doesn't just paraphrase policy; it tells you which document and which page it found the information on. This builds trust. Agents can verify answers and show customers the source material when needed.

I usually combine file search with function calling. An agent might ask about rebooking options for a delayed flight. The assistant searches policy documents to understand rebooking rules, then calls a function to check alternative flight availability, then synthesises both into a coherent recommendation. Full stop.

Persistent Threads: Context That Survives the Shift Change

This might be the most underrated feature for B2B workflows. In most chatbot architectures, each conversation is isolated. If an agent starts helping a customer, then needs to escalate or hand off to a colleague, all that context evaporates.

The Assistants API uses persistent threads—conversation histories that live beyond a single session. Each thread has a unique identifier. You can pause a conversation, come back hours later, and the assistant remembers everything: previous questions, retrieved documents, function calls, even the customer's preferences mentioned in passing.

I've seen this transform handoff workflows. An agent in London starts a complex booking for a group travelling to Singapore. They get partway through, then their shift ends. The next agent—maybe in a different time zone—opens the same thread and immediately sees the full context: who the travellers are, what's been discussed, which options were considered and rejected, what policies were consulted.

Threads also enable asynchronous workflows. An agent can ask the assistant to research something complex—"find all available options for getting twelve people from Paris to Tokyo during cherry blossom season, staying within budget, with specific dietary requirements"—then move on to other tasks while the assistant works through multiple function calls and file searches.

I structure threads hierarchically. The main thread tracks the overall booking journey. If an agent needs to deep-dive on a specific question—say, understanding visa requirements for a particular nationality—I create a sub-thread focused on that topic, then merge the insights back into the main conversation.

Designing the Agent Experience

The technology is powerful, but I've learned that success depends on how you design the interface. Agents don't want to type essays to an AI. They want quick answers, suggested actions, and the ability to override when the model gets it wrong.

I typically build a split-screen interface: the conversation thread on one side, and actionable widgets on the other. When the assistant finds flight options, it doesn't just describe them in text—it renders them in a structured table with "Book" buttons. When it retrieves a policy, it shows the relevant excerpt in a panel with a link to the full document.

I also give agents control over function execution. When the model suggests calling a function—say, creating a booking—I show the parameters it extracted and ask for confirmation before executing. This catches errors and builds trust.

Another pattern I use is proactive suggestions. The assistant monitors the conversation and surfaces relevant actions: "I noticed this is a last-minute booking—would you like me to check our emergency travel policy?" or "This route often has better fares if we include a connection—should I search those options?"

Real-World Constraints and Trade-Offs

I'd be misleading you if I said this was effortless to implement. There are real challenges I work through on every project.

Latency is the first. Function calling adds round-trips: the model generates a function call, your code executes it, then the model processes the result. For complex queries involving multiple functions, this can take ten or fifteen seconds. I mitigate this with streaming responses—showing partial answers as they're generated—and by designing functions that batch operations where possible.

Cost is another consideration. Each API call consumes tokens for the conversation history, file search results, and function definitions. For high-volume agent desks, this adds up. I've found the sweet spot is using GPT-4 for complex reasoning and decision-making, but offloading simple lookups to direct database queries or cheaper models.

Accuracy requires constant tuning. The model sometimes hallucinates function parameters or retrieves irrelevant documents. I address this with detailed function descriptions, few-shot examples in the system prompt, and validation logic that catches impossible parameters before execution.

Integration complexity is the biggest hurdle. Travel systems are notoriously fragmented—GDS platforms, inventory APIs, CRM systems, payment gateways. Each has its own authentication, data format, and quirks. Building robust function adapters that handle errors gracefully is where I spend most of my implementation time.

The Future I'm Building Toward

I believe we're at the beginning of a fundamental shift in how B2B software works. The traditional model—specialised applications with rigid workflows—is giving way to conversational interfaces that adapt to how people actually think and work.

In travel specifically, I'm seeing assistants evolve from tools that answer questions to true copilots that anticipate needs. Imagine an assistant that notices a pattern—this corporate client always books window seats, prefers afternoon flights, has a shellfish allergy—and proactively applies those preferences to new bookings without being asked.

I'm also excited about multi-agent systems: specialised assistants that collaborate. One focuses on flights, another on hotels, a third on ground transportation. They coordinate through a master orchestrator that ensures the full itinerary makes sense as a whole.

My view is that the most successful implementations won't be the ones with the fanciest AI. They'll be the ones that deeply understand agent workflows, integrate cleanly with existing systems, and earn trust through consistent accuracy and transparency. The technology is ready. The question is whether we're ready to rethink how we design software for the people who use it every day.

Tags: openai, travel-tech, ai-agents, assistants-api, conversational-ai

DEV Community