TL;DR: When you add AI to an app, persist the request, persist the response, and link them with an exchange. That one structure makes every round-trip traceable — and token costs fall out naturally.
My default when adding AI to an app
A pattern I keep coming back to: The Three Table Pattern — one table for the request, one for the response, and one exchange table that links them and holds the meter.
My default is to persist what I'm sending, persist what comes back, and tie both to a single exchange id so one user↔model round-trip is traceable end to end.
That id is what lets me answer:
- What did we actually send?
- What did we get back?
- What did this turn cost?
Without reconstructing anything from logs or guessing from the UI.
Not a doctrine. Just the shape I keep reaching for.
The three tables
Requests — what went out
Stores the user's input, a reference to the user, any context the request needs (session id, document id, etc.), and a status of pending.
Responses — what came back
Stores the model's structured output. This is where token counts live in some setups, though I sometimes put them on the exchange instead.
Exchanges — the unit that links them
Stores request_id, response_id, input_tokens, output_tokens, model_used, and status. One exchange = one round-trip. This is the row you query when you want cost.
The sequence
- Insert request (
status: pending) - Insert exchange (
request_idset,response_id: null,status: pending) - Call the model
- Insert response (structured output)
- Update exchange (
response_id, token counts,model_used,status: completed)
If the model call fails, the exchange updates to failed. You always have a record of the attempt.
Example 1 — Inline chat (knowledge-transcript-chat)
This is a chat UI where a user asks questions about a YouTube transcript.
File: insert-knowledge-transcript-chat-request.ts
export const insertKnowledgeTranscriptChatRequest = async (
supabase: SupabaseClient,
input: InsertKnowledgeTranscriptChatRequestInput
): Promise<{ id: string }> => {
const { data, error } = await supabase
.from('knowledge_transcript_chat_requests')
.insert({
user_id: input.userId,
youtube_transcript_id: input.youtubeTranscriptId,
content: input.content,
status: 'pending',
})
.select('id')
.single();
if (error) throw new Error(`Failed to insert chat request: ${error.message}`);
return { id: data.id };
};
File: insert-knowledge-transcript-chat-exchange.ts
export const insertKnowledgeTranscriptChatExchange = async (
supabase: SupabaseClient,
input: InsertKnowledgeTranscriptChatExchangeInput
): Promise<{ id: string }> => {
const { data, error } = await supabase
.from('knowledge_transcript_chat_exchanges')
.insert({
user_id: input.userId,
request_id: input.requestId,
status: 'pending',
})
.select('id')
.single();
if (error) throw new Error(`Failed to insert chat exchange: ${error.message}`);
return { id: data.id };
};
After the model responds, updateKnowledgeTranscriptChatExchange closes it out with input_tokens, output_tokens, model_used, and status: completed.
The client receives responseId, inputTokens, outputTokens, and modelUsed back from the server and appends them to the session slice. The costs table reads directly from those session messages — no separate cost-tracking service needed.
Example 2 — One-off AI job (marketing-blog-ideation)
This is a background job that generates blog ideas for a given topic and ICP context.
File: create-request.ts
export const createMarketingBlogIdeationRequest = async (
supabase: SupabaseClient,
input: CreateMarketingBlogIdeationRequestInput
): Promise<MarketingBlogIdeationRequestRow> => {
const { data, error } = await supabase
.from('marketing_blog_ideation_requests')
.insert({
user_id: input.userId,
topic: input.topic,
context: input.context,
icp_context: input.icpContext,
mode: input.mode,
status: 'pending',
})
.select()
.single();
if (error || !data) throw new Error(`Failed to create request: ${error?.message}`);
return data;
};
File: create-response.ts
export const createMarketingBlogIdeationResponse = async (
supabase: SupabaseClient,
input: CreateMarketingBlogIdeationResponseInput
): Promise<MarketingBlogIdeationResponseRow> => {
const { data, error } = await supabase
.from('marketing_blog_ideation_responses')
.insert({
structured: input.structured,
model_used: input.modelUsed,
input_tokens: input.inputTokens,
output_tokens: input.outputTokens,
total_tokens: input.totalTokens,
})
.select()
.single();
if (error || !data) throw new Error(`Failed to create response: ${error?.message}`);
return data;
};
Same three-table shape, different domain. The request carries topic and ICP context. The response carries structured output and token counts. The exchange links them.
Example 3 — Surfacing it in the client (classification-requests)
On the client side in luckee-web, request rows are normalized into a Redux dump slice by id:
File: classification-requests.ts
export const classificationRequestsSlice = createSlice({
name: 'classificationRequests',
initialState: {} as Record<string, ClassificationRequest>,
reducers: {
setClassificationRequests: (_state, action) => action.payload,
addClassificationRequest: (state, action) => {
state[action.payload.id] = action.payload;
},
updateClassificationRequest: (state, action) => {
state[action.payload.id] = action.payload;
},
deleteClassificationRequest: (state, action) => {
delete state[action.payload];
},
},
});
Requests, responses, and exchanges load into their own slices. The UI reconstructs the round-trip from what it has. The three-table shape stays consistent — it just shows up in Redux instead of raw SQL at this layer.
Cost is a side effect of having the rows
Once input_tokens, output_tokens, and model_used are on the exchange or response, the math is one formula:
(inputTokens / 1_000_000) * inputRateUsd
+ (outputTokens / 1_000_000) * outputRateUsd
Sum per exchange for one turn. Sum per thread for a session. Sum per user for a billing period. You don't need a separate cost-tracking service — you just need the rows.
Key takeaways
- Three tables: requests, responses, exchanges — one per round-trip
- Insert in order: request → exchange → call model → response → update exchange
-
Status on every row:
pending/completed/failedgives you retry and audit for free - Token counts on the exchange: makes cost a query, not a guess
- Same shape, different domains: chat UI, background job, classification — the pattern transfers
Top comments (0)