Jason C

Posted on Dec 7

Building LogantonPA.com: AI, Azure Functions, and self-feeding content

#ai #webdev #azure #automation

Small towns deserve big archives. My hometown, Loganton, Pennsylvania, has stories that stretch from Native American trails, Civil War volunteers, Amish settlers and modern resilience. I wanted to capture that history in a way that was interactive, accurate, and alive.

So I built logantonpa.com. And I did it in few days.

The Flow: Gemini → VS Code → Copilot → Azure Functions

Gemini AI Studio: I scaffolded the initial site here, fast prototyping with AI assistance.
VS Code + Copilot: Once ejected, Copilot became my pair programmer—helping me refine, structure components, and iterate quickly.
Azure Functions: The backend lives in a lightweight serverless functions. They handle chat logging, aggregation, and orchestration.
Azure Table Storage: Every chat is logged, forming the raw dataset for new content ideas.

This combo gave me speed, flexibility, and scalability without over-engineering.

Content Pipeline

Aggregate top questions Every chat log is saved in Table Storage. A function surfaces the most frequently asked questions and flags ones not yet covered by the site.
Fact-check with model Candidate topics are passed to a fact-checker model with strict instructions: validate against trusted archives and existing source links, reject folklore or bot-site noise.
Generate new content with Gemini in Antigravity Once validated, prompts are fed into Gemini running in Antigravity to draft timeline entries or new pages. This keeps the site expanding in response to real user curiosity.
Human review + deployment Automation gets us 90% there, but final publishing is still a human task. I review, polish, and deploy—ensuring accuracy, tone, and fit with the LogantonPA.com mission.

Site Sections

The site is structured around three interactive sections:

Timeline
- A chronological view of Loganton’s history.
- Each entry is fact-checked and sourced.
- Built to be expandable as new verified content comes in.
Source Links
- Every fact is grounded with a source.
- Links to archives, books, and reliable references are prioritized.
- This is the backbone of trust—no floating claims, everything tied to evidence.
Chat
- Visitors can ask questions.
- Chats are logged, aggregated, and analyzed.
- Top questions feed the content pipeline, ensuring the site grows in response to curiosity.

Turning Chats Into Content

Here’s the logical workflow:

Log chats → stored in Table Storage.
Aggregate questions → function surfaces top queries.
Check coverage → does the site already answer this?
Fact-check model → new topics are validated against trusted sources.
Gemini in Antigravity → generates draft content.
Human review → I polish and deploy.

This makes the site both community and AI driven. The more people ask, the more the archive grows.

Code Snippets: Proxy + Flat Logging

The backend isn’t just logging, it’s a proxy to my Gemini model. That means every chat request goes through my Azure Function, gets sent to Gemini, and the response comes back through the same function.

Instead of splitting logs into separate request/response rows, I use flat logging: one row per interaction, with both the question and the answer stored together. This makes aggregation and analysis much simpler later.

Proxying to Gemini


// Handler implementations
async function handleChat(context, req, origin) {
    const { sessionId, messages, extra } = req.body || {};
    if (!sessionId || !Array.isArray(messages) || messages.length === 0) {
        context.res = { status: 400, headers: corsHeaders(origin), body: { error: "Invalid payload. Provide sessionId and messages[]." } };
        return;
    }

    // The API expects history to not include the current user message.
    // The last message in the array is the new one from the user.
    const userMessage = messages[messages.length - 1].parts?.map(p => p.text).join('\n');
    const history = messages.slice(0, -1);

    // System instruction to define the AI's persona and rules.
    const systemInstruction = `You are the **Town Historian** of **Sugar Valley (Loganton, PA)**. Your persona is professional, factual, and deeply knowledgeable. Your knowledge base is exclusively the history of Loganton, its books, archives, farms, the railroad, and local events.

**Mandatory Rules:**
1.  **Strictly Factual:** Answer must be rooted in fact; **never** invent details or speculate.
2.  **Formatting:** Use Markdown for clarity: **bold** key historical terms (e.g., family names, specific dates), and use bulleted lists (\`*\`) or numbered lists for sequential events, timelines, or distinct points.
3.  **Source Citation:** If any external information is used via Google Search, cite the source **immediately** following the relevant information using a single bracketed link \`Source\`.
4.  **Final Output Structure:** The response **must** conclude with the historical content, followed by a hard line break, and then the strictly formatted follow-up questions.

**Strictly Final Output Format (DO NOT deviate):**
[...your factual, formatted historical answer...]

SUGGESTIONS:|Question 1 related to the answer|Question 2 on a related topic|Question 3 about a broader historical theme

Example of final line: SUGGESTIONS:|What was the impact of the railroad on local farms?|Tell me about the original founding families.|How did the valley get the name Sugar Valley?`;

    const tools = [];
    tools.push({ googleSearch: {} });

    const geminiModel = extra?.model || "gemini-2.5-flash";

    try {
        const chat = genAI.getGenerativeModel({
            model: geminiModel,
            systemInstruction: systemInstruction,
            tools: tools.length > 0 ? tools : undefined
        }).startChat(history);

        // Use sendMessage and wait for the response
        const result = await chat.sendMessage(userMessage);
        const response = result.response;
        const text = response.text();

        // Log the conversation after receiving the response
        await logConversation(sessionId, userMessage, text, response, context);

        context.res = {
            status: 200,
            headers: corsHeaders(origin),
            body: {
                sessionId, text, candidates: response.candidates, model: geminiModel
            }
        };

    } catch (err) {
        context.log.error(err, "Gemini request failed");
        await safeLogModelError(sessionId, geminiModel, { error: err.message }, context);
        context.res = { status: 502, headers: corsHeaders(origin), body: { error: "Gemini request failed", message: err.message } };
    }
}

/**
 * logging function to store a conversation turn after completion.
 */
async function logConversation(sessionId, userMessage, modelResponse, response, context) {
    try {
        const table = await getTableClient(); //TableClient.fromConnectionString(CONNECTION_STRING, TABLE_NAME);

        // Create a single chat record for this request and persist before calling Gemini.
        const requestAt = new Date().toISOString();
        const partitionKey = requestAt.slice(0, 10); // YYYY-MM-DD
        const rowKey = makeRowKey(sessionId);

        const entity = {
            partitionKey: partitionKey,
            rowKey: rowKey,
            userMessage: JSON.stringify(userMessage),
            modelResponse: modelResponse,
            rawResponse: JSON.stringify(response)
        };
        await table.createEntity(entity);
    } catch (e) {
        context.log.warn("Failed to log conversation turn", e);
    }
}

/**
 * Make a row key that includes the sessionId for easy prefix queries.
 * Format: {sessionId}_{ISOtimestamp}_{random4}
 * Using sessionId as the prefix allows querying by rowKey prefix across partitions if needed.
 */
function makeRowKey(sessionId) {
    const safeSession = (sessionId || "UNKNOWN")
        .toString()
        .replace(/[^a-zA-Z0-9-_]/g, '_')
        .slice(0, 64);
    const iso = new Date().toISOString();
    const rand = Math.random().toString(36).slice(2, 6);
    return `${safeSession}_${iso}_${rand}`;
}

This snippet shows the flow:

Proxying the chat request to Gemini. We include system instructions asking for SUGGESTIONS in the final output. This allow us to get a list of follow-up questions to ask the user in one call.
Capturing the response data. We grab the text and candidates from the response, these are the grounding links used in the answer.
Logging both question and answer together in one rowkey.
Return the answer text with suggestioned questions suffix, and grounding sources to the user.

This structure makes it easy to later aggregate top questions and detect gaps in site coverage.

Grounding: Why Sources Matter

Grounding is the difference between folklore and fact. Every new content idea is checked against existing source links.

If a chat question matches a timeline entry → no duplication.
If it’s new → fact-checker model validates against trusted archives.
If sources don’t exist → flagged for human review of new sources.

This ensures the site grows responsibly, without drifting into misinformation.

Handling Bad Sites

Not all search results are equal. In fact, many are bot sites or risky links that scrape content without context.

Bot sites: Auto-generated, keyword-stuffed, no citations.
Risky links: Malware, clickbait, or content farms.
Mismatch: Search results that don’t align with current source links.

The pipeline is designed to reject anything that doesn’t ground in trusted sources. Accuracy > volume. But the web is an evolving landfill of information, good sites become bad, and new sites are created all the time.

Conclusion and Broader Impact

This project showcases how AI-assisted coding and serverless functions can rapidly establish living, dynamic archives, preserving local history in an interactive and accurate manner. While automation optimizes the process, the human essence—the stories and heritage of Loganton, remains its core.

DEV Community