Pieces 🌟 for Pieces.app

Posted on Feb 3 • Edited on Feb 5

How to Build an Agentic Blog Generator

#flutter #beginners #tutorial #llm

Building an Agentic Blog Generator With Pieces OS (Flutter)

This project is a Flutter app that generates a technical blog (in Markdown) from real, recent context. The core idea is simple: use Pieces OS as the source of truth for what you’ve been working on (workstream summaries + your own annotations/persona signals), then use an LLM to turn that into a structured, high-quality blog—optionally in agentic mode with MCP tools.

Features

Auto generate title: suggests blog titles from recent Pieces workstream summaries.

Detect persona: pulls recent Pieces user annotations and converts them into “persona signals” for voice/tone

Detect style: analyzes a sample blog to infer a reusable style object for consistent output.

Generate multi part blog: plans part titles + per-part outlines, then generates Markdown one part at a time.

Agentic flow: connects to the MCP endpoint and uses tools for retrieval/verification (RAG) instead of guessing.

What Pieces OS gives us (and why it matters)

When you ask an LLM to “write a blog about my project”, you usually get one of two outcomes:

A generic post that sounds plausible but doesn’t match what you actually did.
A post that misses the details that made the work interesting (trade-offs, decisions, workflow).

Pieces OS helps fix this by providing grounded context:

Workstream Summaries (LTM): a stream of recent summaries of your work so we can generate content from what actually happened recently.
Annotations (persona + preferences + voice cues): your own notes/annotations can be pulled and turned into “persona signals” so the blog tone and framing match you.
MCP (Model Context Protocol) tools: in agentic mode, the generator can call MCP tools (memory/search/context) instead of guessing—so it can fetch or verify information as it writes. Practically, connecting via the MCP endpoint gives us RAG (retrieval‑augmented generation): the model retrieves relevant context from Pieces and then writes with that grounded input.

In this app, Pieces is not a “nice-to-have integration”—it’s the backbone for relevance and personalization.

How this app uses Pieces OS

At a high level, the wizard does three Pieces-backed things:

Connect to Pieces OS
- Establishes app connection with Pieces OS (so API calls work).
Pull recent context
- Pulls recent workstream summary IDs over a short-lived WebSocket.
- Fetches each summary snapshot.
- Extracts the human-readable “DESCRIPTION” annotation text from each summary.
Pull persona signals
- Fetches recent user annotations via UserApi.userGetAnnotations().
- Normalizes/truncates them into a prompt-friendly “persona signals” block.

Then, when we run in agentic mode, we also:

Connect to the Pieces MCP endpoint
List the available tools
Allow the LLM to call those tools while planning/writing

This combination (summaries + persona + MCP tools) gives the generator a much better shot at producing content that matches reality.

Agentic mode: how we got better output (one-shot → plan-first → structured parts)

We saw a clear quality jump as we changed the prompting approach.

Attempt 1: one prompt to generate the whole blog

The naive approach is: “Here’s some context—write the entire blog.”

In practice, that tends to produce:

weak structure (rambling, repetitive sections)
missing coverage (important modules not mentioned)
brittle accuracy (model fills gaps by guessing)

Attempt 2: plan first, then generate

Next improvement: separate planning from writing.

Instead of writing immediately, we made the app generate a plan first (what parts, what each part is about). That makes the writing phase far more constrained and coherent:

the model knows the “shape” of the blog before it writes
you can review/edit titles before any heavy generation happens

Attempt 3: even more planning, broken into titled parts

The biggest jump came from adding more structure:

generate part titles first (so each part has a clear purpose)
generate an outline for each part (so headings and subheadings are defined)
then write one part at a time, using the outline as a contract

That does two things:

Focus: each part stays on-topic (because it has a title + outline).
Better coverage: parts are explicitly scoped, so important areas are less likely to be skipped.

This is exactly why titled parts matter: the model is no longer inventing structure as it writes—it’s executing a plan you’ve already approved.

Tutorial: integrating Pieces OS in a Flutter app

This section breaks down how we integrated Pieces OS in a way that works well for a UI-driven Flutter app. The goal is to make the model’s output grounded (Pieces workstream summaries + persona signals) and optionally agentic (MCP endpoint → RAG).

Step 1: configure the Pieces endpoints

In PiecesOSService, we keep the local Pieces defaults in one place:

baseUrl: REST API base (http://localhost:39300)
websocketUrl: WebSocket base (ws://localhost:39300)
defaultMcpEndpoint: MCP streamable HTTP endpoint (used for tool-based RAG)

Here’s the exact code that defines those endpoints (and the imports used by the service):

import 'dart:async';
import 'dart:developer' as dev;
import 'dart:convert';
import 'dart:io';
import 'package:mcp_dart/mcp_dart.dart' as mcp;
import 'package:pieces_os_client/api.dart';

import '../models/persona_signals.dart';

/// Service to interact with Pieces OS for LTM (Long Term Memory)
class PiecesOSService {
  // Pieces OS configuration
  static const String baseUrl = 'http://localhost:39300';
  static const String websocketUrl = 'ws://localhost:39300';
  static const String defaultMcpEndpoint =
      'http://localhost:39300/model_context_protocol/2025-03-26/mcp';

Step 2: connect the app to Pieces OS (REST)

The connectApplication() method registers/connects this app with Pieces via ConnectorApi.connect(...) and stores the returned Context.

The initialize() method is a small guard that:

ensures we only connect once per session (_isInitialized)
makes all later calls safe to run “just-in-time” from the UI

Here are the service fields + constructor + REST connection code:

  // API clients
  late final ApiClient client;
  late final ConnectorApi _connectorApi;
  late final WorkstreamSummaryApi _workstreamSummaryApi;
  late final AnnotationApi _annotationApi;
  late final UserApi _userApi;

  // Application context
  Context? _context;

  // Application info
  final ApplicationNameEnum appName = ApplicationNameEnum.OPEN_SOURCE;
  final String appVersion = "0.0.1";
  final PlatformEnum platform = Platform.operatingSystem == "windows"
      ? PlatformEnum.WINDOWS
      : Platform.operatingSystem == "macos"
      ? PlatformEnum.MACOS
      : PlatformEnum.LINUX;

  // Track if service is initialized
  bool _isInitialized = false;

  /// Placeholder persona string
  /// generation prompts later.
  String persona = '';

  PiecesOSService() {
    client = ApiClient(basePath: baseUrl);
    _connectorApi = ConnectorApi(client);
    _workstreamSummaryApi = WorkstreamSummaryApi(client);
    _annotationApi = AnnotationApi(client);
    _userApi = UserApi(client);
  }

  /// Register/Connect the application to Pieces OS
  Future<Application> connectApplication() async {
    if (_context?.application != null) {
      return _context!.application;
    }

    try {
      final seededApp = SeededTrackedApplication(
        name: appName,
        platform: platform,
        version: appVersion,
      );

      final connection = SeededConnectorConnection(application: seededApp);

      _context = await _connectorApi.connect(
        seededConnectorConnection: connection,
      );

      if (_context?.application == null) {
        throw Exception(
          'Failed to connect to Pieces OS: No application returned',
        );
      }

      dev.log(
        'Successfully connected to Pieces OS: ${_context!.application.name}',
        name: 'PiecesOSService',
      );
      return _context!.application;
    } catch (e, st) {
      dev.log(
        'Error connecting to Pieces OS',
        name: 'PiecesOSService',
        error: e,
        stackTrace: st,
      );
      rethrow;
    }
  }

  /// Initialize the service - connects to Pieces OS.
  ///
  /// NOTE: We intentionally do **not** start any WebSocket listeners. This app
  /// fetches workstream summary identifiers on-demand.
  Future<void> initialize() async {
    if (_isInitialized) {
      dev.log('Service already initialized', name: 'PiecesOSService');
      return;
    }

    try {
      // Connect to Pieces OS
      await connectApplication();

      _isInitialized = true;
      dev.log('Initialized successfully', name: 'PiecesOSService');
    } catch (e, st) {
      dev.log(
        'Error initializing',
        name: 'PiecesOSService',
        error: e,
        stackTrace: st,
      );
      rethrow;
    }
  }

Step 3: retrieve recent work context (workstream summaries)

To ground generation in “what I actually worked on”, the service does this on-demand:

fetchLatestWorkstreamSummaryIds() opens a WebSocket to the identifiers stream, reads one payload, then closes the socket.
getLastSummaryContents() uses those IDs to fetch summary snapshots and extracts the DESCRIPTION annotation text via getSummaryContent().

Those DESCRIPTION strings are what we feed into the LLM as “recent context”.

Here’s the exact code for streaming IDs, fetching summaries, and extracting DESCRIPTION text:

  /// Fetch the most recent workstream summary identifiers on-demand.
  ///
  /// This opens a WebSocket connection **once**, reads the first identifiers
  /// payload, then closes the socket. No continuous listener / no caching.
  Future<List<String>> fetchLatestWorkstreamSummaryIds({
    int limit = 10,
    Duration timeout = const Duration(seconds: 8),
  }) async {
    if (limit <= 0) return const [];
    await initialize();

    final wsUrl = '$websocketUrl/workstream_summaries/stream/identifiers';
    WebSocket? socket;
    try {
      socket = await WebSocket.connect(wsUrl).timeout(timeout);
      final first = await socket.first.timeout(timeout);

      String raw;
      if (first is String) {
        raw = first;
      } else if (first is List<int>) {
        raw = utf8.decode(first);
      } else {
        raw = first.toString();
      }

      final decoded = jsonDecode(raw);
      final streamed = StreamedIdentifiers.fromJson(decoded);
      final ids = (streamed?.iterable ?? const [])
          .map((e) => e.workstreamSummary?.id)
          .whereType<String>()
          .where((s) => s.trim().isNotEmpty)
          .take(limit)
          .toList(growable: false);
      return ids;
    } finally {
      try {
        await socket?.close();
      } catch (_) {
        // ignore
      }
    }
  }

  /// Get the summary content from a workstream summary's annotations
  Future<String?> getSummaryContent(WorkstreamSummary summary) async {
    try {
      // Loop through annotations to find the DESCRIPTION type
      for (final annotationRef
          in summary.annotations?.indices.keys.toList() ?? []) {
        // Fetch the full annotation using AnnotationApi (singular)
        final annotation = await _annotationApi
            .annotationSpecificAnnotationSnapshot(annotationRef);
        if (annotation == null) {
          continue;
        }

        // Check if this is a DESCRIPTION type annotation
        if (annotation.type == AnnotationTypeEnum.DESCRIPTION) {
          // Return the text content
          return annotation.text;
        }
      }

      return null;
    } catch (e) {
      dev.log(
        'Error fetching annotation content for ${summary.id}: $e',
        name: 'PiecesOSService',
      );
      return null;
    }
  }

  /// Get the last [limit] workstream summary DESCRIPTION texts (most recent first).
  ///
  /// This performs an on-demand identifiers fetch, then retrieves each summary
  /// snapshot and its DESCRIPTION annotation text.
  Future<List<String>> getLastSummaryContents({int limit = 10}) async {
    if (limit <= 0) return const [];
    await initialize();

    final ids = await fetchLatestWorkstreamSummaryIds(limit: limit);
    if (ids.isEmpty) return const [];

    final summaries = await Future.wait(
      ids.map(
        (id) => _workstreamSummaryApi
            .workstreamSummariesSpecificWorkstreamSummarySnapshot(id),
      ),
    );

    final contents = await Future.wait(
      summaries.whereType<WorkstreamSummary>().map(
        (s) async => (await getSummaryContent(s))?.trim(),
      ),
    );

    return contents.whereType<String>().where((t) => t.isNotEmpty).toList();
  }

Step 4: retrieve persona signals (user annotations)

To personalize voice and framing, getPersonaFromUserAnnotations():

resolves the active user (_resolveUserId() → UserApi.userSnapshot())
fetches recent annotations (UserApi.userGetAnnotations(...))
normalizes them into a prompt-friendly PersonaAnnotations object

If Pieces returns no annotations, we keep persona optional (so the model doesn’t invent one).

Here’s the code that resolves the active user and turns annotations into prompt-friendly “persona signals”:

  Future<String> _resolveUserId() async {
    await initialize();

    final snap = await _userApi.userSnapshot();
    final userId = snap?.user?.id;
    if (userId != null && userId.trim().isNotEmpty) return userId.trim();

    throw StateError(
      'No active user found. Ensure Pieces has an active user session.',
    );
  }

  /// Get a "persona" text derived from the user's annotations.
  ///
  /// Uses `UserApi.userGetAnnotations()` which returns the resolved active Person
  /// and filtered Annotations.
  Future<PersonaAnnotations> getPersonaFromUserAnnotations({
    int limit = 1,
  }) async {
    if (limit <= 0) return const PersonaAnnotations();
    final userId = await _resolveUserId();
    final out = await _userApi.userGetAnnotations(
      userId,
      UserAnnotationsInput(limit: limit),
    );

    final texts = out.annotations.iterable
        .map((a) => a.text)
        .whereType<String>()
        .map((t) => t.trim())
        .where((t) => t.isNotEmpty)
        .toList();
    dev.log(
      'Fetched ${texts.length} annotation texts for persona.',
      name: 'PiecesOSService',
    );

    // Pieces might return an empty annotations list; in that case we return empty
    // and let the caller omit persona entirely.
    if (texts.isEmpty) return const PersonaAnnotations();

    final normalized = <String>[];
    for (final t in texts) {
      final oneLine = t.replaceAll(RegExp(r'\s+'), ' ').trim();
      if (oneLine.isEmpty) continue;
      normalized.add(
        oneLine.length > 180 ? '${oneLine.substring(0, 180)}…' : oneLine,
      );
    }

    return PersonaAnnotations(annotations: normalized);
  }

Step 5: enable RAG with MCP (agentic mode)

When we connect via the MCP endpoint, the generator can do RAG (retrieval‑augmented generation):

connectMcp() connects a streamable HTTP transport (POST + SSE GET)
listTools() fetches and caches the available MCP tools
callTool() lets the agent retrieve/verify relevant context from Pieces during planning/writing

This is the “agentic” part: instead of guessing, the model can retrieve what it needs.

Here’s the MCP client code that enables tool calls (MCP endpoint → RAG):

  // MCP (Streamable HTTP/SSE) client
  mcp.McpClient? _mcpClient;
  mcp.StreamableHttpClientTransport? _mcpTransport;
  Uri? _mcpEndpoint;
  List<mcp.Tool>? _cachedMcpTools;

  bool get isMcpConnected =>
      _mcpClient != null && _mcpTransport != null;

  /// Connect to Pieces MCP endpoint using Streamable HTTP (POST + SSE GET).
  ///
  /// Default endpoint is `http://localhost:39300/mcp`.
  Future<void> connectMcp({String? endpoint}) async {
    final ep = Uri.parse((endpoint ?? defaultMcpEndpoint).trim());

    // If already connected to same endpoint, do nothing.
    if (isMcpConnected && _mcpEndpoint == ep) return;

    // Close any existing transport/client.
    await disconnectMcp();
    _mcpEndpoint = ep;

    final transport = mcp.StreamableHttpClientTransport(ep);
    final client = mcp.McpClient(
      const mcp.Implementation(name: 'blog_generator', version: '0.0.1'),
      options: const mcp.McpClientOptions(
        // Client-side capabilities are for server-initiated requests
        // (sampling, elicitation, tasks, roots). We don't need any for now.
        capabilities: mcp.ClientCapabilities(),
      ),
    );

    transport.onerror = (err) {
      dev.log(
        'MCP transport error: $err',
        name: 'PiecesOSService',
        error: err,
      );
    };
    transport.onclose = () {
      dev.log('MCP transport closed', name: 'PiecesOSService');
      _cachedMcpTools = null;
      _mcpClient = null;
      _mcpTransport = null;
    };

    try {
      await client.connect(transport);
      _mcpTransport = transport;
      _mcpClient = client;
      final server = client.getServerVersion();
      dev.log(
        'MCP connected: ${server?.name ?? 'unknown'} ${server?.version ?? ''}',
        name: 'PiecesOSService',
      );
    } catch (e) {
      await disconnectMcp();
      rethrow;
    }
  }

  Future<void> disconnectMcp() async {
    _cachedMcpTools = null;
    try {
      await _mcpTransport?.terminateSession();
    } catch (_) {
      // Ignore - server may not support.
    }

    try {
      await _mcpClient?.close();
    } catch (_) {
      // Ignore.
    }
    try {
      await _mcpTransport?.close();
    } catch (_) {
      // Ignore.
    }

    _mcpClient = null;
    _mcpTransport = null;
  }

  Future<List<mcp.Tool>> listTools({bool forceRefresh = false}) async {
    final client = _mcpClient;
    if (client == null) {
      throw StateError('MCP client not connected');
    }

    if (!forceRefresh && _cachedMcpTools != null) return _cachedMcpTools!;

    final res = await client.listTools();
    _cachedMcpTools = res.tools;
    return res.tools;
  }

  Future<mcp.CallToolResult> callTool({
    required String name,
    Map<String, dynamic> arguments = const {},
  }) async {
    final client = _mcpClient;
    if (client == null) {
      throw StateError('MCP client not connected');
    }

    return client.callTool(
      mcp.CallToolRequest(name: name, arguments: arguments),
    );
  }

Step 6: clean up

dispose() closes the MCP session (best-effort) and clears state so the service doesn’t leak resources across UI lifecycles.

Here’s the cleanup method from the service (it also closes the class):

  /// Disconnect from Pieces OS and cleanup resources
  void dispose() {
    _isInitialized = false;

    // Close MCP client/transport
    // Fire and forget; dispose is sync.
    unawaited(disconnectMcp());

    // Clear context
    _context = null;

    dev.log('Disposed', name: 'PiecesOSService');
  }
}

How the UI uses this service (quick walkthrough)

In BlogWizardScreen, the flow looks like this:

Get grounded context
- _pieces.getLastSummaryContents(limit: 10) for recent workstream summaries
- _pieces.getPersonaFromUserAnnotations(limit: 1) for persona signals (optional)
Turn on agentic RAG
- _pieces.connectMcp(...) then _pieces.listTools()
- pass callMcpTool: (...) => _pieces.callTool(...) into the agent loop

From there, the generator improves quality by moving from one-shot output to a structured workflow:

plan part titles first
generate outlines per part (reviewable)
write each part with the outline as a contract

Takeaway

Pieces OS is what makes this blog generator feel real:

it anchors generation in your actual recent work (summaries)
it shapes tone/voice via your own signals (annotations → persona)
it enables agentic correctness when available (MCP tools)

And the prompt strategy matters just as much: moving from one-shot generation to a plan-first, titled multi-part workflow is what consistently turns “okay output” into “publishable output”.

The rest of the project code (UI, models, generation logic, widgets, etc.) is available on GitHub repository.

DEV Community