DEV Community: Luke Manning

The Entrepreneurial Conundrum

Luke Manning — Thu, 16 Apr 2026 12:39:06 +0000

I have business ideas all the time. Every few days something pops into my head, usually triggered by some frustration or gap I notice in daily life. Sometimes I jot them down. I'll even flesh a few out, turn a vague thought into something that feels almost real.

Then I lose interest. Or I realise I'm not actually passionate about it. Or I just... don't do anything.

And then, inevitably, I see someone else ship the exact same thing. Usually some small tool or app I'd scribbled in my notes months earlier and never touched again. My first reaction is always "That could have been me." It doesn't feel great.

It's a cycle I've been in for years.

The 4 AM Ice Bath Thing

There was a stretch where I kept ending up in these YouTube spirals late at night. Start with something legit, then an hour later you're watching a guy in an ice bath telling you that success comes down to waking up at 4 AM and filling out your gratitude journal. You know the type.

Some of them say it's about being uniquely positioned. Having the right resources, knowledge, and contacts at the right time. But if you're just starting out, how are you uniquely positioned? You're not. That's the point.

Then there's the Alex Hormozi approach. "Fake it till you make it." Grind long enough, build the expertise, make the contacts, and eventually something clicks.

I've watched enough of these to know they all have a point. I've also watched enough to know none of them have the full picture. They all sound so sure of themselves. And I kept watching, so I guess it works.

None of it really helped. The problem wasn't knowing how to start. It was starting.

The Reputation Fear

One of the YouTubers I kept coming back to was Dan Koe. He's the guy that sent me down a self-actualisation rabbit hole. His stuff on personal branding made sense to me: build in public, share your journey, create an audience around what you're learning.

But here's where I get stuck.. if I ship something half-baked, does that damage the brand I'm trying to build? Too many missteps and people stop taking you seriously. Or at least, that's what I worry about. Maybe that's just an excuse to not ship anything.

Then I look at Marc Louvion @marclou (x.com). I came across his profile on X a while back and started following along. He's had more failed startups than most people have ideas. He's transparent about all of them. And he's now followed by over 321k people specifically because of that transparency. He kept failing publicly and kept going.

I read that and think maybe the reputation fear is overblown. Just ship stuff, be honest about what doesn't work, people respect that. Then I go to actually put something out there and the voice comes back. It'll look half-baked. It's not ready.

The Quote That Made Me Cringe

Reading The Million Dollar Weekend by Noah Kagan, I came across this:

Okay, so you have an idea for an app. How would you go about doing it? Here's the way most people, most wantrepreneurs, would do it:

Spend hours at home thinking about the app (and coming up with clever names for it).

Spend $100 hiring your cousin to design a cool logo.

Set up an LLC.

Watch YouTube videos about apps, programming, and business.

Consider signing up for a developer bootcamp and quickly realize coding is hard.

Buy the domain name for the snazzy website you're going to build.

Look into hiring a developer on UpWork and quickly realize it's cost prohibitive.

Give up. Again.

Noah Kagan, The Million Dollar Weekend

I've done most of this list. Not all of it, but enough that reading it felt like being called out by name. The domain buying especially. I've bought domains for ideas that never went anywhere. They're still renewing.

Where I'm At

I'm still in the wantrepreneur camp. I know that. Reading that Noah Kagan excerpt didn't magically change anything, it just made the pattern visible.

The conundrum for me isn't really about entrepreneurship at all. It's about the gap between having ideas and doing something with them. I know the theory. I've read the books, watched the videos, followed the people. Knowing the theory hasn't made me ship anything.

I don't have a neat resolution here. I'm still working through it. Maybe writing about it is a small step. Maybe it's just another form of thinking about the app instead of building it. I genuinely don't know.

Straico Has Great Models But No Streaming, So I Built a Proxy

Luke Manning — Wed, 15 Apr 2026 15:08:54 +0000

I use OpenCode as my main AI coding tool. I switched from Claude Code after Anthropic started going after open source projects and I kept hitting session limits on my subscription extremely fast.

OpenCode works with any OpenAI-compatible API. Straico gives me access to Claude, GPT, Gemini, DeepSeek, and a bunch more through a single API key. Cheap too. Problem is, Straico's API is missing two things OpenCode needs: streaming responses and function calling.

Without streaming, OpenCode just hangs. Never gets a response. But Straico keeps eating tokens on their end anyway. Without function calling, the AI can't use tools like reading files or running bash commands. Both are non-negotiable for an agentic coding tool.

So I built a proxy. It sits between OpenCode and Straico, translating requests and responses to fill in the gaps.

OpenCode
  → localhost:8000 (my proxy)
    → Straico API

What started as "just simulate streaming and inject tool definitions" turned into a surprisingly full-featured thing. The codebase is at github.com/ManningWorks/DOAI-Proxy.

The Architecture I Ended Up With

I didn't start with a provider pattern. I started with four files: server.js, streaming.js, tools.js, utils.js. But once I started thinking about the possibility of adding other providers down the line, I refactored into something cleaner.

The provider pattern lives in providers/. BaseProvider is an abstract class that handles the interface contract and retry logic. StraicoProvider extends it with Straico-specific request/response transformation. ProviderFactory instantiates the right one based on the PROVIDER_TYPE env var.

Right now only Straico exists, but the factory already has stubs for OpenAI and Anthropic. The ADDING_PROVIDERS.md doc in the repo lays out how to add a new one.

The other modules:

server.js - Express server, routing, auth, request lifecycle
streaming.js - SSE simulation with two modes
tools.js - Tool injection and response parsing
utils.js - Logging, formatting, log rotation
utils/model-limits.js - Fetches context limits from Straico's API
summarizer.js - Conversation summarization for long sessions
scripts/sync-opencode-config.js - Syncs model list to OpenCode config

Streaming Without Streaming

Straico returns the full response at once. No SSE. No chunks. The proxy has to fake it.

Two modes: none and smart.

none is what I'd recommend as default. It sends the entire response in one SSE chunk, then the [DONE] marker. Fast, no formatting issues, still technically SSE.

smart is more interesting. It splits the response into chunks with delays to simulate real streaming. The naive approach is responseText.match(new RegExp('.{1,15}', 'g')) and that kind of works. But it breaks markdown. Split mid-bold, mid-code-block, mid-backtick and the rendering glitches.

So smartChunkText() in streaming.js looks for safe boundaries. It prefers splitting on newlines, then whitespace. It also checks for markdown delimiters and extends the chunk to avoid splitting them. There's a max size limit (targetSize * 10) to prevent infinite extension.

// streaming.js - simplified version of the boundary logic
for (let i = end; i > start; i--) {
  if (text[i] === '\n') {
    return i + 1;
  }
}
for (const delim of ['**', '__', '`']) {
  const delimStart = text.indexOf(delim, start);
  if (delimStart !== -1 && delimStart < end) {
    const delimEnd = delimStart + delim.length;
    if (delimEnd > end) {
      return Math.min(delimEnd, text.length, start + maxSize);
    }
  }
}

Default is 15 characters per chunk with 80ms delay. That feels about right for most models. Configurable via STREAM_CHUNK_SIZE and STREAM_DELAY_MS env vars.

I set STREAM_MODE=none as the recommended default. smart works but it's more of a showcase thing. The boundary detection catches most cases but I wouldn't trust it with complex nested markdown.

Function Calling via Prompt Injection

Straico doesn't support function calling natively. The workaround: inject tool definitions into the system prompt and parse the AI's response to detect tool calls.

injectToolsIntoSystem() in tools.js appends a formatted list of available tools to the system message:

You have access to the following tools:
- bash: Run bash commands
- read: Read file contents

When you need to use a tool, format your response like this:
TOOL_CALL: <tool_name>
ARGUMENTS: <json_arguments>

There's a sentinel comment () to prevent double-injection if the same messages get processed twice.

The tricky part is parsing. Different models output tool calls in different formats. I ended up with four parsers that run in sequence:

Minimax XML - <minimax:tool_call> with <invoke> tags
Claude XML - <invoke name="..."> with <parameter_list> tags
OpenAI Native - JSON with "tool_calls": [...] embedded in the response
Text Format - The TOOL_CALL: / ARGUMENTS: format from the injection prompt

Each parser tries to extract tool calls from the response text. The first one that succeeds wins. This was a gradual thing. I started with just the text format parser. Then Minimax models returned XML. Then Claude models returned different XML. Then some models returned JSON that looked like OpenAI's format. Four parsers later and it handles most cases.

The text format parser was the hardest to get right. Matching TOOL_CALL: tool_name ARGUMENTS: {json} seems simple until the JSON contains nested objects, strings with braces, or the model forgets the space between the tool name and ARGUMENTS. The implementation tracks brace depth to find where the JSON actually ends:

for (let i = argsStartIndex; i < responseText.length; i++) {
  const char = responseText[i];
  if (char === '{') braceCount++;
  else if (char === '}') {
    braceCount--;
    if (braceCount === 0) {
      argsEndIndex = i + 1;
      foundClosingBrace = true;
      break;
    }
  }
}

The proxy also validates tool calls against the list of available tools. If the model invents a tool that doesn't exist, it gets filtered out. If all tool calls are invalid, the response is treated as regular text.

Tool Call Streaming

OpenCode expects tool calls to arrive as SSE chunks, same as regular text. streamToolCalls() in streaming.js sends an init chunk with the tool name and ID, then an args chunk with the arguments, then a final chunk with finish_reason: 'tool_calls'. Each chunk has a small delay (20ms, 10ms, 20ms) to feel like actual streaming.

Conversation Summarization

This one sneaked up on me. Straico has model context limits. Some models have 8k tokens, some have 128k. OpenCode sends the entire conversation history with every request. In a long coding session, that history grows fast.

summarizer.js checks if the estimated token count is approaching the model's limit. When it hits a configurable threshold (default 70% of the model's word_limit), it takes all but the most recent messages, sends them to Straico for summarization, and replaces them with a single summary message.

The summarization itself uses Straico's smart_llm_selector with pricing_method: balance, so it picks a cheap model for the summary. Configurable via SUMMARIZATION_MODEL.

I'm still not 100% sure this is the right approach. The summary is lossy. Sometimes the model needs context from earlier messages that the summary glossed over. But without it, long sessions just fail with context limit errors. Tradeoff.

Model Limits and Validation

utils/model-limits.js fetches all available models from Straico's /models endpoint at startup. It caches their context limits (word_limit) and max output tokens (max_output). The proxy uses this to validate incoming requests. If estimated_input_tokens + max_tokens > word_limit, it rejects the request with a 400 error before even hitting Straico.

The model list is also exposed at /v1/models so OpenCode can discover what's available. There's an admin endpoint at /v1/admin/refresh-models to force a refresh if Straico adds new models.

The sync script (scripts/sync-opencode-config.js) goes one step further. It fetches the model list from Straico, then updates ~/.config/opencode/opencode.json with all chat-type models. The Docker entrypoint runs this script before starting the server, so the model list is always current.

Authentication

Four modes, controlled by AUTH_MODE:

required - Needs PROXY_API_KEY, rejects requests without it. Default in production.
optional - Uses the key if set, warns if not. Default in development.
disabled - No auth. For isolated environments.
external - Trusts an external auth header. For when the proxy sits behind an API gateway or service mesh.

The key comparison uses crypto.timingSafeEqual to prevent timing attacks. Took me a moment to realise I needed buffer length checks too, since timingSafeEqual throws if the buffers are different lengths.

Retry and Graceful Shutdown

BaseProvider.makeRequestWithRetry() wraps every API call with exponential backoff. Retries on 429, 5xx, and network errors (ECONNREFUSED, ECONNRESET, ETIMEDOUT). Default is 3 attempts with a 1-second base delay.

Graceful shutdown was one of those things I didn't think about until I ran into issues. When Docker sends SIGTERM, the proxy stops accepting new requests and waits for active ones to drain. There's a timeout (default 30 seconds) after which it force-exits. Without this, long-running streaming responses would get cut off mid-chunk when the container restarted.

Docker Setup

The Dockerfile uses node:18-alpine and an entrypoint script. The entrypoint runs the OpenCode config sync, then starts the server.

Docker Compose mounts two volumes. The .env file for config. And ~/.config/opencode so the sync script can write to the OpenCode config file.

volumes:
  - ./.env:/app/.env:ro
  - ~/.config/opencode:/root/.config/opencode

One thing I got wrong initially was the Dockerfile CMD. I had CMD ["node", "server.js"] which meant the config sync never ran. Switched to ENTRYPOINT ["/app/docker-entrypoint.sh"] and that fixed it. Small thing, but it meant every container restart would have stale model lists.

The Straico-Specific Quirks

Straico's API is mostly OpenAI-compatible but with some differences that caught me out.

Tool result messages use role: "tool" in OpenAI format. Straico doesn't support that role. The proxy converts them to role: "user" with a [Tool Result]: prefix. Same with assistant messages that contain tool calls. Those get converted to the text format the injection prompt expects.

Empty assistant messages get filtered out entirely. Some models return an assistant message with empty content before making a tool call. Straico chokes on those.

There's a TOOL_RESULT_MAX_LENGTH env var that truncates large tool outputs. Some tool results (file reads, command output) can be massive. Without truncation, they blow out the context window and the next request fails.

The proxy also normalises messages. OpenAI sends content as arrays of objects (text parts, image parts, system reminders). The proxy flattens those into plain strings and strips out <system-reminder> tags. Straico doesn't know what to do with the array format.

What I'd Do Differently

The provider pattern is solid but I'd start with it from the beginning rather than refactoring into it. The four-file structure worked fine until I wanted to add features that crossed module boundaries. The abstraction would have saved me some reshuffling.

The smart streaming mode is neat but I'd think harder about whether it's worth the complexity. The boundary detection handles most markdown but not all edge cases. none mode is faster and more reliable. I use none day to day.

The summarization feature is the part I'm least confident about. It works, but the lossy compression means sometimes context gets dropped at exactly the wrong moment. I might revisit this with a sliding window approach instead of a hard summarize-and-replace.

Where It Stands

The proxy handles:

All 90+ Straico models through a single endpoint
Streaming simulation (both modes)
Function calling with four parser strategies
Conversation summarization for long sessions
Model context validation
Authentication with four modes
Retry with exponential backoff
Graceful shutdown with request draining
Docker deployment with automatic model sync

It runs on my machine and OpenCode talks to it at http://localhost:8000/v1. Works well enough that I don't think about it most of the time. Which is exactly what a proxy should do.

The code is on GitHub if you want to look or use it. Or add a provider. The architecture supports it.