What I'm watching this week: OpenClaw, Trellis.2, Gemma 4, and two more

#ai #opensource #machinelearning #webdev

Five open-source projects I pulled up and actually thought about this week. Not a comprehensive roundup — just what caught my attention as someone running three AI-curated directory sites on a $25/month budget.

OpenClaw: the most-starred repo in GitHub history

openclaw/openclaw just crossed 373,000 stars, making it the most-starred software project in GitHub history, overtaking React. It's a self-hosted AI assistant that routes to any model — local or API — and connects to over 50 messaging platforms: WhatsApp, Signal, iMessage, Slack, Discord, Telegram, and more, all from a single config file. Created by Peter Steinberger, it has 1,200+ contributors.

From my directory site angle, OpenClaw is exactly the kind of project that drives "Open Alternative To" traffic. People search for "self-hosted alternatives to [Claude API / ChatGPT]" and what they actually want is something like this — a local-first agent they control. The 300K+ star count without a polished marketing page tells me the latent demand is real. I'm going to add it to the OSS directory and watch whether the page indexes.

Claude Mythos Preview at 93.9% SWE-bench Verified

Anthropic published Claude Mythos Preview this week, which now leads the SWE-bench Verified leaderboard at 93.9%. For reference: Claude Opus 4.8 is at 88.6%, Claude Opus 4.7 Adaptive at 87.6%, and six months ago 80% was the ceiling. The trajectory is steep.

I'm running Claude Haiku 4.5 for my ETL and content generation — cost and latency matter more to me than ceiling performance. But I'm watching this number for a different reason: as coding agents approach 95%, the interesting question shifts from "can it write correct code?" to "can it understand architecture trade-offs without prompting?" SWE-bench doesn't measure that, and I don't know when that bar will move.

Gemma 4: Apache 2.0, agentic, multimodal, four sizes

Google released Gemma 4 in early April and made further announcements at Google I/O in May. The family: Effective 2B, Effective 4B, 26B MoE, and 31B Dense. All under Apache 2.0. The 31B model ranks #3 on the Arena AI open-model leaderboard; the 26B is #6.

What makes it relevant for my stack: native function-calling and structured JSON output are built in from day one, not retrofitted. The E2B model is designed for edge inference, which raises the question of whether it could replace Claude Haiku for my nightly ETL without a cost increase. The honest answer is I don't know — it depends on inference latency under load, and I haven't run that benchmark. I'll do it once I have a stable GPU budget to test against.

LLaDA2.0-Uni: discrete diffusion at 100B parameters

InclusionAI/LLaDA2.0-Uni (Ant Group) is a unified multimodal model — text generation, image understanding, and image generation — built on discrete diffusion instead of autoregressive decoding. The 100B flash variant uses MoE, making it the largest discrete diffusion model published so far.

The architecture is genuinely different from everything else: during training, tokens are randomly masked at varying rates; at inference you start from all-masked and run the reverse process. I'm not suggesting this is production-ready for most use cases today. But as a directional signal — "you don't have to do autoregressive decoding to get strong results at scale" — it's worth tracking. If this line of research matures over the next 12–18 months, it changes the inference cost curve in ways that matter for people running batch AI pipelines cheaply.

TRELLIS.2: image-to-3D in three seconds, MIT license

Microsoft released TRELLIS.2-4B on HuggingFace under MIT license. It converts a single image to a fully textured 3D asset: 512³ resolution in roughly 3 seconds, 1024³ in about 17 seconds, on an NVIDIA A100. Minimum 24GB VRAM.

I don't have an immediate use for 3D generation in my stack. But I'm watching because it fits the same pattern as the open-source video generation wave from late 2025: once the compute floor drops below "GPU you already own," the tool stops being a research artifact and becomes infrastructure. TRELLIS.2 isn't there yet at 24GB minimum, but the MIT license and HuggingFace availability suggest Microsoft is betting on community adoption rather than API gating. The next version will probably run on a 16GB card.

Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.