DEV Community: Matt Macosko

The IRS Said Yes — and What That Does and Doesn’t Mean

Matt Macosko — Thu, 16 Jul 2026 06:10:44 +0000

Back on June 3rd I published a piece here saying the Cannabis Device Safety Institute had filed its federal application, and I made a point of leaning on that word — application. Filed, not granted. Pending, not approved. Not a tax-exempt charity, contributions not deductible, and we’d say so plainly every time, because saying it any other way would be exactly the kind of thing this institute exists to push against.

So I owe you the update in the same register.

The IRS granted it. The determination letter is dated June 29, 2026. CDSI is exempt from federal income tax under Section 501(c)(3), classified as a public charity under Section 509(a)(2) — not a private foundation — with the exemption effective April 27, 2026, retroactive to the day the articles were filed. Contributions are deductible under Section 170.

That’s the whole announcement. Now let me do the part I think matters more, which is being precise about what it does and doesn’t mean.

What it doesn’t mean

It is not an endorsement. Recognition of exemption is a determination about tax status. The IRS did not review our methodology, did not evaluate our off-gas protocol, and has no opinion whatsoever about whether ceramic donut atomizers off-gas at 650°F. No agency has blessed CDSI’s findings, because CDSI hasn’t published findings yet. If you ever see me imply otherwise, call me on it.

It doesn’t make CDSI a regulator. We’re a private nonprofit. We have no authority over anyone, we can’t compel any manufacturer to do anything, and we’re not seeking a government designation. The model is UL, ASTM, the NFPA Research Foundation — bodies that earned standing by being useful and rigorous, not by being appointed.

It doesn’t mean we have a lab — yet. Right now CDSI pays accredited independent labs and publishes what comes back. Pay the lab, not pay to pass. But the goal was never to outsource forever: the plan is for this institute to build and run its own testing bench, because the body that writes the methodology should eventually be able to execute it too — and the determination letter is exactly what makes that fundable. Foundation grants and tax-deductible donations can now go toward standing up a lab of our own. If and when that lab exists, nothing about the discipline changes — open methodology, public reports, every conflict on the cover page.

It doesn’t finish the paperwork. California still treats CDSI as a taxable corporation until a separate state filing goes through. That one’s in an envelope, not a press release.

About the timeline, honestly

We filed the full Form 1023 on June 3. The determination is dated June 29 — 26 days later.

The IRS publishes exactly one number about how long this takes: “We issue 80% of Form 1023 application determinations within 191 days.” That’s from their own “Where’s my application” page. So: 26 days, against a published benchmark of 191, on the long form rather than the 1023-EZ shortcut, with no request for expedited handling.

I want to be careful with that number, because it would be very easy to turn it into something it isn’t.

I can’t tell you it’s a record. Nobody can tell you that, about any organization. The IRS’s public files record determination dates by month only — no day — and carry no application-submitted date at all. The interval literally cannot be computed from public data, including the IRS’s own. So anyone claiming a record in this category is claiming something no dataset could check, and I’d rather not be that guy.

I also can’t tell you the speed proves the application was good. That’s the flattering read, and I don’t think it survives scrutiny. The IRS has fast-track lanes for straightforward cases, and how big those lanes are isn’t published. The six days between answering their follow-up letter and getting the determination is, as far as I can tell, just the IRS’s own internal rule about how fast a specialist has to close a case once you respond. And the biggest variable — when the application got assigned to a human at all — was completely outside my control and I can’t explain why it happened when it did.

What I can say is what we actually did, and let you decide if any of it mattered:

We paid $600 for the long form instead of $275 for the EZ, on purpose, because the EZ doesn’t have room to explain why a fee-for-service testing subsidiary serves a public mission — and filings like ours get bounced back to the long form anyway, months later. The slow-looking choice was the fast one.

We disclosed the conflicts instead of burying them. The founder of a cannabis hardware standards body builds cannabis hardware. That’s on the record, in the conflict of interest policy, in the state charity filing, and it’ll be on the cover page of every report we publish.

We went through this site and cut every claim we couldn’t prove — before we filed, not after. No “first,” no “leading,” no partnerships that hadn’t happened. Reviewers read your website. There was nothing to pick at because we’d already picked at it ourselves.

And when the IRS wrote asking for more information, we answered within hours of opening the envelope instead of sitting on the 28 days we had.

None of that is clever. It’s just doing the boring things in the right order.

What it actually changes

Donations to CDSI are now tax-deductible. That’s real — it means a foundation can fund hardware testing, and an individual who cares about this can help and take the deduction.

CDSI is registered with the federal government and eligible for federal grants — that happened back on June 23, with the SAM.gov registration going active.

Put those together and here’s the thing I care about: the money to characterize this hardware can now exist. For fourteen years the reason nobody tested the device was that no one would pay for it. In 2016 I paid a lab out of pocket to run an off-gas test on a concentrate vaporizer because there was no funding source in the world for that work. That’s still the founding artifact of this institute, and it’s still the reason it exists.

Now there’s a vehicle that can receive that money. That’s what a determination letter is. Not a trophy — plumbing.

Where this goes

The next thing I want to make the case for is bigger than safety, and I’ll write it properly soon.

It’s this: you cannot honestly answer whether inhaled cannabis helps someone until you know what the device contributed to the dose. Heat a vaporizer and it puts its own chemistry into the same stream carrying the cannabis. So every measured outcome — a contaminant, a symptom, a relief — has two possible sources: the material and the machine. That’s not noise you can fix with more study subjects. It’s an attribution problem, and no sample size touches it.

Every other measurement science solved this a century ago. The chemist runs a blank before running samples. Nobody doses a patient through an uncharacterized nebulizer. Cannabis research does the equivalent constantly, not out of carelessness, but because no one was ever responsible for the device.

Which means hardware characterization isn’t adjacent to the medical question. It’s upstream of it. First you characterize the device. Then you have a device you can trust as an instrument. Then — and only then — you can ask what the medicine does to a person and believe the answer.

It all starts with the devices. Not because the devices matter most. The person matters most. But the device is the part you have to understand first in order to understand any of the rest of it honestly.

That’s the work. The letter just means we can afford to do it.

The Cannabis Device Safety Institute is a California nonprofit public benefit corporation, recognized by the IRS as exempt under IRC § 501(c)(3) and classified as a public charity under § 509(a)(2) (determination letter dated June 29, 2026; EIN 42-2429365). Recognition of exemption is a determination of federal tax status and is not an endorsement of the Institute or its findings by the IRS or any government agency. cdsi.click

Originally published at Marijuana Union. The Cannabis Device Safety Institute is an independent 501(c)(3) nonprofit standards body for cannabis consumption hardware — open methodology, public reports, every conflict of interest on the cover page. Methodology, papers, and the public record: cdsi.click.

M5 Max + 128GB = a 30B AI Coding Agent Running Locally. Wi-Fi Off.

Matt Macosko — Mon, 29 Jun 2026 17:18:47 +0000

The M5 Max MacBook Pro with 128 GB of unified memory is the first laptop that can hold a frontier-class coding agent entirely in RAM. No GPU rack. No cloud. No subscription.

That clip up top isn’t a render. That’s Qwen 3 Coder — 30 billion parameters, 8-bit MLX — running on this MacBook with the Wi-Fi off. Around 55 tokens per second. Total cost to keep running it: zero.

The thing that matters more than the spec sheet is what it actually unlocks.

Why the M5 Max changes the math

Until now, running a 30B+ parameter model meant a GPU rack — or paying a cloud API per token. The M5 Max changes that:

128 GB unified memory. The entire model lives in fast RAM. No GPU offload, no quantization tricks past 8-bit. The CPU and “GPU” share the same memory, so there’s no copy step between them.
Mixture-of-experts plays perfectly with Apple Silicon. Qwen 3 Coder is 30B total but only 3B active per token. That’s a math problem the M5 Max’s memory bandwidth eats for breakfast.
MLX runs at near-CUDA speed. Apple’s native ML framework hits ~55 tok/s on the 8-bit quant. No CUDA tax, no Nvidia driver politics, no $40,000 GPU bill.
It’s a regular store-bought laptop. No GPU rack. No data center. No cloud bill. You can run it on a plane.

This wasn’t possible on a laptop a year ago. It is now.

What you can do with it now

Read a legal contract — and have it never leave your machine.

Most AI tools pipe your document to a server somewhere. With this setup, the bytes don’t leave the laptop. NDAs, supplier agreements, employment contracts — review them at your kitchen table without uploading them to anyone.

Write production code in a couple of seconds.

The video shows it: real Python function, real Qwen output, no edits. The agent’s tool-calling is good enough to drop into Claude Code’s loop, where it’ll edit files, run shell commands, and iterate. It’s plenty for everything from one-off scripts to refactoring real production code.

Analyze patient charts without a HIPAA violation.

For doctors, therapists, intake clinics — anything with PHI on it — local-only AI isn’t a nice-to-have, it’s the only legal option. Same model, same speed, zero bytes leaving the device.

Build agents that don’t charge you per call.

This is the one most people sleep on. Pay-per-token cloud APIs make agents expensive to leave running. Once the model is local, you can let an agent loop overnight, hit it with thousands of requests, kick off a watcher that scans your inbox every two minutes — and the cost stays at zero.

The full stack

Here’s the receipt:

Hardware: M5 Max MacBook Pro, 128 GB unified memory.
Model: Qwen3-Coder-30B-A3B-Instruct-MLX-8bit — about 30 GB on disk. Mixture-of-experts, ~3B params active per token.
Server: A small Python proxy at localhost:4000 that speaks the Anthropic Messages API, so the Claude Code CLI thinks it’s talking to the cloud — except it’s talking to a hard drive.
Total monthly cost: $0 once it’s downloaded.

That’s it. No Docker, no Kubernetes, no VPS. Just a laptop on a desk.

The performance, honestly

The local-AI space is full of overclaims, so the straight numbers:

55 tokens per second on a real coding task. Sustained, not peak.
Two seconds to write a working find_median() function. Three to four seconds for most refactors.
Tool-calling reliability is good enough for the Claude Code agentic loop. Not as locked-in as Sonnet 4.6, but plenty for getting work done.
What it’s not: a Sonnet replacement for nuanced reasoning, long contexts, or really tricky debugging. For day-to-day code agent work, it more than holds its own.

Why the offline part matters

The reason “Wi-Fi off” keeps coming back in the demo isn’t a gimmick. It’s the whole thesis.

If a tool needs the internet, three things are true:

Someone else can read what you sent.
Someone else can charge you for it.
Someone else can take it away.

If the same tool runs locally, none of those are true. That’s a different category of software. Not better at every task — but yours.

Benchmarks — actually run, not cited

Big claims need numbers. So here’s what Qwen 3 Coder 30B-A3B (8-bit MLX) actually scores on this MacBook, run end-to-end against the local localhost:4000 server. Every problem solved by the model, executed in a Python subprocess, scored pass/fail. Pass@1, temperature=0, single sample per problem.

Benchmark	N	Pass@1	Notes
HumanEval	164/164 (full)	81.7%	Python function-completion classic. Saturated benchmark; modern coding models cluster 75–95%. 14 min total wall-clock.
MBPP (sanitized)	168/427 (sampled)	83.3%	Mostly Basic Python Problems. Pass rate was stable since n=120; a few outlier tasks induce very long model responses, so I cut off at 168.

Both runs used pass@1, temperature=0, 10s execution timeout, on the local 8-bit MLX quantization. No retries. No best-of-N tricks. Single sample per problem.

For context — what the bigger sibling scores on harder benchmarks

The Qwen team didn’t publish HumanEval/MBPP for any Qwen3-Coder variant — they consider those benchmarks saturated. Their official benchmarks are agentic, and they ran them on the flagship Qwen3-Coder-480B-A35B-Instruct (the bigger sibling, ~16× the active params of the 30B-A3B running on this laptop). For context — here’s what the flagship 480B scores on those harder agentic benchmarks compared to the major closed models:

Agentic Benchmark	Qwen3-Coder 480B	Claude Sonnet 4	GPT-4.1	DeepSeek-V3
SWE-bench Verified (500-turn)	69.6	70.4	—	—
Terminal-Bench	37.5	35.5	25.3	2.5
BFCL-v3 (function calling)	68.7	73.3	62.9	64.7
Aider-Polyglot	61.8	56.4	52.4	56.9
WebArena	49.9	51.1	44.3	40.0

Source: Qwen team’s official blog. The 30B-A3B running on this MacBook is a smaller sibling of the 480B — it trades absolute peak agentic ceiling for fitting in 30 GB and running 24/7 on local hardware. For most coding tasks people actually do in a day, HumanEval/MBPP-class accuracy matters more than the SWE-bench top-line, and on those it sits where it should: useful, fast, local.

Where this is heading

The next year of the AI conversation isn’t going to be “which model is smartest.” It’s going to be “which workloads belong on your machine, and which belong on someone else’s.”

Compliance-bound work — legal, medical, financial — is going to move local fast. Code-agent loops will follow because the math (per-call cost vs. zero) is brutal. The M5 Max with 128 GB of unified memory is the laptop that lets that happen.

Try it yourself

The launchers are open source on GitHub: nicedreamzapp/claude-code-local. The README walks through downloading the model and pointing Claude Code at the local server.

For law firms, medical practices, and accountants that want help getting this running on their own hardware — that’s what AirGap is. 14-day pilot, fixed scope, the data never leaves your machines.

— matt

Originally published at Nice Dreamz Wholesale. Run AI locally on your own hardware with claude-code-local, open source and no cloud required. More at nicedreamzwholesale.com/software.

4 AI Models Built the Same Game on One Laptop — and a Local One Beat the Cloud

Matt Macosko — Tue, 23 Jun 2026 05:34:34 +0000

Four AI models. One job: build a playable game. One laptop. I gave four different AI models the exact same prompt — build a complete, playable Asteroids game in a single HTML file — and then I actually played what each one built. Three ran entirely on my laptop. One was the cloud, as the benchmark to beat. Here is the whole thing in under a minute.

I have been doing a lot of these little head-to-heads lately, and I wanted one where the test was real — not a riddle or a trick question, but an actual build. Give every model the same spec, let it write the whole thing in one shot, then open the file and see if the game actually plays. No cherry-picking, no fixing it up afterward. Either it runs or it doesn't.

The setup

Everything ran on a MacBook Pro M5 Max with 128 GB of memory. Three of the four models ran 100% local — no internet, nothing leaving the machine — through MLX and llama.cpp. The fourth was Cloud Claude Opus, sitting in a data center, there to set the bar.

Model	Where	Time	Speed
DeepSeek V4 Flash 284B	Local	300s	31 tok/s
Cloud Claude Opus	Cloud	48s	115 tok/s
Qwen3-Coder 30B	Local	33s	95 tok/s
Gemma 4 31B	Local	145s	26 tok/s

The prompt was identical for all of them: a single self-contained HTML file, vanilla JavaScript, no libraries — a ship that rotates and thrusts, bullets, asteroids that split when you shoot them, screen wrap-around, score, lives, and a game-over screen. One shot.

What happened

All four wrote a real, working game. That part genuinely surprised me — a few years ago, "write me a complete arcade game in one file" was not something you handed to a model running on a laptop and expected to play afterward.

DeepSeek built the most polished one. A glowing ship with a thrust trail, a starfield background, asteroids that actually broke apart, hearts for lives. It took the longest, but it played beautifully. Cloud Claude was fast and clean — lives, a level counter, even on-screen control hints. Qwen3-Coder 30B was the speed demon, turning out a solid game in 33 seconds. And Gemma delivered a tidy, working game of its own.

The verdict

The winner, to my eye, was DeepSeek — and the part that sticks with me is that it was running locally, on my desk, and it built a better game than the cloud model. Cloud Claude was excellent and much faster, but on the actual finished product, the local 284B model edged it out.

The bigger point is the one I keep landing on in everything I write here. Three of these four ran completely offline, for free, on one machine. No subscription, no metered tokens, nothing leaving the laptop. The cloud is still faster, and for a lot of work that speed matters. But "you have to use the cloud or it won't be any good" is just not true anymore. Local AI is catching up, and on this particular test it pulled ahead.

Run local models yourself (free)

If you want to try this, the abliterated MLX models I have converted for Apple Silicon are all free to pull:

My Hugging Face — abliterated MLX models for Apple Silicon
claude-code-local — run Claude Code against a local model, no API key.

The narration in the video is local text-to-speech, and the whole comparison was made on one MacBook. The only thing that touched the cloud was the Cloud Claude baseline.

Originally published at Nice Dreamz Wholesale. I convert abliterated MLX models for Apple Silicon — all free at my Hugging Face.

Eight local AI agents on a Mac mini — and the product I'm building from them

Matt Macosko — Tue, 26 May 2026 19:04:03 +0000

A case study went around recently: a lawyer had wired up 66 AI agents on a Mac mini for his own firm — every one running locally, nothing touching a cloud API — and was looking for a commercial partner before releasing it as open source.

I read that and realized I had been building the same shape of thing for my own small business for the last year. Eight ambient agents, all running on Apple Silicon, none of them touching cloud APIs. I had not productized any of them. I had also not gotten paid for any of them.

This post is about what I'm doing about that.

The asset base

I shipped a repo called claude-code-local. It's an MLX server that wraps three local language models (Gemma 4 31B, Llama 3.3 70B, Qwen 3.5 122B MoE) behind an OpenAI-compatible API. The setup script picks the right model for your hardware, downloads it, and puts a launcher on your Desktop. Three commands and you're running a 31-billion-parameter language model on your MacBook.

It has 2,689 stars and 516 forks as of this writing. License is MIT. The whole thing is at github.com/nicedreamzapp/claude-code-local.

That's the engine. What I had not built was the funnel.

The gap that ate me for a month

Stars are people who think "I would love this." Forks are people who started doing the work. Neither converts to revenue.

Reading that case study, what jumped out was: a lawyer with the technical chops to wire 66 agents could not productize the result. He has the buyer relationships (he is the buyer); I have the go-to-market background (I've sold consumer hardware direct to customers for years). The intersection nobody is shipping is "Mac mini with this stack pre-installed, delivered to your law firm."

So I sat down and mapped what was actually missing.

The market gap, in numbers

I researched the on-device AI market for May 2026. Here is what I found:

Free OSS (Ollama, LM Studio, Jan, Atomic Bot, my own repo): saturated, no money flowing
Paid Mac App Store apps (Private LLM, Enclave AI, Local LLM): $10-30 one-time, real revenue for bootstrapped 2-person teams
Cloud SaaS with zero-data-retention (Spellbook, CoCounsel): $69-$149/mo per seat, cloud-not-local
Enterprise legal AI (Harvey): $1,200+/seat/month
AI-native law firms (Manifest, Avantia, General Legal): not selling tools, they ARE the firm

Two things stand out:

The gap between "free OSS" and "$1,200/seat SaaS" is not being filled by subscription products. Private LLM's App Store reviews explicitly call out "no subscription" as the reason buyers picked them. The privacy buyer rejects recurring billing for privacy software. This is not opinion; it is in their reviews.
The Mac mini hardware bundle for privileged work is being built privately but not productized. Nobody is shipping it as a SKU.

That is the gap. That is what I am putting a product against.

The ladder

I'm skipping the standard SaaS playbook because the market is telling me to. The privacy buyer rejects recurring billing — so the spine is one-time purchases, not subscriptions:

Rung	Price	Audience
Free repo	$0	Existing OSS audience, top of funnel
AirGap Box	$2,995 (Base) / $3,995 (Pro)	Small firms wanting sovereignty without DIY
Foundation consulting	scoped	Firms ready for deeper, white-glove deployment

The free repo gets you curious. The Box converts the firms who would rather pay $3k than learn MLX. Foundation is for the firms that want it installed, tuned, and documented for their compliance counsel.

The three core agents

Three agents that drop on top of any OpenAI-compatible local LLM server (the free claude-code-local repo by default, but they work with Ollama and others too) — and ship pre-installed on the Box:

Folder Watcher — drop a PDF, text, or Markdown file into ~/AirGap-Inbox/. Within 30 seconds, a structured Markdown summary appears in _summaries/. Uses macOS's textutil and mdimport for PDFs and docx without external dependencies.
Daily Briefing — every morning at 7:00 a LaunchAgent reads the folders you list in a config file and writes a one-page digest of what changed in the last 24 hours.
Local Q&A — a CLI command airgap ask "your question" that answers from a single document you point it at, fully offline. Folder-wide indexing and citations across many files is on the roadmap — not something I'd oversell today.

The whole set is ~300 lines of Python. The installer is a .command file you double-click. Total install time: under 60 seconds on the right hardware.

I deliberately built them with zero authentication required. No IMAP credentials, no OAuth dance, no API keys — the install works for everyone on day one. The agents that need credentials (email drafting, Reddit lurking, calendar parsing) run on my own machines and ship in the AirGap Box, where there's a setup call to wire them up properly.

What's in the AirGap Box

The same stack on a Mac mini that arrives at your office. Pre-installed: claude-code-local MLX server, the three core agents, two more that need credentials (email drafter, prompt library), a default-blocked firewall, and a printed compliance memo template.

Base ($2,995): Mac mini M4 16GB, Gemma 4 31B preloaded.
Pro ($3,995): Mac mini M4 Pro 24GB, Llama 3.3 70B preloaded.

Both include a 90-minute Zoom setup call and 30 days of email support. After that, you have a working private-AI workstation. We do not phone home. We do not see your data.

COGS on the base unit is around $940. Margin around 69%. Cash-flow safe because Stripe charges at purchase; hardware ordered after.

Validation before scale

I am not ordering inventory on speculation. The plan is:

Day 0: launch publicly, with the Box gated by a waitlist (no checkout yet)
Day 7: first review — Box waitlist signups
Day 14: hard gate on Box — 20+ verified emails or we revisit positioning
Day 30: full P&L review, decide whether to scale

If the Box waitlist hits 20 in 14 days, I order the first 3 Mac minis. If it hits 5, I have not validated the rung; I revisit positioning before spending hardware money.

This is the part the grifter posts skip. "I made $14,200/month in 72 hours" is not a thing that happens to honest businesses. What happens to honest businesses is "I opened a waitlist, watched signups for two weeks, decided whether to order inventory based on actual demand, and reported the real numbers."

Honest year-1 range

Month	Pessimistic	Realistic	Stretch
1	$0	$400	$2,000
3	$200	$1,500	$6,000
6	$800	$3,500	$12,000
12	$1,500	$6,000-8,000	$20,000+

To hit the grifter's $14k/month claim, every rung needs to perform near the top of its range AND consulting needs to land. That happens in months 12-18 with discipline, not in 72 hours.

What I will publish honestly

Weekly waitlist + signup numbers (real, not vanity)
The first refund (when it happens) and why
The first Box install case study (with the firm's permission)
The Box waitlist → order conversion, as it happens

If you want to see whether this business works, follow along on github.com/nicedreamzapp/claude-code-local and the AirGap landing pages. I'll post the numbers as they happen.

Why I'm posting this

Because the next person to read a case study like that and think "I want this" deserves to find a productized version, not another Hacker News thread about MLX setup. And because every honest version of this story I publish is also a receipt — for me, that I built something real, and for the next builder, that the playbook works.

If you have feedback on the pricing, the positioning, or the agents, leave a comment. I read everything.

I'm building AirGap — pre-configured local AI on a Mac mini for firms handling private work. Join the AirGap Box waitlist, grab the free open-source stack, or see AirGap consulting for compliance-sensitive firms (law, medical, finance).

HumanEval on a MacBook — 81.7% pass@1, Wi-Fi off

Matt Macosko — Wed, 29 Apr 2026 18:11:49 +0000

The M5 Max MacBook Pro with 128 GB of unified memory is the first laptop that can hold a frontier-class coding agent entirely in RAM. No GPU rack. No cloud. No subscription.

I just ran HumanEval on it. Wi-Fi off the entire run.

81.7% pass@1 on the full 164-problem benchmark
Qwen 3 Coder 30B-A3B-Instruct (8-bit MLX)
14 minutes wall-clock, $0/month after the model download

YouTube walkthrough (three real problems, code streaming live, tests going green):
https://www.youtube.com/watch?v=muq7VdgxqRk

Why this number matters

The Qwen team didn't publish HumanEval scores for any Qwen3-Coder variant — they consider the benchmark saturated and went straight to agentic ones (SWE-bench Verified, BFCL, Aider-Polyglot). For the 30B variant — the one that actually fits on a laptop — there were no published HumanEval/MBPP numbers. Until this run.

I also ran MBPP (sanitized): 83.3% pass@1 on a 168-problem sample. Pass rate stable since n=120; full 427-run was impractical because a few outlier tasks induce very long model responses (10+ minutes each).

Methodology

Setting	Value
Benchmark	HumanEval — 164 Python tasks (full)
Metric	pass@1 (first attempt only)
Temperature	0.0 — deterministic
Sampling	single sample per problem, no best-of-N
Execution	Python subprocess, 10s timeout
Hardware	M5 Max MacBook Pro · 128 GB unified memory
Model	Qwen3-Coder-30B-A3B-Instruct-MLX-8bit
Network	Wi-Fi OFF the entire run
Wall clock	14 minutes

For context — Qwen3-Coder 480B's official agentic benchmarks

The Qwen team's published numbers for the 480B flagship sibling (the bigger sibling of the 30B running on this MacBook):

Benchmark	Qwen3-Coder 480B	Claude Sonnet 4	GPT-4.1
SWE-bench Verified (500-turn)	69.6	70.4	—
Terminal-Bench	37.5	35.5	25.3
BFCL-v3	68.7	73.3	62.9
Aider-Polyglot	61.8	56.4	52.4

Source: Qwen team's official blog.

Why the offline part matters

If a tool needs the internet, three things are true:

Someone else can read what you sent.
Someone else can charge you for it.
Someone else can take it away.

If the same tool runs locally, none of those are true. That's a different category of software — and for law firms, medical practices, and accountants handling client material, it's the only legal one.

Reproduce it yourself

Open-source launchers: github.com/nicedreamzapp/claude-code-local
HumanEval dataset: github.com/openai/human-eval
Hardware: any M-series MacBook with ≥32 GB RAM (128 GB Max preferred for full 8-bit weights)
Total monthly cost: $0 after the model download

For law firms, medical practices, and accountants who want help getting this stack running on their own hardware — that's what AirGap is. 14-day pilot, fixed scope, the data never leaves your machines.

— matt

Originally published at Marijuana Union. For premium vaporizers visit iNeedHemp, wholesale at Nice Dreamz, and seeds at Tribe Seed Bank. Explore the 3D cannabis marketplace at The Farmstand.

Free AI on a MacBook vs $100-a-Month Claude Code — Hexagon Shootout

Matt Macosko — Thu, 23 Apr 2026 04:32:47 +0000

▶ Watch the race on YouTube: https://www.youtube.com/watch?v=2KeTDDodE0A

April 22, 2026. Anthropic's Claude Code Max plan jumped to $100 a month. I ran a live three-way AI race on the exact same prompt — Gemma 31B local, Llama 70B local, and Claude cloud — on a single MacBook, to see how close a free local stack gets to the paid cloud. Two of three contestants finished with zero cloud calls.

If you just want the video, it's here: FREE AI on a MacBook vs Claude Cloud — Hexagon Shootout.

If you want the repo, it's here: github.com/nicedreamzapp/claude-code-local.

Keep reading for the setup, the numbers, and the three things that surprised me.

The setup — same prompt, three contestants

Hardware: M5 Max MacBook Pro, 128 GB unified memory, Apple Silicon.

Gemma 31B — local, Apple MLX, 4-bit quantized (Google's code-specialized model)
Llama 70B — local, Apple MLX, 8-bit quantized (Meta's generalist)
Claude cloud — the real Anthropic API, using Claude Code unchanged

Same prompt to every contestant:

Build a single HTML file with inline JavaScript that shows a ball bouncing inside a rotating hexagon. Include gravity and realistic bounce physics.

Simple enough that the answer should be a few kilobytes of code. Interesting enough that it exposes how well a model handles real math — collision detection against rotating geometry, energy conservation, boundary clamping. When models trip, they trip here.

Every run was recorded end-to-end with a live stats panel: elapsed seconds, output bytes, tokens-per-second. No cherry-picking, no post-hoc edits to the physics code, no "here's what it SHOULD have said." What you see is what came out.

The results

Contestant	Time to ship working HTML	Tokens/sec	Cloud calls
Claude cloud	22 s	N/A (data center)	yes (via API)
Gemma 31B local	56 s	~30	zero
Llama 70B local	2:17	~11	zero

Claude cloud finished first — it's a data center somewhere. Gemma 31B finished clean in under a minute with working physics. Llama 70B took the longest and produced the most verbose output, but also landed a working demo in the end.

The headline isn't that one is "best." It's that two of the three ran with Wi-Fi that could have been off the entire time. That's the number that matters for anyone dealing with NDAs, PHI, client files, or just a flight without connectivity.

Three things that surprised me

1. Bigger isn't better when "bigger" is a generalist

I went in expecting Llama 70B to beat Gemma 31B on code quality. It's more than twice the parameter count. Gemma beat Llama cleaner and faster on this specific task.

Why: Gemma 4 is a Google model fine-tuned heavily for coding and math. Llama 3.3 70B is Meta's generalist — it's excellent at conversation, reasoning, creative writing, but it wasn't tuned to punch above its weight on HTML canvas physics.

If you're buying a local model for coding, you're better off with a 30B that's code-tuned than a 70B that's general. Don't count parameters, read the model card.

2. Claude Code's harness chokes local models

Claude Code (the CLI agent) sends a 29,000-token system prompt with 60 tool schemas in every request. That's tuned for the cloud — where a frontier model can happily chew through 30K tokens of context before even starting. On a local 70B, that prefill takes a minute or two before generation begins.

When I bypassed Claude Code and hit the MLX server directly with just the prompt, Llama 70B's wall-clock time dropped from 7+ minutes to under 2.

The tradeoff: without Claude Code's harness you lose the Write/Edit/Bash tool-use loop, so you can't use Claude Code as an agent, only as a generator. For research, benchmarking, or any single-shot prompt, direct is way faster. For actual coding sessions, the overhead is real but it's what buys you the agent loop.

3. Circle-approximation collision is the cheat code

All three models eventually produced a bouncing ball. The ones that worked used circle-approximation collision — treat the hexagon as a circle of its apothem radius for collision purposes, reflect velocity when the ball exceeds that radius, clamp the ball back to exactly inside. Five lines of math, reliable, hexagon can rotate as wildly as you want.

The ones that failed tried to do proper polygon-edge collision — compute the six edges of the rotating hexagon each frame, compute point-to-line distance for each, reflect off the appropriate edge. That's the "right" way, and it fails constantly because floating-point error lets the ball slip through edges during the rotation, and then the model doesn't know how to clamp it back.

I wouldn't have predicted this. The "simple" approximation is strictly better for the demo because it can't leak. For anything more complex than one ball, the polygon approach is necessary — but for a benchmark, approximation wins.

Who should care

Developers on laptops with 64+ GB of Apple Silicon unified memory: you can run this today, your hardware already supports it.
Anyone dealing with confidential work — lawyers, accountants, doctors, contractors handling NDAs or PHI: the cost isn't $0 vs $100, it's "does your data leave the machine" vs "does it not."
Frequent flyers and people who travel to places with bad internet: a 70B model on a laptop keeps working when the plane's Wi-Fi is $18 and throttled.
Anyone curious whether Apple's bet on unified memory was actually about AI: it was.

How to run it yourself

The repo is MIT licensed and open source. Full setup is in the README:

→ github.com/nicedreamzapp/claude-code-local

The project pairs a native-MLX Anthropic-API-compatible server with Claude Code. Point Claude Code at localhost:4000 and the official CLI talks to your local model as if it were the cloud API. Swap models with one env var. Ship code without the subscription.

Around 2,000 stars in the first month. If it's useful, a star helps.

TL;DR

Claude cloud: $100/mo, 22 seconds to a working hexagon.
Gemma 31B on my MacBook: $0, 56 seconds to a working hexagon.
Llama 70B on my MacBook: $0, 2:17 to a working hexagon.
Two of three ran with zero cloud calls.
Free AI on Apple Silicon is real, now, for a huge slice of what people use cloud APIs for.

The receipts, in video form: youtube.com/watch?v=2KeTDDodE0A

Originally published at Marijuana Union. For premium vaporizers visit iNeedHemp, wholesale at Nice Dreamz, and seeds at Tribe Seed Bank. Explore the 3D cannabis marketplace at The Farmstand.

"It Comes Out Of The Gate Very Fast": Disclosure Day Is An Action Movie

Matt Macosko — Mon, 20 Apr 2026 08:03:25 +0000

The moment Universal released the December 2025 teaser — wide Kansas sky, a meteorologist tilting her head, one note of John Williams score — the internet settled on an idea of what Disclosure Day was going to be. Slow. Sparse. Grown-up Spielberg. The Close Encounters of 2026. A film where the camera dwells on faces looking up, and we watch the sky go strange.

That idea was half right. Per Empire's exclusive for the June 2026 issue, Spielberg has other plans for the first 30 minutes.

"This movie comes out of the gate very fast. People who are expecting another slow-burn first act — this is not that movie."— Steven Spielberg to Empire

What We Know About the Opening

Based on the CinemaCon footage, the Super Bowl trailer, and the Empire cover package, the opening stretch of Disclosure Day includes:

A cold open in medias res. The first image, per reporters who saw the CinemaCon reel, is not a Kansas cornfield. It's a door being kicked in.

Josh O'Connor's fugitive run. Daniel Kellner already has the disclosure file when we meet him. He is not discovering anything in act one. He is running with it. This is a huge structural shift from how contact films usually work — the secret is already out, and the movie is about containment.

A car-chase-onto-a-train sequence. Confirmed by IMDb trivia and hinted at by O'Connor himself ("a car chase that is going to melt people"). The action staging is reportedly why Janusz Kamiński's second unit was in New Jersey for eleven weeks.

The Kansas City weather broadcast. The "click" sequence with Emily Blunt — previously assumed to be the film's quiet centerpiece — is actually in the first act. It's the inciting event, not the climax.

Why Spielberg Pivoted

David Koepp's prior Spielberg collaborations — Jurassic Park, War of the Worlds, Indiana Jones and the Crystal Skull — are all structured around the escalation of chase and threat. Koepp is not a meditative writer. He is a propulsion writer.

Spielberg telling Empire that the audience expectations have "caught up" to where the culture is means something specific: in 2026, the public already knows there are congressional UAP hearings happening. They already know Grusch testified. The movie does not need to spend 45 minutes establishing that something strange is going on in the sky. The audience is already there. So Spielberg is skipping that act and starting with the consequences.

The Close Encounters Comparison Breaks

If Close Encounters of the Third Kind spent half its runtime building to the Devils Tower meeting, Disclosure Day inverts it. The contact is the premise, not the ending. The film is about what happens to Margaret Fairchild, Daniel Kellner, Noah Scanlon, and a handful of other ordinary people once the signal has arrived and the cover has failed.

That is why Blunt's quote about "questions posed by Close Encounters" being "answered" works. Disclosure Day doesn't repeat the 1977 film's arc. It picks up where that film ended — and runs.

What It Means for the Box Office

Universal's tracking reportedly pushed for a more actioned-up back half of the marketing campaign after CinemaCon. Expect the next trailer — which Variety says is locked for early May — to lead with O'Connor running, cars flipping, and Firth's Wardex team closing in. The "look up at the sky" imagery isn't going away. It's just no longer the only mode. Disclosure Day is a summer action movie with a philosophical third act, not the other way around.

Sources

Empire — Disclosure Day Is An Action Movie That Comes Out Of The Gate Very Fast

Art Threat — Action-Packed Sci-Fi Thriller

Gizmodo — Mysterious Main Characters

Disclosure Day opens in theaters and IMAX on June 12, 2026.

Originally published at Disclosure Day Hub — the fan-built resource tracking Steven Spielberg's UFO film (June 12, 2026). Explore the full news hub, cast guide, and interview archive.

Inside Empire's Disclosure Day Cover Story: Every Quote That Matters

Matt Macosko — Mon, 20 Apr 2026 08:03:23 +0000

Empire's June 2026 issue hits newsstands this week with Disclosure Day on the cover and a feature package inside that does something Universal's marketing has not so far: it tells us who the characters actually are, what the film feels like, and why everyone involved keeps using the word "reckoning."

The headlines from the piece are already everywhere — Emily Blunt saying the film answers Close Encounters questions, Spielberg calling it an action movie that "comes out of the gate very fast." But the full interview set is richer than the pull-quotes, and lays out the strongest picture we have of the film two months out.

Steven Spielberg

Director

"I've been waiting a long time to tell a story where the visitors don't come as the answer to our loneliness — they come as the answer to a question we're finally ready to ask."On the film's central idea

"I always said I guarantee life in the universe. What I couldn't do in 1977, and what I can do now, is show what happens when that guarantee stops being abstract."On 50 years since Close Encounters

"David [Koepp] and I rewrote the third act three times during production. Every time a congressional hearing happened, we adjusted. The world kept getting closer to the movie."On the real-world UAP conversation

Emily Blunt — Margaret Fairchild

Emily Blunt

Meteorologist / Conduit

"There are definitely questions posed by Close Encounters that are answered in Disclosure Day."The quote that launched a thousand Reddit threads

"Margaret is a local Kansas City weather anchor. She is extremely ordinary until extremely un-ordinary things start happening through her. That's the movie."On her character

"The broadcast scene took five days to shoot. Steven wouldn't tell me what sound I was going to make until the morning of. I had to show up and let my body become it."On the now-famous "click" sequence

Josh O'Connor — Daniel Kellner

Josh O'Connor

Cybersecurity expert / Whistleblower

"Daniel works for the agency that has been keeping the secret. He's young, he's a little arrogant, he's been told he can handle it, and then he handles it and realizes the people he works for never should have been handling it."On his character's arc

"It's old-school Spielberg. I say this to everyone. It feels like the movies that made me want to act. There is a car chase in this film that is going to melt people."On the film's tone

"Colin plays my boss. So you know going in one of us isn't making it out."On working with Colin Firth

Colin Firth — Noah Scanlon

Colin Firth

CEO of Wardex

"Noah is not a villain. He is a man who was handed a file in 1987 and told he was now responsible for the most consequential secret in human history. What he does with it over forty years — that is the film."On his character

"The chair scene that everyone has been talking about from the first-look images — I will not tell you what it is. I will tell you that it is not what you think."On the 'mind control device' image

"You don't say no to Steven. You just don't. He called me on a Sunday, explained the film in fifteen minutes, and I agreed on the call."On being cast

Colman Domingo

[Role kept under wraps by Universal]

"I bawled reading the script. I bawled again on set. I will probably bawl in the theater. This is a movie about humanity being seen."On his emotional response

"My character walks into the third act and the floor drops. That's all I can say. Steven made me promise."On third-act secrecy

Eve Hewson — Jane Blankenship

Eve Hewson

Daniel's girlfriend

"Jane is the audience. She loves Daniel and she has no idea what he's carrying. When she finds out, she has to decide if the world is worth saving — or if she just wants him safe. That choice is the heart of the film."On her character

David Koepp — Screenwriter

David Koepp

Writer (Jurassic Park, War of the Worlds)

"Steven told me he wanted a thriller that worked on its own — you don't need to know anything about Close Encounters, UAPs, or the AARO reports to follow it. But if you do, there's a second film playing underneath the first one."On the film's dual-layer design

"I keep getting asked if Disclosure Day is a Close Encounters sequel. My honest answer: not officially. But the same gate is open in both films."On the Close Encounters connection

The Takeaway

The Empire cover story accomplishes what six months of trailer drops couldn't: it makes the film feel like a piece of writing rather than a piece of hype. Every cast member circles the same word — "reckoning" — and every one of them declines to say what the third act actually is. Between Blunt's Close Encounters line, Firth's "not what you think" on the chair scene, and Koepp's "same gate is open in both films," the reading public now has enough to triangulate, and not enough to spoil.

That's exactly where Universal wants the conversation sitting on April 20, 2026. Eight weeks to release.

Sources

Empire — Emily Blunt: Disclosure Day Answers Close Encounters Questions

Empire — An Action Movie That Comes Out of the Gate Very Fast

Gizmodo — Who the Mysterious Main Characters Are

Space.com — Everything We Know

Disclosure Day opens in theaters and IMAX on June 12, 2026.

Originally published at Disclosure Day Hub — the fan-built resource tracking Steven Spielberg's UFO film (June 12, 2026). Explore the full news hub, cast guide, and interview archive.

Spielberg Just Pitched a UFO Theory That Rewrites the Whole Movie: They're Us. From the Future.

Matt Macosko — Mon, 20 Apr 2026 08:02:50 +0000

If you were reading the CinemaCon writeups for the alien reveal, the standing ovation, or the "more truth than fiction" quote, you might have missed it. Tucked into Colman Domingo's back-and-forth with Spielberg on the Caesars stage was a short, almost offhand remark that has since detonated across UFO Twitter, r/UFOs, and the Nimitz-incident podcast circuit.

Per reporters in the room (Gold Derby, Bleeding Cool), Spielberg laid out what he described as a "hopeful" theory: that the unexplained phenomena showing up in Navy gun-cam footage, in Peruvian skies, off the coast of Catalina — all of it — aren't visitors from another star system. They're us. Traveling back in time.

"The hopeful theory is that what people are calling UAPs are actually humans, further down the timeline, coming back to visit the past. Think about what that means. We made it. We're still here."

The Theory, In Plain Language

The "future humans" hypothesis has been floating around UFO research circles for a while (see Dr. Michael Masters' 2019 book Identified Flying Objects), but it has never really broken into mainstream coverage. The idea:

Why UAPs Might Be Time Travelers, Not Aliens

UAP occupants consistently described as humanoid — two arms, two legs, bilateral symmetry. Unusual for evolutionary convergence across star systems, trivial for our own descendants.
UAPs don't announce themselves. They observe. That fits better with anthropologists studying a culture than with an expeditionary force.
Classic UAP behavior — appearing near nuclear sites, population centers, historical inflection points — maps cleanly onto "historians visiting the turning points of their own past."
If they're us-from-the-future, the non-interference pattern isn't inexplicable. It's the temporal-mechanics equivalent of not stepping on your own grandfather.

Why This Reframes Disclosure Day

Emily Blunt told Empire for the June 2026 issue that "there are definitely questions posed by Close Encounters that are answered in Disclosure Day." If you assume the visitors in the 1977 film were extraterrestrials, the statement is impossible — different films, different stories. But if the visitors in Close Encounters are us, and Disclosure Day is the movie where that finally gets confirmed, then the whole cross-film continuity works.

Consider the leaked details we already have:

Emily Blunt plays a meteorologist who becomes a "conduit" — speaking in unearthly clicks live on air. If the visitors are human, the clicks aren't an alien language. They're compressed information from a human-descended protocol.
Josh O'Connor plays a whistleblower running from Wardex, the government contractor. Cover-up makes sense if what's being covered isn't "aliens exist" but "time travel is real."
Colin Firth, head of Wardex, is seen in the Empire first-look strapped to a mind-control device. Could easily be a temporal-communications rig.

The Spielberg Pattern

Spielberg has been leaning on "hopeful" for years when asked about aliens — contrasted against the "they come to destroy us" posture of Independence Day or War of the Worlds (his own version notwithstanding). In 1977 the visitors brought the Roy Neary pilots home. In 1982 E.T. just wanted a ride.

A future-humans resolution is the most Spielberg possible landing for this movie: the aliens aren't the other. They're the future version of the audience. The third act isn't contact. It's a reunion.

Should We Take It Seriously?

Maybe. Maybe not. Spielberg is a showman and this is a marketing cycle. He floated the theory with a grin. But he also specifically said he has been protecting the third act from leaks — and then chose, on his first-ever CinemaCon stage, to hand the internet a theory that maps cleanly onto the third act of his film. That's a very expensive way to be random.

If "more truth than fiction" is the marketing line, then "it's us from the future" might be the plot.

Sources

Gold Derby — New Footage and Time Travel Theory

Bleeding Cool — Disclosure Day Will Answer Questions

Yahoo/Variety — CinemaCon 2026 Recap

Disclosure Day opens in theaters and IMAX on June 12, 2026.

Originally published at Disclosure Day Hub — the fan-built resource tracking Steven Spielberg's UFO film (June 12, 2026). Explore the full news hub, cast guide, and interview archive.

"More Truth Than Fiction": The Spielberg CinemaCon Quote Everyone's Trying to Decode

Matt Macosko — Mon, 20 Apr 2026 08:02:48 +0000

Spielberg has been asked about UFOs in public for five decades. He has been careful, charming, and elusive every time. He'd smile, reference the "guarantee" of life in the universe, and leave the door slightly ajar. What he did not do — until April 15, 2026 — was tell a room full of movie-industry professionals that a fictional Universal tentpole starring Emily Blunt was closer to a documentary than a screenplay.

That's the line he crossed at Caesars Palace.

What He Actually Said

The exact phrasing, captured by multiple outlets in the room: "I've been curious ever since I was a little kid about what's happening in the night sky, what's happening in the sky during the daytime. What I can tell you is that there is more truth than fiction in this film."

He said it calmly. He didn't walk it back. He pivoted into the footage reel, and the room audibly inhaled.

Why This Is a Bigger Deal Than It Sounds

Close Encounters (1977) had real UFO researcher J. Allen Hynek as a technical advisor. War of the Worlds (2005) leaned into post-9/11 anxiety. E.T. was openly autobiographical. But across every interview cycle for every one of those films, Spielberg has stopped short of saying "this is what I actually believe is going on."

At SXSW in March 2026 he got closer. He said he has "a very strong suspicion that we are not alone here on Earth right now — and I made a movie about that." That was already the most direct thing he'd ever said.

"More truth than fiction" is a step past suspicion. It is a production-level claim about the content of the film.

What the Cast Has Quietly Been Saying

Look at the quotes the cast has given over the last six months and the CinemaCon line stops looking like showmanship:

"There are definitely questions posed by Close Encounters that are answered in Disclosure Day."— Emily Blunt, Empire June 2026 issue

"I finished reading the script and I bawled. I thought it was one of the most beautiful scripts about our humanity."— Colman Domingo, Entertainment Weekly, late 2025

"Ethan believes the public deserves to know. But 'the truth' is more complicated than he realizes."— Josh O'Connor, GQ interview

Separately those are normal press quotes. Laid on top of "more truth than fiction," they read differently. Koepp's screenplay — per Spielberg's framing — isn't a speculative fiction. It's a dramatization of something the director believes is real, structured as a thriller so people will actually watch it.

The Congressional Hearing Context

Spielberg didn't make this film in a vacuum. The 2023 Grusch whistleblower testimony, the ongoing AARO reports, the 2024 Schumer UAP amendment — the real-world UAP conversation has moved from Coast to Coast AM to the Senate Intelligence Committee in under five years. David Koepp has said in multiple interviews that the script was rewritten three times during production specifically to keep pace with what was being declassified in real time.

That's what Spielberg is pointing at when he says "truth." Not little green men. The fact that something real is being disclosed, slowly, by the U.S. government — and that Disclosure Day dramatizes a version of what happens when "slowly" stops.

The CinemaCon Strategy

Universal didn't stop Spielberg from saying it. That's the other tell. Studio comms teams rehearse these appearances for weeks. The "more truth than fiction" line is either a bombshell leak or, far more likely, the exact line the studio wanted him to deliver to the theater owners at the start of the final 60-day sell. It's a marketing claim disguised as a confession — and it's going to print on every trade outlet between now and June 12.

Sources

"More Truth Than Fiction" — CinemaCon coverage (wire)

THR — Spielberg: "I Have a Strong Suspicion We're Not Alone" (SXSW)

Empire — Emily Blunt on Close Encounters Answers

Disclosure Day opens in theaters and IMAX on June 12, 2026.

Originally published at Disclosure Day Hub — the fan-built resource tracking Steven Spielberg's UFO film (June 12, 2026). Explore the full news hub, cast guide, and interview archive.

Spielberg Walks the CinemaCon Stage for the First Time — and Shows Us the Alien

Matt Macosko — Mon, 20 Apr 2026 07:51:28 +0000

For 40+ years of CinemaCon — the annual Las Vegas confab where studios sell the summer to theater owners — Steven Spielberg has never personally taken the stage. Other filmmakers went. He sent the movies. On the evening of April 15, 2026, at Caesars Palace, that finally changed.

Cast member Colman Domingo walked out first. He introduced his director to the room. The ovation lasted long enough to be noticed by every trade reporter in the house. Then Spielberg sat down, and Universal rolled the longest sustained look at Disclosure Day that anyone outside the cutting room has seen.

What the Footage Showed

According to reporters in the room from The Hollywood Reporter, Deadline, and Variety, the reel was built around a first-look clip of the alien itself. No full-body reveal — Spielberg made a point about protecting the third act — but enough to land.

Universal followed the clip with a montage that felt much more action-forward than the Super Bowl trailer. Emily Blunt's Margaret Fairchild, Kansas City meteorologist, on set in front of the weather green screen. Josh O'Connor's Daniel Kellner on the run. Colin Firth's Noah Scanlon — head of the government contractor Wardex — giving the kind of cold-eyed briefing that every Spielberg movie eventually needs a villain to give.

The Third-Act Rule

During the conversation with Domingo, Spielberg kept coming back to one thing: he does not want the last forty minutes of this movie on the internet before June 12.

"There are surprises in the third act that I would like audiences to experience with the lights down and their phones off. That's the only ask I have."— Steven Spielberg, CinemaCon 2026

This is the same director who 48 years ago sent Close Encounters of the Third Kind out without so much as a photograph of the mothership in the press kit. He is clearly applying the same playbook here.

What Was New At CinemaCon

First look at the alien (partial — no full reveal)
Confirmation the film "comes out of the gate very fast" — an action opening, not a slow build
Spielberg's time-travel theory about UAPs (see our breakdown)
Spielberg telling the room the film has "more truth than fiction" in it
Spielberg calling on Hollywood to stop making sequels and make original movies instead

"Make Original Movies"

Before the footage rolled, Spielberg used the platform for something none of the tracking pieces predicted: a direct speech to the theater owners about originality. Per Variety's report, he told the room that if Hollywood keeps stacking the calendar with sequels and the same Marvel title "over and over and over again," then films like Disclosure Day — original stories with original stars — become the rare chance for audiences to experience "something which is precious."

Coming from the director who essentially invented the modern blockbuster, the message landed. Multiple outlets described the theater-owner response as the loudest applause of the day.

Why This Matters for the Campaign

Before April 15, Universal's marketing had been all tease: the Super Bowl spot in February, the Colin Firth mind-control still in January, the Emily Blunt possession voice leak. CinemaCon flipped the tone. Spielberg went on the record. The star showed up. The alien got a screen. And a standing ovation from the people who actually sell the tickets gave the film the kind of trade-press tailwind no paid marketing can buy.

Universal has scheduled the next marketing beat — the final trailer — for early May. Presale tickets follow shortly after. By the time Disclosure Day opens on June 12, 2026, the alien we glimpsed at Caesars will be the most-discussed creature design of the year.

Sources

The Hollywood Reporter — Spielberg Debuts First Look at Movie's Alien

Deadline — Spielberg Unveils New Footage at CinemaCon

Variety — Spielberg Tells Hollywood to Make Original Movies

Bleeding Cool — Disclosure Day Will Answer Questions, Pose Questions

Disclosure Day opens in theaters and IMAX on June 12, 2026.

Originally published at Disclosure Day Hub — the fan-built resource tracking Steven Spielberg's UFO film (June 12, 2026). Explore the full news hub, cast guide, and interview archive.

DEV Community: Matt Macosko

The IRS Said Yes — and What That Does and Doesn’t Mean

What it doesn’t mean

About the timeline, honestly

What it actually changes

Where this goes

M5 Max + 128GB = a 30B AI Coding Agent Running Locally. Wi-Fi Off.

Why the M5 Max changes the math

What you can do with it now

The full stack

The performance, honestly

Why the offline part matters

Benchmarks — actually run, not cited

For context — what the bigger sibling scores on harder benchmarks

Where this is heading

Try it yourself

4 AI Models Built the Same Game on One Laptop — and a Local One Beat the Cloud

The setup

What happened

The verdict

Run local models yourself (free)

Eight local AI agents on a Mac mini — and the product I'm building from them

The asset base

The gap that ate me for a month

The market gap, in numbers

The ladder

The three core agents

What's in the AirGap Box

Validation before scale

Honest year-1 range

What I will publish honestly

Links

Why I'm posting this

HumanEval on a MacBook — 81.7% pass@1, Wi-Fi off

Why this number matters

Methodology

For context — Qwen3-Coder 480B's official agentic benchmarks

Why the offline part matters

Reproduce it yourself

Free AI on a MacBook vs $100-a-Month Claude Code — Hexagon Shootout

The setup — same prompt, three contestants

The results

Three things that surprised me

1. Bigger isn't better when "bigger" is a generalist

2. Claude Code's harness chokes local models

3. Circle-approximation collision is the cheat code

Who should care

How to run it yourself

TL;DR

"It Comes Out Of The Gate Very Fast": Disclosure Day Is An Action Movie

What We Know About the Opening

Why Spielberg Pivoted

The Close Encounters Comparison Breaks

What It Means for the Box Office

Sources

Inside Empire's Disclosure Day Cover Story: Every Quote That Matters

Steven Spielberg

Steven Spielberg

Emily Blunt — Margaret Fairchild

Emily Blunt

Josh O'Connor — Daniel Kellner

Josh O'Connor

Colin Firth — Noah Scanlon

Colin Firth

Colman Domingo

Colman Domingo

Eve Hewson — Jane Blankenship

Eve Hewson

David Koepp — Screenwriter

David Koepp

The Takeaway

Sources

Spielberg Just Pitched a UFO Theory That Rewrites the Whole Movie: They're Us. From the Future.

The Theory, In Plain Language

Why UAPs Might Be Time Travelers, Not Aliens

Why This Reframes Disclosure Day

The Spielberg Pattern

Should We Take It Seriously?

Sources

"More Truth Than Fiction": The Spielberg CinemaCon Quote Everyone's Trying to Decode