Source: This article summarizes the ~103-minute video by @DeRonin_. All credit goes to the original creator.
Introduction
OpenAI Codex desktop app is an AI agent platform that goes far beyond coding assistance — it handles design, document creation, research, and automation all in one place. This article is a Japanese-to-English summary of the complete ~103-minute guide video.
Key Features of the Codex Desktop App
Here are the features introduced in the video.
Project Management & File Organization
Codex manages chats in "project" units, each linked 1-to-1 with a local folder on your computer. Files created by the agent are automatically saved inside an outputs/ subfolder of the project folder, and any file in the folder can be referenced with @filename. You can open the folder in Finder at any time from the side panel.
Parallel Multi-tasking
Multiple chat threads can run simultaneously. While one agent is executing, you can start new tasks in separate chats. A blue dot notification appears when a task completes.
Skills & Plugins
Skills are "reusable recipes." Plugins are installable packages that bring those recipes into Codex. Hundreds of pre-built plugins exist (Google Calendar, Gmail, Figma, Remotion, etc.), and you can create custom skills by combining external APIs with the skill creator keyword. Skills are invoked with /skill-name or @skill-name in subsequent sessions.
Automations
Set recurring tasks with plain language: "Every Friday at 4pm, summarize this week's calendar and email it to me." The Automations tab shows a list, allows test runs, and supports natural-language edits.
Computer Control
The agent literally controls your mouse and keyboard. It can operate any GUI app — including Xcode builds and browser interactions — even without an API.
In-App Image Generation
Generate images from prompts and use them directly in your workflow. The video demos generating product photos (a futuristic shoe brand) and 10 iOS app icon options, with transparent backgrounds.
Steer (Real-Time Steering)
While the agent is running, you can paste text or a screenshot and redirect it immediately using the "Steer" button. Unlike normal queuing, this injects your prompt mid-execution.
Terminal Integration (Claude Code)
For design-heavy tasks, launch Claude Code from the integrated terminal:
claude --dangerously-skip-permissions
The video shows switching to Claude Code when Codex's design quality hits its limits — landing pages and slide decks came out significantly better.
Canva Export
Click the Canva icon next to any generated PowerPoint file to open it directly in Canva for final polish.
Skills vs. Plugins — The Difference
The video used an Excalidraw skill to auto-generate this diagram:
| Skill | Plugin | |
|---|---|---|
| Definition | Reusable workflow package for a specific task | Installable unit that adds capabilities to Codex |
| Role | A recipe that tells Codex how to execute a workflow | Bundles skills, apps, MCP Servers, and integrations |
| Purpose | Reliable execution recipe | Provides access to connected systems and tools |
Note: Simple way to remember: Skill = reusable recipe. Plugin = installable package that brings the recipe into Codex.
Design Tool Integration (Paper / Figma)
Codex integrates with Paper (Alpha), a Figma-like design tool built specifically for AI agent workflows.
Demo flow:
- "Using the Noo Shoo logo image, build a landing page directly on this Paper board"
- Codex calls Paper MCP, selects the transparent hero image
- Auto-decides design direction: editorial-tech, warm near-black neutral, cyan accent
- Builds 4 sections automatically: Hero, Performance Strip, Product Story, CTA/Footer
Note: Paper is designed specifically for AI agent collaboration. It's more intuitive than direct Figma editing for generative design tasks.
Automations in Practice
Creating an automation is as simple as typing "do X every Y." The video demos two:
Weekly Calendar Summary
After connecting Google Calendar and Gmail plugins: "Every Friday at 4pm, summarize this week's events and email them to me." That's it — the automation is registered and the next scheduled run is visible immediately.
Monthly YouTube Report
After building a YouTube Researcher skill (SuperData API): "On the last day of every month, analyze that month's videos and generate a Word doc with hook analysis and a view-count table." Delivered automatically every month.
Part 2 Highlight: Building 6 Projects in Parallel
In the second half of the video, six projects for the Chorus app (an AI agent learning platform) were built simultaneously.
| Task | Tools Used |
|---|---|
| iOS App (design + implementation) | Swift · Xcode · Supabase · Mobile Design Skill |
| Web Landing Page | Tally · React · Claude Code · Vercel |
| Launch Video | Remotion Plugin · Claude Code |
| Investor Deck | PowerPoint Skill · Claude Code · Canva |
| X Post Automation | Typefully Skill |
| Project Plan | Markdown (checklist) |
The key insight: once you send instructions to an agent, don't wait — move to the next task immediately. This "serial tasking" pattern is what makes AI-era multitasking work.
Detailed Video Guide
Part 1 — Mastering the Basics
Downloading & Project Management
Search "Codex app download" in your browser and download from chatgpt.com. The first-time interface looks like ChatGPT, but the depth is entirely different.
Codex's biggest feature is project management tied to local folders. Before starting a chat, you specify which folder to work in. That folder becomes the "project," and all agent-created files are auto-saved to its outputs/ subfolder.
From the side panel you can open the folder in Finder or reference files inside with @filename. Even with 30+ projects, Command+G lets you search by chat name or content instantly.
Set permissions to "Full Access" so the agent works without approval prompts. The recommended defaults are GPT-5.4 model and Extra High effort level.
How to Use Skills and Plugins
The core distinction:
Skills are "recipes" — instructions telling the agent how to execute a specific task. Plugins are installable packages that bring those recipes into Codex. Think of a plugin as "the container for a skill."
To learn what a new plugin can do, open a new chat and ask: @Figma tell me everything you can do with this plugin. Click the caret (▼) in the response to see the agent's reasoning.
Hands-On: Automating with Google Calendar + Gmail
Installing the Google Calendar plugin takes seconds: Plugins → Google Calendar → sign in via browser.
Once connected, this entire workflow happens in one conversation:
- "List this week's events for me" → Full calendar response
- "Email me a weekly summary" → Sent via Gmail immediately
- "Make this an automation for every Friday at 4pm" → Weekly task registered
The Automations tab shows next run time, status, and a test button. You can edit automations later with natural language.
Generating Designs with Figma and Paper MCP
Figma's main use case is converting an existing Figma board into code. It's less suited for having the AI generate designs and place them into Figma.
That's where Paper (Alpha) comes in:
"Using the new shoe PNG (no background), create a landing page on the open Paper board"
↓
Codex calls Paper MCP, decides design direction
↓
Auto-builds: Hero · Performance Strip · Product Story · CTA
Right-click a chat and select "Open in mini window" to float the chat sidebar while working in Paper.
Codex also has a Steer feature. Normally, typing a prompt while the agent is running queues it for later. Clicking "Steer" injects your message immediately — great for pasting a screenshot and saying "this button is overlapping, fix it while you work."
Building a Custom Skill: YouTube Researcher
Step 1: Find an API
Ask Codex: "Suggest the top 5 APIs for pulling YouTube transcripts." SuperData, Transcript API, YouTube Transcript.io, and others are returned. SuperData offers 100 free requests/month.
Step 2: Create the skill
In a new chat:
Use skill creator to build a skill that uses the SuperData API
to fetch and summarize the latest 10 videos from any YouTube channel.
API key: [paste here]
The skill creator keyword activates Codex's skill-building mode.
Step 3: Use the skill
Open a new chat and type YouTube Researcher:
Look up Riley Brown's latest 10 YouTube videos,
pull the transcripts, and create a document.
Include which videos performed well, a hook analysis, and thumbnails.
The resulting report showed "Claude is taking over (high urgency, large market shift)" and "Claude Code Leak" as top-performing hooks. Vibe coding content underperformed. Then this was automated: "On the last day of every month, run this skill and create a Word report."
Part 2 — Building 6 Projects Simultaneously
Note: The following demos use "Chorus" (an AI agent learning iOS app) as the subject. The core idea: send instructions, then move to the next task without waiting.
Planning the Project
Create a "My New Business" folder and start a new project. Create the plan as a Markdown file:
[Attach screenshot]
"Create a checklist-style plan from this.
Items: iOS app · Web landing page · Mobile app design ·
Launch video · Investor deck · X post automation.
Include the app idea at the top."
Chorus concept: A platform for learning about AI agents. Compare agent tools, access a copy-paste skill library, and learn fundamentals — all in an iOS app.
iOS App: From Design to TestFlight
Screen Design Skill
Export the workflow from claude.ai/design, paste it into Codex, and say "create a mobile design skill that does the same thing." That's all it takes to build the custom skill.
"Use the mobile design skill to create Chorus app screens
in basic Apple style"
A prototype link appears showing Learn · Platforms · Skills · Saved in a 4-tab layout.
Xcode Build
"Create a new Swift mobile app project called Chorus.
For now, just display 'Hello, this is Chorus' in the center.
Open the Xcode project when done."
Hit Play in Xcode + iOS Simulator (or physical device) to see each build. After integrating the screen designs, connect Supabase.
Supabase + Authentication
Supabase is the de facto AI-agent database. After configuring the MCP, restart Codex for the connection to register. Then: "If connected, create all tables." Tables for skill categories, platforms, skills, and saved items are auto-generated.
Authentication was implemented with email + password (Google sign-in was initially tried, but Supabase's native email auth was the fastest path). Disable email confirmation in Supabase for instant login. The app eventually shipped to TestFlight.
Web Landing Page: Tally + React + Vercel
Form setup (tally.so)
Create a waitlist form with name and email fields, then copy the embed code.
Run as React app
"I'm using tally.so. Embed this form in the site and
run it locally as a React app. We'll design it later."
Styling with Claude Code
Codex struggles with design, so bring in Claude Code from the terminal:
claude --dangerously-skip-permissions
"Forget the current page styling completely.
Look at the Chorus app code and match its font and design.
Keep the Tally embed. Minimal text, simple, conversion-focused."
Claude Code improves it dramatically in minutes. Then: "Deploy to Vercel and give me the public link."
Launch Video: Remotion Plugin
Install the Remotion plugin and type @remotion in a new chat.
"Create a launch video for Chorus app.
Start with a test: take app screenshots (attached),
put them in iPhone mockups on a white background, and animate them.
Run it locally."
localhost:3031 opens a timeline editor. Specify timing as seconds.frames (e.g., 2.20 = 2 seconds, 20 frames).
Use Steer to adjust in real time. Turn on grid lines and give the agent exact coordinates (e.g., "X:1040, Y:540") for precise placement.
Hand off design-heavy sections (animation quality, color cards, cut timing) to Claude Code — the results are noticeably better. To add BGM, attach an MP3 and say "add this song at 50% volume."
Investor Deck: Chat Fork + Canva Export
Fork the chat
Right-click the mobile app chat → "Fork into Local." This creates a new chat with the same context. Rename it "Investor Deck."
"Analyze the app features, icons, and style, then
create a matching investor slide deck.
Use the PowerPoint skill.
Research what investors look for in April 2026 and match that style."
Refine with Claude Code
claude --dangerously-skip-permissions
"Review this deck and reduce text, add more visuals.
Add charts and diagrams to improve readability. Don't add slides."
Export to Canva
Click the Canva icon next to the PowerPoint file. It opens instantly in Canva for final 5–10% polish — add animations, adjust colors.
X Post Automation: Typefully Skill
Get a Typefully API key (Settings → API → Create new key), then:
"Search the Typefully API docs and build a skill that gives me
full control. Test it on the Riley Brown account
(use fruit emojis so I know which are yours).
API key: [paste here]"
Then automate it:
"Set up an automation to create 3 X post drafts every morning.
Use the Typefully control skill."
Final Results
| Task | Outcome |
|---|---|
| iOS App | Published to TestFlight (Learn · Platforms · Skills · Saved) |
| Web Landing Page | Live on Vercel · Tally form confirmed working |
| Launch Video | First draft complete (Remotion + Claude Code) |
| Investor Deck | Exported to Canva · Final polish done |
| X Post Automation | 3 daily drafts scheduled |
| Project Plan | All 6 items checked off |
Note: AI agents can take 1–2 hours on complex tasks. Instead of waiting, the key is to keep sending new instructions to new agents and move on. That's the core productivity skill of the AI era.




Top comments (0)