DEV Community

naoki_JPN
naoki_JPN

Posted on

The Complete Guide to OpenAI Codex Desktop: Skills, Plugins, Automations & Parallel Multitasking

Source: This article summarizes the ~103-minute video by @DeRonin_. All credit goes to the original creator.

Introduction

OpenAI Codex desktop app is an AI agent platform that goes far beyond coding assistance — it handles design, document creation, research, and automation all in one place. This article is a Japanese-to-English summary of the complete ~103-minute guide video.


Key Features of the Codex Desktop App

Here are the features introduced in the video.

Codex app feature overview (around 0:00)

Project Management & File Organization

Codex manages chats in "project" units, each linked 1-to-1 with a local folder on your computer. Files created by the agent are automatically saved inside an outputs/ subfolder of the project folder, and any file in the folder can be referenced with @filename. You can open the folder in Finder at any time from the side panel.

Parallel Multi-tasking

Multiple chat threads can run simultaneously. While one agent is executing, you can start new tasks in separate chats. A blue dot notification appears when a task completes.

Skills & Plugins

Skills are "reusable recipes." Plugins are installable packages that bring those recipes into Codex. Hundreds of pre-built plugins exist (Google Calendar, Gmail, Figma, Remotion, etc.), and you can create custom skills by combining external APIs with the skill creator keyword. Skills are invoked with /skill-name or @skill-name in subsequent sessions.

Automations

Set recurring tasks with plain language: "Every Friday at 4pm, summarize this week's calendar and email it to me." The Automations tab shows a list, allows test runs, and supports natural-language edits.

Computer Control

The agent literally controls your mouse and keyboard. It can operate any GUI app — including Xcode builds and browser interactions — even without an API.

In-App Image Generation

Generate images from prompts and use them directly in your workflow. The video demos generating product photos (a futuristic shoe brand) and 10 iOS app icon options, with transparent backgrounds.

Steer (Real-Time Steering)

While the agent is running, you can paste text or a screenshot and redirect it immediately using the "Steer" button. Unlike normal queuing, this injects your prompt mid-execution.

Terminal Integration (Claude Code)

For design-heavy tasks, launch Claude Code from the integrated terminal:

claude --dangerously-skip-permissions
Enter fullscreen mode Exit fullscreen mode

The video shows switching to Claude Code when Codex's design quality hits its limits — landing pages and slide decks came out significantly better.

Canva Export

Click the Canva icon next to any generated PowerPoint file to open it directly in Canva for final polish.


Skills vs. Plugins — The Difference

The video used an Excalidraw skill to auto-generate this diagram:

Skills vs. Plugins diagram (around 10:00)

Skill Plugin
Definition Reusable workflow package for a specific task Installable unit that adds capabilities to Codex
Role A recipe that tells Codex how to execute a workflow Bundles skills, apps, MCP Servers, and integrations
Purpose Reliable execution recipe Provides access to connected systems and tools

Note: Simple way to remember: Skill = reusable recipe. Plugin = installable package that brings the recipe into Codex.


Design Tool Integration (Paper / Figma)

Paper (Alpha) auto-generating a landing page (around 30:00)

Codex integrates with Paper (Alpha), a Figma-like design tool built specifically for AI agent workflows.

Demo flow:

  1. "Using the Noo Shoo logo image, build a landing page directly on this Paper board"
  2. Codex calls Paper MCP, selects the transparent hero image
  3. Auto-decides design direction: editorial-tech, warm near-black neutral, cyan accent
  4. Builds 4 sections automatically: Hero, Performance Strip, Product Story, CTA/Footer

Note: Paper is designed specifically for AI agent collaboration. It's more intuitive than direct Figma editing for generative design tasks.


Automations in Practice

Automations setup screen (around 35:00)

Creating an automation is as simple as typing "do X every Y." The video demos two:

Weekly Calendar Summary
After connecting Google Calendar and Gmail plugins: "Every Friday at 4pm, summarize this week's events and email them to me." That's it — the automation is registered and the next scheduled run is visible immediately.

Monthly YouTube Report
After building a YouTube Researcher skill (SuperData API): "On the last day of every month, analyze that month's videos and generate a Word doc with hook analysis and a view-count table." Delivered automatically every month.


Part 2 Highlight: Building 6 Projects in Parallel

In the second half of the video, six projects for the Chorus app (an AI agent learning platform) were built simultaneously.

Task Tools Used
iOS App (design + implementation) Swift · Xcode · Supabase · Mobile Design Skill
Web Landing Page Tally · React · Claude Code · Vercel
Launch Video Remotion Plugin · Claude Code
Investor Deck PowerPoint Skill · Claude Code · Canva
X Post Automation Typefully Skill
Project Plan Markdown (checklist)

The key insight: once you send instructions to an agent, don't wait — move to the next task immediately. This "serial tasking" pattern is what makes AI-era multitasking work.


Detailed Video Guide


Part 1 — Mastering the Basics

Downloading & Project Management

Search "Codex app download" in your browser and download from chatgpt.com. The first-time interface looks like ChatGPT, but the depth is entirely different.

Codex's biggest feature is project management tied to local folders. Before starting a chat, you specify which folder to work in. That folder becomes the "project," and all agent-created files are auto-saved to its outputs/ subfolder.

From the side panel you can open the folder in Finder or reference files inside with @filename. Even with 30+ projects, Command+G lets you search by chat name or content instantly.

Set permissions to "Full Access" so the agent works without approval prompts. The recommended defaults are GPT-5.4 model and Extra High effort level.


How to Use Skills and Plugins

The core distinction:

Skills are "recipes" — instructions telling the agent how to execute a specific task. Plugins are installable packages that bring those recipes into Codex. Think of a plugin as "the container for a skill."

To learn what a new plugin can do, open a new chat and ask: @Figma tell me everything you can do with this plugin. Click the caret (▼) in the response to see the agent's reasoning.


Hands-On: Automating with Google Calendar + Gmail

Google Calendar integration and automation setup (around 15:00)

Installing the Google Calendar plugin takes seconds: Plugins → Google Calendar → sign in via browser.

Once connected, this entire workflow happens in one conversation:

  1. "List this week's events for me" → Full calendar response
  2. "Email me a weekly summary" → Sent via Gmail immediately
  3. "Make this an automation for every Friday at 4pm" → Weekly task registered

The Automations tab shows next run time, status, and a test button. You can edit automations later with natural language.


Generating Designs with Figma and Paper MCP

Figma's main use case is converting an existing Figma board into code. It's less suited for having the AI generate designs and place them into Figma.

That's where Paper (Alpha) comes in:

"Using the new shoe PNG (no background), create a landing page on the open Paper board"
↓
Codex calls Paper MCP, decides design direction
↓
Auto-builds: Hero · Performance Strip · Product Story · CTA
Enter fullscreen mode Exit fullscreen mode

Right-click a chat and select "Open in mini window" to float the chat sidebar while working in Paper.

Codex also has a Steer feature. Normally, typing a prompt while the agent is running queues it for later. Clicking "Steer" injects your message immediately — great for pasting a screenshot and saying "this button is overlapping, fix it while you work."


Building a Custom Skill: YouTube Researcher

Step 1: Find an API

Ask Codex: "Suggest the top 5 APIs for pulling YouTube transcripts." SuperData, Transcript API, YouTube Transcript.io, and others are returned. SuperData offers 100 free requests/month.

Step 2: Create the skill

In a new chat:

Use skill creator to build a skill that uses the SuperData API
to fetch and summarize the latest 10 videos from any YouTube channel.
API key: [paste here]
Enter fullscreen mode Exit fullscreen mode

The skill creator keyword activates Codex's skill-building mode.

Step 3: Use the skill

Open a new chat and type YouTube Researcher:

Look up Riley Brown's latest 10 YouTube videos,
pull the transcripts, and create a document.
Include which videos performed well, a hook analysis, and thumbnails.
Enter fullscreen mode Exit fullscreen mode

The resulting report showed "Claude is taking over (high urgency, large market shift)" and "Claude Code Leak" as top-performing hooks. Vibe coding content underperformed. Then this was automated: "On the last day of every month, run this skill and create a Word report."


Part 2 — Building 6 Projects Simultaneously

Note: The following demos use "Chorus" (an AI agent learning iOS app) as the subject. The core idea: send instructions, then move to the next task without waiting.

Planning the Project

Create a "My New Business" folder and start a new project. Create the plan as a Markdown file:

[Attach screenshot]

"Create a checklist-style plan from this.
 Items: iOS app · Web landing page · Mobile app design ·
 Launch video · Investor deck · X post automation.
 Include the app idea at the top."
Enter fullscreen mode Exit fullscreen mode

Chorus concept: A platform for learning about AI agents. Compare agent tools, access a copy-paste skill library, and learn fundamentals — all in an iOS app.


iOS App: From Design to TestFlight

Screen Design Skill

Export the workflow from claude.ai/design, paste it into Codex, and say "create a mobile design skill that does the same thing." That's all it takes to build the custom skill.

"Use the mobile design skill to create Chorus app screens
 in basic Apple style"
Enter fullscreen mode Exit fullscreen mode

A prototype link appears showing Learn · Platforms · Skills · Saved in a 4-tab layout.

Xcode Build

"Create a new Swift mobile app project called Chorus.
 For now, just display 'Hello, this is Chorus' in the center.
 Open the Xcode project when done."
Enter fullscreen mode Exit fullscreen mode

Hit Play in Xcode + iOS Simulator (or physical device) to see each build. After integrating the screen designs, connect Supabase.

Supabase + Authentication

Supabase is the de facto AI-agent database. After configuring the MCP, restart Codex for the connection to register. Then: "If connected, create all tables." Tables for skill categories, platforms, skills, and saved items are auto-generated.

Authentication was implemented with email + password (Google sign-in was initially tried, but Supabase's native email auth was the fastest path). Disable email confirmation in Supabase for instant login. The app eventually shipped to TestFlight.


Web Landing Page: Tally + React + Vercel

Form setup (tally.so)

Create a waitlist form with name and email fields, then copy the embed code.

Run as React app

"I'm using tally.so. Embed this form in the site and
 run it locally as a React app. We'll design it later."
Enter fullscreen mode Exit fullscreen mode

Styling with Claude Code

Codex struggles with design, so bring in Claude Code from the terminal:

claude --dangerously-skip-permissions
Enter fullscreen mode Exit fullscreen mode
"Forget the current page styling completely.
 Look at the Chorus app code and match its font and design.
 Keep the Tally embed. Minimal text, simple, conversion-focused."
Enter fullscreen mode Exit fullscreen mode

Claude Code improves it dramatically in minutes. Then: "Deploy to Vercel and give me the public link."


Launch Video: Remotion Plugin

Building a motion graphic launch video with Remotion (around 55:00)

Install the Remotion plugin and type @remotion in a new chat.

"Create a launch video for Chorus app.
 Start with a test: take app screenshots (attached),
 put them in iPhone mockups on a white background, and animate them.
 Run it locally."
Enter fullscreen mode Exit fullscreen mode

localhost:3031 opens a timeline editor. Specify timing as seconds.frames (e.g., 2.20 = 2 seconds, 20 frames).

Use Steer to adjust in real time. Turn on grid lines and give the agent exact coordinates (e.g., "X:1040, Y:540") for precise placement.

Hand off design-heavy sections (animation quality, color cards, cut timing) to Claude Code — the results are noticeably better. To add BGM, attach an MP3 and say "add this song at 50% volume."


Investor Deck: Chat Fork + Canva Export

Fork the chat

Right-click the mobile app chat → "Fork into Local." This creates a new chat with the same context. Rename it "Investor Deck."

"Analyze the app features, icons, and style, then
 create a matching investor slide deck.
 Use the PowerPoint skill.
 Research what investors look for in April 2026 and match that style."
Enter fullscreen mode Exit fullscreen mode

Refine with Claude Code

claude --dangerously-skip-permissions
"Review this deck and reduce text, add more visuals.
 Add charts and diagrams to improve readability. Don't add slides."
Enter fullscreen mode Exit fullscreen mode

Export to Canva

Click the Canva icon next to the PowerPoint file. It opens instantly in Canva for final 5–10% polish — add animations, adjust colors.


X Post Automation: Typefully Skill

Get a Typefully API key (Settings → API → Create new key), then:

"Search the Typefully API docs and build a skill that gives me
 full control. Test it on the Riley Brown account
 (use fruit emojis so I know which are yours).
 API key: [paste here]"
Enter fullscreen mode Exit fullscreen mode

Then automate it:

"Set up an automation to create 3 X post drafts every morning.
 Use the Typefully control skill."
Enter fullscreen mode Exit fullscreen mode

Final Results

Task Outcome
iOS App Published to TestFlight (Learn · Platforms · Skills · Saved)
Web Landing Page Live on Vercel · Tally form confirmed working
Launch Video First draft complete (Remotion + Claude Code)
Investor Deck Exported to Canva · Final polish done
X Post Automation 3 daily drafts scheduled
Project Plan All 6 items checked off

Note: AI agents can take 1–2 hours on complex tasks. Instead of waiting, the key is to keep sending new instructions to new agents and move on. That's the core productivity skill of the AI era.

Top comments (0)