Claude Skills vs. OpenAI Codex / AgentKit: Building Scalable Multi-Agent Workflows

Author’s note:

Question: What are Claude Skills and how do they work? How do they compare to Codex skills? how do i set myself up to have multi-agent skills?

Context: Context:

How do claude skills differ from claude agents? Does codex have sub-agents and what not? What are the best way to use claude skills/agents?

Executive Summary

As of January 2026, the landscape of AI orchestration has shifted from monolithic prompts to modular, filesystem-based architectures. Claude Skills represent a paradigm shift in how agents consume information: instead of stuffing context windows with static instructions, Skills allow agents to dynamically load “folders” of expertise—scripts, templates, and guides—only when needed ¹. This “progressive disclosure” architecture reduces token usage by up to 75% for document-heavy tasks compared to traditional prompting ¹ ².

Meanwhile, OpenAI has integrated its coding capabilities into the GPT-5.2-Codex model and launched AgentKit, a comprehensive platform for building agentic workflows ³ ⁴. While OpenAI focuses on a visual-first “Agent Builder” and robust managed infrastructure, Anthropic’s Skills emphasize portability and token efficiency via a local filesystem approach ⁴ ².

For developers, the key strategic move in 2026 is decoupling “knowledge” (Skills) from “reasoning” (Agents). This report details how to architect these systems, comparing Anthropic’s Skill-first approach with OpenAI’s AgentKit, and providing a roadmap for multi-agent orchestration.

1. Introduction & Core Terminology

To navigate the 2026 agent ecosystem, it is critical to distinguish between the “actor” (the agent) and the “capability” (the skill).

Claude Skills: Modular, filesystem-based directories containing instructions (SKILL.md), executable code, and resources. They are designed for “progressive disclosure,” meaning the model only sees the skill’s metadata until it decides to use it, at which point it loads the full instruction set ⁵ ¹.
Claude Agents: Autonomous loops built with the Claude Agent SDK (formerly Claude Code SDK). These agents act as orchestrators that can use tools, execute terminal commands, and delegate tasks to sub-agents or Skills ⁶ ⁷.
OpenAI Codex (GPT-5.2-Codex): “Codex” is no longer just a research preview but a specialized model within the GPT-5 family (gpt-5.2-codex). It is optimized for repo-scale reasoning and is typically orchestrated via OpenAI’s Agents SDK or AgentKit rather than a standalone “skills” platform ³.
AgentKit: OpenAI’s integrated suite for building agents, comprising a visual Agent Builder, the Agents SDK (open-source), and ChatKit for UI embedding ⁴.

2. How Claude Skills Work Under the Hood

The defining feature of Claude Skills is progressive disclosure. In traditional RAG or prompting, you might inject 10 pages of documentation into the context window “just in case.” With Skills, the process is efficient and dynamic.

The Filesystem Architecture

A Skill is simply a folder on a virtual machine (VM) or local filesystem. It typically contains:

SKILL.md: The entry point containing metadata (name, description) and high-level instructions ¹.
Scripts: Executable files (e.g., clean_data.py) that perform deterministic actions ¹.
Resources: Static files like templates or database schemas ².

The Execution Flow

Discovery: At startup, Claude loads only the name and description of available skills into its system prompt. This consumes minimal tokens (approx. 30 tokens per skill) ¹.
Triggering: When a user request matches a skill’s description (e.g., “Create a slide deck”), Claude uses a bash tool to read the full SKILL.md file ¹.
Execution: If the skill requires running code (e.g., generating a PPTX file), Claude executes the bundled script in a sandboxed environment. Crucially, only the script’s output enters the context window, not the code itself, keeping the context clean ².

Supported Platforms

Skills are available across the entire Anthropic ecosystem:

Claude.ai: Pre-built skills (PDF, Office) are active by default; custom skills can be uploaded ⁵.
Claude API: Developers can attach skills to API requests using the container parameter ².
Claude Code (CLI): Skills are auto-discovered from the .claude/skills directory ⁸.

3. Building & Deploying a Skill: End-to-End Example

This section demonstrates how to create a custom Skill that fills PDF forms—a task that combines instruction following with code execution.

Step 1: Structure the Directory

Create a folder named pdf-form-filler containing your instruction file and python script.

pdf-form-filler/
├─ SKILL.md
└─ fill_form.py

Step 2: Define the Skill Metadata (`SKILL.md`)

The description is critical; it acts as the “trigger” for the model ⁸.

---
name: pdf-form-filler
description: Fill PDF forms using the company-standard template. Trigger when users ask to "fill form", "complete application", or "generate PDF".
---

# PDF Form Filler Instructions

When the user provides data for a form:
1. Map the user's data to the fields defined in `fill_form.py`.
2. Execute the python script to generate the PDF.
3. Verify the output file exists before confirming to the user.

Step 3: Invoke via the API

To use this skill programmatically, you pass the skill definition in the container block. Note the requirement for specific beta headers ² ⁹.

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
 model="claude-3-5-sonnet-20240620",
 max_tokens=1024,
 messages=[
 {"role": "user", "content": "Generate a quarterly report PDF for Q3 sales of $50k."}
 ],
 # Beta headers required for Skills & Code Execution
 extra_headers={
 "code-execution-2025-08-25": "true",
 "skills-2025-10-02": "true",
 "files-api-2025-04-14": "true"
 },
 tools=[{
 "name": "code_execution", # Required for Skills to run scripts
 "type": "code_execution"
 }],
 # Attach the skill (conceptually - in practice, you upload or ref the skill ID)
 container={
 "type": "anthropic",
 "skill_id": "pdf-form-filler", # Or custom skill ID after upload
 "version": "latest"
 }
)

4. Claude Skills vs. Agents vs. OpenAI Codex

Understanding the distinction between these tools is vital for selecting the right architecture.

Comparison Matrix

Feature	Claude Skills	Claude Agents (SDK)	OpenAI AgentKit / Codex
Primary Role	Knowledge/Capability: “How to do X” (Passive)	Orchestrator: “Decide to do X” (Active)	Platform: End-to-end build & deploy
Architecture	Filesystem folders (`SKILL.md`) ¹	Python/TS SDK loop ⁶	Visual Builder + SDK ⁴
Context Strategy	Progressive Disclosure: Loads on demand ²	Full context + tool definitions	Full workflow context loaded upfront ⁴
Code Execution	Sandboxed VM (Bash/Python) ²	Local or Remote (via MCP) ⁷	Managed Infrastructure ³
Portability	Open Standard (`agentskills.io`) ¹	SDK-specific	Open Standard (`AGENTS.md`) ³
Best For…	Reusable, static workflows (e.g., “Review PR”)	Complex, dynamic reasoning loops	Visual workflow design & rapid deployment

Key Differences

Skills vs. Agents: A Skill is a tool that an Agent uses. You wouldn’t build a “Skill” to manage a long-running conversation; you would build an Agent that uses a “Memory Skill” or “Database Skill” to accomplish its tasks ⁸.
Claude vs. Codex: OpenAI’s “Codex” is now integrated into the GPT-5.2 model family. It doesn’t have a distinct “Codex Skills” product. Instead, OpenAI promotes AgentKit and the Agents SDK as the way to harness Codex’s capabilities ³. However, OpenAI supports the AGENTS.md and Skills open standards, allowing for some interoperability ³.

5. Multi-Agent Architecture with Sub-Agents

For complex tasks, a single agent often becomes overwhelmed by context. The solution is multi-agent orchestration using sub-agents.

How Sub-Agents Work

In the Claude ecosystem, sub-agents are specialized instances that run in their own isolated context window.

Isolation: They do not see the main agent’s full history, only the specific task delegated to them. This prevents context pollution ¹⁰.
Specialization: You can restrict a sub-agent’s tools. For example, a “Code Reviewer” sub-agent might have read-only access, while a “Fixer” sub-agent has edit permissions ¹¹.
No Infinite Nesting: Sub-agents cannot spawn their own sub-agents ¹⁰.

Parallelization Strategy

One of the most powerful features of the Claude Agent SDK is the ability to run sub-agents in parallel. This can drastically reduce wall-clock time for multi-step tasks.

Example: Parallel Code Review Pipeline Instead of running checks sequentially, an orchestrator agent can spawn three sub-agents simultaneously:

# Conceptual Python SDK Example
from claude_agent_sdk import Agent

# Define specialized sub-agents
security_agent = Agent(name="SecScan", description="Check for vulnerabilities", tools=["grep"])
style_agent = Agent(name="Linter", description="Check PEP8 compliance", tools=["flake8"])
test_agent = Agent(name="Tester", description="Run unit tests", tools=["pytest"])

# Main orchestrator delegates in parallel
results = main_agent.run_parallel([
 (security_agent, "Scan auth_module.py"),
 (style_agent, "Lint auth_module.py"),
 (test_agent, "Test auth_module.py")
])

# Synthesize results
main_agent.reply(f"Review complete. Security: {results[^0]}, Style: {results[^1]}...")

Note: Parallel execution logic is supported via the SDK’s agents parameter and async patterns ¹¹.

6. Setup Guide: How to Set Yourself Up

To implement multi-agent skills effectively, follow this setup checklist:

1. Environment Configuration

Enable Code Execution: In your Claude settings (or API headers), ensure code execution is enabled. This is a prerequisite for Skills ⁵ ².
Directory Structure: Create a .claude/skills directory in your project root. This is where the Claude CLI and SDK look for custom skills ².

2. Define Your Agents

Use the agents parameter in the SDK or define them as markdown files in .claude/agents/.

Best Practice: Give every sub-agent a clear, distinct description. Claude uses this description to decide when to route a task to that sub-agent ¹¹.
Tool Restriction: Explicitly define allowed_tools for each sub-agent to minimize security risks (e.g., deny bash access to a “Research” agent) ¹⁰.

3. Adopt the Open Standard

To future-proof your work, structure your skills according to the Agent Skills open standard (agentskills.io). This ensures your skills remain portable if you switch between Anthropic’s and OpenAI’s ecosystems in the future ¹ ³.

7. Bottom Line

Use Claude Skills when you have repeatable, procedural knowledge (e.g., “How to format our weekly report”) that involves reference files or scripts. Their progressive disclosure architecture makes them significantly more token-efficient than pasting instructions into prompts ¹ ².
Use Sub-Agents when you need to isolate context or run tasks in parallel. If a task requires exploring 50 files, delegate it to a sub-agent so your main conversation doesn’t get cluttered with the file contents ¹⁰ ¹¹.
Watch the Convergence: Both Anthropic (via agentskills.io) and OpenAI (via AGENTS.md) are converging on open standards. Building your skills as modular, filesystem-based resources is the safest long-term bet for interoperability ¹ ³.

Recommendation: Start by auditing your most frequent workflows. Convert the static documentation for these workflows into Skills (SKILL.md). Then, build a simple “Orchestrator Agent” using the SDK to call these skills. This “Skill-First” approach scales better than trying to build one giant “do-it-all” agent.