The complete guide to Claude Code setup (Opus 4.6, Sonnet 4.6, Haiku 4.5). 1M token context window. 100+ hours saved. 25 hook events. Agent teams and task management. Production-tested patterns for skills, hooks, and MCP integration.
Anthropic published a series of cookbooks and documentation covering skill design, agent orchestration patterns, and plugin development. This chapter distills the practical patterns most relevant to Claude Code power users. Sources: claude-cookbooks-docs, claude-code-docs, and Anthropic engineering blog (March 2026).
Skills and slash commands can now override the session’s effort level:
---
name: my-skill
description: When to invoke this skill
effort: low # low | medium | high
---
Guidelines:
low: Quick lookups, reference checks, status queries (~1-2 tool calls)medium: Standard workflows, file modifications, moderate analysis (~3-10 tool calls)high: Complex analysis, multi-file changes, architectural decisions (~10+ tool calls)context: forkHeavy skills should run in a fresh subagent to avoid polluting the main conversation context:
---
name: deep-analysis
description: Run comprehensive code analysis
context: fork # Runs in isolated subagent
effort: high
allowed-tools: Read, Grep, Glob, Bash
---
When to use context: fork:
When NOT to use it:
Skills that perform irreversible actions (deploy, send messages, delete data) should require manual invocation:
---
name: production-deploy
description: Deploy to production
disable-model-invocation: true # MUST be /deploy, never auto-triggered
effort: medium
---
Keep SKILL.md under 500 lines. Move detailed reference material to companion files:
~/.claude/skills/my-skill/
├── SKILL.md # <500 lines — instructions + key info
├── reference.md # Detailed docs (loaded on-demand by Claude)
├── examples.md # Usage examples
└── scripts/ # Helper scripts
Claude reads SKILL.md when the skill triggers. It reads reference.md/examples.md only if it needs more detail — this is lazy loading by design.
The agent reasons about its next step, takes an action, observes the result, and repeats:
Think → Act → Observe → Think → Act → Observe → ... → Answer
Claude Code implementation: This is the default behavior. Claude naturally follows ReAct when given tools. Optimize by:
maxTurns to prevent runaway loopsA lead agent delegates subtasks to specialized workers:
Orchestrator (main context)
├── Worker 1: Database analysis (subagent)
├── Worker 2: API investigation (subagent)
└── Worker 3: Frontend check (subagent)
Claude Code implementation: Use the Agent/Task tool with specialized subagents:
# In orchestrator prompt:
# 1. Read the plan
# 2. Delegate each task via Task() with file paths
# 3. Collect results and verify
# 4. Never accumulate worker outputs in orchestrator context
Key rule: Orchestrator stays lean (<20% context). Each worker gets a fresh window.
A dedicated research agent explores broadly, then reports findings:
---
name: research-agent
description: Deep codebase research with comprehensive findings
model: sonnet
tools: Read, Grep, Glob, WebFetch
effort: high
maxTurns: 20
disallowedTools: [Write, Edit] # Read-only research
---
The research agent can explore freely without risking accidental modifications.
For fast, cheap operations (classification, routing, simple lookups):
---
name: classifier
description: Quick intent classification
model: haiku
effort: low
maxTurns: 5
---
Best for: Triage, categorization, yes/no decisions, data extraction from structured sources.
my-plugin/
├── .claude-plugin/
│ └── plugin.json # Manifest (required)
├── skills/
│ └── my-skill/SKILL.md # Skills (NOT inside .claude-plugin/)
├── agents/
│ └── my-agent.md # Agents
├── hooks/
│ └── hooks.json # Hook definitions
├── .mcp.json # MCP server configs
└── settings.json # Default settings
Critical: Skills, agents, and hooks go at the plugin root — NOT inside .claude-plugin/.
# Use ${CLAUDE_PLUGIN_DATA} for state that survives plugin updates
# /plugin uninstall prompts before deleting it
Declare plugins inline in settings.json instead of external repos:
{
"enabledPlugins": {
"my-plugin@source": true
}
}
When context fills (~95%), Claude compresses the conversation:
PostCompact hook for context re-injectionRe-inject critical context that may be lost during compaction:
{
"hooks": {
"PostCompact": [{
"hooks": [{
"type": "command",
"command": "echo 'Key context: [project-specific reminders here]'"
}]
}]
}
}
When compacting, preserve:
- Full list of all modified files
- Summary of architecture decisions made
- Test commands that passed/failed
- Current task progress and next steps
Pre-warm the cache with context you’ll use repeatedly:
| Category | Budget | Notes |
|---|---|---|
| System prompt | ~7K | Fixed, cached |
| CLAUDE.md + rules | <20K | Keep under 200 lines each |
| Auto memory | <5K | First 200 lines of MEMORY.md |
| Skill descriptions | ~2% | Loaded for routing decisions |
| MCP tool descriptions | Variable | Use tool search for large registries |
| Working context | Remaining | Conversation + file reads |
75% rule: Checkpoint at 75% context usage — quality degrades past this point.
effort: frontmatter — reduces token usage for simple skills (new in 2.1.80)context: fork for heavy analysis skills — keeps main context cleandisable-model-invocation: true for side-effect skills (deploy, send, delete)