The complete guide to Claude Code setup. 100+ hours saved. 370x optimization. Production-tested patterns for skills, hooks, and MCP integration.
Every message you send to Claude Code loads a set of context: your CLAUDE.md, skill descriptions, agent descriptions, and MCP tool definitions. Understanding what loads when – and what it costs in tokens – is the difference between a responsive setup and one that silently drops skills or wastes budget on duplicates. This chapter breaks down the context cost of each extension point, explains the skill description budget, and provides real-world measurements from a production project.
Purpose: Understand and optimize context consumption across Claude Code extension points Source: Claude Code documentation + production measurements (LIMOR AI: 224 skills, 52 agents, 5 MCP servers) Difficulty: Intermediate Time: 30 minutes to audit, 1-2 hours to optimize
Claude Code’s context window is finite. Every extension point you configure consumes some portion of it. The critical insight is that most costs are per-message, not per-session. A bloated CLAUDE.md or excessive skill descriptions eat into your available context on every single turn.
Key principle: Minimize always-loaded context. Push details into on-demand loading (full skill content, agent bodies, on-demand files).
This table shows what loads when for each Claude Code extension point:
| Extension Point | When It Loads | Per-Message Cost | Notes |
|---|---|---|---|
CLAUDE.md + .claude/rules/ |
Every message | Full content | Biggest fixed cost. Keep lean. |
| Skill descriptions | Every message | Subject to 2% budget | Only the YAML frontmatter description field |
| Skill full content | On invocation only | Full .md body |
Loaded when user or model triggers the skill |
| Agent descriptions | Every message | Full description field |
From frontmatter of each .claude/agents/*.md |
| Agent full content | Only when spawned | Loaded into subagent context | Via Task() tool – does not affect main context |
| Hooks | Zero context cost | None | Shell scripts, run externally, no token impact |
| MCP tool list | Every message | Tool name + description + schema | Grows with number of MCP servers and tools |
Claude Code enforces a budget on how much total skill description text is loaded per message. Skills that exceed the budget are silently excluded – no warning, no error, they simply do not appear to the model.
2% of the context window = approximately 16,000 characters
This budget covers the combined description field from all skill files across all levels (enterprise, personal, project).
Set the environment variable in your shell profile:
# In ~/.bashrc or ~/.zshrc
export SLASH_COMMAND_TOOL_CHAR_BUDGET=40000
This increases the budget to 40,000 characters. Useful when you have a large skill library, but be aware it steals from your available context for conversation.
If your project has 100+ skills (user + project level combined), the native 16k budget is likely insufficient. Measure total description characters before deciding:
total=0
for f in $(find ~/.claude/skills .claude/skills -maxdepth 3 -name "SKILL.md" 2>/dev/null); do
desc=$(sed -n '/^description:/p' "$f" | head -1 | sed 's/^description: *//;s/^"//;s/"$//')
total=$((total + ${#desc}))
done
echo "Total: $total chars (budget: ${SLASH_COMMAND_TOOL_CHAR_BUDGET:-16000})"
Real example: A project with 213 skills (22 user-level + 182 project-level + 9 plugin) had 35,415 chars of descriptions (224% of native budget). Without the 40,000 override, 56% of skills were silently excluded. The fix was a single line in ~/.bashrc:
export SLASH_COMMAND_TOOL_CHAR_BUDGET=40000
Warning: If you previously used a custom pre-prompt hook to bypass the budget (e.g., top-N injection), and that hook was removed, the override becomes your only safety net. Do not remove it without measuring first.
Each skill’s description field in its YAML frontmatter should follow this pattern:
---
description: "Deploy to GCP Cloud Run staging or production. Use when deploying, checking revisions, or routing traffic."
---
Rules:
Good:
description: "Validate database schema against Sacred Commandments. Use when checking table structure, column types, or Golden Rule compliance."
Bad:
description: "This skill helps with database work. It has patterns for employees, shifts, orders, and more. It covers Sacred Commandments I, IV, VIII, XII, and XIV with detailed examples and SQL queries."
The bad example wastes budget on details that belong in the skill body, not the description.
At high utilization (>90%), adding a single skill can silently drop others. Monitor and enforce:
# Measure current budget usage
total=0
for f in $(find ~/.claude/skills -name "SKILL.md") $(find .claude/skills -name "SKILL.md"); do
desc=$(grep "^description:" "$f" | sed 's/^description: *//;s/^"//;s/"$//')
total=$((total + ${#desc}))
done
echo "Budget: $total / ${SLASH_COMMAND_TOOL_CHAR_BUDGET:-16000} chars"
What to do when budget is tight:
~/.claude/skills-disabled/ (not loaded, but recoverable)disable-model-invocation: true for user-only skills (removes from budget)~/.claude/skills/ (global) to .claude/skills/ (per-project) so they only load where neededReal example: A project hit 98.7% budget (39,480/40,000 chars). Four skills unrelated to the active project were moved to skills-disabled/, bringing usage to 93.4% with 2,639 chars headroom.
Two mechanisms remove skills from the description budget entirely:
1. disable-model-invocation: true
---
name: my-manual-skill
description: "Generate weekly report for Slack."
disable-model-invocation: true
---
This makes the skill a user-only slash command. The model cannot invoke it autonomously, and it does not count against the description budget. Use this for skills that should only run when the user explicitly types /my-manual-skill.
2. context: fork
---
name: heavy-analysis
description: "Run deep code analysis across the entire repository."
context: fork
---
This runs the skill in an isolated subagent context, separate from the main conversation. The skill’s full content loads into the fork, not the main context window. The description still counts against the budget, but the body content is isolated.
SKILL.md file under 500 lines~/.claude/skills/
deploy-workflow-skill.md # Under 500 lines
deploy-workflow-skill/
checklist.md # Supporting file
cloud-run-commands.md # Supporting file
The supporting files have zero context cost until the skill reads them during execution.
Agents (.claude/agents/*.md) have their descriptions loaded every message. Unlike skills, there is no formal character budget – but the cost is real.
Agent loading follows a priority order. When two agents share the same name, only the higher-priority one loads:
Priority (highest to lowest):
1. Managed agents (Anthropic-provided)
2. CLI flag agents (--agent flag)
3. Project-level agents (.claude/agents/ in repo)
4. User-level agents (~/.claude/agents/)
5. Plugin agents
Common waste: Defining an agent at both user-level (~/.claude/agents/deploy-agent.md) and project-level (.claude/agents/deploy-agent.md). The project-level agent wins, but both descriptions may load depending on implementation. Keep agents at one level only.
Keep agent descriptions short and action-focused:
# Good: 85 characters
description: "Deploy to GCP Cloud Run. Use for staging/production deployments and traffic routing."
# Bad: 240 characters
description: "This agent is a deployment specialist that handles all aspects of deploying to Google Cloud Platform Cloud Run including staging deployments, production deployments, traffic routing, health checks, timeout configuration, and rollback procedures."
The model needs to know WHEN to spawn the agent, not everything it can do.
Avoid agents whose job is to optimize or enforce rules on other agents. These “meta-agents” add description overhead to every message while providing marginal value. Put enforcement rules in .claude/rules/ instead (zero ongoing cost after initial load via CLAUDE.md).
These measurements come from LIMOR AI, a production project with 224 skills, 52 agents, and 5 MCP servers. They illustrate real-world budget pressure.
| Component | Count | Total Description Chars | Approx Tokens |
|---|---|---|---|
| Skills | 224 | 42,165 | ~10,541 |
| Agents | 52 | 9,476 | ~2,369 |
| Duplicate agents (user + project) | 11 | ~2,400 | ~600 |
| MCP tools | 70 | ~14,000 | ~3,500 |
| Total per-message overhead | ~68,041 | ~17,010 |
Problem: 42,165 skill description characters exceeded even the overridden 40,000 character budget. Some skills were silently excluded from model awareness.
| Action | Savings |
|---|---|
| Trimmed skill descriptions (removed verbose explanations) | 42,165 -> 37,800 chars (-10%) |
| Removed 18 low-value agents | 52 -> 34 agents (-35%) |
| Eliminated 11 duplicate agents | ~2,400 chars freed |
Added disable-model-invocation: true to 15 user-only skills |
~3,800 chars removed from budget |
| Result | Skills under 38k budget, agents under 7k chars |
# Count total skill description characters
find ~/.claude/skills/ .claude/skills/ -name "*.md" 2>/dev/null | \
xargs grep -A1 "^description:" | grep -v "^--$" | wc -c
# Count agents and their description lengths
find ~/.claude/agents/ .claude/agents/ -name "*.md" 2>/dev/null | \
while read f; do
desc=$(grep "^description:" "$f" | head -1)
echo "$(basename "$f"): ${#desc} chars"
done
# Find duplicate agent names across user and project levels
comm -12 \
<(ls ~/.claude/agents/*.md 2>/dev/null | xargs -I{} basename {} | sort) \
<(ls .claude/agents/*.md 2>/dev/null | xargs -I{} basename {} | sort)
When multiple skills share the same name, the highest-priority level wins:
1. Enterprise skills (organization-managed)
2. Personal skills (~/.claude/skills/)
3. Project skills (.claude/skills/)
When multiple agents share the same name:
1. Managed agents (Anthropic-provided)
2. CLI flag agents (--agent)
3. Project-level agents (.claude/agents/)
4. User-level agents (~/.claude/agents/)
5. Plugin agents
Use this checklist to audit your Claude Code context costs:
.claude/rules/ or on-demand filesSLASH_COMMAND_TOOL_CHAR_BUDGET)disable-model-invocation: true?context: fork for isolation?.claude/rules/ instead of agents?Anthropic’s skill design guide states: “The context window is a public good. Skills share the context window with everything else.”
This means every character you put into CLAUDE.md, skill descriptions, agent descriptions, and MCP tool schemas competes with the actual work Claude needs to do – reading files, reasoning about code, and generating responses.
The default assumption: Claude is already very smart. Only add context that Claude does not already have. If a rule is common sense (e.g., “write clean code”), it wastes budget without adding value.
Instead of loading everything upfront, use a layered approach:
Level 1: Description (per-message, ~100-200 chars)
→ Tells the model WHEN to use a skill/agent
→ Always loaded, always costs tokens
Level 2: Skill Body (on-invocation, ~2-8k chars)
→ Tells the model HOW to execute the task
→ Only loaded when skill activates
Level 3: Supporting Files (on-demand, unlimited)
→ Deep reference data, examples, large configs
→ Only loaded via Read tool during execution
Token savings example: A deployment skill with 500 lines of content costs ~2k tokens at Level 2 – but only when triggered. The same content in CLAUDE.md would cost ~2k tokens on EVERY message.
Skills that reference executable scripts should treat them as black boxes:
# In SKILL.md:
## Usage
Run `scripts/validate.sh --help` for available options.
DO NOT read the script source into context.
Why: Scripts can be very large (hundreds of lines). Reading them into context wastes tokens on implementation details the model doesn’t need. Instead, the model runs --help to discover usage, then executes the script.
Rule: All scripts referenced by skills MUST support --help for self-documentation.
Token savings: A 200-line bash script = ~800 tokens saved by not reading it into context.
Source: Anthropic webapp-testing skill uses this pattern for its test runner scripts.
Previous: 37: Agent Teams Next: 39: Context Separation