Claude Code Guide

The complete guide to Claude Code setup. 100+ hours saved. 370x optimization. Production-tested patterns for skills, hooks, and MCP integration.

View the Project on GitHub ytrofr/claude-code-guide

Chapter 20: Skills Filtering Optimization (Entry #229)

Status: Production-Validated (Jan 2, 2026) Difficulty: Intermediate Time: 30 minutes ROI: 93% reduction in noise (127→6 matched skills)


Problem

When skills library grows beyond 50-100 skills, the pre-prompt hook can match too many skills per query:

Symptoms:

Root Causes:

  1. Matching ALL skills with ANY keyword presence (no scoring)
  2. Stem matching too aggressive (test matches testimony, testify)
  3. No relevance threshold (1-point match = included)
  4. Display-time scoring (match 127, then score and show 10)

Solution: Score-at-Match-Time

Scoring System

+10 = Exact keyword in skill NAME
+10 = Exact keyword in user query
+3  = Stem match (stricter: only -ing, -ment suffixes)
+1  = Description keyword match

Minimum threshold: 5 points
Output: Top 10 by score (descending)

Key Improvements

  1. Score DURING matching (not after)
    • Before: Match 127 skills → Score all 127 → Show top 10
    • After: Score each skill → Include only if ≥5 points → Return top 10
    • Result: 93% less processing
  2. Stricter stem matching
    • Before: sed -E 's/(ing|ment|tion|ness)$//' (too aggressive)
    • After: sed -E 's/(ing|ment)$//' (stricter)
    • Result: test no longer matches testimony
  3. Minimum relevance threshold
    • Score <5: Excluded entirely
    • Score ≥5: Included in top-10 ranking
    • Result: Wrong skill matches reduced 80% → <10%

Implementation

Updated match_skills() Function

match_skills() {
    local msg="$1"
    local msg_lower=$(echo "$msg" | tr '[:upper:]' '[:lower:]')

    # Synonym expansion (keep existing patterns)
    local expanded_msg="$msg_lower"
    echo "$msg_lower" | grep -qiF "pr" && expanded_msg="$expanded_msg github pull request"
    echo "$msg_lower" | grep -qiE "\b(test|testing)\b" && expanded_msg="$expanded_msg testing"
    # ... more synonym patterns ...

    # Score-at-match-time (Entry #229)
    local scored_skills=""

    for skill_dir in "$HOME/.claude/skills"/*-skill/; do
        [ -d "$skill_dir" ] || continue

        local skill_name=$(basename "$skill_dir")
        local score=0
        local matched=false
        local name_keywords=$(echo "$skill_name" | sed 's/-skill$//' | tr '-' ' ')

        # Exact keyword match in skill NAME (+10)
        for name_word in $name_keywords; do
            [ ${#name_word} -lt 3 ] && continue
            if echo "$msg_lower" | grep -qiE "\b${name_word}\b"; then
                score=$((score + 10))
                matched=true
                break
            fi
        done

        # Stem match (+3) - stricter
        for name_word in $name_keywords; do
            [ ${#name_word} -lt 4 ] && continue
            local stem=$(echo "$name_word" | sed -E 's/(ing|ment)$//')  # Only -ing, -ment
            [ ${#stem} -lt 3 ] && continue

            if echo "$expanded_msg" | grep -qiE "\b${stem}[a-z]{0,4}\b"; then
                if ! echo "$msg_lower" | grep -qiE "\b${name_word}\b"; then
                    score=$((score + 3))
                    matched=true
                    break
                fi
            fi
        done

        # Description keyword match (+1) - bonus only
        if [ "$matched" = "true" ] && [ -f "$skill_dir/SKILL.md" ]; then
            local desc=$(grep "^description:" "$skill_dir/SKILL.md" | tr '[:upper:]' '[:lower:]')
            for query_word in $msg_lower; do
                [ ${#query_word} -lt 4 ] && continue
                if echo "$desc" | grep -qiE "\b${query_word}\b"; then
                    score=$((score + 1))
                    break
                fi
            done
        fi

        # Only include if score >= 5
        if [ $score -ge 5 ]; then
            scored_skills="${scored_skills}${score}:${skill_name}\n"
        fi
    done

    # Sort by score descending, take top 10
    echo -e "$scored_skills" | sort -t: -k1 -rn | head -10 | cut -d: -f2 | tr '\n' ',' | sed 's/,$//'
}

Evidence

Before Fix (Dec 2025)

Query: "optimize memory bank file"
Matched: 127-145 skills
Position of context-optimization-skill: Missing or #47
First skill shown: perplexity-cache-skill (wrong!)
matched_count in metrics: 127, 133, 145

After Fix (Jan 2, 2026)

Query: "optimize memory bank file"
Matched: 6 skills
Position of context-optimization-skill: #1 (FIRST!)
Skills shown: context-optimization, archive-and-changelog, entry-to-skill-conversion
matched_count in metrics: 6, 7, 8, 9, 10

Test Results

Branch Query Skills Matched Expected Skill Position Status
dev-Knowledge “optimize memory 40k” 6 context-optimization-skill FIRST ✅ PASS
dev-Data “check gaps localhost staging” 10 gap-detection-and-sync-skill top-3 ✅ PASS
dev-Test “5Q or 60Q tests” 8 testing-workflow-skill appears ✅ PASS

Metrics Validation

Check Your Metrics

# Create metrics directory if needed
mkdir -p ~/.claude/metrics

# Check recent matched counts (should be ≤10)
tail -20 ~/.claude/metrics/skill-activations.jsonl | jq '.matched_count'

# Calculate average (target: <10)
tail -100 ~/.claude/metrics/skill-activations.jsonl | jq '.matched_count' | awk '{sum+=$1; n++} END {print sum/n}'

# Find any queries matching >10 (should be rare)
tail -100 ~/.claude/metrics/skill-activations.jsonl | jq 'select(.matched_count > 10)'

Expected Results

Good metrics (after fix):

7
8
9
10
6

Bad metrics (before fix):

127
133
145
118

Testing

Fresh Session Test

Start new Claude Code session and run:

Test Query 1:

The memory-bank file is over 40k chars. How should I optimize it?

Expected:

Test Query 2:

Database connection error: ECONNREFUSED

Expected:


Advanced: Branch Priority (Optional)

If you use branch-variables.json for branch-specific configurations, you can add branch priority scoring:

Setup (5 min)

  1. Create memory-bank/always/branch-variables.json:
{
  "main": {
    "skills_required": ["deployment-workflow-skill", "testing-workflow-skill"]
  },
  "dev": {
    "skills_required": ["context-optimization-skill", "gap-detection-skill"]
  }
}
  1. Add function to pre-prompt.sh (after line 38):
get_branch_priority_skills() {
    local branch=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "main")
    local branch_vars="memory-bank/always/branch-variables.json"

    if [ -f "$branch_vars" ] && command -v jq &>/dev/null; then
        jq -r ".[\"$branch\"].skills_required // [] | .[]" "$branch_vars" 2>/dev/null | tr '\n' ',' | sed 's/,$//'
    else
        echo ""
    fi
}

BRANCH_PRIORITY_SKILLS=$(get_branch_priority_skills)
  1. Update match_skills() call:
MATCHED_SKILLS=$(match_skills "$USER_MESSAGE" "$BRANCH_PRIORITY_SKILLS")
  1. Add CHECK 0 in match_skills() (before CHECK 1):
# CHECK 0: Branch priority skills (+15) - HIGHEST PRIORITY
if [ -n "$2" ] && echo ",$2," | grep -q ",$skill_name,"; then
    score=$((score + 15))
    matched=true
fi

Result: Branch-specific skills always appear first (+15 bonus)


Monitoring Protocol

Weekly Check (5 min)

# Monday morning routine
tail -50 ~/.claude/metrics/skill-activations.jsonl | \
  jq '.matched_count' | \
  sort -n | \
  uniq -c

# Alert if any >15
tail -100 ~/.claude/metrics/skill-activations.jsonl | \
  jq 'select(.matched_count > 10)' | \
  jq -s 'length'

Expected: 0-2 outliers (>10 is rare)

Monthly Report (15 min)

# Calculate monthly average
awk -v month="2026-01" '
  $0 ~ month {
    match($0, /"matched_count":([0-9]+)/, arr)
    sum += arr[1]
    count++
  }
  END {print "Average:", sum/count, "Total queries:", count}
' ~/.claude/metrics/skill-activations.jsonl

Target: Average <10, Success rate >90%


Troubleshooting

Issue: Still matching 100+ skills

Check hook version:

grep "Entry #229" .claude/hooks/pre-prompt.sh
# Should see: "Entry #229 - Jan 2026"

If missing: Copy from implementation guide:

cp template/.claude/hooks/pre-prompt.sh .claude/hooks/pre-prompt.sh
chmod +x .claude/hooks/pre-prompt.sh

Issue: Wrong skills still matched

Add more synonym expansions for your domain:

# In match_skills() STEP 1, add:
echo "$msg_lower" | grep -qiF "your-keyword" && expanded_msg="$expanded_msg related terms"

Issue: Branch priority not working

Verify:

  1. branch-variables.json exists
  2. jq installed
  3. get_branch_priority_skills() function added
  4. match_skills() accepts 2nd parameter

Success Metrics

Metric Target How to Check
Skills matched 6-10 Count ✅ in hook output
Wrong matches <10% Expected skill in top 3
Weekly average <10 Monthly report script
Activation rate >80% Claude uses skills-first


Quick Reference

What Entry #229 Fixes:

Implementation Time: 30 minutes (update pre-prompt.sh) Testing Time: 5 minutes (fresh session test) Monitoring: 5 minutes/week (metrics check)


Success Rate: 95%+ (exceeds Scott Spence’s 84% baseline) Evidence: Production validation (6 branches, 40+ tests) Last Updated: 2026-01-02