This file contains 4 hands-on exercises designed to reinforce core CCA exam concepts through practical implementation. Each exercise includes an objective, detailed steps, and links to relevant domain notes.
Exercise 1: Build a Multi-Tool Agent with Escalation Logic
Objective: Practice agentic loop design, tool integration, structured error handling, and escalation patterns. By the end, you'll have a working agent that makes intelligent decisions about when to handle issues autonomously vs. escalate to humans.
Context: You're building a customer support agent that processes refund requests. The agent has access to get_customer, lookup_order, process_refund, and escalate_to_human tools. The challenge is implementing decision logic and error handling that improves first-contact resolution while maintaining safety.
Steps:
- Define 3-4 MCP Tools with Detailed Descriptions
- Create tool definitions for:
get_customer(input: customer_name or email, returns: customer_id, account_status, history),lookup_order(input: customer_id + order_id, returns: order details, damage report, purchase date),process_refund(input: customer_id, order_id, reason, amount, returns: confirmation or error),escalate_to_human(input: reason, context, returns: ticket_id) - Include in each tool description: input formats (required vs. optional fields), example queries, edge cases, and clear boundaries (when NOT to use)
- Example: "lookup_order: Requires customer_id (from get_customer) and order_id. Does NOT accept customer name. Edge case: Fails if order_id is malformed. Use only after customer identity is verified."
-
Document what each tool should NOT do (explicit boundaries reduce ambiguity)
-
Implement Agentic Loop Checking stop_reason
- Create a loop that processes user messages and calls tools until
stop_reason = "end_turn" - Implement each tool call with its response handling
- Log all tool calls and responses to understand agent behavior
- Add a max iteration limit (e.g., 10 calls) to prevent infinite loops
-
Example structure: while stop_reason != "end_turn" and iterations < 10: call_tool() → check_response() → continue_loop()
-
Add Structured Error Responses
- Define error response format:
{errorCategory: "validation_error" | "timeout" | "business_rule" | "unknown", isRetryable: boolean, message: string, suggestion: string} - Implement error handling in tool responses (e.g., if
process_refundreturns amount > $500, errorCategory = "business_rule", isRetryable = false, suggestion = "escalate to supervisor") - Test error responses: invalid customer IDs, orders not found, refund amount out of policy
-
Ensure the agent interprets these structured errors and decides to retry or escalate
-
Implement Hook for Business Rule Enforcement (Block Refunds Above Threshold)
- Create a PostToolUse hook that intercepts
process_refundcalls - Add logic: if amount > $500, block the call and return a structured error instead of executing
- Ensure that even if the agent tries to call process_refund directly, it cannot bypass the $500 limit without escalation
-
This demonstrates how hooks enforce deterministic business rules that prompts alone cannot guarantee
-
Test with Multi-Concern Messages
- Create test messages with multiple concerns: "My order arrived damaged AND I want to return it for a refund AND I have questions about my account"
- Verify the agent uses get_customer first to establish identity
- Verify it can handle sequential concerns (damage → lookup_order → process_refund)
- Test edge case: agent tries to call lookup_order before get_customer (should be blocked by hook or tool description)
- Verify the agent escalates when appropriate (e.g., refund > $500, damaged item requires supervisor approval)
Domains Reinforced: - Agentic Architecture & Orchestration (agentic loops, escalation patterns, tool ordering) - Tool Design & MCP Integration (tool definitions, descriptions, boundaries) - Context Management & Reliability (error handling, structured responses)
Exercise 2: Configure Claude Code for a Team Development Workflow
Objective: Practice setting up a complete Claude Code environment with project-level configuration, path-specific rules, team commands, and MCP server integration. You'll learn how CLAUDE.md hierarchies, rules, and commands scale across a team.
Context: Your team has 3 developers working on a full-stack app with frontend (React), backend (Python Flask), and infrastructure (Terraform). Different areas have different coding conventions, dependencies, and tooling. You want Claude Code to automatically apply the right conventions in each area without developers thinking about it.
Steps:
- Create Project-Level CLAUDE.md
- Create
/CLAUDE.mdat the project root with global context: "This is a full-stack app. Frontend: React + Jest. Backend: Python 3.11 + Pytest. Infra: Terraform + AWS. Common conventions: error handling, logging, testing." - Add instructions for when to ask for clarification: "Before writing infrastructure code, ask if this is for dev/staging/prod. Before database migrations, clarify impact scope."
-
This file applies globally to all work in the project
-
Create .claude/rules/ with YAML Frontmatter Glob Patterns
- Create
/app/frontend/.claude/rules/frontend.yamlwith frontmatter:path: "frontend/**"and rules for React (components insrc/components/, tests in__tests__/, use Jest, import styles as modules) - Create
/app/backend/.claude/rules/backend.yamlwith frontmatter:path: "backend/**"and rules for Python (Flask conventions, tests intests/, use pytest, type hints required) - Create
/infra/.claude/rules/terraform.yamlwith frontmatter:path: "infra/**"and rules for Terraform (variables invariables.tf, outputs inoutputs.tf, comments required on all resources) -
Test that these rules apply automatically regardless of file location (e.g.,
/app/frontend/hooks/useData.jsmatchesfrontend/**) -
Create Project-Scoped Skill with Context and Allowed Tools
- Create
/app/.claude/skills/SKILL.mdwith a skill definition, e.g., "code-review" skill - Set
context: fork_sessionso the skill creates a new session for isolated code analysis - Set
allowed-tools: ["Read", "Write", "Bash", "Grep"]to limit what the skill can do (no deletion, no external API calls) - Document when developers should invoke this skill: "Use /code-review before opening a PR to catch issues early"
-
This demonstrates how to create safe, repeatable team processes
-
Configure MCP Server in .mcp.json with Env Var Expansion + Personal Server
- Create
/.mcp.jsonat project root with:servers: [{"type": "stdio", "command": "mcp-server-github", "env": {"GITHUB_TOKEN": "${GITHUB_TOKEN}", "GITHUB_ORG": "my-org"}}] - Set up environment variable expansion so
GITHUB_TOKENis loaded from the developer's shell environment (not hardcoded) - Create
~/.claude.json(personal config) with a personal MCP server:{"servers": [{"type": "stdio", "command": "mcp-server-personal-tools"}]} -
Test that both project servers (from
.mcp.json) and personal servers (from~/.claude.json) are available to Claude -
Test Plan Mode vs. Direct Execution
- Run a complex task in plan mode: "Refactor the user authentication system to support OAuth 2.0" → Observe plan mode exploring the codebase, identifying files, and designing the approach
- Run the same task in direct execution mode (without plan) and observe the difference in approach
- Run a simple task ("Add a console.log statement to App.js") in direct execution and verify it completes quickly without planning
- Note that plan mode is for architectural decisions; direct execution is for straightforward changes
Domains Reinforced: - Claude Code Configuration & Workflows (CLAUDE.md, .claude/rules/, .claude/commands/, .claude/skills/) - Claude Code Architect (hierarchies, path-specific rules, skill definition) - Tool Design & MCP Integration (MCP server configuration, environment variables)
Exercise 3: Build a Structured Data Extraction Pipeline
Objective: Practice JSON schema design, tool_use in structured output, validation-retry loops, and batch processing at scale. You'll build a pipeline that reliably extracts data from varied document formats and handles failures gracefully.
Context: Your company processes invoices from different vendors. Each invoice has a different format, but you need to extract: vendor_name, invoice_number, total_amount, line_items, tax_amount, payment_terms. Some invoices are scanned PDFs (text-only), others are digital. You need to handle extraction failures and route low-confidence extractions for human review.
Steps:
- Define Extraction Tool with JSON Schema
- Create a tool schema with:
- Required fields:
vendor_name(string),invoice_number(string),total_amount(number) - Optional fields:
line_items(array of {description, quantity, unit_price}),tax_amount(number),payment_terms(string) - Enum with "other":
document_type: enum ["invoice", "receipt", "quote", "other"]— allows "other" when document doesn't fit standard types - Nullable fields:
po_number?: string | null(present but might be missing from document)
- Required fields:
-
This schema allows flexible handling of varied document formats
-
Implement Validation-Retry Loop
- Create a loop that sends a document to Claude with the extraction tool
- First attempt: Extract data from the document using the JSON schema
- Validation: Check if required fields are present and reasonable (e.g., total_amount > 0)
- On failure (e.g., missing vendor_name, total_amount = 0): Send the document + failed extraction + validation error back to Claude with a retry prompt: "The extraction failed because vendor_name is missing. Re-examine the document and try again. Look for company letterhead, signatures, or 'Bill From' sections."
- On success: Return the extracted data
-
Test with documents where vendor_name is in unusual locations (footer, watermark, small print)
-
Add Few-Shot Examples for Varied Document Formats
- Create examples showing extraction from:
- A standard digital invoice with clear fields
- A scanned PDF with skewed text and OCR errors
- A receipt with minimal information
- An invoice with multiple line items and taxes
- Each example shows: input (document text), expected output (JSON), and any special handling (e.g., "Tax was listed as '10% of subtotal,' calculate from subtotal amount")
- Include edge cases: "If vendor_name is in a logo, use nearby text; if no clear vendor found, use 'other'."
-
These examples improve extraction accuracy across document types
-
Design Batch Processing with Message Batches API
- Create a batch of 100 invoices to process
- Use custom_id for tracking:
invoice_12345→ maps to invoice ID - Implement batch request chunking: If a document is oversized (> 100KB), split into smaller chunks before batching
- Calculate processing SLA: Batch API processes within 24 hours, costs 50% less, so suitable for overnight batch processing
- Implement failure handling by custom_id: If custom_id
invoice_12345fails (validation error), retry that specific invoice with additional context -
Example: Process 100 invoices in one batch, track failures by custom_id, queue failed invoices for retry with augmented prompts
-
Implement Human Review Routing with Field-Level Confidence Scores
- Add a confidence scoring step after extraction: For each field, score 0.0-1.0 based on how clear the document was
- Example:
{vendor_name: 0.95, invoice_number: 0.9, total_amount: 0.85, tax_amount: 0.3} - Route to human review if:
- Any field has confidence < 0.7 (unclear extraction)
- Validation warning but not error (e.g., vendor_name found but in unusual location)
- Document type = "other" (doesn't fit standard format)
- Store human review feedback as examples for future improvements
- This creates a feedback loop where hard cases improve the system over time
Domains Reinforced: - Prompt Engineering & Structured Output (JSON schema, structured output, tool_use) - Context Management & Reliability (validation, error handling, retry logic) - Claude Code Configuration & Workflows (batch processing, API selection)
Exercise 4: Design and Debug a Multi-Agent Research Pipeline
Objective: Practice orchestrating subagents with proper context passing, error propagation, and provenance tracking. You'll build a research system where a coordinator delegates work to subagents and synthesizes conflicting information.
Context: You're building a research system to answer "What are the latest developments in AI safety?" A coordinator breaks this into subagent tasks, gathers results, and synthesizes findings. Challenges: subagents might return contradictory data, some subagents timeout, and the synthesis must track sources.
Steps:
- Build Coordinator with 2+ Subagents (AgentDefinition + Task Tool)
- Create a coordinator AgentDefinition that can call the Task tool to spawn subagents
- Define 2-3 subagents:
- Subagent 1: "Find recent academic papers on AI safety alignment"
- Subagent 2: "Find industry reports and policy statements on AI safety"
- (Optional) Subagent 3: "Find news articles on recent AI safety incidents"
- Set
allowedToolsin coordinator to include Task (to call subagents) and Read/Grep (to synthesize) - Create coordinator prompt: "You are coordinating research on AI safety. Use the Task tool to ask subagents for specific information. Their allowedTools are limited to web search and document reading. Synthesize their findings into a coherent report."
-
Document what each subagent CAN do: "Subagent 1 can: search papers, read PDFs, cite sources. Subagent 1 cannot: run code, access databases."
-
Implement Parallel Subagent Execution
- In the coordinator's first response, make multiple Task tool calls in one turn (not sequential)
- Example:
Task: "Search for papers on AI alignment published in the last 6 months. Return results with title, author, date, URL." Task: "Find policy statements from OpenAI, Anthropic, and DeepMind on AI safety. Return title, date, key points." Task: "Search news for AI safety incidents in the last 3 months. Return headline, date, summary." - This is more efficient than waiting for each subagent to complete before starting the next
-
Verify all Task calls execute in parallel (not sequentially) by checking timestamps
-
Design Structured Subagent Output
- Define the format for subagent responses:
json { "claim": "String describing the finding", "evidence": ["Supporting fact 1", "Supporting fact 2"], "source_url": "URL or citation", "publication_date": "ISO 8601 date", "confidence": 0.8 } - This structure makes synthesis easier (evidence is explicitly separated from claims)
- Include in subagent prompts: "Format each finding as: claim, evidence, source URL, date. Confidence 0.0-1.0."
-
Test with varied outputs: Some subagents return full structured data, others return partial data (no URL), others return prose that needs parsing
-
Implement Error Propagation
- Simulate a subagent timeout: Subagent 2 times out after 30 seconds trying to fetch a policy document
- Expect structured error:
{errorType: "timeout", attemptedQuery: "OpenAI policy on AI safety", partialResults: [found 1 policy from Anthropic], suggestedAlternatives: ["search Anthropic and DeepMind only", "search in smaller time increments"]} - Verify the coordinator receives this error context and decides to:
- Proceed with partial results (Anthropic + DeepMind, skip OpenAI)
- Retry Subagent 2 with narrower scope
- Escalate and note "OpenAI policy not available"
-
Test with multiple subagent failures and verify coordinator can work with partial results
-
Test with Conflicting Source Data
- Create test case: Subagent 1 finds paper saying "Alignment approach X is infeasible," Subagent 2 finds industry statement saying "Approach X is promising"
- Verify synthesis preserves both findings with attribution: "Academia argues X is infeasible (cite: Paper 1), while industry reports X is promising (cite: Statement 1). This discrepancy reflects different evaluation methodologies."
- Do NOT let the coordinator pick one side; ensure both perspectives are preserved
- Test with 3+ conflicting sources and verify the synthesis presents them clearly rather than averaging or contradicting
Domains Reinforced: - Agentic Architecture & Orchestration (coordinator design, subagent orchestration, task decomposition, error propagation) - Context Management & Reliability (context passing, error handling, partial results) - Claude Agent SDK (Task tool, AgentDefinition, allowedTools, stop_reason)
How to Use These Exercises
- Do them in order: Exercises 1-4 progress from single-agent patterns → team configuration → data extraction → multi-agent orchestration
- Time allocation: 1-2 hours per exercise
- Test thoroughly: Include test cases that should fail (to verify error handling) and test cases that should succeed
- Document your setup: Take screenshots or notes of your final configurations; these are good interview preparation materials
- Relate to real work: After each exercise, think about how you'd apply it to a real project you've worked on
Conclusion
None