Menu

Testing Best Practices

Relevant source files

Purpose and Scope

This page provides practical guidance for testing ADK agents, tools, and services. It covers unit testing, integration testing, and ADK-specific testing patterns to help developers maintain code quality and prevent regressions.

For information about the formal evaluation framework (EvalSet, EvalCase, metrics), see Evaluation Framework. For details on evaluation metrics, see Evaluation Metrics. For conformance testing capabilities, see Conformance Testing.


Testing Strategy Overview

ADK testing follows a multi-layered approach:

Sources: CONTRIBUTING.md79-126 README.md173-186


Unit Testing

Framework and Structure

ADK uses pytest as the testing framework. Unit tests are located under tests/unittests/ following existing naming conventions.

Key Requirements:

  • Tests should be fast and isolated
  • Use mocks or fixtures for external dependencies
  • Include docstrings or comments for complex scenarios
  • Cover new features, edge cases, error conditions, and typical use cases

Sources: CONTRIBUTING.md84-101

Running Unit Tests

Sources: CONTRIBUTING.md173-186 README.md173-178

Testing Agent Components

When testing agents, focus on isolating the agent logic from LLM calls and service dependencies:

Testing Patterns:

  • Use in-memory services to avoid external dependencies
  • Mock LLM responses to test agent logic deterministically
  • Verify event streams contain expected event types and content
  • Test state transitions across invocations
  • Test error conditions and edge cases

Sources: CONTRIBUTING.md89-101

Testing Tools

Tools should be tested independently from agents:

Test AspectWhat to TestExample
ExecutionTool executes successfully with valid inputsTool returns expected result
Input ValidationTool handles invalid inputs gracefullyRaises appropriate errors
AuthenticationOAuth2 flows work correctlyCredential exchange succeeds
ConfirmationConfirmation flow triggers when requiredUser approval requested
Error HandlingTool handles failures and timeoutsReturns error information
Tool ContextToolContext provides session accessState read/write works

Sources: CONTRIBUTING.md89-101 src/google/adk/tools/base_tool.py (not provided but referenced)

Testing Services

Service implementations should be tested with focus on data persistence and retrieval:

Testing Considerations:

  • Test with in-memory implementations for speed
  • Verify state hierarchy (app, user, session, temp)
  • Test state delta extraction
  • Test event filtering and history management
  • Test artifact versioning

Sources: CONTRIBUTING.md89-101

Mocking and Fixtures

Best Practices:

  • Use fixtures to avoid external dependencies
  • Mock LLM calls to eliminate non-determinism and API costs
  • Use InMemorySessionService and InMemoryArtifactService for testing
  • Create reusable fixtures for common test data
  • Mock tool execution with mock_tool_output in eval cases

Sources: CONTRIBUTING.md96-99 src/google/adk/evaluation/evaluation_constants.py25


Integration and End-to-End Testing

Manual Testing with ADK Web

The development UI provides interactive testing capabilities:

Testing Steps:

  1. Start the development server: adk web path/to/agent
  2. Navigate to the UI in a browser
  3. Test conversation flows and agent behavior
  4. Inspect events, state, and traces
  5. Capture screenshots for documentation

Sources: CONTRIBUTING.md110-116 README.md134-138

Testing with Runner

The Runner class enables programmatic testing of complete agent workflows:

Example Testing Flow:

  1. Set up test environment with required services
  2. Initialize Runner with agent and configuration
  3. Invoke agent with test messages
  4. Iterate through event stream
  5. Assert expected events, content, and state changes

Sources: CONTRIBUTING.md117-124

Testing Agent Deployments

Testing Workflow:

Sources: CONTRIBUTING.md202-229


ADK-Specific Testing Best Practices

Testing Asynchronous Code

ADK extensively uses async/await patterns. Test async code properly:

Best Practices:

  • Mark async tests with @pytest.mark.asyncio
  • Always await coroutines
  • Use async for to iterate event streams
  • Test concurrent operations with asyncio.gather()

Testing with Different Service Backends

Test your agent with multiple service implementations to ensure portability:

ServiceTest WithPurpose
SessionInMemorySessionServiceFast unit tests
SessionDatabaseSessionService (SQLite)Persistence testing
ArtifactInMemoryArtifactServiceFast unit tests
ArtifactFileArtifactServiceLocal file storage
MemoryInMemoryMemoryServiceFast unit tests

Testing Pattern:

  • Start with in-memory services for unit tests
  • Add database tests for persistence validation
  • Use parametrized tests to run against multiple backends

Sources: CONTRIBUTING.md89-101

Testing Tool Confirmation Flows

Test human-in-the-loop (HITL) confirmation scenarios:

Test Considerations:

  • Verify ConfirmationRequestEvent is emitted
  • Test approval path (confirmation=True)
  • Test denial path (confirmation=False)
  • Test modified input scenarios
  • Verify tool execution resumes correctly after confirmation

Sources: README.md49

Testing LLM Interactions

Testing code that integrates with LLMs requires special considerations:

Best Practices:

  • Unit tests: Mock LLM responses for deterministic, fast tests
  • Integration tests: Use real API calls sparingly for critical flows
  • Evaluation tests: Use the evaluation framework for quality assessment
  • Cost management: Use smaller models or mock responses for frequent tests
  • Non-determinism: Accept that real LLM responses vary; test behavior patterns, not exact strings

Test Organization and Coverage

Test File Structure

Naming Conventions:

  • Test files: test_<module>_<feature>.py
  • Test classes: Test<Feature>
  • Test methods: test_<scenario>()
  • Use descriptive names that explain what is being tested

Sources: CONTRIBUTING.md93-94

Coverage Requirements

Coverage Requirements:

  • New features must include comprehensive test coverage
  • Cover edge cases and error conditions
  • Aim for high code coverage, particularly for critical paths
  • Don't sacrifice test quality for coverage percentage

Sources: CONTRIBUTING.md90-92


Continuous Integration

CI Test Execution

CI Pipeline:

  1. Unit tests run automatically on PR creation
  2. Code style checks enforce formatting
  3. Build verification ensures package integrity
  4. All checks must pass before merge

Sources: README.md5 CONTRIBUTING.md190-194

Pre-Commit Testing

Before submitting a PR:

Sources: CONTRIBUTING.md187-194


Testing Checklist

For Pull Requests

CategoryRequirementStatus
Unit TestsAdded/updated unit tests
Unit TestsAll tests pass locally
CoverageNew code has test coverage
E2E TestingManual E2E testing completed
E2E TestingScreenshots/logs provided
DocumentationTest plan described in PR
Code QualityCode formatted (autoformat.sh)
BuildWheel builds successfully

Sources: .github/pull_request_template.md22-48

Testing Plan Template

For each PR, include a testing plan:

Sources: .github/pull_request_template.md22-38


Common Testing Anti-Patterns

Anti-Patterns to Avoid

What Not To Do:

  • ❌ Tests that require external API keys or services
  • ❌ Tests that take more than a few seconds
  • ❌ Tests that fail intermittently
  • ❌ Tests without clear assertions
  • ❌ Tests that depend on execution order
  • ❌ Tests that modify global state

Sources: CONTRIBUTING.md96-99


Summary

Effective testing in ADK requires:

  1. Comprehensive unit tests with mocks and fixtures for fast, isolated testing
  2. Integration tests using the Runner and ADK Web for complete workflow validation
  3. Evaluation framework for systematic quality assessment
  4. Proper async testing with pytest.mark.asyncio
  5. Service backend testing to ensure portability
  6. CI/CD integration with automated test execution

By following these best practices, you ensure that your ADK agents are reliable, maintainable, and production-ready.

Sources: CONTRIBUTING.md79-126 README.md142-146