Understanding AI agents in the context of TDD
Prerequisites
Step 1: Setting up GitHub Copilot for TDD
Step 2: Copilot Chat for Test Strategy (Optional)
1. Using Copilot Chat for test design
2. Alternative: Using Claude as the LLM
Step 3: Copilot-Driven TDD Workflow
1. Core workflow pattern
Step 3: Practical Example - Calculator with Copilot TDD
1. Custom Agent Configurations with Copilot
Limitations and Considerations with GitHub Copilot
Measuring effectiveness
Related subjects
References

test-driven development with agent - for typescript development

Last updated Feb 23, 2026 Published Feb 14, 2026

The content here is under the Attribution 4.0 International (CC BY 4.0) license

Home / Tdd / test-driven development with agent - for typescript development

Join Our Community

Connect with developers, architects, and tech leads who share your passion for quality software development. Discuss TDD, architecture, software engineering, and more.

→ Join Slack

The emergence of AI coding assistants has created opportunities to enhance test-driven development practices. Research by Baudry et al. (Baudry & Monperrus, 2024) demonstrates that generative AI can produce test data with accuracy, suggesting potential for broader TDD support. In a previous post, I explored the literature and the match that TDD principles have with AI capabilities. This guide demonstrates how to approach Copilot as an AI agent specifically for TDD workflows, covering configuration and practical implementation.

Understanding AI agents in the context of TDD

AI coding agents are tools that assist developers by generating code, suggesting implementations, and automating repetitive tasks, they are known by generating extensively amounts of code that might lead to cognitive overload.

When applied to TDD, these agents can support the red-green-refactor cycle (Beck, 2002) by generating test cases, suggesting implementations, and identifying edge cases.

The integration of AI in TDD aligns with the practice’s core principles. TDD emphasizes writing tests before implementation, and AI agents can accelerate this process while maintaining the discipline of test-first development.

Prerequisites

Before configuring Copilot for TDD, ensure you have:

Visual Studio Code or compatible IDE with Copilot support
GitHub Copilot (or another coding assistant, however, this guide focuses on Copilot)
Basic understanding of TDD principles (red, green, refactor cycle)
Jest installed in your project (for other programing languages you should have the corresponding testing framework installed)

Step 1: Setting up GitHub Copilot for TDD

GitHub Copilot provides context-aware code suggestions. To optimize it for TDD workflows:

Installation

Install the GitHub Copilot extension in Visual Studio Code
Sign in with your GitHub account that has Copilot access
Verify the extension is active

In order to follow up we will be using a katas playground repository, specifically in the subfolder solutions/string-calculator in Typescript.

Configuration for TDD

Create a .github/copilot-instructions.md file in your repository root:

This repository contains a typescript codebase focused on TDD katas.

Create a .github/agents/tdd-coach.md file in your repository root with the following content to guide Copilot’s suggestions towards TDD patterns:

Agents for specific path

Agents for specific paths

---
name: TDD Coach
description: An AI agent that assists developers in following test-driven development practices. Provides suggestions for writing tests, implementing code to pass tests, and refactoring while maintaining code behaviour. Offer guidance on test design and test quality, specially for test smells.
---

Focus on the following instructions:
- Test case generation: Suggests test cases based on code context
- Implementation suggestions: Provides minimal code implementations to pass tests
- Refactoring support: Offers suggestions for improving code quality while maintaining test coverage
- Edge case identification: Recommends additional test cases for edge scenarios
- Follows TDD principles: Encourages the red-green-refactor cycle and test-first development. On each step, waits for user input
- Provides feedback on test quality and coverage

When writing tests:
- Follow the Arrange-Act-Assert pattern
- Use descriptive test names that explain the behavior being tested
- Write one assertion per test when possible
- Consider edge cases and boundary conditions
- Use appropriate test doubles (mocks, stubs, fakes) based on the testing strategy

When suggesting implementations:
- Write minimal code to pass the current test
- Avoid over-engineering solutions
- Follow SOLID principles
- Consider existing patterns in the codebase

Do not:
- Generate entire test suites at once; focus on one test at a time
- Provide implementation suggestions before tests are written and confirmed to fail
- Refactor code without ensuring all tests pass first
- Advance to the next step in the TDD cycle without user confirmation

Using Copilot in the TDD cycle

Red phase (Writing failing tests):

Start by writing a test that you trust, take the time to write it with the dependencies and the assertion you want
Run the tests and see it fail (true red phase)

Example workflow in JavaScript with Jest:

test('should calculate the sum of two positive numbers', () => {
  const calculator = new StringCalculator();
  const result = calculator.add(2, 3);
  expect(result).toBe(5);
});

Run the test suite to confirm it fails for the right reason, when the result is different from the expected value.

Green phase (Implementing code):

Navigate to the implementation file
Write a function signature or class definition
Use Copilot to suggest the minimal implementation
Run tests to verify the implementation passes

Refactor phase:

Use Copilot Chat to ask for refactoring suggestions
Prompt: “Refactor this method to improve readability while maintaining behavior”
Review suggestions and apply incrementally

A note from the experience

Copilot helps specially when complex test-doubles are needed, for example, when testing a function that use Intersection Observer API, the test setup can be quite complex, and Copilot can suggest the necessary boilerplate code to create mocks and stubs for the observer, allowing you to focus on the test logic rather than the setup.

Step 2: Copilot Chat for Test Strategy (Optional)

While inline Copilot suggestions are the primary workflow, you can optionally use Copilot Chat (in VS Code or GitHub.com) for higher-level test strategy discussion before implementation.

Using Copilot Chat for test design

When planning test strategy, open Copilot Chat and ask:

@workspace I need to implement a validation function for email addresses.
Based on the project's testing patterns, what test cases should I consider?
Consider edge cases, invalid formats, and boundary conditions.

Copilot Chat analyzes your codebase context and suggests test cases aligned with your project’s patterns. However, the primary workflow relies on inline Copilot suggestions rather than chat-based interaction.

Alternative: Using Claude as the LLM

If you prefer Claude’s reasoning capabilities, some Copilot-compatible IDEs (through enterprise policies) may allow model selection. Instructions remain identical—configure .github/copilot-instructions.md as shown in Step 1, and the workflow proceeds using whichever model is configured.

Step 3: Copilot-Driven TDD Workflow

Core workflow pattern

Define behavior with comments - Write descriptive test names and requirement comments to clarify intent
Let Copilot suggest test structure - Accept or modify inline suggestions for test bodies
Review and adjust - Verify test quality, assertions, and edge case coverage
Implement minimally with Copilot - Let Copilot suggest minimal implementations to pass tests
Run and validate - Execute tests and verify behavior matches requirements
Refactor with Copilot support - Use Copilot’s suggestions to improve code quality while maintaining test coverage

Step 3: Practical Example - Calculator with Copilot TDD

To illustrate the complete Copilot-driven TDD workflow:

Phase 1: Test Case Design

Start with a comment describing test intent:

// Test: Calculator should add two positive numbers
// Expected: 2 + 3 = 5
test('should add two positive numbers', () => {
  // Copilot suggests the implementation pattern
  const calc = new Calculator();
  const result = calc.add(2, 3);
  expect(result).toBe(5);
});

// Test: Calculator should handle division by zero
// Expected: throws error
test('should handle division by zero', () => {
  // Copilot auto-completes based on pattern
  const calc = new Calculator();
  expect(() => calc.divide(5, 0)).toThrow('Division by zero');
});

Copilot recognizes the comment pattern and suggests test structure automatically.

Phase 2: Test Implementation

Let Copilot complete remaining test cases based on the established pattern.

Phase 3: Red Phase - Verify Tests Fail

Run tests to confirm they fail before implementing Calculator class.

Phase 4: Green Phase - Implementation

Navigate to Calculator.js and start typing:

class Calculator {
  add(a, b) {
    // Copilot suggests: return a + b;
  }
  
  divide(a, b) {
    // Copilot suggests: 
    // if (b === 0) throw new Error('Division by zero');
    // return a / b;
  }
}

Copilot provides minimal implementations that pass tests.

Phase 5: Refactor Phase

After all tests pass, refactor with Copilot suggestions:

// Consider: Extract validation logic
private validateDivisor(b) {
  if (b === 0) throw new Error('Division by zero');
}

Copilot suggests improvements while maintaining test coverage.

Custom Agent Configurations with Copilot

For teams adopting Copilot-driven TDD, establish shared guidelines:

Repository-level instructions (.github/copilot-instructions.md):

This file is the primary configuration for team TDD standards. Store tested patterns and conventions here:

# Project TDD Standards

## Testing Framework and Configuration

Testing framework: Jest
Test location: __tests__/ directory
Naming convention: [component].test.js

## Required Test Structure

- Describe blocks for each component/function
- Nested describe blocks for different scenarios
- Test names should complete the phrase "it should..."
- Use AAA pattern: Arrange, Act, Assert

## Mocking Strategy

- Use jest.mock() for external dependencies
- Prefer dependency injection over global mocks
- Clean up mocks in afterEach blocks
- Mock API responses consistently across tests

## Copilot-Driven TDD Patterns

When using Copilot inline suggestions:
- Start test files with comment describing test intent
- Let Copilot generate test structure based on established patterns
- Review suggestions for domain-specific edge cases
- Maintain 80%+ mutation testing score

## Common Test Patterns

### Testing API endpoints
- Test success path first
- Then test each error scenario
- Mock external service calls
- Verify status codes and response structures

### Testing async operations
- Use async/await syntax
- Test success and rejection paths
- Use jest.useFakeTimers() for timeout scenarios
- Clean up timers in afterEach

### Generating test fixtures
- Create reusable factory functions
- Place factories in __fixtures__/ directory
- Reference factories by descriptive names

This guide becomes Copilot’s context for all inline suggestions across your team, ensuring consistent TDD patterns.

Limitations and Considerations with GitHub Copilot

While Copilot enhances TDD workflows, awareness of limitations is essential:

Copilot suggestions require critical evaluation:

Copilot analyzes patterns from your codebase and training data, but doesn’t understand domain semantics
Generated tests may miss business-critical edge cases specific to your application
Human expertise remains necessary for comprehensive test design

Context window limitations:

Copilot works with the current file and limited surrounding context
Complex refactorings or cross-module patterns may not be suggested optimally
For multi-file architectural decisions, manual design is often necessary

Over-reliance risks:

Developers may accept suggestions without critical evaluation
Generated code patterns may work but introduce subtle assumptions
Test coverage metrics can be misleading if Copilot-generated tests don’t test meaningful behavior

Domain-specific challenges:

Copilot excels with common patterns (CRUD operations, utility functions, API handlers)
Domain-specific validation logic, complex algorithms, or business processes benefit from manual design
Proprietary business rules should be explicitly coded rather than suggested by Copilot

Complementary to expertise, not a replacement:

TDD requires understanding of testing principles, design patterns, and domain knowledge
Copilot accelerates implementation of well-understood patterns but does not replace developer judgment
The red-green-refactor cycle is most effective when developers actively design tests, not passively accept suggestions

Research on AI-assisted software development (Baudry & Monperrus, 2024) indicates that while AI can generate accurate code patterns, human oversight of suggestion quality remains critical.

Measuring effectiveness

Track these metrics to evaluate AI agent impact on TDD workflows:

Test quality: Mutation testing scores measure how well tests detect introduced defects. A mutation testing score above 80% indicates robust test coverage.
Development velocity: Time from test writing to passing implementation. This metric helps identify whether AI assistance reduces development friction.
Refactoring frequency: Number of refactoring cycles enabled by test coverage. Higher frequency suggests developers feel confident making changes with test safety nets.
Code quality: Metrics such as cyclomatic complexity (aim for values below 10 per method) and code duplication (target less than 3% duplication).

Compare these metrics before and after agent adoption to assess quantifiable impact. Research suggests that TDD practices reduce defect rates (Beck, 2002), and AI assistance should maintain or improve these outcomes.

References

Baudry, B., & Monperrus, M. (2024). Generative AI for Test Data Generation. IEEE Software, 41(3), 34–41.
Beck, K. (2002). Test Driven Development: By Example. Addison-Wesley Professional.

About this post

This post content s was assisted by AI, which helped with research, curate content and code suggestions.

Table of contents

test-driven development with agent - for typescript development

Understanding AI agents in the context of TDD

Prerequisites

Step 1: Setting up GitHub Copilot for TDD

Installation

Configuration for TDD

Using Copilot in the TDD cycle

Step 2: Copilot Chat for Test Strategy (Optional)

Using Copilot Chat for test design

Alternative: Using Claude as the LLM

Step 3: Copilot-Driven TDD Workflow

Core workflow pattern

Step 3: Practical Example - Calculator with Copilot TDD

Custom Agent Configurations with Copilot

Limitations and Considerations with GitHub Copilot

Measuring effectiveness

References

About this post

You also might like

Table of contents

test-driven development with agent - for typescript development

Understanding AI agents in the context of TDD

Prerequisites

Step 1: Setting up GitHub Copilot for TDD

Installation

Configuration for TDD

Using Copilot in the TDD cycle

Step 2: Copilot Chat for Test Strategy (Optional)

Using Copilot Chat for test design

Alternative: Using Claude as the LLM

Step 3: Copilot-Driven TDD Workflow

Core workflow pattern

Step 3: Practical Example - Calculator with Copilot TDD

Custom Agent Configurations with Copilot

Limitations and Considerations with GitHub Copilot

Measuring effectiveness

Related subjects

References

About this post

You also might like