What Are Characterization Tests?
When to Use Characterization Tests
Techniques for Writing Characterization Tests
Golden Master Tests vs. Characterization Tests
1. Example: Golden Master Test
2. Pros and Cons
Refactoring with Characterization Tests
Mutation Testing to Validate Test Quality
Practical Exercise: The Gilded Rose Kata
1. Getting Started
Sensible Defaults for Characterization Tests
Common Pitfalls
Frontend vs Backend Characterization: Key Differences
Real Example 1: Understanding a Payment Processor (TypeScript)
1. Initial Code
2. Characterization Tests
Real Example 2: Characterization Tests in a ReactJS Frontend
Real Example 2: Import Users Feature
1. Initial Code (TypeScript)
2. Characterization Tests
Resources
References

Characterization Tests - Understanding Legacy Code Through Real Examples

Last updated Feb 11, 2026 Published Feb 11, 2026

The content here is under the Attribution 4.0 International (CC BY 4.0) license

Home / Tdd / Characterization Tests - Understanding Legacy Code Through Real Examples

Join Our Community

Connect with developers, architects, and tech leads who share your passion for quality software development. Discuss TDD, architecture, software engineering, and more.

→ Join Slack

Characterization tests are a technique for understanding and documenting the behavior of existing code, particularly legacy code that lacks proper documentation or tests. Unlike traditional tests that verify correctness against specifications, characterization tests capture the current behavior of the system as it exists today.

What Are Characterization Tests?

A characterization test captures the current behavior of code without making assumptions about whether that behavior is correct. The primary goal is to understand what the code does, not what it should do. This approach provides a safety net when you need to refactor or modify legacy code.

Working Effectively with Legacy Code

Michael Feathers introduced the concept of characterization tests in his book, Working Effectively with Legacy Code (Feathers, 2004). The technique helps developers understand and document the behavior of legacy code before making changes. By capturing the current behavior, you create a foundation for safe refactoring and gradual improvement.

When to Use Characterization Tests

In my experience working with legacy systems, characterization tests have proven in several recurring scenarios. Often, I have inherited codebases that lacked any form of automated tests or meaningful documentation. In these situations, the absence of guidance made it nearly impossible to make changes with confidence. Characterization tests became my primary tool for mapping out what the code actually did, rather than what I assumed it should do.

Another common challenge is deciphering complex business logic that has evolved over years, sometimes decades, and is deeply embedded in legacy code. By writing characterization tests, I could systematically document the observed behavior, which not only helped me understand the system but also provided a safety net for future modifications.

When refactoring, the risk of breaking existing functionality is always present, especially in systems with unknown dependencies or side effects. Characterization tests allowed me to refactor incrementally, ensuring that the system’s current behavior remained intact throughout the process.

There have also been times when I needed to explore code to identify subtle bugs or edge cases that were not immediately obvious. Characterization tests enabled me to capture these behaviors as I discovered them, turning exploration into executable documentation.

Finally, during modernization efforts, characterization tests have been essential for protecting against regressions. They serve as a living specification of the legacy system, making it possible to evolve the codebase with greater confidence and less risk.

Techniques for Writing Characterization Tests

1. Start with Known Inputs and Outputs

Begin with the simplest cases where you can predict or observe the output:

test('empty list returns empty result', () => {
  const result = processData([]);
  expect(result).toEqual([]);
});

2. Introduce Coverage Tools

Use code coverage tools to identify untested paths:

# TypeScript with Jest
npm test -- --coverage

# TypeScript with Vitest
npm run test:coverage

3. Write Tests for Edge Cases

Once you understand the basic behavior, explore edge cases:

describe('edge cases', () => {
  it('handles null input', () => {
    const result = processData(null);
    // Capture actual behavior, even if it throws
  });
  
  it('handles empty string', () => {
    const result = processData('');
    expect(result).toBe(/* observed value */);
  });
  
  it('handles extremely large inputs', () => {
    const largeInput = 'x'.repeat(10000);
    const result = processData(largeInput);
    // Document the behavior
  });
});

4. Use Property-Based Testing for Systematic Exploration

Instead of using loops in tests—which is widely considered a test smell and can obscure individual failures—prefer property-based testing to systematically explore a wide range of inputs. Property-based testing frameworks generate diverse input data and check that certain properties always hold, making it easier to discover edge cases and unexpected behaviors.

For TypeScript, libraries like fast-check can be used:

import fc from 'fast-check';

test('legacyFunction maintains invariants for all integers', () => {
  fc.assert(
    fc.property(fc.integer(), (value) => {
      const result = legacyFunction(value);
      // Assert properties about result
      // e.g., expect(typeof result).toBe('number');
    })
  );
});

Loops in tests can hide which input caused a failure and make test output harder to interpret. Property-based testing provides systematic, reproducible exploration and better diagnostics.

Golden Master Tests vs. Characterization Tests

Golden Master (or Approval) tests are a variant where you capture the entire output as a snapshot and compare future executions against it.

Example: Golden Master Test

test('captures complete JSON output', () => {
  const input = { id: 1, name: 'Test' };
  const output = complexTransformation(input);
  
  expect(output).toMatchSnapshot();
});

Pros and Cons

Golden Master Tests:

Pros: Quick to set up, captures complex outputs completely, detects any change
Cons: Does not improve understanding, creates brittle tests, difficult to maintain snapshots

Characterization Tests:

Pros: Improves code understanding, documents specific behaviors, easier to maintain
Cons: Requires more effort to write, may miss some edge cases initially

For long-term maintainability, characterization tests that document specific behaviors are preferable to golden master tests.

Refactoring with Characterization Tests

Once you have comprehensive characterization tests, you can refactor safely. Common refactoring strategies include:

Extract Method

// Before
function processPayment(
  amount: number, 
  method: string, 
  customer: Customer
): PaymentResult {
  // ... complex logic ...
}

// After
function processPayment(
  amount: number, 
  method: string, 
  customer: Customer
): PaymentResult {
  if (!isValidAmount(amount)) {
    return invalidAmountResult();
  }
  
  if (method === 'credit_card') {
    return processCreditCard(amount, customer);
  }
  
  if (method === 'bank_transfer') {
    return processBankTransfer(amount);
  }
  
  return unknownMethodResult();
}

function isValidAmount(amount: number): boolean {
  return amount > 0;
}

Replace Conditional with Polymorphism

// Before: Single class with conditionals
class PaymentProcessor {
  process(method: string, amount: number, customer: Customer): Result {
    if (method === 'credit_card') {
      // credit card logic
    } else if (method === 'bank_transfer') {
      // bank transfer logic
    }
  }
}

// After: Strategy pattern
interface PaymentMethod {
  process(amount: number, customer: Customer): Result;
}

class CreditCardPayment implements PaymentMethod {
  process(amount: number, customer: Customer): Result {
    // credit card logic
  }
}

class BankTransferPayment implements PaymentMethod {
  process(amount: number, customer: Customer): Result {
    // bank transfer logic
  }
}

Introduce Parameter Object

// Before
function calculateDiscount(
  customerAge: number,
  customerTier: string,
  orderTotal: number,
  orderDate: Date
): number {
  // complex logic
}

// After
interface DiscountContext {
  customerAge: number;
  customerTier: string;
  orderTotal: number;
  orderDate: Date;
}

function calculateDiscount(context: DiscountContext): number {
  // same logic, clearer interface
}

Mutation Testing to Validate Test Quality

After writing characterization tests, use mutation testing to verify they actually protect against changes:

# TypeScript with Stryker
npx stryker run

Mutation testing introduces small changes (mutations) to your code and checks if your tests catch them. A high mutation score indicates your tests effectively capture the behavior.

Practical Exercise: The Gilded Rose Kata

The Gilded Rose Kata is a classic exercise for practicing characterization tests. It presents a legacy inventory system with complex business rules and no tests.

Getting Started

Clone the kata repository in TypeScript
Read the requirements document to understand the business rules
Write characterization tests to capture current behavior
Use coverage tools to ensure all paths are tested
Refactor the code while keeping tests green
Add the new feature (aged brie) using TDD

# Clone the repository
git clone https://github.com/emilybache/GildedRose-Refactoring-Kata.git

# Choose TypeScript directory
cd GildedRose-Refactoring-Kata/TypeScript

Sensible Defaults for Characterization Tests

Start Small: Begin with simple, predictable cases before tackling complex scenarios
Document Surprises: When you discover unexpected behavior, add comments explaining what you found
Separate Concerns: Distinguish between tests that capture intended behavior and those that capture bugs
Use Descriptive Names: Test names should clearly describe the behavior being characterized
Refactor Tests: Once you understand the code, refactor tests to be more maintainable
Delete Obsolete Tests: After refactoring, remove tests that no longer serve a purpose

Common Pitfalls

Over-reliance on Snapshots

Avoid using snapshot testing for everything. Snapshots are useful for complex outputs but should be supplemented with specific assertions for critical behaviors.

Testing Implementation Details

Focus on observable behavior, not internal implementation. Tests should survive refactoring.

Ignoring Test Maintainability

Even characterization tests need to be readable and maintainable. Refactor them as you gain understanding.

Not Updating Tests After Refactoring

As you refactor and improve the code, update tests to reflect the new, clearer structure rather than preserving the legacy behavior.

Frontend vs Backend Characterization: Key Differences

Characterization tests apply across the entire stack, but their focus and techniques differ significantly between frontend and backend code.

Backend Characterization

Backend characterization tests focus on the behavior of business logic, data transformations, and state management. They are typically:

Input-focused, testing direct function calls with various inputs and assertions on return values
Fast to execute, since they operate on pure functions or isolated services
Deterministic, with clear and repeatable scenarios
Concerned with logic correctness, edge cases, and error handling

Backend tests rarely need to worry about visual representation or UI state. The assertions are straightforward: verify that given input X, the function returns output Y or that a side effect occurs as expected.

Frontend Characterization

Frontend characterization tests focus on user interactions, component state, and DOM rendering. They are typically:

Interaction-focused, testing how components respond to user actions (clicks, typing, form submissions)
Potentially slower, since they involve rendering and DOM manipulation
More visually-oriented, often using snapshot testing to capture entire UI states
Concerned with component behavior, accessibility, and user experience

Frontend tests must contend with the complexity of UI state, event handling, and asynchronous rendering. Rather than asserting on specific text or attributes (which can be brittle), snapshot testing captures the entire rendered output, making it easier to detect unintended side effects from refactoring.

When to Use Snapshots vs Specific Assertions

Use specific assertions for critical behaviors that represent business requirements (e.g., “when amount is invalid, show an error”)
Use snapshots to characterize the overall component structure and rendering in various states
Combine both approaches for comprehensive characterization

The key difference is that backend tests validate logic, while frontend tests validate both logic and presentation. Characterization tests on the frontend should capture the current rendering behavior as a baseline for safe refactoring.

Real Example 1: Understanding a Payment Processor (TypeScript)

Pure TypeScript Example

Initial Code

interface Customer {
  creditScore?: number;
  creditLimit?: number;
}

interface PaymentResult {
  success: boolean;
  message: string;
  fee?: number;
  processingTime?: string;
}

function processPayment(
  amount: number, 
  method: string, 
  customer: Customer
): PaymentResult {
  let result: PaymentResult = { success: false, message: '' };
  
  if (amount <= 0) {
    result.message = 'Invalid amount';
    return result;
  }
  
  if (method === 'credit_card') {
    if (!customer.creditScore || customer.creditScore < 600) {
      result.message = 'Credit declined';
      return result;
    }
    if (!customer.creditLimit || amount > customer.creditLimit) {
      result.message = 'Exceeds credit limit';
      return result;
    }
    result.success = true;
    result.message = 'Payment processed';
    result.fee = amount * 0.029 + 0.30;
  } else if (method === 'bank_transfer') {
    if (amount < 1) {
      result.message = 'Minimum transfer is $1';
      return result;
    }
    result.success = true;
    result.message = 'Transfer initiated';
    result.processingTime = '3-5 days';
  } else {
    result.message = 'Unknown payment method';
  }
  
  return result;
}

Characterization Tests

describe('processPayment characterization tests', () => {
  describe('credit card payments', () => {
    it('processes valid credit card payment with fee', () => {
      const customer: Customer = { creditScore: 700, creditLimit: 5000 };
      const result = processPayment(100, 'credit_card', customer);
      
      expect(result.success).toBe(true);
      expect(result.message).toBe('Payment processed');
      expect(result.fee).toBeCloseTo(3.20, 2);
    });
    
    it('declines payment when credit score is below 600', () => {
      const customer: Customer = { creditScore: 599, creditLimit: 5000 };
      const result = processPayment(100, 'credit_card', customer);
      
      expect(result.success).toBe(false);
      expect(result.message).toBe('Credit declined');
      expect(result.fee).toBeUndefined();
    });
    
    it('declines payment exceeding credit limit', () => {
      const customer: Customer = { creditScore: 700, creditLimit: 50 };
      const result = processPayment(100, 'credit_card', customer);
      
      expect(result.success).toBe(false);
      expect(result.message).toBe('Exceeds credit limit');
    });
  });
  
  describe('bank transfer payments', () => {
    it('processes bank transfer with processing time', () => {
      const customer: Customer = {};
      const result = processPayment(100, 'bank_transfer', customer);
      
      expect(result.success).toBe(true);
      expect(result.message).toBe('Transfer initiated');
      expect(result.processingTime).toBe('3-5 days');
    });
    
    it('rejects bank transfer below minimum amount', () => {
      const customer: Customer = {};
      const result = processPayment(0.50, 'bank_transfer', customer);
      
      expect(result.success).toBe(false);
      expect(result.message).toBe('Minimum transfer is $1');
    });
  });
  
  describe('edge cases discovered through exploration', () => {
    it('rejects zero or negative amounts', () => {
      const customer: Customer = { creditScore: 700, creditLimit: 5000 };
      const result = processPayment(0, 'credit_card', customer);
      
      expect(result.success).toBe(false);
      expect(result.message).toBe('Invalid amount');
    });
    
    it('handles unknown payment method', () => {
      const customer: Customer = {};
      const result = processPayment(100, 'paypal', customer);
      
      expect(result.success).toBe(false);
      expect(result.message).toBe('Unknown payment method');
    });
  });
});

These characterization tests reveal several behaviors:

Credit card payments include a fee calculation (2.9% + $0.30)
Credit score threshold is exactly 600
Bank transfers have a minimum of $1 (but the check seems inconsistent with the initial validation)
The function returns different structures depending on the payment method

Real Example 2: Characterization Tests in a ReactJS Frontend

Suppose you have a legacy ReactJS component that processes user payments and you want to characterize its behavior before refactoring. The approach is similar, but you use React Testing Library and Jest for the frontend context.

// PaymentForm.tsx (production code)
import React, { useState } from 'react';

const PaymentForm: React.FC = () => {
  const [amount, setAmount] = useState('');
  const [method, setMethod] = useState('');
  const [message, setMessage] = useState('');

  const handleSubmit = (e: React.FormEvent) => {
    e.preventDefault();
    if (!amount || isNaN(Number(amount)) || Number(amount) <= 0) {
      setMessage('Invalid amount');
      return;
    }
    if (method === 'credit_card') {
      setMessage('Payment processed');
    } else if (method === 'bank_transfer') {
      setMessage('Transfer initiated');
    } else {
      setMessage('Unknown payment method');
    }
  };

  return (
    <form onSubmit={handleSubmit}>
      <label htmlFor="amount">Amount</label>
      <input
        id="amount"
        name="amount"
        type="number"
        value={amount}
        onChange={e => setAmount(e.target.value)}
      />
      <label htmlFor="method">Method</label>
      <input
        id="method"
        name="method"
        type="text"
        value={method}
        onChange={e => setMethod(e.target.value)}
      />
      <button type="submit">Submit</button>
      {message && <div>{message}</div>}
    </form>
  );
};

export default PaymentForm;

import React from 'react';
import { render, screen } from '@testing-library/react';
import userEvent from '@testing-library/user-event';
import PaymentForm from './PaymentForm';

describe('PaymentForm characterization tests', () => {
  it('renders and submits invalid amount', async () => {
    render(<PaymentForm />);
    const user = userEvent.setup();
    await user.type(screen.getByLabelText(/amount/i), '0');
    await user.click(screen.getByRole('button', { name: /submit/i }));
    expect(document.body).toMatchSnapshot();
  });

  it('renders and submits valid credit card payment', async () => {
    render(<PaymentForm />);
    const user = userEvent.setup();
    await user.type(screen.getByLabelText(/amount/i), '100');
    await user.type(screen.getByLabelText(/method/i), 'credit_card');
    await user.click(screen.getByRole('button', { name: /submit/i }));
    expect(document.body).toMatchSnapshot();
  });
});

This approach allows you to characterize the observable behavior of frontend components, ensuring you can refactor with confidence.

Real Example 2: Import Users Feature

This example is based on the Import Users Kata, which simulates importing user data from external sources with hidden business rules.

Initial Code (TypeScript)

interface User {
  email: string;
  name: string;
  age: number;
}

interface Database {
  userExists(email: string): boolean;
  save(user: User): void;
}

class ImportResult {
  private imported: string[] = [];
  private skipped: string[] = [];
  private errors: string[] = [];
  
  addImported(email: string): void {
    this.imported.push(email);
  }
  
  addSkipped(email: string): void {
    this.skipped.push(email);
  }
  
  addError(message: string): void {
    this.errors.push(message);
  }
  
  getImportedCount(): number {
    return this.imported.length;
  }
  
  getSkippedCount(): number {
    return this.skipped.length;
  }
  
  getErrorCount(): number {
    return this.errors.length;
  }
  
  getErrors(): string[] {
    return this.errors;
  }
}

class UserImporter {
  constructor(private database: Database) {}
  
  importUsers(userLines: string[]): ImportResult {
    const result = new ImportResult();
    
    for (const line of userLines) {
      const parts = line.split(',');
      
      if (parts.length < 3) {
        result.addError(`Invalid format: ${line}`);
        continue;
      }
      
      const email = parts[0].trim();
      const name = parts[1].trim();
      const ageStr = parts[2].trim();
      
      if (!email.includes('@')) {
        result.addError(`Invalid email: ${email}`);
        continue;
      }
      
      const age = parseInt(ageStr, 10);
      if (isNaN(age)) {
        result.addError(`Invalid age for ${email}`);
        continue;
      }
      
      if (age < 18) {
        result.addError(`User must be 18 or older: ${email}`);
        continue;
      }
      
      if (this.database.userExists(email)) {
        result.addSkipped(email);
        continue;
      }
      
      const user: User = { email, name, age };
      this.database.save(user);
      result.addImported(email);
    }
    
    return result;
  }
}