Table of contents
- What Are Characterization Tests?
- When to Use Characterization Tests
- Techniques for Writing Characterization Tests
- Golden Master Tests vs. Characterization Tests
- Refactoring with Characterization Tests
- Mutation Testing to Validate Test Quality
- Practical Exercise: The Gilded Rose Kata
- Sensible Defaults for Characterization Tests
- Common Pitfalls
- Frontend vs Backend Characterization: Key Differences
- Real Example 1: Understanding a Payment Processor (TypeScript)
- Real Example 2: Characterization Tests in a ReactJS Frontend
- Real Example 2: Import Users Feature
- Resources
- References
Characterization Tests - Understanding Legacy Code Through Real Examples
The content here is under the Attribution 4.0 International (CC BY 4.0) license
Join Our Community
Connect with developers, architects, and tech leads who share your passion for quality software development. Discuss TDD, architecture, software engineering, and more.
→ Join SlackCharacterization tests are a technique for understanding and documenting the behavior of existing code, particularly legacy code that lacks proper documentation or tests. Unlike traditional tests that verify correctness against specifications, characterization tests capture the current behavior of the system as it exists today.
What Are Characterization Tests?
A characterization test captures the current behavior of code without making assumptions about whether that behavior is correct. The primary goal is to understand what the code does, not what it should do. This approach provides a safety net when you need to refactor or modify legacy code.
Working Effectively with Legacy Code
Michael Feathers introduced the concept of characterization tests in his book, Working Effectively with Legacy Code (Feathers, 2004). The technique helps developers understand and document the behavior of legacy code before making changes. By capturing the current behavior, you create a foundation for safe refactoring and gradual improvement.
When to Use Characterization Tests
In my experience working with legacy systems, characterization tests have proven in several recurring scenarios. Often, I have inherited codebases that lacked any form of automated tests or meaningful documentation. In these situations, the absence of guidance made it nearly impossible to make changes with confidence. Characterization tests became my primary tool for mapping out what the code actually did, rather than what I assumed it should do.
Another common challenge is deciphering complex business logic that has evolved over years, sometimes decades, and is deeply embedded in legacy code. By writing characterization tests, I could systematically document the observed behavior, which not only helped me understand the system but also provided a safety net for future modifications.
When refactoring, the risk of breaking existing functionality is always present, especially in systems with unknown dependencies or side effects. Characterization tests allowed me to refactor incrementally, ensuring that the system’s current behavior remained intact throughout the process.
There have also been times when I needed to explore code to identify subtle bugs or edge cases that were not immediately obvious. Characterization tests enabled me to capture these behaviors as I discovered them, turning exploration into executable documentation.
Finally, during modernization efforts, characterization tests have been essential for protecting against regressions. They serve as a living specification of the legacy system, making it possible to evolve the codebase with greater confidence and less risk.
Techniques for Writing Characterization Tests
1. Start with Known Inputs and Outputs
Begin with the simplest cases where you can predict or observe the output:
test('empty list returns empty result', () => {
const result = processData([]);
expect(result).toEqual([]);
});
2. Introduce Coverage Tools
Use code coverage tools to identify untested paths:
# TypeScript with Jest
npm test -- --coverage
# TypeScript with Vitest
npm run test:coverage
3. Write Tests for Edge Cases
Once you understand the basic behavior, explore edge cases:
describe('edge cases', () => {
it('handles null input', () => {
const result = processData(null);
// Capture actual behavior, even if it throws
});
it('handles empty string', () => {
const result = processData('');
expect(result).toBe(/* observed value */);
});
it('handles extremely large inputs', () => {
const largeInput = 'x'.repeat(10000);
const result = processData(largeInput);
// Document the behavior
});
});
4. Use Property-Based Testing for Systematic Exploration
Instead of using loops in tests—which is widely considered a test smell and can obscure individual failures—prefer property-based testing to systematically explore a wide range of inputs. Property-based testing frameworks generate diverse input data and check that certain properties always hold, making it easier to discover edge cases and unexpected behaviors.
For TypeScript, libraries like fast-check can be used:
import fc from 'fast-check';
test('legacyFunction maintains invariants for all integers', () => {
fc.assert(
fc.property(fc.integer(), (value) => {
const result = legacyFunction(value);
// Assert properties about result
// e.g., expect(typeof result).toBe('number');
})
);
});
Loops in tests can hide which input caused a failure and make test output harder to interpret. Property-based testing provides systematic, reproducible exploration and better diagnostics.
Golden Master Tests vs. Characterization Tests
Golden Master (or Approval) tests are a variant where you capture the entire output as a snapshot and compare future executions against it.
Example: Golden Master Test
test('captures complete JSON output', () => {
const input = { id: 1, name: 'Test' };
const output = complexTransformation(input);
expect(output).toMatchSnapshot();
});
Pros and Cons
Golden Master Tests:
- Pros: Quick to set up, captures complex outputs completely, detects any change
- Cons: Does not improve understanding, creates brittle tests, difficult to maintain snapshots
Characterization Tests:
- Pros: Improves code understanding, documents specific behaviors, easier to maintain
- Cons: Requires more effort to write, may miss some edge cases initially
For long-term maintainability, characterization tests that document specific behaviors are preferable to golden master tests.
Refactoring with Characterization Tests
Once you have comprehensive characterization tests, you can refactor safely. Common refactoring strategies include:
Extract Method
// Before
function processPayment(
amount: number,
method: string,
customer: Customer
): PaymentResult {
// ... complex logic ...
}
// After
function processPayment(
amount: number,
method: string,
customer: Customer
): PaymentResult {
if (!isValidAmount(amount)) {
return invalidAmountResult();
}
if (method === 'credit_card') {
return processCreditCard(amount, customer);
}
if (method === 'bank_transfer') {
return processBankTransfer(amount);
}
return unknownMethodResult();
}
function isValidAmount(amount: number): boolean {
return amount > 0;
}
Replace Conditional with Polymorphism
// Before: Single class with conditionals
class PaymentProcessor {
process(method: string, amount: number, customer: Customer): Result {
if (method === 'credit_card') {
// credit card logic
} else if (method === 'bank_transfer') {
// bank transfer logic
}
}
}
// After: Strategy pattern
interface PaymentMethod {
process(amount: number, customer: Customer): Result;
}
class CreditCardPayment implements PaymentMethod {
process(amount: number, customer: Customer): Result {
// credit card logic
}
}
class BankTransferPayment implements PaymentMethod {
process(amount: number, customer: Customer): Result {
// bank transfer logic
}
}
Introduce Parameter Object
// Before
function calculateDiscount(
customerAge: number,
customerTier: string,
orderTotal: number,
orderDate: Date
): number {
// complex logic
}
// After
interface DiscountContext {
customerAge: number;
customerTier: string;
orderTotal: number;
orderDate: Date;
}
function calculateDiscount(context: DiscountContext): number {
// same logic, clearer interface
}
Mutation Testing to Validate Test Quality
After writing characterization tests, use mutation testing to verify they actually protect against changes:
# TypeScript with Stryker
npx stryker run
Mutation testing introduces small changes (mutations) to your code and checks if your tests catch them. A high mutation score indicates your tests effectively capture the behavior.
Practical Exercise: The Gilded Rose Kata
The Gilded Rose Kata is a classic exercise for practicing characterization tests. It presents a legacy inventory system with complex business rules and no tests.
Getting Started
- Clone the kata repository in TypeScript
- Read the requirements document to understand the business rules
- Write characterization tests to capture current behavior
- Use coverage tools to ensure all paths are tested
- Refactor the code while keeping tests green
- Add the new feature (aged brie) using TDD
# Clone the repository
git clone https://github.com/emilybache/GildedRose-Refactoring-Kata.git
# Choose TypeScript directory
cd GildedRose-Refactoring-Kata/TypeScript
Sensible Defaults for Characterization Tests
- Start Small: Begin with simple, predictable cases before tackling complex scenarios
- Document Surprises: When you discover unexpected behavior, add comments explaining what you found
- Separate Concerns: Distinguish between tests that capture intended behavior and those that capture bugs
- Use Descriptive Names: Test names should clearly describe the behavior being characterized
- Refactor Tests: Once you understand the code, refactor tests to be more maintainable
- Delete Obsolete Tests: After refactoring, remove tests that no longer serve a purpose
Common Pitfalls
Over-reliance on Snapshots
Avoid using snapshot testing for everything. Snapshots are useful for complex outputs but should be supplemented with specific assertions for critical behaviors.
Testing Implementation Details
Focus on observable behavior, not internal implementation. Tests should survive refactoring.
Ignoring Test Maintainability
Even characterization tests need to be readable and maintainable. Refactor them as you gain understanding.
Not Updating Tests After Refactoring
As you refactor and improve the code, update tests to reflect the new, clearer structure rather than preserving the legacy behavior.
Frontend vs Backend Characterization: Key Differences
Characterization tests apply across the entire stack, but their focus and techniques differ significantly between frontend and backend code.
Backend Characterization
Backend characterization tests focus on the behavior of business logic, data transformations, and state management. They are typically:
- Input-focused, testing direct function calls with various inputs and assertions on return values
- Fast to execute, since they operate on pure functions or isolated services
- Deterministic, with clear and repeatable scenarios
- Concerned with logic correctness, edge cases, and error handling
Backend tests rarely need to worry about visual representation or UI state. The assertions are straightforward: verify that given input X, the function returns output Y or that a side effect occurs as expected.
Frontend Characterization
Frontend characterization tests focus on user interactions, component state, and DOM rendering. They are typically:
- Interaction-focused, testing how components respond to user actions (clicks, typing, form submissions)
- Potentially slower, since they involve rendering and DOM manipulation
- More visually-oriented, often using snapshot testing to capture entire UI states
- Concerned with component behavior, accessibility, and user experience
Frontend tests must contend with the complexity of UI state, event handling, and asynchronous rendering. Rather than asserting on specific text or attributes (which can be brittle), snapshot testing captures the entire rendered output, making it easier to detect unintended side effects from refactoring.
When to Use Snapshots vs Specific Assertions
- Use specific assertions for critical behaviors that represent business requirements (e.g., “when amount is invalid, show an error”)
- Use snapshots to characterize the overall component structure and rendering in various states
- Combine both approaches for comprehensive characterization
The key difference is that backend tests validate logic, while frontend tests validate both logic and presentation. Characterization tests on the frontend should capture the current rendering behavior as a baseline for safe refactoring.
Real Example 1: Understanding a Payment Processor (TypeScript)
Pure TypeScript Example
Initial Code
interface Customer {
creditScore?: number;
creditLimit?: number;
}
interface PaymentResult {
success: boolean;
message: string;
fee?: number;
processingTime?: string;
}
function processPayment(
amount: number,
method: string,
customer: Customer
): PaymentResult {
let result: PaymentResult = { success: false, message: '' };
if (amount <= 0) {
result.message = 'Invalid amount';
return result;
}
if (method === 'credit_card') {
if (!customer.creditScore || customer.creditScore < 600) {
result.message = 'Credit declined';
return result;
}
if (!customer.creditLimit || amount > customer.creditLimit) {
result.message = 'Exceeds credit limit';
return result;
}
result.success = true;
result.message = 'Payment processed';
result.fee = amount * 0.029 + 0.30;
} else if (method === 'bank_transfer') {
if (amount < 1) {
result.message = 'Minimum transfer is $1';
return result;
}
result.success = true;
result.message = 'Transfer initiated';
result.processingTime = '3-5 days';
} else {
result.message = 'Unknown payment method';
}
return result;
}
Characterization Tests
describe('processPayment characterization tests', () => {
describe('credit card payments', () => {
it('processes valid credit card payment with fee', () => {
const customer: Customer = { creditScore: 700, creditLimit: 5000 };
const result = processPayment(100, 'credit_card', customer);
expect(result.success).toBe(true);
expect(result.message).toBe('Payment processed');
expect(result.fee).toBeCloseTo(3.20, 2);
});
it('declines payment when credit score is below 600', () => {
const customer: Customer = { creditScore: 599, creditLimit: 5000 };
const result = processPayment(100, 'credit_card', customer);
expect(result.success).toBe(false);
expect(result.message).toBe('Credit declined');
expect(result.fee).toBeUndefined();
});
it('declines payment exceeding credit limit', () => {
const customer: Customer = { creditScore: 700, creditLimit: 50 };
const result = processPayment(100, 'credit_card', customer);
expect(result.success).toBe(false);
expect(result.message).toBe('Exceeds credit limit');
});
});
describe('bank transfer payments', () => {
it('processes bank transfer with processing time', () => {
const customer: Customer = {};
const result = processPayment(100, 'bank_transfer', customer);
expect(result.success).toBe(true);
expect(result.message).toBe('Transfer initiated');
expect(result.processingTime).toBe('3-5 days');
});
it('rejects bank transfer below minimum amount', () => {
const customer: Customer = {};
const result = processPayment(0.50, 'bank_transfer', customer);
expect(result.success).toBe(false);
expect(result.message).toBe('Minimum transfer is $1');
});
});
describe('edge cases discovered through exploration', () => {
it('rejects zero or negative amounts', () => {
const customer: Customer = { creditScore: 700, creditLimit: 5000 };
const result = processPayment(0, 'credit_card', customer);
expect(result.success).toBe(false);
expect(result.message).toBe('Invalid amount');
});
it('handles unknown payment method', () => {
const customer: Customer = {};
const result = processPayment(100, 'paypal', customer);
expect(result.success).toBe(false);
expect(result.message).toBe('Unknown payment method');
});
});
});
These characterization tests reveal several behaviors:
- Credit card payments include a fee calculation (2.9% + $0.30)
- Credit score threshold is exactly 600
- Bank transfers have a minimum of $1 (but the check seems inconsistent with the initial validation)
- The function returns different structures depending on the payment method
Real Example 2: Characterization Tests in a ReactJS Frontend
Suppose you have a legacy ReactJS component that processes user payments and you want to characterize its behavior before refactoring. The approach is similar, but you use React Testing Library and Jest for the frontend context.
// PaymentForm.tsx (production code)
import React, { useState } from 'react';
const PaymentForm: React.FC = () => {
const [amount, setAmount] = useState('');
const [method, setMethod] = useState('');
const [message, setMessage] = useState('');
const handleSubmit = (e: React.FormEvent) => {
e.preventDefault();
if (!amount || isNaN(Number(amount)) || Number(amount) <= 0) {
setMessage('Invalid amount');
return;
}
if (method === 'credit_card') {
setMessage('Payment processed');
} else if (method === 'bank_transfer') {
setMessage('Transfer initiated');
} else {
setMessage('Unknown payment method');
}
};
return (
<form onSubmit={handleSubmit}>
<label htmlFor="amount">Amount</label>
<input
id="amount"
name="amount"
type="number"
value={amount}
onChange={e => setAmount(e.target.value)}
/>
<label htmlFor="method">Method</label>
<input
id="method"
name="method"
type="text"
value={method}
onChange={e => setMethod(e.target.value)}
/>
<button type="submit">Submit</button>
{message && <div>{message}</div>}
</form>
);
};
export default PaymentForm;
import React from 'react';
import { render, screen } from '@testing-library/react';
import userEvent from '@testing-library/user-event';
import PaymentForm from './PaymentForm';
describe('PaymentForm characterization tests', () => {
it('renders and submits invalid amount', async () => {
render(<PaymentForm />);
const user = userEvent.setup();
await user.type(screen.getByLabelText(/amount/i), '0');
await user.click(screen.getByRole('button', { name: /submit/i }));
expect(document.body).toMatchSnapshot();
});
it('renders and submits valid credit card payment', async () => {
render(<PaymentForm />);
const user = userEvent.setup();
await user.type(screen.getByLabelText(/amount/i), '100');
await user.type(screen.getByLabelText(/method/i), 'credit_card');
await user.click(screen.getByRole('button', { name: /submit/i }));
expect(document.body).toMatchSnapshot();
});
});
This approach allows you to characterize the observable behavior of frontend components, ensuring you can refactor with confidence.
Real Example 2: Import Users Feature
This example is based on the Import Users Kata, which simulates importing user data from external sources with hidden business rules.
Initial Code (TypeScript)
interface User {
email: string;
name: string;
age: number;
}
interface Database {
userExists(email: string): boolean;
save(user: User): void;
}
class ImportResult {
private imported: string[] = [];
private skipped: string[] = [];
private errors: string[] = [];
addImported(email: string): void {
this.imported.push(email);
}
addSkipped(email: string): void {
this.skipped.push(email);
}
addError(message: string): void {
this.errors.push(message);
}
getImportedCount(): number {
return this.imported.length;
}
getSkippedCount(): number {
return this.skipped.length;
}
getErrorCount(): number {
return this.errors.length;
}
getErrors(): string[] {
return this.errors;
}
}
class UserImporter {
constructor(private database: Database) {}
importUsers(userLines: string[]): ImportResult {
const result = new ImportResult();
for (const line of userLines) {
const parts = line.split(',');
if (parts.length < 3) {
result.addError(`Invalid format: ${line}`);
continue;
}
const email = parts[0].trim();
const name = parts[1].trim();
const ageStr = parts[2].trim();
if (!email.includes('@')) {
result.addError(`Invalid email: ${email}`);
continue;
}
const age = parseInt(ageStr, 10);
if (isNaN(age)) {
result.addError(`Invalid age for ${email}`);
continue;
}
if (age < 18) {
result.addError(`User must be 18 or older: ${email}`);
continue;
}
if (this.database.userExists(email)) {
result.addSkipped(email);
continue;
}
const user: User = { email, name, age };
this.database.save(user);
result.addImported(email);
}
return result;
}
}
Characterization Tests
describe('UserImporter characterization tests', () => {
let importer: UserImporter;
let database: InMemoryDatabase;
beforeEach(() => {
database = new InMemoryDatabase();
importer = new UserImporter(database);
});
it('imports valid user', () => {
const users = ['john@example.com, John Doe, 25'];
const result = importer.importUsers(users);
expect(result.getImportedCount()).toBe(1);
expect(result.getErrorCount()).toBe(0);
expect(database.userExists('john@example.com')).toBe(true);
});
it('rejects user under 18', () => {
const users = ['young@example.com, Young User, 17'];
const result = importer.importUsers(users);
expect(result.getImportedCount()).toBe(0);
expect(result.getErrorCount()).toBe(1);
expect(result.getErrors()[0]).toContain('must be 18 or older');
});
it('skips existing users', () => {
database.save({ email: 'existing@example.com', name: 'Existing', age: 30 });
const users = ['existing@example.com, Existing User, 30'];
const result = importer.importUsers(users);
expect(result.getImportedCount()).toBe(0);
expect(result.getSkippedCount()).toBe(1);
});
it('handles invalid email format', () => {
const users = ['notanemail, John Doe, 25'];
const result = importer.importUsers(users);
expect(result.getImportedCount()).toBe(0);
expect(result.getErrorCount()).toBe(1);
expect(result.getErrors()[0]).toContain('Invalid email');
});
it('handles invalid age format', () => {
const users = ['john@example.com, John Doe, not-a-number'];
const result = importer.importUsers(users);
expect(result.getImportedCount()).toBe(0);
expect(result.getErrorCount()).toBe(1);
expect(result.getErrors()[0]).toContain('Invalid age');
});
it('handles insufficient fields', () => {
const users = ['john@example.com, John'];
const result = importer.importUsers(users);
expect(result.getImportedCount()).toBe(0);
expect(result.getErrorCount()).toBe(1);
expect(result.getErrors()[0]).toContain('Invalid format');
});
it('processes mixed valid and invalid users', () => {
const users = [
'john@example.com, John Doe, 25',
'invalid-email, Jane Doe, 30',
'bob@example.com, Bob Smith, 35'
];
const result = importer.importUsers(users);
expect(result.getImportedCount()).toBe(2);
expect(result.getErrorCount()).toBe(1);
});
});
Resources
- Working Effectively with Legacy Code by Michael Feathers (the definitive guide to characterization tests)
- Emily Bache’s YouTube Channel (practical kata walkthroughs)
- Gilded Rose Kata (practice characterization tests)
- Import Users Kata (realistic legacy code scenario)
- Stryker Mutation Testing (TypeScript mutation testing)
References
- Feathers, M. (2004). Working effectively with legacy code. Prentice Hall Professional.