Refactoring

Last updated Jan 19, 2025 Published Jan 5, 2025

The content here is under the Attribution 4.0 International (CC BY 4.0) license

Refactoring is a topic that has recently got a lot of attention in our ongoing project. Many non-technical staff members perceive refactoring as a major project that requires significant time and resources. However, this is one side of the possible techniques available in the industry. Here the goal is to explore the different strategies to tackle the maintenance of source code. Before that let’s dive into the topics that I will cover in this regard:

Part I

What is Refactoring?

Refactoring is the process of restructuring existing computer code without changing its external behavior. It aims to improve the nonfunctional attributes of the software. Refactoring is crucial because it helps maintain code quality, making it easier to understand, maintain, and extend.

Why to refactor?

Refactoring shold be focused on the economics of software development, we refactor because we want to make the next thing in the software to be cheaper to implement than it is today. (Beck, 2023), highlights that “What makes sense for us to do as programmers may go contrary to the nature of money. When geeky imperatives clash with money imperatives, money wins. Eventually.”. In that sense, finding the balance between monetary reasons and long term sustainability is a forces game.

The cost associated from unit testing

In discussions of costs in regards to refactoring, the idea of costs in unit test also come into play as they act as a safety net for enabling refactoring. (Ellims et al., 2006) work points to a miss understanding of the costs associated with unit testing. The paper argues that the benefits of unit testing are often underestimated.

Despite the monetary game, refactoring is not just about making code look nice; it is about improving the structure to reduce complexity, enhance readability, and make it easier to maintain. It should be an integral part of the development process, not a separate project. Refactoring is a continuous process that should be done regularly. According to (Fritzsch, 2024) around 70% of software developers spent time on the maintenance of existing systems. (Hermans, 2021) argues that developers spend more time reading and understanding code in comparison to writing it.

The implications of refactoring often, adds to the reduced costs of future changes, as the code becomes easier to understand and modify. Modifying code is the core of evolving business requirements and software should enable that dynamics not preventing that, doing so, might give back in terms of reduced costs in the future.

Common Misconceptions

One common misconception is that refactoring is a major project that needs to be planned and executed separately. This thinking is flawed because it overlooks the continuous nature of refactoring. Treating refactoring as a separate project can lead to increased technical debt and reduced code quality over time.

Refactoring is not a exactly flow to follow, it is a learning process, while we are facing the known, refactoring help developers to learn as it gets understanding of the source code. It might be needed to do many refactors and see the code in a worse state, until it gets in a better state.

Rewrite or refactor?

Another misconception is that refactoring is the same as rewriting code. While rewriting involves starting from scratch, refactoring focuses on improving existing code without changing its external behavior. Rewriting can be a risky and expensive endeavor, often leading to more problems than it solves. Refactoring, on the other hand, is a safer and more incremental approach that allows developers to make continuous improvements to the codebase.

Slides of the talk "Refactoring at software crafters Madrid"

In june my colleague Javier and I gave a talk at the Software Crafters Madrid meetup about refactoring. The slides are available at speakerdeck. The talk covered topics described in this post with a focus on the practical aspects of refactoring in a real-world project.

Approaches to refactoring

Safety net

Martin Fowler, in his seminal book “Refactoring: Improving the Design of Existing Code,” advocates for a continuous approach to refactoring. According to Fowler, refactoring should be an integral part of the development process, performed regularly and incrementally. This approach ensures that code remains clean and maintainable, preventing the accumulation of technical debt (Fowler, 2018). However, working on code and change it to fits a better state requires other techniques that will help developers to check its behaviour at the moment of the change. To that end, there are katas that use the “characterization testing” approach.

Characterization testing, also known as golden master testing or approval testing, originated as a way to describe and protect the actual behavior of existing legacy software when no formal specification or tests exist. Michael Feathers coined the term. The goal is to capture the current behavior of the system so that future changes can be verified against this baseline, ensuring no unintended changes occur. Unlike traditional tests that assert expected behavior, characterization tests verify that behavior remains consistent with the observed legacy behavior, making them change detectors rather than correctness validators.

Approval testing vs Characterization testing
It depends on the source used the wording used vary, but refers to the same key idea of checking the output hasn't changed.

It happens to be that while working in professional source code, the safety net of the characterization might not exists. In that case, there are other approaches to start with refactoring:

  1. Characterize the code first - Characterizing the code first means that before doing anything in the existing code, the characterization will be created and then executed to confirm behaviour.
  2. Write the test first - In this stage we take the opposite direction, we make the observed behaviour with tests that check the rules we expected to be fulfilled.
  3. Refactor only with safe refactorings - When you don’t have characterization or other tests before refactoring, Fowler advises taking very small, safe steps—each refactoring move is designed to be minimal and unlikely to break behavior

While the first two approaches are more common, the third one is a good starting point when working with legacy code or when tests are not available. The key is to ensure that each refactoring step is small and manageable, allowing for easy identification of any issues that may arise. This approach helps maintain the integrity of the code while gradually improving its structure and readability. At events like the Software Crafters Madrid, we often discuss these approaches and share experiences on how to effectively implement them in real-world projects.

Tools at hand

There are several tools available to help with refactoring, such as IDEs with built-in refactoring support, static code analysis tools, and automated testing frameworks. These tools can help identify areas of the code that need refactoring, automate repetitive refactoring tasks, and ensure that the code remains functional after changes are made.

ApprovalsJs and StrykerJs

In a blog post by Codesai, the author discusses how to use ApprovalsJs and StrykerJs in WebStorm to facilitate refactoring. The post contribution focuses on building a safety net for refactoring by using these tools to create and run characterization tests.

The types

The programmers brain book, points to an approach of readability, which is a key aspect of refactoring. The book emphasizes that readability is not just about making code look nice, but also about making it easier to understand and maintain. This aligns with Fowler’s approach, where the goal of refactoring is to improve the design of existing code without changing its external behavior. The same idea of readability is used here in the order in which one refactoring should be used over another.

Readability

Readability refactorings focus on improving the clarity and understandability of the code. These refactorings include renaming variables and methods to more meaningful names, breaking down large methods into smaller ones, and simplifying complex conditional statements. The goal is to make the code more readable and easier to follow, which ultimately leads to better maintainability and collaboration among team members. The premise is that readable code is easier to understand, which reduces the cognitive load on developers and makes it easier to spot bugs or improvements(Hermans, 2021).

Design

Design refactorings focus on improving the overall structure and organization of the code. These refactorings include extracting classes or methods, introducing design patterns, and reorganizing code to follow the Single Responsibility Principle (SRP). The goal is to create a more modular and flexible codebase that can adapt to changing requirements. Design refactorings help reduce coupling between components, making it easier to modify or extend the code without affecting other parts of the system.

Design refactorings are not exclusively about readability, but rather about improving the design of the code to make it extensible. They often involve more significant changes to the code structure, such as introducing new classes or methods, or reorganizing existing code to follow best practices and design patterns.

Performance

Performance refactorings focus on optimizing the code for better performance. These refactorings include optimizing algorithms, reducing memory usage, and improving the efficiency of data structures. The goal is to make the code run faster and use fewer resources, which is especially important in performance-critical applications.

Performance refactorings are often necessary when the codebase has grown over time and performance issues have emerged. They can involve significant changes to the code, such as replacing inefficient algorithms with more efficient ones, or restructuring data to improve access times. However, performance refactorings should be done carefully, as they can introduce new bugs or make the code harder to understand if not done properly. In this list they are the last ones to be done.

In addition to the safety net for refactoring, performance refactoring might require additional tools or techniques, such as fitness functions to automate the process of measuring performance improvements (Ford et al., 2017).

A playlist of videos about refactoring

Throughout my searchings about refactoring, I have been able to stumble upon a few videos that changed the way I refactor. To keep track of that, I made a playlist on youtube with all of them.

Part II

In this second part the goal is to show examples of real code bases and build on top of the previous part to combine the theory with the practice.

Resources

References

  1. Hermans, F. (2021). The Programmer’s Brain: What Every Programmer Needs to Know About Cognition. Manning Publications.
  2. Fowler, M. (2018). Refactoring: Improving the Design of Existing Code (2nd ed.). Addison-Wesley Professional.
  3. Ford, N., Parsons, R., & Kua, P. (2017). Building Evolutionary Architectures: Support Constant Change. O’Reilly Media.
  4. Beck, K. (2023). Tidy First?: A Personal Exercise in Empirical Software Design. Pearson.
  5. Ellims, M., Bridges, J., & Ince, D. C. (2006). The economics of unit testing. Empirical Software Engineering, 11, 5–31.
  6. Fritzsch, J. (2024). Architectural refactoring to microservices: a quality-driven methodology for modernizing monolithic applications.

You also might like