Migrating to microservices databases by Edson Yanaga - Migration and technical challenges

Last updated May 4, 2024 Published Nov 11, 2021

The content here is under the Attribution 4.0 International (CC BY 4.0) license

Microservices are a popular subject among developers and businesses, the idea of scaling and moving different streams of work forward at the same time shines in the eyes. Therefore, with the shift in how we architect applications, the challenges and tradeoffs should be carefully analyzed.

I like books about strategies to migrate legacy systems and bring them up to date in terms of technology. I wrote about testing strategies for code bases that are big but without any tests. On the other hand, having a book describing possible migrations related to databases is something that I value as well - besides that, spoiler alert, the book is to the point and shows different strategies to deal with data.

In this post, I resume what I understand and insights that I saw while reading the book “Migrating to microservices databases” by Edson Yanaga [1].

Zero downtime

The book touches on this particular subject which is the facto standard now in the industry. Users expect no downtime for any upgrade. Even we as developers don’t want to be interrupted to upgrade. Such a feature comes with technical challenges.

Containers on the other hand became the standard to achieve overcome such challenges. Containers allow developers to run applications as standalone apps on top of orchestration tools, such as Kubernetes. Kubernetes won the container race and is the standard for managing containers at scale - one of the possible strategies to deploy applications is the Blue/Green deployment.

Yanaga in the book also explores the idea of Canary deployments and A/B testing, such strategies are used in the microservices architecture.

Evolving your schema

One of the key points of this book for me is the care of the data, to a certain extent, without data the application can’t do much, data is the core of any business - which brings the attention to the database, or wherever you might store it.

Another concern is, how to evolve the database. To adhere to business needs? Any application that business use requires changes, in the code and also in the database. For source code, for the last few years, we have been using SVN, git and now containers in a way to distribute and versioning. The database in this regard has been forgotten. Therefore, the database also is part of the zero downtime plan, depending on which change is applied it can lock the application, regardless of the choice, the recommended approach in the book is to use a migration tool to version along in the code base repository.

Add column migration

As mentioned in the book, this is the simplest migration to use in the database, the goal is to add a column of any type in the database, often this is triggered by the need to store more data in the table. Points of attention to such migration, the recommended approach is:

  1. Add the column and avoid using not null as it will break insert/update statements
  2. Read from the column and assume that the value will be null
  3. Updates the new column’s value
  4. The new version of the code reads and writes into the new column

In step 3 the update column is optional, as it depends on the requirements to not have a nullable value in the table.

Rename column migration

Renaming a column is a bit trickier as it might already have data and the application uses it. In this scenario, Yanaga recommends the following:

  1. Add a column, same as the step 1 in the previous strategy
  2. The new code version is written to both columns - at this stage, some columns will still have null values
  3. update the new column with the old column values
  4. The new version of the code reads and writes into the new column
  5. Delete the old column

If you noticed, this is the same as adding a new column, but in the end, we remove the old one.

Change the type of column migration

Changing the type of the column as mentioned by Yanaga in his book is not different from renaming the column migration.

  1. Add a column with the new type
  2. The new code version is written to both columns - at this step, both columns will exist, and the new one with the desired value
  3. Update the new column with the value from the old one, but this time with the desired type
  4. Update code to update to the new column
  5. Delete the old column

Delete column migration

Interestingly enough, what Yanaga states in this section is one of the most important insights that I got from his book, to quote the book:

Never delete a column in your database when you’re releasing a new version.

Migrating to Microservice Databases From Relational Monolith to Distributed Data, Edson Yanaga, pg 30

Playing safe is a good idea for any kind of application. Which makes deleting a column a dangerous move. Instead of deleting it right away when releasing new code, Yanaga suggests the following:

  1. Stop reading the column value
  2. Stop writing in the column
  3. Then, delete it

Refactoring Databases

Learn how to identify opportunities for improvement, minimize risks, and optimize the performance of your database:

ThoughtWorks podcast: Refactoring databases

ThoughtWorks released a podcast called “Refactoring Databases — or Evolutionary Database Design”, which is a companion for this subject.

References

  1. [1]E. Yanaga, Migrating to Microservice Databases: From Relational Monolith to Distributed Data. O’Reilly Media, 2017.