Django Migrations with Init Containers on OpenShift

Chris Hambridge
ITNEXT
Published in
5 min readAug 12, 2019

--

https://www.flickr.com/photos/kogakure/2225768345/

As you iterate and grow your application you will inevitably need to add or alter your database. In the Django ecosystem you are familiar with migrations as part of the application life cycle. As you move into the production landscape you will have to confront the operational impacts of migrations as new versions of your application with associated database changes are rolled out.

Photo by Marvin Esteve on Unsplash

As new versions of your application are deployed, changes to the database can mean potential outages. Here we need to understand practices for both safely migrating the database and how to do it in a way that avoids an outage completely utilizing rolling deployments.

Additionally we can focus on an approach that decouples our main application from the migration phase of deployment in order to allow for the fastest potential application startup. Being able to decouple migrations and normal application flow means that you can quickly scale out your application without wasted cycles spent checking for already run migrations.

Django migrations on OpenShift

A common mechanism for deploying your Django application on OpenShift is as a DeploymentConfig. A DeploymentConfig allows you define your deployed pods and trigger updates based on image and configuration changes. A DeploymentConfig also allows you to define a strategy for deployment; like a rolling deployment as was mentioned before.

Deployment Config Schema

Data migration approach with rolling deployments

A rolling deployment strategy can allow you to avoid outages as you migrate your data. In a rolling deployment you still have an older version (N-1) of your application running while your new version is coming online, so you avoid ever having a gap in your APIs ability to respond to users.

Example Rolling Strategy

This also means you must be mindful of how you apply migrations. Any changes you make in a new version then cannot be destructive, they must be additive so that an older version will not fail while the rolling update takes place and the database is altered. This additive approach to migrations doesn’t mean you just keep growing your database tables, columns, etc. forever; it just means you may need to take a two step approach. First you maintain the old and add the new tables/columns with the new code. Your new code-base will no longer reference the old tables/columns that you want to remove. Next, you deliver a second update with migrations that remove the old tables/columns that the new code no longer uses. This two step flow of updates allows you to remain always up but still keep your data structures concise and reflect your application use.

Drawbacks to data migrations in your startup flow

Now that we understand a safe and always up approach to migrating the database we still need to understand how to efficiently migrate the database during deployment. Many examples of Django deployments fold the migration step into into the startup flow either as a command in the container’s entry point or as a step in the source-to-image (S2I) run command. While inserting the migration of the database as a step in the application startup command is a logical approach it ultimately impacts the speed of the application startup and overtime with many migrations this can become quite costly. The greatest impact to startup performance occurs when you look to scale out your application, perhaps with a horizontal autoscaler.

Example Horizontal Pod Autoscaler

In an instance where your pod is scaling out due to a burst in traffic where you have hit one of your scale triggers (memory or CPU threshold) the speed at which your new pod can startup is crucial. If your pods startup logic includes migrations checks that have already executed then you are just wasting startup time and may fail to start the pod fast enough to handle the increased load. Failing to handle the increased load means your users may have been met with the dreaded 502 Bad Gateway Error and if it was a small burst in traffic it may just have been a waste of resources. How do we meet the needs of data migration and fast startup that allows for scaling out rapidly? Next we’ll learn about Init Containers as a possible solution.

What is an Init Container?

An Init Container is a short lived container intended to be run before the rest of the pod is deployed. Here you can run utilities or custom code. An Init Container does not even have to be from the same image as the deploying application. This means either you could potentially slim down the application image to not include tools, which can be good for both a build speed reason and security reason or your Init Container image may be small if it only needs to perform a small set of functions, like shell scripts, when its triggered. Its also possible to have multiple Init Containers so you can isolate functional steps as necessary and decouple items to decrease code maintenance.

Migrations as an Init Container

As seen in the last section we can solve our migration issue using an Init Container. Essentially you can reuse the same container image you previously had defined and employ the Django migration utility as your container command.

Sample Init Container Spec

This will allow you to decouple your migration from your normal application startup logic while still logically ordering it to happen before you your application deployment.

Synopsis

Hopefully this story has enlightened you to an approach for Django migrations that focuses on being always up and enabling a quick scale out. We looked at using a two step approach for migrations which allows for a rolling deployment strategy. Additionally we discussed the drawbacks of migrations in the container’s startup entrypoint and how an Init Container can be utilized to decouple migration execution from normal application startup flow to enable a snappy scale out of your application.

--

--

Software Engineer at Red Hat. Passionate about devOps and cloud native technologies.