Caching Python Dependencies for Production Stability on OpenShift

Published in

ITNEXT

3 min readJan 25, 2019

Are you prepared? You’ve got a critical fix for your Python application to deploy to production, but PyPi is down. Isolating risks related to your dependencies is critical when it comes to maximizing stability and availability in production. If you could take a simple step to minimize your risk of a dependency outage and it came with the added bonus of a performance boost to your builds wouldn’t you take it?

While a PyPi outage, may not seem likely it does happen. You may say, “aren’t there mirrors for that case”? And the answer is yes there are mirrors for PyPi. Does your operations team know about them? Do the operations team know how to specify a working mirror for your application? Wouldn’t it be better if they didn’t need to worry about it because you’ve created a local mirror with the dependencies your applications utilizes? Dependency caching is exactly what devpi delivers (specifically devpi-server).

Running devpi on OpenShift

Getting devpi-server running on OpenShift is a simple five step process. This easy setup is thanks to the wonder that is the opensource community. The project openshift-devpi has done all the legwork to pull this together, by defining the deployment and service definitions and the container image.

git clone https://github.com/saily/openshift-devpi.git oc login https://<your-master-node>:8443 oc new-project devpi oc create -f openshift/deployment.yaml oc create -f openshift/service.yaml

Utilizing devpi on OpenShift

Now you have devpi-server up and running on OpenShift how do you make use of it? The benefits of utilizing devpi comes into play is you are building images on your OpenShift, most especially if you are employing Source to Image (s2i) to create your image streams.

How to Create an S2I Builder Image - Red Hat OpenShift Blog

Source-To-Image (S2I) is a standalone tool which is very useful when creating builder images. It also happens that S2I…

blog.openshift.com

I’m going to assume that since you are developing with Python and deploying on OpenShift you are keeping up with development trends and using Pipenv for dependency management. If you aren’t using Pipenv yet, please take a look as its truly a great tool. Pipenv has the PIPENV_PYPI_MIRROR environment variable for specifying an alternate mirror. In order to harness the devpi now setup on your cluster, you need to pass the service endpoint into the PIPENV_PYPI_MIRROR variable. If you deployed devpi-server to the example project, devpi, then you would provide http://devpi.devpi.svc.cluster.local:3141 as the input value for local cluster access. Alternatively, you could define a route that would give you external access and supply that route URL. In order to pass in the environment variable you must define it as part of your BuildConfig, see the linked example. Now when you create the application you can just supply this as a parameter. If the parameter is not provided then PyPi will be the default. With all this in place you should also see a significant performance improvement in build times. You can also toggle the DEVPI_MIRROR_CACHE_EXPIRY value on the devpi-server to more (or less) than the default of a day depending on how frequent your dependencies change.

Synopsis

Hopefully this story has informed you on a helpful new tool, devpi, and given you a short set of steps to harden your production environment from dependency outages. We discussed how to get a devpi-server running on OpenShift in five simple steps. Next we took a look at how image creation can make use of devpi and get a significant speed up. While this is a helpful setup for a production cluster, it can also be used in a development cluster or on a local cluster. If you are a developer with OpenShift running on your local system you can even make use of the performance improvements when doing local development outside of OpenShift, if you add an external route and provide the mirror when running setup.