Deploying new features that require changes to the database is a technical challenge in any application. To quote Jeff Atwood from Coding Horror: “When I ask development teams whether their database is under version control, I usually get blank stares.”
The database is a critical part of your application. If you deploy version 2.0 of your application against version 1.0 of your database, what do you get? A broken application. And that’s why your database should always be under source control right next to your application code. You deploy the app, and you deploy the database. Like peanut butter and chocolate, they are two great tastes that taste great together.
The Database Is Under Source Control
At Kaltura, we have kept the schema definition scripts under source control for a while. This allows you to checkout the latest Kaltura Community Edition, and install a new working server easily. Terrific! Enjoy your new Kaltura deployment.
But what can you do if you already have an existing Kaltura server running? How can we help you get the new and exciting features added to the latest Kaltura version? Upgrading to the latest version of the code and the database is not enough. We also need to migrate your existing data to the new version, change configuration files and even update 3rd party softwares we are using in the server.
In Eagle we created an update script that runs all SQL and PHP updates on the server and all SQL updates on our Kaltura Data Warehouse (DWH), which is a good start for our future upgrade process.
Our upgrade script needs to know which version of schema is already installed, and then understand and automatically apply the updates to the schema. Luckily, it does. As part of the Kaltura development cycle we deploy features to a variety of servers all the time. To deploy a feature to QA, integration, and finally into our SaaS offering, as easy as possible, we developed a method that allows us to automatically update a server to a newer version of Kaltura, and you get to benefit from this as well.
Automatic Change Management That Doesn’t Hurt
In the Kaltura open-source spirit, I’m going to describe how we manage schema changes easily in the DWH.
First, we have put in place a strong convention for defining schemas. The initial schema definition is stored in the dwh/ddl directory. Inside, we place all changes under dwh/ddl/migrations/ and create a directory named DATE_VERSION_DESCRIPTION that contains the changes, where VERSION is a positive number used to define the order of the change.
Second, we have created a table in the database called version_management, which holds the list of changes we have already applied for this deployment.
Third, we have written an update to automatically discover and apply new changes:
- Gets the latest version of the code and schema from the source control
- Checks the database for a table called version_management, and finds the latest applied changes from the database.
- Finds new changes in the dwh/ddl/migration directory that have a version number larger than the latest applied change in the database.
For each new change:
- Execute all the SQL scripts inside the change, by alphabetical order. If the order matters, we make sure to number them accordingly.
- Insert a new row into the version_management table with the new change id and the file name. Tracking file names in the database allows us to add more scripts to a change before we finish development and tests. This is also a good way to recover from failure, if such occurs.