In this blog post I’m going to talk top of the waves about a very large subject: Continuous Integration or CI.
Where to Start
CI is a process, but it’s not an all or nothing proposition. There are levels of CI that can be achieved. As the title of this blog post suggests, you can start small and build on your process. The most difficult aspect of getting to a CI environment is the natural resistance of people who have been running your company for years. This occurs because companies always start out small and software is easy when it’s small. It’s more forgiving. Manual deployment is not painful. Unfortunately, by the time an organization discovers that they need to do something, the manual deployment process is at a disastrous level.
To get the process rolling, identify what can be done quickly. Each manual process that your organization is performing that can be easily and cheaply automated will save time in the future. As you implement more and more automation, you will begin to see results as operations become smoother.
I’m going to identify some low-hanging fruit that can be done in any company creating software and deploying it on a regular basis. First, developers should always use version control. There are many products that are available, including free versions that are very good (Bitbucket is one example that allows free private repositories). By using either GitHub or Bitbucket you will get historical records on changes in your software and you’ll get off-site backup protection. Disaster recovery just got a bit easier.
Once your software is regularly checked in by developers, then it’s time to try out some build systems.
The Build Server
Your first level of CI is to have a system that builds your software whenever a change is checked in or merged, then notifies all developers if a build is broken. At this point, process and rules must be put in place to ensure that the build gets fixed right away. The longer a build goes without being fixed, the more difficult it becomes to find the problem. Some companies force developers to stay late to fix their build (since it can affect other developers), other companies have rules that allow the build to be broken for a maximum tolerable time. This all depends on the company and the number of developers involved.
Once a build system is in place, your software should use unit tests to ensure a change in the software does not break a previously established feature. The build server must run these unit tests every time the build is completed and the build should be rejected if the unit tests don’t pass. This is also something that must be fixed by developers right away.
Many version control packages allow a pre-build check to be used. As you assemble your CI environment, somewhere down the road you’ll want to incorporate some sort of pre-build system that doesn’t allow the software to be checked in unless it builds (and possibly passes the unit tests). This will prevent change sets that are broken from being checked into your version control system.
Your next phase and probably a more difficult phase is to automate your deployment. I’m assuming that you can acquire a development server or environment that mimics your production environment. Once you scale up, you’ll need to add a quality environment for testing (QA) and some sort of staging environment. Before you setup too many environments, you should get an automated deployment in place. Jenkins is a good starting point though you can get by with just power shell or batch files. Initially you can automate the process of preparing your deployment package and then manually switch out existing directories with the automatically prepared directories. Once you are comfortable with that process you can automate the process of backing up the current environment and deploying the new one. Keep in mind that you should have a roll-back mechanism that works quickly. For a web server the process would look something like this:
- Create the directory with all the new files from your build server.
- Copy config files from the existing production environment.
- Stop the web server (in a web farm, perform this operation for one server at a time).
- Rename the existing website directory (give it a date so you can keep multiple backups if needed).
- Rename the new directory to the name that your web server expects.
- Start your web server.
For web farms, you can deploy your software in the middle of the day by adding steps to your process to take one server out of the farm (mark it unhealthy or disable it in your load balancer). Wait for the traffic to bleed off, then shut the server down. Continue with the directory switch. Then start the server back up and put it back in the farm. Then perform the same operation on the next web server, continuing through all servers in the farm.
To enhance this operation you can put a stop after the first web server and allow testing to be performed before authorizing the continuation of the deployment to other web servers in the farm.
Next you’ll need to make sure you have a roll-back plan. Roll-back can be accomplished by stopping your web server, putting the old directory back and starting your web server. This process should be tested on a test or development system first. It needs to be perfected because you’ll need it when something bad happens.
If your software doesn’t log exceptions, then you’ll need to add some sort of catch-all logging (like ELMAH). You should at least log errors and send it to a text file or to an email address. Be aware that if you are adding logging to an old legacy system that has had no logging in the past, that your email system must either be robust enough to handle the load or you’ll need an email system that can dump older emails if the in-box is too full. Otherwise, you’ll find yourself with buggy software and an email system that is down. For text files, you’ll need to make sure that it is setup to roll-over (create a new file) when the text file gets too big. Find out what your maximum file size is for the editor of your choice. For products such as Notepad++, you’ll want to keep the files under 50 meg in size. Sublime can handle larger files, but the file will load slowly as it approaches 500 megabytes in size. You’ll also want to limit the number of roll-over files to prevent your hard drive from filling up and crashing your system.
Once you have established a logging system, analyze what errors are being logged. Focus on the largest quantity of one type of error and fix the problem in your software. Eventually, you’ll get into obscure errors that occur in situations where a web bot hits your website with incorrect parameters (or something of that nature).
Next, you’ll want to log events that occur on your APIs. Logging such events can reveal aspects of your software that you never knew existed. Scenarios where an object is null after a call to an API can occur if the parameters are unexpected. These bugs can be fixed to prevent 500 errors from tying up your resources. It can also help prevent memory leaks and stuck web server processes.
The last aspect of CI that you should focus on involves tests such as load testing. Load tests can be performed during off-hours or on a system that is isolated from your production system. Code coverage can be performed to determine if your unit tests are adequate. Be aware that code coverage can be misleading due to the fact that there is a difference between a large quantity of poorly designed unit tests and a small quantity of well designed unit tests.
Integration testing can also be automated. This is a complex subject in it’s own right, but there are ways to script a process to perform gets and posts on APIs and web sites to probe for 500 errors. Any manual test that is performed more than once should become a candidate for automation. Manual regression tests are time consuming (i.e. expensive) and prone to errors. Your test suite should consist of a collection of small individual test packages that can be run separately or in parallel. Eventually, you’ll get to a point where you can perform a full regression test at night and reduce the amount of manual testing to new systems or tricky sections of code. This type of testing can be brittle if not done properly, so be aware that you will need to identify what can be automated and what will need to be done manually.
One aspect of CI that is important is the theory of deploying small chunks often instead of large chunks of code rarely. When you deploy software often (as in several times a week or even several times a day), you’ll gain confidence in your ability to deploy quality code. The deployment process becomes automatic and your automated processes will become hardened and reliable. Feed back from your logging should indicate if something went wrong (in which case you can roll back) or if your improvements have made an impact. Feed back from your customers can also be addressed quickly since your turn-around time consists of analysis and programming followed by your automated testing and deployment process. If your testing and deployment process takes a month, then developers will have forgotten a lot of information of what features they programmed by the time it’s time to deploy them.
Other Candidates for Automation
Resetting passwords of your databases used by your website should be easy to do. Sometimes development continues at a fevered pace only to discover that two dozen (or more) databases are accessed from connection strings in various web.config files scattered on all different servers. You should at least keep a list of where the passwords are stored so you can change them all quickly.
Third party connections can also fall into this category. If you connect to an outside service, you should keep track of where that information is stored and how to change it. It only takes one rouge programmer to make life miserable for a programming shop that consists of hundreds of config files scattered everywhere. If you need to keep programmers out of your production system then you’ll need a method of changing passwords often (like once a month).
Be sure to click the “like” button if this information was helpful!