Summary
In this blog post I’m going to talk about legacy code and my experience with transitioning from Legacy into a new model.
Background
I’ve worked for quite a few companies now and I’ve converted many systems from antiquated legacy code into something more modern, like C#, PHP, Object Oriented, MVC, etc. Legacy code that I have worked with included languages like VB.Net, Basic, FoxPro, MS Access, C, ASP and other systems. My experience has been that a company will hire a “programmer” with little qualifications, other than the fact that they can write code and make small programs work. Once the software created by this person (and it’s almost always one person at first), grows to a limit where it becomes an overwhelming burden on the programmer and starts to drag down the company, some changes need to be made. Usually this is where the professionals get hired and the real work begins. Don’t get me wrong, I’m not bragging here, I was once that guy that produced some code in Basic in a non-structured layout with no design in hand. That’s a fun job as long as you can get out before you have to add features to the mess that was created.
Sometimes legacy code can exist due to the fact that it is so big and so expensive to upgrade or so risky to upgrade that a company will hang on forever. I can tell horror stories about a company that used Fortran for their accounting with thousands of rules that applied to their union negotiated benefits. All customized for the company in a mainframe computer that needed to fixed in time for the year 2000. I’m certain that software was well designed. Unfortunately, the company had two Fortran programmers who had full-time jobs maintaining that software and no plans for replacing the software in the near future. To top that off, the programmers who maintained that system must have been close to retirement. I’m not sure what that company did, I’m betting that events probably forced their hand.
Initial Handling of Legacy Code
First of all, I think developers need to think like a doctor. Doctors have the Hippocratic Oath: “First do no harm”. Let’s face some facts: First, the legacy code is running the business, otherwise, you could just throw it away. Second, the legacy code in place might be the only actual “documentation” of what it was intended to do. It’s always best if there are design documents, because then you can read those and find out what the original intent was. If there are meeting notes, that good too. Otherwise, the only thing left (assuming the original programmer is long gone) is the software itself. Does it have comments? Probably not. Are variable names descriptive? That usually doesn’t happen either. You might need to do some refactoring, but I would do some recon work before attempting anything.
One of the biggest mistakes inexperienced programmers will make is that they ignore version control. So it’s probably also true that you can’t look at earlier versions of the legacy code you are about to work on. The first step in all of this is make sure you have the entire source code package checked into a repository and have some mechanism in place to track changes. This is very critical, because you’ll probably make a mistake and need to roll-back.
Converting Code – My Experiences
I’ve never worked on legacy code that contained unit tests. Can unit tests be created to test the legacy code? This is always a huge challenge. It would be nice if a unit or integration test could be applied to legacy code without changing the code. That would make it easy to convert. Once the unit tests were in place, you could convert the code and use the same unit tests to verify that the changed code matched the output of the original code. Unfortunately, legacy code is usually tightly coupled and many times it’s tied into other systems that cannot be accessed safely from a unit test environment. You might be able to perform some sort of end-to-end automated tests. This could become a time-sink in itself, and my experience is that the company is making demands to enhanced the existing legacy code while you’re attempting to fix it.
One of the methods I used when I converted BTA’s software from two un-normalized databases (using SQL and FoxPro) into one normalized Oracle database server was to write a data converter and re-write the software. This was quite a painful experience and I would not recommend this technique unless your company has full buy-in and the enhancement requests can be reduced to a minimum. In the case of BTA, they had such a data quality issue, that fixing that problem was their customer’s number one demand. It took about a year to complete and the roll-out was difficult, but it worked. In order to be effective I hired two contractors to maintain the existing system and write the converter. My remaining developers worked on the new system. The converter was run at least once a week to refresh our new system database and to find and fix any normalization issues that cropped up in the legacy system. Once the new system was on-line and the initial bugs were worked out, everything ran noticeably smoother. Having one code base that was well structured helped a lot. At that time unit tests were not available and most of our code was in PHP and C.
Years later BTA ran into another issue. This time it was an issue involving the demise of Borland. Borland made a great product at one time called C++ Builder. This was C++ that had forms, allowing the developer to create a stand-alone program in C++ without programming and connecting all the resources necessary to handle events, windows, menus, etc. The product worked the way C# does today. Visual C++ did not have this capability which made it difficult and time-consuming to build an application. Unfortunately, C++ builder went through some changes and the 2006 version was too buggy for our developers to use. So we tested Visual Studio 2005 with C# and discovered that we could quickly convert our software from C++ Builder into Visual C# and have an improved product. So we did, and it cost us about 4 months of conversion time using one developer. This legacy code was easy to convert because we had design documentation, we did not change the database and we could transition from the old to the new system because both worked at the same time. We just continued to use the new software and fix bugs that cropped up until it was better than the legacy software.
About one year after the conversion to C# BTA had ran into an issue that had happened many times before. We were a unix shop (Oracle and PHP), but a smaller company with little possibility of major growth in the near future. This issue was due to the fact that we could not keep a unix expert on staff full-time and there were very few contract companies in the area that could cover our issues. In addition to this issue, BTA took advantage of a one-time tax change that allowed them to collect a large amount of cash that year. So I did some research into what it would take to transform BTA’s unix-based systems into Microsoft based systems. This was outside of the box type of thinking and I proposed that we convert our Oracle server into MS SQL, which cost us next to nothing since the licensing was comparable and the conversion of our software to point to SQL instead of Oracle was very minor (our system had very few stored procedures and field types were kept to a minimum). Our biggest headache would be converting a large PHP web site into C# and IIS. I wrote some helper classes in C# that mimicked the functions in PHP that C# did not perform eactly (like PHP’s substring functions). Then I estimated what it would take in man-hours to convert the entire website into C# and I was tasked at finding a contracting company that could perform the job. This was a job that involved copying the PHP code and pasting it into the code-behind of a C# .net application, then removing the dollar signs on variables and declaring all variables for the page. Then there was some cleanup that occurred for each page. In the end it took about 4-6 months with 3 contractors at a cost of about $140k.
BTA’s benefit was a code base that was entirely C#. After that point, we were able to merge many common objects between our stand-alone application and our website. We were also able to find network administrators with a Microsoft background easier than unix administrators. In addition, and this was our biggest improvement to developer productivity, we were using Visual Studio for all of our code with built-in tools for refactoring code, syntax highlighting, auto-formatting and other capabilities we didn’t have with PHP at the time.
At DealerOn the situation is different. We have a solid database. It might not be the most optimal database design, but it’s organized pretty well and it doesn’t suffer from many integrity issues. What we currently have is a platform in VB.Net that we are slowly converting into C#.Net and eventually we will convert that into an MVC environment. There is a lot of progress in this area of legacy code replacement. Our biggest headache lies with our CMS (Content Management System) software that was built in classic ASP. Our plan is to build a new system (which is already in progress) to utilize C# with MVC and build it with all the new features. We are very averse to adding features to the legacy CMS. Unfortunately, progress is slow and success is uncertain. I am not a big fan of this method of conversion. It only works if a company can staff up and build the new system as quickly as possible. Otherwise the technology used on the new system will become obsolete and possibly be abandoned.
Going Forward
I anticipate that advances in software development technology will bring us to a point where the legacy code will have unit tests. At the very least it will be possible to add unit tests. This will make the job of converting legacy code easier because there will be a more reliable method of verifying the new code. I don’t think the culture of businesses will change such that they will spend the money to hire the best developer when they start their custom software projects. So we’ll always be saddled with code that is tightly coupled, poorly designed and not documented. Therefore, there will always be a painful legacy code conversion project in the future. How you handle these projects will determine how successful you are in converting the code. DealerOn’s platform code conversion is one of the most optimal solutions I’ve seen. We will have a mix of VB and C# for some time until we get to a critical mass of C# vs. VB and we do a final “clean-up” push to convert the remaining VB code. At this time the conversion from VB to C# is not costing the company much money since it is only done when a new enhancement is implemented. Limiting the amount of code converted and allowing a smaller amount of code to be tested before the next conversion takes place.
If you have the option of running two systems side-by-side, it can be a good solution. Especially in accounting situations where verification is necessary. If you can mix legacy and modern code, that is a good solution in situations where you are transforming a website. If you have a faulty database design, then you’re dealing with a more serious situation and any solution can get expensive fast.
Reasons to Convert Legacy Code
Conversion of code from one language to another or from one database system to another is expensive and does not necessarily benefit the company. You should never convert a system just to change to the latest and coolest language. The only time you should convert to a new language is if there is some sort of structural reason for converting. In my experiences I converted a FoxPro database system into Oracle and PHP. Why? Because FoxPro was not a web language and BTA had an application to allow roofing contractors to enter their bid prices. The FoxPro application they had was designed to be packaged into an installer shipped by floppy (later downloaded by FTP), and installed on the contractor’s PC with a subset of data to bid on. Then the contractor was required to export their bid prices and email the zip file to BTA for analysis with other contractors. By building a website all of the headache of preparing data, shipping software, dealing with installation issues (contractors do not buy top of the line PC’s) and importing corrupt data vanished overnight. Now contractors get a login id and password. They log in, enter their prices and log out. It goes right into a database server. FoxPro is a stand-alone database system with limited security capabilities and no web capabilities. This needed to change and our target was Oracle and PHP. If MySQL had been a mature product in the late 90’s we probably would have opted for MySQL. Today, I would investigate MongoDB and workup an estimate based on that technology.
Converting from PHP to C# was a very rare example of a conversion that could have been avoided. If BTA were a growing company and could afford a Unix administrator on staff. If PHP had better tools at the time. Today, I probably would look at Ruby and Python.
Another consideration you should factor into taking the conversion plunge: What languages do your developers know? Are you going to have to re-staff? Switching from PHP to C# did not break the hearts of my team of developers at BTA. Some developers might take issue with converting from a Unix environment to a Microsoft environment (or the other direction). I personally have no allegiance to either, but I’m a rare developer with extensive knowledge of both types of infrastructures.
Finally, if you have a valid reason and the benefits outweigh the costs, then don’t hesitate. Set down your plan and get to work. The sooner you get through the pain of conversion the sooner you can move on to better things.
It is never an ending battle with legacy code. What you have today will be legacy next year or two down the road. One thing in our industry is that people don't look at profitability. Change for change's sake is not a good thing. However, we have to continuously look at improvement with an eye for the future extension. As far as system-wide is concerned, this article is a good read: http://msdn.microsoft.com/en-us/magazine/dn451442.aspx
That really is a good read. DealerOn has been leaning toward the SOA framework. The goal is to divide the system into smaller systems that can be upgraded/rebuilt in future versions in any language or OS. For this company the SOA model works, I'm not sure how well it would have worked for BTA's system.