The purpose of this blog post is to describe how legacy code gets perpetuated years beyond its useful life and how to put a stop to it. I will also make my case for why this process needs to be stopped. I will attempt to target this article to managers as well as the developers who are continuing the creation of legacy code. I would like to make a disclaimer up front that my intent is not to insult anybody. My intent is to educate people. To get them out of their shell and think about newer technologies and why the new technologies were invented.
First, I’m going to give a bit of my own history as a developer so there is some context to this blog post.
I have been a developer since 1977 or 78 (too long to remember the exact year I wrote my first basic program). I learned Basic. Line numbered Basic. I join the Navy in 1982 and I was formally educated on how to repair minicomputers, specifically the UYK-20 and the SNAP-II. In those days you troubleshoot down to the circuit level (and sometimes replace a chip). While I was in the Navy, the Apple Macintosh was introduced and I bought one because it fit in the electronics storage cabinet in the transmitter room on the ship (which I had a key to). I programmed with Microsoft Basic and I wanted to write a game or two. My first game was a battleship game that had graphical capabilities (and use of the mouse, etc. etc.). It didn’t take long before the line numbers became a serious problem and I finally gave in and decided to look at other languages. I was very familiar with Basic syntax, so switching was like speaking a foreign language. It was going to slow me down.
I stopped at the computer store (that really was “the day”), and I saw Mac Pascal in a box on the shelf and the back of the box had some sample code. It looked similar to Basic and I bought it. I got really good at Pascal. Line numbers were a thing of the past. In fact I used Pascal until I was almost out of college. At that time the University of Michigan was teaching students to program using Pascal (specifically Borland Pascal). Object oriented programming was just starting to enter the scene and several instructors actually taught OOP concepts such as encapsulation and polymorphism. This was between 1988 and 1994.
The reason I used Pascal for so long was due to the fact that the Macintosh built-in functions used Pascal headers. The reason I abandoned Pascal was due to the fact the the World Wide Web was invented around that time and everything unixish was in C. I liked C and my first C programs were written in Borland C.
What’s my Beef with Legacy Programmers?
If your development knowledge ended with ASP and/or VB without learning and using a unit testing framework, the MVC framework (or equivalent), ORMs, Test Driven Development, SOLID principles, then you probably are oblivious to how much easier it is to program within a modern environment. This situation happens because programmers focus on solving a problem with the tools they have in their tool box. If a programer doesn’t spend the time to learn new tools, then they will always apply the same set of tools to the problem. These are the programmers that I am calling Legacy Programmers.
Legacy Programmers, who am I talking about?
First, let’s describe the difference between self-taught and college educated developers. I get a lot of angry responses about developers who have a degree and can’t program. There are a lot of them. This does not mean that the degree is the problem and it also should not lead one to believe that a developer without a degree will be guaranteed to be better than a degree carrying developer. Here’s a Vin diagram to demonstrate the pool of developers available:
The developers that we seek to create successful software is the intersection of the degree/non-degree programmers. This diagram is not intended to indicate that there are more or less of either developer in the intersection called solid developers. In my experience, there are more college degree carrying developers in this range due to the fact that most solid developers will be wise enough to realize that they need to get the piece of paper that states that they have a minimum level of competence. It’s unfortunate that colleges are churning out so many really bad developers, but to not obtain the degree usually indicates that the individual is not motivated to expand their knowledge (there are exceptions).
OK, now for a better Vin diagram of the world of developers (non-unix developers):
In the world of Microsoft language developers there are primarily VB and C# developers. Some of these developers only know VB (and VB Script) as indicated by the large blue area. I believe these individuals outnumber the total C# programmers judging by the amount of legacy code I’ve encountered over the years, but I could be wrong on this assumption. The number of C# programmers are in red and the number of individuals who know C# and not VB are small. That’s due to the fact that C# programmers don’t typically come from an environment where C# is their first language. In the VB circle, people who learned VB and not C# are normally self-taught (colleges don’t typically teach VB). Most of the developers that know VB and C# come from the C# side and learn VB, or if they are like me, they were self-taught before they obtained a degree and ended up with knowledge of both languages.
The legacy programmers I’m talking about in this blog post fall into the blue area and do not know C#.
Where am I Going With This?
OK, let’s cut to the chase. In my review of legacy code involving VB.Net and VB Script (AKA Classic ASP) I have discovered that developers who built the code do not understand OOP patterns, SOLID principles, Test Driven Development, MVC, etc. Most of the code in the legacy category fit the type of code I used to write in the early 90’s before I discovered how to modularize software using OOP patterns. I forced myself to learn the proper way to break a program into objects. I forced myself to develop software using TDD methods. I forced myself to learn MVC (and I regret not learning it when it first came out). I did this because these techniques solved a lot of development issues. These techniques help to contain bugs, enhance debugging capabilities, reduce transient errors and make it easier to enhance without breaking existing features (using unit tests to perform regression testing). If you have no idea what I’m talking about, or maybe you’ve heard the term and you have never actually used these techniques in your daily programming tasks, you’re in trouble. Your career is coming to an end unless you learn now.
Let’s talk about some of these techniques and why they are so important. First, you need to understand Object Oriented Programming. The basics of this pattern is that an object is built around the data that you are working on (I’m not talking about database data, I’m talking about a small atomic data item, like an address or personnel information or maybe a checking account). The data is contained inside the object and then methods are built to act on this data. The object itself knows all about the data that is acted on and external objects that use this object do not need to understand nuances of the data (like how to dispose of allocated resources or how to keep a list properly ordered). This allows the developer that creates the object to hide details, debug the methods that act on the data and not have to worry about another object corrupting the data or not using it correctly. It also makes the software modular.
On a grander scale is a framework called MVC (Model View Controller). This is not the only framework available, but it is the most common web development framework in Microsoft Visual Studio. What this framework does is give a clean separation between the C# (or VB) code and the web view code (which is typically written in HTML, JQuery and possibly Razor). ASP mixes all the business logic in with the view code and there are no controllers. In MVC, the controllers will wire-up the business logic with the view code. Typically the controller will communicate with an AJAX call that gives the web-based interface a smooth look. The primary reason for breaking code in this fashion is to be able to put the business logic in a test harness and wrap unit tests around each feature that your program performs.
Unit testing is very important. It takes a lot of practice to perform Test Driven Development (TDD) and it’s easier to develop your code first and then create unit tests, until you learn the nuances of unit testing, object mocking and dependency injection. Once you have learned about mocking and dependency injection, you’ll realize that it is more efficient to create the unit tests first, then write your code to pass the test. After your code is complete, each feature should be matched up with a set of unit tests so that any future changes can be made with the confidence that you (or any other developer) will not break previously defined features. Major refactoring can be done in code designed this way because any major change that breaks the code will show up in the failure of one or more unit tests.
ORMs (Object Relational Mapping) are becoming the technique to use for querying data from a database. An ORM with LINQ is a cleaner way to access a database than ADO or a DataSet. One aspect of an ORM that makes it powerful is the fact that a query written in LINQ can use the context sensitive editor functions of Visual Studio to avoid syntax errors. The result set is contained in a object with properties that produces code that is easier to read.
APIs (Application Programming Interface) and SOA (Service Oriented Architecture) are the new techniques. These are not just buzzwords that sound cool. These were invented to solve an issue that legacy code has: You are stuck with the language you developed your entire application around. By using Web APIs to separate your view with your business logic, you can reuse your business logic for multiple interfaces. Like mobile applications, custom mini-applications, mash-ups with 3rd party software, etc. The MVC framework is already setup to organize your software in this fashion. To make the complete separation, you can create two MVC projects, one containing the view components and one containing the model and controller logic. Then your HTML and JQuery code can access your controllers in the same way they would if they were in the same project (using Web API). However, different developers can work on different parts of the project. A company can assign developers to define and develop the APIs to provide specific data. Then developers/graphic artists can develop the view logic independently. Once the software is written, other views can be designed to connect to the APIs that have been developed, such as reports or mobile. Other APIs can be designed using other languages including unix languages running on a unix (or Linux) machine. Like Python or Ruby. The view can still communicate to the API because the common language will be either JSON or XML.
Another aspect of legacy code that is making enhancements difficult is the use of tightly coupled code. There is a principle called SOLID. This is not the only principle around, but it is a very good one. By learning and applying SOLID to any software development project, you can avoid the problems of tightly coupled code, procedures or methods that perform more than one task, untestable code, etc.
The last issue is the use of VB itself. I have seen debates of VB vs. C#, and VB has all the features of C#, etc. etc. Unfortunately, VB is not Microsoft’s flagship language, it’s C#. This is made obvious by the fact that many of C# Visual Studio functions are finally going to come to the VB world in Visual Studio 2015. The other issue with VB is that it is really a legacy language with baggage left over from the 1980’s. VB was adapted to be object oriented not designed to be an object oriented language. C# on the other hand is only an OOP language. If you’re searching for code on the internet there is a lot more MVC and Web API code in C# than in VB. This trend is going to continue and VB will become the “Fortran” of the developer world. Don’t say I didn’t warn ya!
If you are developing software and are not familiar with the techniques I’ve described so far, you need to get educated fast. I have kept up with the technology because I’m a full-blooded nerd and I love to solve development issues. I evolved my knowledge because I was frustrated with producing code that contained a lot of bugs and was difficult to enhance later on. I learned each of these techniques over time and have applied them with a lot of success. If I learn a new technique and it doesn’t solve my issue, I will abandon it quickly. However, I have had a lot of success with the techniques that I’ve described in this blog post. You don’t need to take on all of these concepts at once, but start with C# and OOP. Then work your way up to unit testing, TDD and then SOLID.