Using Scripts

Summary

In this post I’m going to show how you can improve developer productivity by steering developers to use scripts where it makes sense.

Setting up IIS

As a back-end developer, I spend a lot of time standing up and configuring new APIs.  One of the tools I use to reduce the amount of man-hours it takes me to get an API up and running is PowerShell.  Personally, the world “PowerShell” makes my skin crawl.  Why?  Because it’s a scripting language that has a syntax that feels like something built by Dr. Frankenstein.  To get beyond my lack of memorizing each and every syntax nuance of PowerShell, I use a lot of Google searches.  Fortunately, after several years of use, I’ve become familiar with some of the capabilities of PowerShell and I can save a lot of time when I create IIS sites.

Now you’re probably wondering where I save my time, since the script has to be written and the site only needs to be setup once.  The time saving comes when I have to change something minor or I have to establish the site on another environment.  In the case of another environment, I can change the path name or url to match the destination environment and run my script to create all the pieces necessary to run my API.

Before I get into the script, I’m going to go through the steps to create an IIS site for WebApi for .Net Core 2.0.

Step 1: Setup the Application Pool.

  • Open IIS and navigate to the Application Pool node.
  • Right-click and add.
  • Give your app pool a name that matches your site, so you can identify it quickly.  This will save you troubleshooting time.
  • For .Net Core, you need to set the .Net Framework Version to “No Managed Code”

Step 2: Setup IIS site.

  • Right-click on the “Sites” node and “Add Web Site”
  • I usually name my site the same as the URL or at least the sub-domain of the URL so I can find it quick.  Again, this name is not used by the system, it is only used when I have to troubleshoot and saving time troubleshooting is the number one priority.
  • Set the path to point to the root of your publish directory (make sure you have done a local publish from Visual Studio before performing this step).
  • Type in the host name.  This is the URL of your site.  If you are just testing locally, you can make up a URL that you’ll need to add to the Hosts file.
  • Select the Application Pool that you created earlier.

Step 3: Optional, setup Hosts file.  Use this step if you are setting up a local website for testing purposes only.

  • Navigate to C:\Windows\System32\drivers\etc
  • Edit “Hosts” file.  You might have to edit with Administrator rights.
  • Add your URL to the hosts file: “127.0.0.1       MyDotNetWebApi.com”

Now try to visualize performing this process for each environment that your company uses.  For me, that comes out to be about half a dozen environments.  In addition to this, each developer that will need your API setup on their PC will need to configure this.  Here’s where the time-saving comes in.  Create the PowerShell script first, and test the script.  Never create the site by hand.  Then use the script for each environment.  Provide the script for other developers to setup their own local copy.  This can be accomplished by posting the script on a wiki page or checking the script into your version control system with the code.

Here’s what an example PowerShell script would look like:

# if you get an error when executing this script, comment the line below to exclude the WebAdministration module
Import-Module WebAdministration

#setup all IIS sites here
$iisAppList = 
    "MyDotNetWebApi,MyDotNetWebApi.franksurl.com,c:\myapicodedirectory,", # use "v4.0" for non-core apps
    "testsite2,testsite2.franksurl.com,c:\temp,v4.0"
    


# setup the app pools and main iis websites
foreach ($appItem in $iisAppList)
{
    $temp = $appItem.split(',')
    
    $iisAppName = $temp[0]
    $iisUrl = $temp[1]
    $iisDirectoryPath = $temp[2]
    $dotNetVersion = $temp[3]
    
    #navigate to the app pools root
    cd IIS:\AppPools\

    if (!(Test-Path $iisAppName -pathType container))
    {
        #create the app pool
        $appPool = New-Item $iisAppName
        $appPool | Set-ItemProperty -Name "managedRuntimeVersion" -Value $dotNetVersion
    }
    
    
    #navigate to the sites root
    cd IIS:\Sites\
    
    if (!(Test-Path $iisAppName -pathType container))
    {
        #create the site
        $iisApp = New-Item $iisAppName -bindings @{protocol="http";bindingInformation=":80:" + $iisUrl} -physicalPath $iisDirectoryPath
        $iisApp | Set-ItemProperty -Name "applicationPool" -Value $iisAppName
        
        Write-Host $iisAppName "completed."
    }
}

c:

You can change the sites listed in the list of sites at the top of the script.  The app pool is setup first, followed by the IIS web site.  Each section will test to see if the app pool or site is already setup (in which is skips).  So you can run the PowerShell script again without causing errors.  Keep the script in a safe location, then you can add to the list and re-run the PowerShell script.  If you need to recreate your environment, you can create all sites with one script.

If you delete all your IIS sites and app pools you might run into the following error:

New-Item : Index was outside the bounds of the array.

To fix this “issue” create a temporary web site in IIS (just use a dummy name like “test”).  Run the script, then delete the dummy site and it’s app pool.  The error is caused by a bug where IIS is trying to create a new site ID.

Setting a Directory to an Application

There are time when you need to convert a directory in your website into it’s own application.  To do this in IIS, you would perform the following steps:

  • Expand the website node
  • Right-click on the directory that will be converted and select “Convert to Application”
  • Click “OK”

To perform this operation automatically in a script, add the following code after creating your IIS sites above (just before the “c:” line of code):

$iisAppList = 
    "MyDotNetWebApi,MyAppDirectory,MyAppDirectory,MyDotNetWebApi\MyAppDirectory,"

foreach ($appItem in $iisAppList)
{
    $temp = $appItem.split(',')

    $iisSiteName = $temp[0]
    $iisAppName = $temp[1]
    $iisPoolName = $temp[2]
    $iisPath = $temp[3]
    $dotNetVersion = $temp[4]

    cd IIS:\AppPools\

    if (!(Test-Path $iisPoolName -pathType container))
    {
        #create the app pool
        $appPool = New-Item $iisPoolName
        $appPool | Set-ItemProperty -Name "managedRuntimeVersion" -Value $dotNetVersion
    }

    cd IIS:\Sites\
    
    # remove and re-apply any IIS applications
    if (Get-WebApplication -Site $iisSiteName -Name $iisAppName)
    {
        Remove-WebApplication -Site $iisSiteName -Name $iisAppName
    }

    ConvertTo-WebApplication -PSPath $iisPath -ApplicationPool $iisPoolName
}

Now add any applications to the list.  The first parameter is the name of the IIS site.  The second parameter is the application name.  The third parameter is the pool name (this script will create a new pool for the application).  The fourth parameter is the path to the folder.  The last parameter is the .Net version (use v4.0 if this application is not a .Net Core project).

For the above script to run, you’ll need to create a blank directory called: C:\myapicodedirectory\MyAppDirectory

Now execute the script and notice that the MyAppDirectory has been turned into an application:

You can add as many applications to each IIS website as you need by adding to the list.

What the code above does is it creates an application pool first (if it doesn’t exist already).  Then it removes the application from the site followed by converting a directory to an application for a specific site.  This script can also be executed multiple times without causing duplicates or errors.

If you run into problems executing your script, you might have to run under an Administrator.  I usually startup powershell in Administrator mode.  Then I navigate to the directory containing the script.  Last, I execute the script.  This allows me to see any errors in the console window.  If you right-click on the ps1 file and run with powershell, your script could fail and exit before you can read the error message.

Feel free to copy the scripts from above and build your own automated installation scripts.

 

XML Serialization

Summary

In this post I’m going to demonstrate the proper way to serialize XML and setup unit tests using xUnit and .Net Core.  I will also be using Visual Studio 2017.

Generating XML

JSON is rapidly taking over as the data encoding standard of choice.  Unfortunately, government agencies are decades behind the technology curve and XML is going to be around for a long time to come.  One of the largest industries industries still using XML for a majority of their data transfer encoding is the medical industry.  Documents required by meaningful use are mostly encoded in XML.  I’m not going to jump into the gory details of generating a CCD.  Instead, I’m going to keep this really simple.

First, I’m going to show a method of generating XML that I’ve seen many times.  Usually coded by a programmer with little or no formal education in Computer Science.  Sometimes programmers just take a short-cut because it appears to be the simplest way to get the product out the door.  So I’ll show the technique and then I’ll explain why it turns out that this is a very poor way of designing an XML generator.

Let’s say for instance we wanted to generate XML representing a house.  First we’ll define the house as a record that can contain square footage.  That will be the only data point assigned to the house record (I mentioned this was going to be simple right).  Inside of the house record will be lists of walls and lists of roofs (assume a house could have two or more roofs like a tri-level configuration).  Next, I’m going to make a list of windows for the walls.  The window block will have a “Type” that is a free-form string input and the roof block will also have a “Type” that is a free-form string.  That is the whole definition.

public class House
{
  public List Walls = new List();
  public List Roofs = new List();
  public int Size { get; set; }
}

public class Wall
{
  public List Windows { get; set; }
}

public class Window
{
  public string Type { get; set; }
}

public class Roof
{
  public string Type { get; set; }
}

The “easy” way to create XML from this is to use the StringBuilder and just build XML tags around the data in your structure.  Here’s a sample of the possible code that a programmer might use:

public class House
{
  public List<Wall> Walls = new List<Wall>();
  public List<Roof> Roofs = new List<Roof>();
  public int Size { get; set; }

  public string Serialize()
  {
    var @out = new StringBuilder();

    @out.Append("<?xml version=\"1.0\" encoding=\"utf-8\"?>");
    @out.Append("<House xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\">");

    foreach (var wall in Walls)
    {
      wall.Serialize(ref @out);
    }

    foreach (var roof in Roofs)
    {
      roof.Serialize(ref @out);
    }

    @out.Append("<size>");
    @out.Append(Size);
    @out.Append("</size>");

    @out.Append("</House>");

    return @out.ToString();
  }
}

public class Wall
{
  public List<Window> Windows { get; set; }

  public void Serialize(ref StringBuilder @out)
  {
    if (Windows == null || Windows.Count == 0)
    {
      @out.Append("<wall />");
      return;
    }

    @out.Append("<wall>");
    foreach (var window in Windows)
    {
      window.Serialize(ref @out);
    }
    @out.Append("</wall>");
  }
}

public class Window
{
  public string Type { get; set; }

  public void Serialize(ref StringBuilder @out)
  {
    @out.Append("<window>");
    @out.Append("<Type>");
    @out.Append(Type);
    @out.Append("</Type>");
    @out.Append("</window>");
  }
}

public class Roof
{
  public string Type { get; set; }

  public void Serialize(ref StringBuilder @out)
  {
    @out.Append("<roof>");
    @out.Append("<Type>");
    @out.Append(Type);
    @out.Append("</Type>");
    @out.Append("</roof>");
  }
}

The example I’ve given is a rather clean example.  I have seen XML generated with much uglier code.  This is the manual method of serializing XML.  One almost obvious weakness is that the output produced is a straight line of XML, which is not human-readable.  In order to allow human readable XML output to be produced with an on/off switch, extra logic will need to be incorporated that would append the newline and add tabs for indents.  Another problem with this method is that it contains a lot of code that is unnecessary.  One typo and the XML is incorrect.  Future editing is hazardous because tags might not match up if code is inserted in the middle and care is not taken to test such conditions.  Unit testing something like this is an absolute must.

The easy method is to use the XML serializer.  To produce the correct output, it is sometimes necessary to add attributes to properties in objects to be serialized.  Here is the object definition that produces the same output:

public class House
{
  [XmlElement(ElementName = "wall")]
  public List Walls = new List();

  [XmlElement(ElementName = "roof")]
  public List Roofs = new List();

  [XmlElement(ElementName = "size")]
  public int Size { get; set; }
}

public class Wall
{
  [XmlElement(ElementName = "window")]
  public List Windows { get; set; }

  public bool ShouldSerializenullable()
  {
    return Windows == null;
  }
}

public class Window
{
  public string Type { get; set; }
}

public class Roof
{
  public string Type { get; set; }
}

In order to serialize the above objects into XML, you use the XMLSerializer object:

public static class CreateXMLData
{
  public static string Serialize(this House house)
  {
    var xmlSerializer = new XmlSerializer(typeof(House));

    var settings = new XmlWriterSettings
    {
      NewLineHandling = NewLineHandling.Entitize,
      IndentChars = "\t",
      Indent = true
    };

    using (var stringWriter = new Utf8StringWriter())
    {
      var writer = XmlWriter.Create(stringWriter, settings);
      xmlSerializer.Serialize(writer, house);

      return stringWriter.GetStringBuilder().ToString();
    }
  }
}

You’ll also need to create a Utf8StringWriter Class:

public class Utf8StringWriter : StringWriter
{
  public override Encoding Encoding
  {
    get { return Encoding.UTF8; }
  }
}

Unit Testing

I would recommend unit testing each section of your XML.  Test with sections empty as well as containing one or more items.  You want to make sure you capture instances of null lists or empty items that should not generate XML output.  If there are any special attributes, make sure that the XML generated matches the specification.  For my unit testing, I stripped newlines and tabs to compare with a sample XML file that is stored in my unit test project.  As a first-attempt, I created a helper for my unit tests:

public static class XmlResultCompare
{
  public static string ReadExpectedXml(string expectedDataFile)
  {
    var assembly = Assembly.GetExecutingAssembly();
    using (var stream = assembly.GetManifestResourceStream(expectedDataFile))
    {
      using (var reader = new StreamReader(stream))
      {
        return reader.ReadToEnd().RemoveWhiteSpace();
      }
    }
  }

  public static string RemoveWhiteSpace(this string s)
  {
    s = s.Replace("\t", "");
    s = s.Replace("\r", "");
    s = s.Replace("\n", "");
  return s;
  }
}

If you look carefully, I ‘m compiling my xml test data right into the unit test dll.  Why am I doing that?  The company that I work for as well as most serious companies use continuous integration tools such as a build server.  The problem with a build server is that your files might not make it to the same directory location on the build server that they are on your PC.  To ensure that the test files are there, compile them into the dll and reference them from the namespace using Assembly.GetExecutingAssembly().  To make this work, you’ll have to mark your xml test files as an Embedded Resource (click on the xml file and change the Build Action property to Embedded Resource).  To access the files, which are contained in a virtual directory called “TestData”, you’ll need to use the name space, the virtual directory and the full file name:

XMLCreatorTests.TestData.XMLHouseOneWallOneWindow.xml

Now for a sample unit test:

[Fact]
public void TestOneWallNoWindow()
{
  // one wall, no windows
  var house = new House { Size = 2000 };
  house.Walls.Add(new Wall());

  Assert.Equal(XmlResultCompare.ReadExpectedXml("XMLCreatorTests.TestData.XMLHouseOneWallNoWindow.xml"), house.Serialize().RemoveWhiteSpace());
}

Notice how I filled in the house object with the size and added one wall.  The ReadExpectedXml() method will remove whitespaces automatically, so it’s important to remove them off the serialized version of house in order to match.

Where to Get the Code

As always you can go to my GitHub account and download the sample application (click here).  I would recommend downloading the application and modifying it as a test to see how all the piece work.  Add a unit test to see if you can match your expected xml with the xml serializer.

 

 

 

Environment Configuration

Introduction

We’ve all modified and saved variables in the web.config and app.config files of our applications.  I’m going to discuss issues that occur when a company maintains multiple environments.

Environments

When I talk about environments, I’m talking about entire systems that mimic the production system of a company.  An example is a development environment, a quality environment and sometimes a staging environment.  There may be other environments as well for purposes such as load testing and regression testing.  Typically a development or QA environment is smaller than the production environment, but all the web, database, etc. servers should be represented.

The purpose of each environment is to provide the ability to test your software during different stages of development.  For this reason it’s best to think of software development like a factory with a pipe-line of steps that must be performed before delivering the final product.  The software developer might create software on his/her local machine and then check in changes to a version control system (like Git or TFS).  Then the software is installed on the development environment using a continuous integration/deployment system like TeamCity, BuildMaster or Jenkins.  Once the software is installed on the development system, then developer-level integration testing can begin.  Does it work with other parts of the software?  Does it work with real servers (like IIS)?

Once a feature is complete and it works in the development environment, it can be scheduled to be quality checked or QA’d.  The feature branch can be merged and deployed to the QA environment and tested by QA personnel and regression scripts can be run.  If the software is QA complete, then it can be merged up to the next level.  Load testing is common in larger environments.  A determination must be made if the new feature causes a higher load on the system than the previous version.  This may be accomplished through the use of an environment to contain the previous version of code.  Baseline load numbers may also be maintained to be used as a comparison for future load testing.  It’s always best to keep a previous system because hardware can be upgraded and that will change the baseline numbers.

Once everything checks out, then your software is ready for deployment.  I’m not going to go into deployment techniques in this blog post (there are so many possibilities).  I’ll leave that for another post.  For this post, I’m going to dig into configuration of environments.

Setting up the physical or virtual hardware should be done in a fashion that mimics the production system as close as possible.  If the development, QA and production systems are all different, then it’ll be difficult to get software working for each environment.  This is a waste of resources and needs to be avoided at all costs.  Most systems today use virtual servers on a host machine, making it easy to setup an identical environment.  The goal in a virtual environment is to automate the setup and tear-down of servers so environments can be created fresh when needed.

The Web.Config and App.Config

One issue with multiple environments is the configuration parameters in the web.config and app.config files (and now the appsettings.json for .net core apps).  There are other config files that might exist, like the nlog.config for nlog setup, but they all fall into the same category: They are specific for each environment.

There are many solutions available.  BuildMaster provides variable injection.  Basically, a template web.config is setup in BuildMaster and variables contain the data to be inserted for each deployment type.  There is a capability in Visual Studio called web.config transformation.  Several web.config files can be setup in addition to a common web.config to be merged when different configurations are built.  Powershell can be used to replace text in a web.config.  One powershell script per environment, or a web.config can have sections commented and powershell can remove the comment lines for the section that applies for the environment being deployed to.

These are all creative ways of dealing with this problem, but they all lack a level of security that is needed if your company has isolation between your developers and the production environment.  At some point your production system becomes so large that you’ll need an IT department to maintain the live system.  That’s the point where you need to keep developers from tweaking production code whenever they feel the need.  Some restrictions must come into play.  One restriction is the passwords used by the databases.  Production databases should be accessible for only those in charge of the production system.

What this means is that the config parameters cannot be checked into your version control system.  They’ll need to be applied when the software is deployed.  One method would be to insert parameters in for the values that will be applied at the end of deployment.  Then a list of variables and their respective parameters can exist on a server in each environment.  That list would be specific to the environment and can be used by powershell to replace variables in the web.config file by the deployment server.

There is a system called Zookeeper that can contain all of your configuration parameters and centrally accessed.  The downside to this is that you’ll need a cluster of servers to provide the throughput for a large system, plus another potential central point of failure.  The complexity of your environment just increased for the sole purpose of keeping track of configuration parameters.

Local Environments

Local environments are a special case problem.  Each software developer should have their own local environment to use for their development efforts.  By creating a miniature environment on the software developer’s desktop/laptop system, the developer has full flexibility to test code without the worry of destroying data that is used by other developers.  An automated method of refreshing the local environment is necessary.  Next comes the issue of configuring the local environment.  How do you handle local configuration files?

Solution 1: Manually alter local config files.

The developer needs to exclude the config files from check-in otherwise the local changes could end up in the version control software for one developer.

Solution 2: Manually alter local config files and exclude from check-in by adding to ignore settings.

If there are any automatic updates to the config by Visual Studio, those changes will not be checked into your version control software.

Solution 3: Create config files with replaceable parameters and include a script to replace them as a post build operation.

Same issue as solution 1, the files could get checked-in to version control and that would wipe out the changes.

Solution 4: Move all config settings to a different config file.  Only maintain automatic settings in web.config and app.config.

This is a cleaner solution because the local config file can be excluded from version control.  A location (like a wiki page or the version control) must contain a copy of the local config file with the parameters to be replaced.  Then a script must be run the first time to populate the parameters or the developer must manually replace the parameters to match their system.  The web.config and app.config would only contain automatic parameters.

One issue with this solution is that it would be difficult to convert legacy code to use this method.  A search and replace for each parameter must be performed, or you can override the ConfigurationManager object and implement the AppSettings method to store the values in the custom config file (ditto for the database connection settings).

Why Use Config Files at all?

One of the questions I see a lot is the question about using the config file in the first place.  Variables used by your software can be stored in a database.  If your configuration data is specific to the application, then a table can be setup in your database to store application config data.  This can be cached and read all at once by your application.  There is only one catch: What about your database connection strings?  At least one connection string will be needed and probably a connection string to your caching system (if you’re using Redis or Memcached or something).  That connection string will be used to read the application config variables, like other database connection strings, etc.  Keep this method in mind if you hope to keep your configuration rats-nest to something manageable.  Each environment would read it’s config variables from the database that belongs to it.

The issues listed in the local environment are still valid.  In other words, each environment would need it’s own connection and the production environment (and possibly other environments) connection would need to be kept secret from developers.

Custom Solutions

There are other solutions to the config replacement problem.  The web.config can be deserialized by a custom program and then each config parameter value can be replaced by a list of replacement values using the key (for appSettings as an example).  Then you can maintain a localized config settings file for each environment.  This technique has the added bonus of not breaking if someone checks in their local copy into the version control software.

Here is a code snippet to parse the appSettings section of the web.config file:

// read the web.config file
var document = new XmlDocument();
document.Load(fileName);

var result = document.SelectNodes("//configuration/appSettings");

// note: you might also need to check for "//configuration/location/appSettings"

foreach (XmlNode childNodes in result)
{
    foreach (XmlNode keyItem in childNodes)
    {
        if (keyItem.Attributes != null)
        {
            if (keyItem.Attributes["key"] != null && keyItem.Attributes["value"] != null)
            {
                var key = keyItem.Attributes["key"].Value;

                // replace your value here
                keyItem.Attributes["value"].Value = LookupValue(key);
            }
        }
    }
}

// save back the web.config file
document.Save(fileName);

The same technique can be used for .json files to read, deserialize, alter then save.

Conclusion

Most aspects of system deployment have industry-wide standards.  Managing configuration information isn’t one of them.  I am not sure why a standard has not been established.  Microsoft provides a method of setting up multiple environment configuration files, but it does not solve the issue of securing a production environment from the developers.  It also does not work when the developer executes their program directly from Visual Studio.  The transform operation only occurs when a project is published.

BuildMaster has a good technique.  That product can be configured to deploy different config files for each environment.  This product does not deploy to the developer’s local environment, so that is still an issue.  The final issue with this method is that automatically added parameters will not show up and must be added by hand.

Jenkins doesn’t seem to have any capability to handle config files (if you know of a plug-in for Jenkins, leave a comment, I’d like to try it out).  Jenkins leaves the dev-ops person with the task of setting up scripts to modify the config parameters during deployment/promotion operations.

If you have other ideas that work better than these, please leave a comment or drop me an email.  I’d like to hear about your experience.

 

Ransomware

My wife and I recently took a vacation in Nevada so we could hike trails in parks surrounding Las Vegas.  To get a sense of what we did on vacation you can check out the pictures on my hiking blog by clicking here (I lined up the post dates to be the date when we visited each park).  This article is not about the fun we had on our vacation.  This is about a ransomware attack that occurred while I was on vacation.

I’m a software engineer.  That means that I know how to hack a computer and I also know how to protect myself from hackers.  But I’m not immune to making mistakes.  The mistake I made was that I have passwords that I haven’t changed in forever.  Like all large companies, Apple had been hacked about a year ago and their entire database of passwords were obtained and distributed throughout the hacking community (yeah, “community”, that’s what they are).  The Apple hack involved their cloud service, which I didn’t pay much attention to, because I don’t use their cloud storage.  What I didn’t pay attention to was that their cloud services play a part in some of the iPhone and iPad security.

If you clicked the link above and started looking through my pictures, you’ll notice that the first place we visited was Death Valley.  I discovered that it was supposed to be a record 120 degrees that day and I wanted to know what it would feel like to stand outside in 120 degree weather!  Yeah, I’m crazy like that.  As it turned out it got up to 123 degrees according to the thermometer built into the Jeep that we rented.  Dry heat or not, 123 degrees was HOT!

I use my Canon Rebel xti for the photos that I post on my hiking blog, but I also take pictures with my iPhone in order to get something on my Facebook account as well as get a few panoramas.  When we arrived at the sand dunes it was just before noon local time or before 3PM Eastern time.  My photos indicate they were taken around 2:07, but the camera time is off by more than an hour and the camera is on Eastern time.

I took a few pictures of the dunes and then I pulled out my iPhone and was about to take pictures when I got a warning that my phone is locked and I need to send an email to get instructions on how to pay for access to my iPhone.  So I used my usual pin number to unlock my iPhone and it worked correctly.  I was annoyed, but I moved on.  I thought it was some “clever” advertisement or spam notification.

When we returned to the resort, I sat down on the couch and started to use my iPad to do some reading.  My iPad had the same message on the front and a pin number was set up.  Unfortunately, for me, I never set a pin number on my iPad because I only use it at home to surf the web, read a book and maybe play a game.  What the hackers did was setup a pin number on my iPad.  What an annoyance.  Ransomware on any of my devices is no more worrisome than a rainy day.  It’s more of an irritation than anything.  I have off-site backups for my desktop machine and I know how to restore the entire machine in a few hours.  Hacking my iPad while I was on vacation (and the second day of vacation to boot), was really annoying.  Primarily because I don’t have access to all of my adapters, computers and tools.  My wife has a tiny laptop that we use for minor stuff.  It has a grand total of 64 gigabytes of storage space.  So she installed iTunes on it (with some difficulty) and we restored the iPad and got everything back to normal.

After returning from vacation, I cleaned out all of my spam emails for the past couple of weeks and discovered these emails:

It appears that someone manually logged into my iCloud account, enabled lost mode and put in a message, for both of my devices.  The iPhone was first, which was locked by pin number, so they couldn’t change that.  The iPad, however, was not setup with a pin number, so they went in and set their own.  Or so I assumed, when I saw it was asking for a 6-digit pin.  Apparently, the pin that shows up is the pin that is set when the device is first setup.  My pin was not the same for the iPad as I used on my iPhone (which was what I tried when I first saw it appear).

My wife and I changed the password on our iCloud accounts when we were at the resort and she set the two-factor on for the iCloud.  Of course, that is a bit of a problem if I lose my iPhone, but it prevents anyone from hacking my iCloud account.

One thing that makes me wonder… how was Apple storing my password?  Are the passwords stored in clear text?  Are they encrypted with an algorithm that allows the password to be decrypted?  That seems foolish.  Maybe Apple was using something like a weak MD5 hash and the hacked database was decrypted using a brute force method like this: 25-GPU cluster cracks every standard Windows password in <6 hours.  I know that the correct password was used to login to the iCloud using a browser.  The notification sent to my email account proves it.

How to Protect Yourself

The first level of protection that I have is that I assume I will get hacked.  From that assumption, I have plans in place to reduce any damages that can occur.  First, I have an off-site backup system that backs up everything I can’t replace on my desktop computer.  Pictures, documents, etc.  They are all backed up.  Some of my software is on GitHub so I don’t worry about backing up my local repository directory.  I have backup systems in place on my blogs and my website.

Next in line is the two-factor system.  This is probably one of the best ways to protect yourself.  Use your phone as your second factor and protect your phone from theft.  If someone steals your phone, they probably don’t have your passwords.  If someone has your passwords, they don’t have your phone.  If you see messages arrive at your phone with a second factor pin number, then you need to change the password for the account that requested it.

Next, you should turn on notifications of when someone logs into your account (if the feature is available).  Like the notifications about my iCloud being used in the emails above, I can see that someone accessed my account when I wasn’t around.  If someone is silently logging into your account, a lot more damage can be done before you figure out what is going on.

If you’re using email as your second factor, you need to protect your email account as though it was made of gold.  Change your email password often, in case the provider has been hacked.  Your email account is most likely used as a method of resetting your password on other sites.  So if a hacker gets into your email account, they can guess at other sites that you might have accounts and reset your password to get in.  I have my own urls and hosts so I create and maintain my own email system.  If my email system gets hacked it’s 100% my fault.

Disable unused accounts.  If you’re like me, you have hundreds of web accounts for stores and sites that you signed up for.  Hey, they were “free” right?  Unfortunately, your passwords are out there and any one site can get hacked.  You can’t keep track of which sites got hacked last week.  Keep a list of sites that you have accounts on.  Review that list at least annually and delete accounts on sites you no longer use.  If the site doesn’t allow you to delete your account, then go in and change the password to something that is completely random and long (like 20 characters or more depending on what the site will allow).

Use a long password if possible.  Just because the minimum password is 8 characters doesn’t mean you need to come up with an 8 character password.  If sites allow you to use 30 characters, then make something up.  There is an excellent XKCD comic demonstrating password strengths: click here.  For companies providing websites with security, I would recommend you allow at least 256 characters for passwords.  Allow your customers to create a really strong password.  Storage is cheap.  Stolen information is expensive.

Don’t use the same password for everything.  That’s a bit obvious, but people do crazy things all the time.  The problem with one password for all is that any site that gets hacked means a hacker can get into everything you have access to.  It also means you need to change all of your passwords.  If you use different passwords or some sort of theme (don’t make the theme obvious), then you can change your most important passwords often and the passwords to useless sites less often.

Last but not Least…

Don’t pay the ransom!  If you pay money, what happens if you don’t get the unlock key?  What happens if you unlock your computer and it gets re-ransomed again?  Plan for this contingency now.  Paying ransom only funds a criminal organization.  The more money they make performing these “services” the more likely they will continue the practice.  I like to think of these people as telemarketers, if nobody paid, then they would all be out of work.  Since telemarketing continues to this day, someone, somewhere is buying something.  Don’t keep the ransomware cycle going.

 

Documentation

Why is there No Documentation?

I’m surprised at the number of developers who don’t create any documentation.  There are ways to self-document code and there are packages to add automatic help to an API to document the services.  Unfortunately, that’s not enough.  I’ve heard all the arguments:

  • It chews up programming time.
  • It rapidly becomes obsolete.
  • The code should explain itself.
  • As a programmer, I’m more valuable if I keep the knowledge to myself.

Let me explain what I mean by documentation.  What should you document outside of your code?  Every time you create a program you usually create a design for it.  If it’s just drawings on a napkin, then it needs to go someplace where other developers can access it if they need to work on your software.  Configuration parameters need documentation.  Not a lot, just enough to describe what the purpose is.  Installation?  If there is some trick to making your program work in a production, QA or staging environment, then you should document it.  Where is the source code located?  Is there a deployment package?  Was there a debate on the use of one technology over another?

So what happens when you have no documentation?  First, you need to find the source code.  Hopefully it’s properly named and resides in the current repository.  Otherwise, you may be forced to dig through directories on old servers or in some cases the source might not be available.  If the source is not available your options are limited: De-compile, rewrite or work with what is built.  Looking at source code written by a programmer that no longer exists at your company is a common occurrence (did that programmer think not documenting made him/her valuable?).  Usually such code is tightly coupled, contains poorly named methods and variables with no comments.  So here are the arguments of why you should do documentation:

  • Reverse engineering software chews up programming time.
  • Most undocumented code is not written to be self-explanatory.
  • Attempting to figure out why a programmer wrote a program the way he/she did can be difficult and sometimes impossible.
  • Programmers come and go no matter how little documentation exists.

It’s easy to go overboard with documentation.  This can be another trap.  Try to keep your documentation to just the facts.  Don’t write long-winded literature.  Keep it technical.  Start with lists of notes.  Expand as needed.  Remove any obsolete documentation.

Getting Your Documentation Started

The first step to getting your documentation started is to decide on a place to store it.  The best option is a wiki of some sort.  I prefer Confluence or GitHub.  They both have clean formatting and are easy to edit and drop in pictures/screenshots.

So you have a wiki setup and it’s empty.  Next, create some subjects.  If you have several software projects in progress, start with those.  Create a subject for each project and load up all the design specifications.  If your development team is performing a retrospective, type it directly into the wiki.  If there is a debate or committee meeting to discuss a change or some nuance with the software, type it into the wiki.  They can just be raw historical notes.

Next, add “documentation” as a story point to your project, or add it to each story.  This should be a mandatory process.  Make documentation part of the development process.  Developers can just add a few notes, or they can dig in and do a brain-dump.  Somewhere down the road a developer not involved in the project will need to add an enhancement or fix a bug.  That developer will have a starting point.

Another way to seed the wiki is to create subjects for each section of your existing legacy code and just do a dump of notes in each section.  Simple information off the top of everyone’s head is good enough.  The wiki can be reworked at a later date to make things more organized.  Divide an conquer.  If a developer has fixed a bug in a subsystem that nobody understands, that developer should input their knowledge into the wiki.  This will save a lot of time when another developer has to fix a bug in that system and it will prevent your developers from becoming siloed.

You Have Documentation – Now What?

One of the purposes of your technical documentation is to train new people.  This is something that is overlooked a lot.  When a new developer is hired, that person can get up to speed faster if they can just browse a wiki full of technical notes.  With this purpose in mind, you should expand your wiki to include instructions on how to setup a desktop/laptop for a development environment.  You can also add educational material to get a developer up to speed.  This doesn’t mean that you need to type in subjects on how to write an MVC application.  You should be able to link to articles that can be used by new developers to hone their skills.  By doing this, you can keep a new person busy while you coordinate your day to day tasks, instead of being tied down to an all-day training session to get that person up to speed.

Your documentation should also contain a subject on your company development standards.  What frameworks are acceptable?  What processes must be followed before introducing new technologies to the system?  Coding standards?  Languages that can be used?  Maybe a statement of goals that have been laid down.  What is the intended architecture of your system?  If your company has committees to decide what the goal of the department is, then maybe the meeting minutes would be handy.

Who is Responsible for Your Documentation

Everyone should be responsible.  Everyone should participate.  Make sure you keep backups in case something goes wrong or someone makes a mistake.  Most wiki software is capable of tracking revisions.  Documentation should be treated like version control.  Don’t delete anything!  If you want to hide subjects that have been deprecated, then create a subject at the bottom for all your obsolete projects.  When a project is deprecated, move the wiki subject to that folder.  Someday, someone might ask a question about a feature that used to exist.  You can dig up the old subject and present what used to exist if necessary.  This is especially handy if you deprecated a subsystem that was a problem-child due to it’s design.  If someone wants to create that same type of mess, they can read what experience was learned from the obsolete subsystem.  The answer can be: “We tried that, it didn’t work and here’s why…” or it can be: “We did that before and here’s how it used to work…”

If you make the task of documenting part of the work that is performed, developers can add documentation as software is created.  Developers can modify documentation when bugs are fixed or enhancements are made.  Developers can remove or archive documentation when software is torn down or replaced.  The documentation should be part of the software maintenance cycle.  This will prevent the documentation from getting out of sync with your software.

 

Three Tier Architecture

There is a lot of information on the Internet about the three-tier architecture, three-tier structure and other names for the concept of breaking a program into tiers.  The current system design paradigm is to break your software into APIs and the three-tier architecture still applies.  I’m going to try and explain the three-tier architecture from the point of practicality and explain the benefits of following this structure.  But first, I have to explain what happens when a hand-full of inexperienced programmers run in and build a system from the ground up…

Bad System Design

Every seasoned developer knows what I’m talking about.  It’s the organically grown, not very well planned system.  Programming is easy.  Once a person learns the basic syntax, the world is their oyster!  Until the system get really big.  There’s a tipping point where tightly-coupled monolithic systems become progressively more difficult and time consuming to enhance.  Many systems that I have worked on were well beyond that point when I started working on them.  Let me show a diagram of what I’m talking about:

Technically, there is no firm division between front-end and back-end code.  The HTML/JavaScript is usually embedded in back-end code and there is typically business code scattered between the HTML.  Sometimes systems like this contain a lot of stored procedures which does nothing more than marry you to the database that was first used to build the application.  Burying business code in stored procedures also has the additional financial burden of ensuring you are performing your processing on the product with the most expensive licensing fees.  When your company grows, you’ll be forced to purchase more and more licenses for the database in order to keep up.  This proceeds exponentially and you’ll come to a point where any tiny change in a stored procedure, function or table causes your user traffic to overwhelm the hardware that your database runs on.

I often-times joke about how I would like to build a time machine for no other reason than to go back in time, sit down with the developers of the system I am working on and tell them what they should do.  To head it off before it becomes a very large mess.  I suspect that this craziness occurs because companies are started by non-programmers and they hook-up with some young and energetic programmer with little real-world experience who can make magic happen.  Any programmer can get the seed started.  Poor programming practices and bad system design doesn’t show up right away.  A startup company might only have a few hundred users at first.  Hardware is cheap, SQL server licenses seem reasonable, everything is working as expected.  I also suspect that those developers move on when the system becomes too difficult to manager.  They move on to another “new” project that they can start bad.  Either that, or they learn their lesson and the next company they work at is lucky to get a programmer with knowledge of how not to write a program.

Once the software gets to the point that I’ve described, then it takes programmers like me to fix it.  Sometimes it takes a lot of programmers with my kind of knowledge to fix it.  Fixing a system like this is expensive and takes time.  It’s a lot like repairing a jet while in flight.  The jet must stay flying while you shut down one engine and upgrade it to a new one.  Sound like fun?  Sometimes it is, usually it’s not.

Three-Tier Basics

In case you’re completely unfamiliar with the three-tier system, here is the simplified diagram:

It looks simple, but the design is a bit nuanced.  First of all, the HTML, JavaScript, front-end frameworks, etc. must be contained in the front-end box.  You need isolation from the back-end or middle-tier.  The whole purpose of the front-end is to handle the presentation or human interface part of your system.  The back-end or middle-tier is all the business logic.  It all needs to be contained in this section.  It must be loosely coupled and unit tested.  Preferably with an IOC container like AutoFac.  The database must be nothing more than a container for your data.  Reduce special features as much as possible.  Your caching system is also located in this layer.

The connection between the front-end and back-end is usually an API connection using REST.  You can pass data back and forth between these two layers using JSON or XML or just perform “get”, “post”, “delete” and “put” operations.  If you treat your front-end as a system that communicates with another system called your back-end, you’ll have a successful implementation.  You’ll still have hardware challenges (like network bandwidth and server instances), but those can be solved much quicker and cheaper than rewriting software.

The connection between the back-end and database has another purpose.  Your goal should be to make sure your back-end is database technology independent as much as possible.  You want the option of switching to a database with cheap licensing costs.  If you work hard up-front, you’ll get a pay-back down the road when your company expands to a respectable size and the database licensing cost starts to look ugly.

What About APIs?

The above diagram looks like a monolithic program at first glance.  If you follow the rules I already laid out, you’ll end up with one large monolithic program.  So there’s one more level of separation you must be aware of.  You need to logically divide your system into independent APIs.  You can split your system into a handful of large APIs or hundreds of smaller APIs.  It’s better to build a lot of smaller APIs, but that can depend on what type of system is being built and how many logical boxes you can divide it into.  Here’s an example of a very simple system divided into APIs:

This is not a typical way to divide your APIs.  Typically, an API can share a database with another API and the front-end can be separate from the API itself.  For now, let’s talk about the advantages of this design as I’ve shown.

  1. Each section of your system is independent.  If a user decides to consume a lot of resources by executing a long-running task, it won’t affect any other section of your system.  You can contain the resource problem.  In the monolithic design, any long-running process will kill the entire system and all users will experience the slow-down.
  2. If one section of your system requires heavy resources, then you can allocate more resources for that one section and leave all other sections the same.  In other words, you can expand one API to be hosted by multiple servers, while other APIs are each on one server.
  3. Deployment is easy.  You only deploy the APIs that are updated.  If your front-end is well isolated, then you can deploy a back-end piece without the need for deployment of your front-end.
  4. Your technology can be mixed.  You can use different database technologies for each section.  You can use different programming languages for each back-end or different frameworks for each front-end.  This also means that you have a means to convert some or all of your system to a Unix hosted system.  A new API can be built using Python or PHP and that API can be hosted on a Linux virtual box.  Notice how the front-end should require no redesign as well as your database.  Just the back-end software for one subsection of your system.

Converting a Legacy System

If you have a legacy system built around the monolithic design pattern, you’re going to want to take steps as soon as possible to get into a three-tier architecture.  You’ll also want to build any new parts using an API design pattern.  Usually it takes multiple iterations to remove the stored procedures and replace the code with decoupled front-end and back-end code.  You’ll probably start with something like this:

In this diagram the database is shared between the new API and the legacy system, which is still just a monolithic program.  Notice how stored procedures are avoided by the API on the right side.  All the business logic must be contained in the back-end so it can be unit tested.  Eventually, you’ll end up with something like this:

Some of your APIs can have their own data while others rely on the main database.  The monolithic section of your system should start to shrink.  The number of stored procedures should shrink.  This system is already easier to maintain than the complete monolithic system.  You’re still saddled with the monolithic section and the gigantic database with stored procedures.  However, you now have sections of your system that is independent and easy to maintain and deploy.  Another possibility is to do this:

In this instance the front-end is consistent.  One framework with common JavaScript can be contained as a single piece of your system.  This is OK because your front-end should not contain any business logic.

The Front-End

I need to explain a little more about the front-end that many programmers are not aware of.  Your system design goal for the front-end is to assume that your company will grow so large that you’re going to have front-end specialists.  These people should be artists who work with HTML, CSS and other front-end languages.  The front-end designer is concerned with usability and aesthetics.  The back-end designer is concerned about accuracy and speed.  These are two different skill sets.  The front-end person should be more of a graphic designer while the back-end person should be a programmer with knowledge of scalability and system performance.  Small companies will hire a programmer to perform both tasks, but a large company must begin to divide their personnel into distinct skill-sets to maximize the quality of their product.

Another overlooked aspect of the front-end is that it is going to become stale.  Somewhere down the road your front-end is going to be ugly compared to the competition.  If your front-end code is nothing more than HTML, CSS, JavaScript and maybe some frameworks, you can change the look and feel of the user interface with minimum disruption.  If you have HTML and JavaScript mixed into your business logic, you’ve got an uphill battle to try and upgrade the look and feel of your system.

The Back-End

When you connect to a database the common and simple method is to use something like ODBC or ADO.  Then SQL statements are sent as strings with parameters to the database directly.  There are many issues with this approach and the current solution is to use an ORM like Entity Framework, NHibernate or even Dapper.  Here’s a list of the advantages of an ORM:

  1. The queries are in LINQ and most errors can be found at compile time.
  2. The context can be easily changed to point to another database technology.
  3. If you including mappings that match your database, you can detect many database problems at compile time, like child to parent relationship issues (attempt to insert a child record with no parent).
  4. An ORM can break dependency with the database and provide an easy method of unit testing.

As I mentioned earlier, you must avoid stored procedures, functions and any other database technology specific features.  Don’t back yourself into a corner because MS SQL server had a feature that made it easy to use as an enhancement.  If your system is built around a set of stored procedures, you’ll be in trouble if you want to switch from MS SQL to MySQL, or from MS SQL to Oracle.

Summary

I’m hoping that this blog post is read by a lot of entry-level programmers.  You might have seen the three-tier architecture mentioned in your school book or on a website and didn’t realize what it was all about.  Many articles get into the technical details of how to implement a three-tier architecture using C# or some other language, glossing over the big picture of “why” it’s done this way.  Be aware that there are also other multi-tier architectures that can be employed.  Which technique you use doesn’t really matter as long as you know why it’s done that way.  When you build a real system, you have to be aware of what the implications of your design are going to be five or ten years from now.  If you’re just going to write some code and put it into production, you’ll run into a brick wall before long.  Keep these techniques in mind when you’re building your first system.  It will pay dividends down the road when you can enhance your software just be modifying a small API and tweak some front-end code.

 

Dear Computer Science Majors…

Introduction

It has been a while since I wrote a blog posts directed at newly minted Computer Science Majors.  In fact, the last time I wrote one of these articles was in 2014.  So I’m going to give all of you shiny-new Computer Scientists a leg-up in the working world by telling you some inside information about what companies need from you.  If you read through my previous blog post (click here) and read through this post, you’ll be ahead of the pack when you submit your resume for that first career starting job.

Purpose of this Post

First of all, I’m going to tell you my motivation for creating these posts.  In other words: “What’s in it for Frank.”  I’ve been programming since 1978 and I’ve been employed as a software engineer/developer since 1994.  One of my tasks as a seasoned developer is to review submitted programming tests, create programming tests, read resumes, submit recommendations, interview potential developers, etc.  By examining the code submitted by a programming test, I can tell a lot about the person applying for a job.  I can tell how sophisticated they are.  I can tell if they are just faking it (i.e. they just Googled results, made it work and don’t really understand what they are doing).  I can tell if the person is interested in the job or not.  One of the trends that I see is that there is a large gap between what is taught in colleges and what is needed by a company.  That gap has been increasing for years.  I would like to close the gap, but it’s a monstrous job.  So YOU, the person reading this blog that really wants a good job, must do a little bit of your own leg-work.  Do I have your attention?  Then read on…

What YOU Need to Do

First, go to my previous blog post on this subject and take notes on the following sections:

  • Practice
  • Web Presence

For those who are still in school and will not graduate for a few more semesters, start doing this now:

  • Programming competitions
  • Resume Workshops
  • Did I mention: Web Presence?

You’ll need to decide which track you’re going to follow and try to gain deep knowledge in that area.  Don’t go out and learn a hundred different frameworks, twenty databases and two dozen in-vogue languages.  Stick to something that is in demand and narrow your expertise to a level that you can gain useful knowledge.  Ultimately you’ll need to understand a subject well enough to make something work.  You’re not going to be an expert, that takes years of practice and a few failures.  If you can learn a subject well enough to speak about it, then you’re light-years ahead of the average newly minted BS-degree.

Now it’s time for the specifics.  You need to decide if you’re going to be a Unix person or a .Net person.  I’ve done both and you can cross-over.  It’s not easy to cross-over, but I’m proof that it can happen.  If you survive and somehow end up programming as long as I have, then you’ll have experience with both.  Your experience will not be even between the two sides.  It be weighted toward one end or the other.  In my case, my experience is weighted toward .Net because that is the technology that I have been working on more recently.

If you’re in the Unix track, I’m probably not the subject expert on which technologies you need to follow.  Python, Ruby, which frameworks, unit testing, you’ll need to read up and figure out what is in demand.  I would scan job sites such as Glass Door, Indeed, LinkedIn, Stack Exchange or any other job sites just to see what is in demand.  Look for entry level software developer positions.  Ignore the pay or what they are required to do and just take a quick tally of how many companies are asking for Python, PHP, Ruby, etc.  Then focus on some of those.

If you’re in the .Net track, I can tell you exactly what you need to get a great paying job.  First, you’re going to need to learn C#.  That is THE language of .Net.  Don’t let anybody tell you otherwise.  Your college taught you Java?  No problem, you’re language knowledge is already 99% there.  Go to Microsoft’s website and download the free version of Visual Studio (the community version) and install it.  Next, you’ll need a database and that is going to be MS SQL Server.  Don’t bother with MS Access.  There is a free version of SQL as well.  In fact the developer version is fully functional, but you probably don’t need to download and install that.  When you install Visual Studio the Express version of SQL is normally installed with it.  You can gain real database knowledge from that version.

Follow this list:

  • Install Visual Studio Community.
  • Check for a pre-installed version of MS SQL Server Express.
  • Go out and sign up for a GitHub account.  Go ahead, I’ll wait (click here).
  • Download and install SourceTree (click here).

Now you have the minimum tools to build your knowledge.  Here’s a list of what you need to learn, using those tools:

  • How to program in C# using a simple console application.
  • How to create simple unit tests.
  • Create an MVC website, starting with the template site.
  • How to create tables in MS SQL Server.
  • How to insert, delete, update and select data in MS SQL Server.
  • How to create POCOs, fluent mappings and a database context in C#.
  • How to troubleshoot a website or API (learn some basic IIS knowledge).
  • How to create a repository on GitHub.
  • How to check-in your code to GitHub using SourceTree.

That would pretty much do it.  The list above will take about a month of easy work or maybe a hard-driven weekend.  If you can perform these tasks and talk intelligently about them, you’ll have the ability to walk into a job.  In order to seal-the-deal, you’ll have to make sure this information is correctly presented on your resume.  So how should you do that?

First, make sure you polish your projects and remove any commented code, remove any unused or dead-code.  If there are tricky areas, put in some comments.  Make sure you update your “Read-Me” file on GitHub for each of your projects.  Put your GitHub URL near the top of your resume.  If I see a programmer with a URL to a GitHub account, that programmer has already earned some points in my informal scale of who gets the job.  I usually stop reading the resume and go right to the GitHub account and browse their software.  If you work on a project for some time, you can check-in your changes as you progress.  This is nice for me to look at, because I can see how much effort you are putting into your software.  If I check the history and I see the first check-in was just a blank solution followed by several check-ins that show the code being refactored and re-worked into a final project, I’m going to be impressed.  That tells me that you’re conscious enough to know to get your code checked-in and protected from loss immediately.  Don’t wait for the final release.  Building software is a lot like producing sausage.  The process is messy, but the final product is good (assuming you like sausage).

If you really want to impress me and by extension, any seasoned programmer, create a technical blog.  Your blog can be somewhat informal, but you need to make sure you express your knowledge of the subject.  A blog can be used as a tool to secure a job.  It doesn’t have to get a million hits a day to be successful.  In fact, if your blog only receives hits from companies that are reading your resume, it’s a success.  You see, the problem with the resume is that it doesn’t allow me to into your head.  It’s just a sheet of paper (or more if you have a job history) with the bare minimum information on it.  It’s usually just to get you past the HR department.  In the “olden” days, when resumes were mailed with a cover letter, the rule was one page.  Managers would not have time to read novels, so they wanted the potential employee to narrow down their knowledge to one page.  Sort of a summary of who you are in the working world.  This piece of paper is compared against a dozen or hundreds of other single-page resumes to determine which hand-full of people would be called in to be interviewed.  Interviews take a lot of physical time, so the resume reading needs to be quick.  That has changed over the years and the rules don’t apply to the software industry as a whole.  Even though technical resumes can go on for two or more pages, the one-page resume still applies for new graduates.  If you are sending in a resume that I might pick up and read, I don’t want to see that you worked at a Walmart check-out counter for three years, followed by a gig at the car wash.  If you had an intern job at a tech company where you got some hands-on programming experience, I want to see that.  If you got an intern with Google filing their paperwork, I don’t care.

Back to the blog.  What would I blog about if I wanted to impress a seasoned programmer?  Just blog about your experience with the projects you are working on.  It can be as simple as “My First MVC Project”, with a diary format like this:

Day 1, I created a template MVC project and started digging around.  Next I modified some text in the “View” to see what would happen.  Then I started experimenting with the ViewBag object.  That’s an interesting little object.  It allowed me to pass data between the controller and the view.

And so on…

Show that you did some research on the subject.  Expand your knowledge by adding a feature to your application.  Minor features like, search, column sort, page indexing are important.  It demonstrates that you can take an existing program and extend it to do more.  When you enter the working world, 99% of what you will create will be a feature added to code you never wrote.  If your blog demonstrates that you can extend existing code, even code you wrote yourself, I’ll be impressed.

Taking the Test

Somewhere down the line, you’re going to generate some interest.  There will be a company out there that will want to start the process and the next step is the test.  Most companies require a programming test.  At my point in my career the programming test is just an annoyance.  Let’s call it a formality.  As a new and inexperienced programmer, the test is a must.  It will be used to determine if you’re worth someone’s time to interview.  Now I’ve taken many programming tests and I’ve been involved in designing and testing many different types of programming tests.  The first thing you need to realize is that different companies have different ideas about what to test.  If it was up to me, I would want to test your problem solving skills.  Unfortunately, it’s difficult to test for that skill without forcing you to take some sort of test that may ask for your knowledge in a subject that you don’t have.  I’ve seen tests that allow the potential hire to use any language they want.  I’ve also seen tests that give very gray specifics and are rated according to how creative the solution is.  So here are some pointers for passing the test:

  • If it’s a timed test, try to educate yourself on the subjects you know will on the tests before it starts.
  • If it’s not a timed test, spend extra time on it.  Make it look like you spent some time to get it right.
  • Keep your code clean.
    • No “TODO” comments.
    • No commented code or dead-code.
    • Don’t leave code that is not used (another description for dead-code).
    • Follow the naming convention standards, no cryptic variable names (click here or here for examples).
  • If there is extra credit, do it.  The dirty secret is that this is a trick to see if you’re going to be the person who does just the minimum, or you go the extra mile.
  • Don’t get too fancy.
    • Don’t show off your knowledge by manually coding a B-tree structure instead of using the “.Sort()” linq method.
    • Don’t perform something that is obscure just to look clever.
    • Keep your program as small as possible.
    • Don’t add any “extra” features that are not called for in the specification (unless the instructions specifically tell you to be creative).

When you’re a student in college, you are required to analyze algorithms and decide which is more efficient in terms of memory use and CPU speed.  In the working world, you are required to build a product that must be delivered in a timely manner.  Does it matter if you use the fastest algorithm?  It might not really matter.  It will not make a difference if you can’t deliver a working product on time just because you spent a large amount of your development time on a section of code that is only built for the purpose of making the product a fraction faster.  Many companies will need a product delivered that works.  Code can be enhanced later.  Keep that in mind when you’re taking your programming test.  Your program should be easy to follow so another programmer can quickly enhance it or repair bugs.

The Interview

For your interview, keep it simple.  You should study up on general terms, in case you’re asked.  Make sure you understand these terms:

  • Dependency injection
  • Polymorphism
  • Encapsulation
  • Single purpose
  • Model/View/Controller
  • REST
  • Base Class
  • Private/public methods/classes
  • Getters/Setters
  • Interface
  • Method overloading

Here’s a great place to do your study (click here).  These are very basic concepts and you should have learned them in one of your object oriented programming classes.  Just make sure you haven’t forgotten about them.  Make sure you understand the concepts that you learned from any projects that you checked into GitHub.  If you learned some unit testing, study the terms.  Don’t try to act like an expert for your first interview.  Just admit the knowledge that you have.  If I interview you and you have nothing more than a simple understanding of unit testing, I’m OK with that.  All it means is that there is a base-line of knowledge that you can build on with my help.

Wear a suit, unless explicitly specified that you don’t need one.  At a minimum, you need to dress one step better than the company dress policy.  I’m one of the few who can walk into an interview with red shoes, jeans and a technology T-shirt and get a job.  Even though I can get away with such a crazy stunt, I usually show up in a really nice suit.  To be honest, I only show up in rags when I’m Luke-warm about a job and I expect to be wooed.  If I really like the company, I look sharp.  The interviewers can tell me to take off my tie if they think I’m too stuffy.  If you’re interviewing for your first job, wear a business suit.

Don’t BS your way through the interview.  If you don’t know something, just admit it.  I ask all kinds of questions to potential new hires just to see “if” by chance they know a subject.  I don’t necessarily expect the person to know the subject and it will not have a lot of bearing on the acceptance or rejection of the person interviewing.  Sometimes I do it just to find out what their personality is like.  If you admit that you know SQL and how to write a query, I’m going to hand you a dry-erase marker and make you write a query to join two tables together.  If you pass that, I’m going to give you a hint that I want all records from the parent table to show up even if it doesn’t have child records.  If you don’t know how to do a left-outer join, I’m not going to hold it against you.  If you are able to write a correct or almost correct left join, I’ll be impressed.  If you start performing a union query or try to fake it with a wild guess, I’ll know you don’t know.  I don’t want you to get a lucky guess.  I’m just trying to find out how much I’m going to have to teach you after you’re hired.  Don’t assume that another candidate is going to get the job over you just because they know how to do a left outer join.  That other candidate might not impress me in other ways that are more important.  Just do the best you can and be honest about it.

Don’t worry about being nervous.  I’m still nervous when I go in for an interview and I really have not reason to be.  It’s natural.  Don’t be insulted if the interviewer dismisses you because they don’t think you’ll be a fit for their company.  You might be a fit and their interview process is lousy.  Of course, you might not be a fit.  The interviewer knows the company culture and they know the type of personality they are looking for.  There are no hard-fast rules for what an interviewer is looking for.  Every person who performs an interview is different.  Every company is different.

What Does a Career in Software Engineering Look Like

This is where I adjust your working world expectations.  This will give you a leg-up on what you should focus on as you work your first job and gain experience.  Here’s the general list:

  • Keep up on the technologies.
  • Always strive to improve your skills.
  • Don’t be afraid of a technology you’ve never encountered.

Eventually you’re going to get that first job.  You’ll get comfortable with the work environment and you’ll be so good at understanding the software you’ve been working on that you won’t realize the world of computer programming has gone off on a different track.  You won’t be able to keep up on everything, but you should be able to recognize a paradigm shift when it happens.  Read.  Read a lot.  I’m talking about blogs, tech articles, whatever interests you.  If your company is having issues with the design process, do some research.  I learned unit testing because my company at the time had a software quality issue from the lack of regression testing.  The company was small and we didn’t have QA people to perform manual regression testing, so bugs kept appearing in subsystems that were not under construction.  Unit testing solved that problem.  It was difficult to learn how to do unit testing correctly.  It was difficult to apply unit testing after the software was already built.  It was difficult to break the dependencies that were created from years of adding enhancements to the company software.  Ultimately, the software was never 100% unit tested (if I remember correctly, it was around 10% when I left the company), but the unit tests that were applied had a positive effect.  When unit tests are used while the software is being developed, they are very effective.  Now that the IOC container is main-stream, dependencies are easy to break and unit tests are second nature.  Don’t get complacent about your knowledge.  I have recently interviewed individuals who have little to no unit testing experience and they have worked in the software field for years.  Now they have to play catch-up, because unit testing is a requirement, not an option.  Any company not unit testing their software is headed for bankruptcy.

APIs are another paradigm.  This falls under system architecture paradigms like SOA and Microservices.  The monolithic application is dying a long and slow death.  Good riddance.  Large applications are difficult to maintain.  They are slow to deploy.  Dependencies are usually everywhere.  Breaking a system into smaller chunks (called APIs) can ease the deployment and maintenance of your software.  This shift from monolithic design to APIs started to occur years ago.  I’m still stunned at the number of programmers that have zero knowledge of the subject.  If you’ve read my blog, you’ll know that I’m a big fan of APIs.  I have a lot of experience designing, debugging and deploying APIs.

I hope I was able to help you out.  I want to see more applicants that are qualified to work in the industry.  There’s a shortage of software developers who can do the job and that problem is getting worse every year.  The job market for seasoned developers is really good, but the working world is tough because there is a serious shortage of knowledgeable programmers.  Every company I’ve worked for has a difficult time filling a software developer position and I don’t see that changing any time in the near future.  That doesn’t mean that there is a shortage of Computer Science degrees graduating each year.  What it means is that there are still too many people graduating with a degree that just to measure up.  Don’t be that person.

Now get started on that blog!

 

 

 

 

The Case for Unit Tests

Introduction

I’ve written a lot of posts on how to unit test, break dependencies, mocking objects, creating fakes, dependency injection and IOC containers.  I am a huge advocate of writing unit tests.  Unit tests are not the solution to everything, but they do solve a large number of problems that occur in software that is not unit tested.  In this post, I’m going to build a case for unit testing.

Purpose of Unit Tests

First, I’m going to assume that the person reading this post is not sold on the idea of unit tests.  So let me start by defining what a unit test is and what is not a unit test.  Then I’ll move on to defining the process of unit testing and how unit tests can save developers a lot of time.

A unit test is a tiny, simple test on a method or logic element in your software.  The goal is to create a test for each logical purpose that your code performs.  For a given “feature” you might have a hundred unit tests (more or less, depending on how complex the feature is).  For a method, you could have one, a dozen or hundreds of unit tests.  You’ll need to make sure you can cover different cases that can occur for the inputs to your methods and test for the appropriate outputs.  Here’s a list of what you should unit test:

  • Fence-post inputs.
  • Obtain full code coverage.
  • Nullable inputs.
  • Zero or empty string inputs.
  • Illegal inputs.
  • Representative set of legal inputs.

Let me explain what all of this means.  Fence-post inputs are dependent on the input data type.  If you are expecting an integer, what happens when you input a zero?  What about the maximum possible integer (int.MaxValue)?  What about minimum integer (int.MinValue)?

Obtain full coverage means that you want to make sure you hit all the code that is inside your “if” statements as well as the “else” portion.  Here’s an example of a method:

public class MyClass
{
    public int MyMethod(int input1)
    {
        if (input1 == 0)
        {
            return 4;
        }
        else if (input1 > 0)
        {
            return 2;
        }
        return input1;
    }
}

How many unit tests would you need to cover all the code in this method?  You would need three:

  1. Test with input1 = 0, that will cover the code up to the “return 4;”
  2. Test with input = 1 or greater, that will cover the code to “return 2;”
  3. Test with input = -1 or less, that will cover the final “return input1;” line of code.

That will get you full coverage.  In addition to those three tests, you should account for min and max int values.  This is a trivial example, so min and max tests are overkill.  For larger code you might want to make sure that someone doesn’t break your code by changing the input data type.  Anyone changing the data type from int to something else would get failed unit tests that will indicate that they need to review the code changes they are performing and either fix the code or update the unit tests to provide coverage for the redefined input type.

Nullable data can be a real problem.  Many programmers don’t account for all null inputs.  If you are using an input type that can have null data, then you need to account for what will happen to your code when it receives that input type.

The number zero can have bad consequences.  If someone adds code and the input is in the denominator, then you’ll get a divide by zero error, and you should catch that problem before your code crashes.  Even if you are not performing a divide, you should probably test for zero, to protect a future programmer from adding code to divide and cause an error.  You don’t necessarily have to provide code in your method to handle zero.  The example above just returns the number 4.  But, if you setup a unit test with a zero for an input, and you know what to expect as your output, then that will suffice.  Any future programmer that adds a divide with that integer and doesn’t catch the zero will get a nasty surprise when they execute the unit tests.

If your method allows input data types like “string”, then you should check for illegal characters.  Does your method handle carriage returns?  Unprintable characters?  What about an empty string?  Strings can be null as well.

Don’t forget to test for your legal data.  The three tests in the previous example test for three different legal inputs.

Fixing Bugs

The process of creating unit tests should occur as you are creating objects.  In fact, you should constantly think in terms of how you’re going to unit test your object, before you start writing your object.  Creating software is a lot like a sausage factory and even I write objects before unit tests as well as the other way around.  I prefer to create an empty object and some proposed methods that I’ll be creating.  Just a small shell with maybe one or two methods that I want to start with.  Then I’ll think up unit tests that I’ll need ahead of time.  Then I add some code and that might trigger a thought for another unit test.  The unit tests go with the code that you are writing and it’s much easier to write the unit tests before or just after you create a small piece of code.  That’s because the code you just created is fresh in your mind and you know what it’s supposed to do.

Now you have a monster that was created over several sprints.  Thousands of lines of code and four hundred unit tests.  You deploy your code to a Quality environment and a QA person discovers a bug.  Something you would have never thought about, but it’s an easy fix.  Yeah, it was something stupid, and the fix will take about two seconds and you’re done!

Not so fast!  If you find a bug, create a unit test first.  Make sure the unit test triggers the bug.  If this is something that blew up one of your objects, then you need to create one or more unit tests that feeds the same input into your object and forces it to blow up.  Then fix the bug.  The unit test(s) should pass.

Now why did we bother?  If you’re a seasoned developer like me, there have been numerous times that another developer unfixes your bug fix.  It happens so often, that I’m never surprised when it does happen.  Maybe your fix caused an issue that was unreported.  Another developer secretly fixes your bug by undoing your fix, not realizing that they are unfixing a bug.  If you put a unit test in to account for a bug, then a developer that unfixes the bug will get an error from your unit test.  If your unit test is named descriptively, then that developer will realize that he/she is doing something wrong.  This episode just performed a regression test on your object.

Building Unit Tests is Hard!

At first unit tests are difficult to build.  The problem with unit testing has more to do with object dependency than with the idea of unit testing.  First, you need to learn how to write code that isn’t tightly coupled.  You can do this by using an IOC container.  In fact, if you’re not using an IOC container, then you’re just writing legacy code.  Somewhere down the line, some poor developer is going to have to “fix” your code so that they can create unit tests.

The next most difficult concept to overcome is learning how to mock or fake an object that is not being unit tested.  These can be devices, like database access, file I/O, smtp drivers, etc.  For devices, learn how to use interfaces and wrappers.  Then you can use Moq to mock your unit tests.

Unit Tests are Small

You need to be conscious of what you are unit testing.  Don’t create a unit test that checks a whole string of objects at once (unless you want to consider those as integration tests).  Limit your unit tests to the smallest amount of code you need in order to test your functionality.  No need to be fancy.  Just simple.  Your unit tests should run fast.  Many slow running unit tests bring no benefit to the quality of your product.  Developers will avoid running unit tests if it takes 10 minutes to run them all.  If your unit tests are taking too long to run, you’ll need to analyze what should be scaled back.  Maybe your program is too large and should be broken into smaller pieces (like APIs).

There are other reasons to keep your unit tests small and simple: Some day one or more unit tests are going to fail.  The developer modifying code will need to look at the failing unit test and analyze what it is testing.  The quicker a developer can analyze and determine what is being tested, the quicker he/she can fix the bug that was caused, or update the unit test for the new functionality.  A philosophy of keeping code small should translate into your entire programming work pattern.  Keep your methods small as well.  That will keep your code from being nested too deep.  Make sure your methods server a single purpose.  That will make unit testing easier.

A unit test only tests methods of one object.  The only time you’ll break other objects is if you add parameters to your object or public methods/parameters.  If you change something to a private method, only unit tests for the object you’re working on will fail.

Run Unit Tests Often

For a continuous integration environment, your unit tests should run right after you build.  If you have a build serer (and you should), your build server must run the unit tests.  If your tests do not pass, then the build needs to be marked as broken.  If you only run your unit tests after you end your sprint, then you’re going to be in for a nasty surprise when hundreds of unit tests fail and you need to spend days trying to fix all the problems.  Your programming pattern should be: Type some code, build, test, repeat.  If you test after each build, then you’ll catch mistakes as you make them.  Your failing unit tests will be minimal and you can fix your problem while you are focused on the logic that caused the failure.

Learning to Unit Test

There are a lot of resources on the Internet for the subject of unit testing.  I have written many blog posts on the subject that you can study by clicking on the following links:

 

Mocking Your File System

Introduction

In this post, I’m going to talk about basic dependency injection and mocking a method that is used to access hardware.  The method I’ll be mocking is the System.IO.Directory.Exists().

Mocking Methods

One of the biggest headaches with unit testing is that you have to make sure you mock any objects that your method under test is calling.  Otherwise your test results could be dependent on something you’re not really testing.  As an example for this blog post, I will show how to apply unit tests to this very simple program:

class Program
{
    static void Main(string[] args)
    {
        var myObject = new MyClass();
        Console.WriteLine(myObject.MyMethod());
        Console.ReadKey();
    }
}

The object that is used above is:

public class MyClass
{
    public int MyMethod()
    {
        if (System.IO.DirectoryExists("c:\\temp"))
        {
            return 3;
        }
        return 5;
    }
}

Now, we want to create two unit tests to cover all the code in the MyMethod() method.  Here’s an attempt at one unit test:

[TestMethod]
public void test_temp_directory_exists()
{
    var myObject = new MyClass();
    Assert.AreEqual(3, myObject.MyMethod());
}

The problem with this unit test is that it will pass if your computer contains the c:\temp directory.  If your computer doesn’t contain c:\temp, then it will always fail.  If you’re using a continuous integration environment, you can’t control if the directory exists or not.  To compound the problem you really need test both possibilities to get full test coverage of your method.  Adding a unit test to cover the case where c:\temp to your test suite would guarantee that one test would pass and the other fail.

The newcomer to unit testing might think: “I could just add code to my unit tests to create or delete that directory before the test runs!”  Except, that would be a unit test that modifies your machine.  The behavior would destroy anything you have in your c:\temp directory if you happen to use that directory for something.  Unit tests should not modify anything outside the unit test itself.  A unit test should never modify database data.  A unit test should not modify files on your system.  You should avoid creating physical files if possible, even temp files because temp file usage will make your unit tests slower.

Unfortunately, you can’t just mock System.IO.Directory.Exists().  The way to get around this is to create a wrapper object, then inject the object into MyClass and then you can use Moq to mock your wrapper object to be used for unit testing only.  Your program will not change, it will still call MyClass as before.  Here’s the wrapper object and an interface to go with it:

public class FileSystem : IFileSystem
{
  public bool DirectoryExists(string directoryName)
  {
    return System.IO.Directory.Exists(directoryName);
  }
}

public interface IFileSystem
{
    bool DirectoryExists(string directoryName);
}

Your next step is to provide an injection point into your existing class (MyClass).  You can do this by creating two constructors, the default constructor that initializes this object for use by your method and a constructor that expects a parameter of IFileSystem.  The constructor with the IFileSystem parameter will only be used by your unit test.  That is where you will pass along a mocked version of your filesystem object with known return values.  Here are the modifications to the MyClass object:

public class MyClass
{
    private readonly IFileSystem _fileSystem;

    public MyClass(IFileSystem fileSystem)
    {
        _fileSystem = fileSystem;
    }

    public MyClass()
    {
        _fileSystem = new FileSystem();
    }

    public int MyMethod()
    {
        if (_fileSystem.DirectoryExists("c:\\temp"))
        {
            return 3;
        }
        return 5;
    }
}

This is the point where your program should operate as normal.  Notice how I did not need to modify the original call to MyClass that occurred at the “Main()” of the program.  The MyClass() object will create a IFileSystem wrapper instance and use that object instead of calling System.IO.Directory.Exists().  The result will be the same.  The difference is that now, you can create two unit tests with mocked versions of IFileSystem in order to test both possible outcomes of the existence of “c:\temp”.  Here is an example of the two unit tests:

[TestMethod]
public void test_temp_directory_exists()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(true);

    var myObject = new MyClass(mockFileSystem.Object);
    Assert.AreEqual(3, myObject.MyMethod());
}

[TestMethod]
public void test_temp_directory_missing()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(false);

    var myObject = new MyClass(mockFileSystem.Object);
    Assert.AreEqual(5, myObject.MyMethod());
}

Make sure you include the NuGet package for Moq.  You’ll notice that in the first unit test, we’re testing MyClass with a mocked up version of a system where “c:\temp” exists.  In the second unit test, the mock returns false for the directory exists check.

One thing to note: You must provide a matching input on x.DirectoryExists() in the mock setup.  If it doesn’t match what is used in the method, then you will not get the results you expect.  In this example, the directory being checked is hard-coded in the method and we know that it is “c:\temp”, so that’s how I mocked it.  If there is a parameter that is passed into the method, then you can mock some test value, and pass the same test value into your method to make sure it matches (the actual test parameter doesn’t matter for the unit test, only the results).

Using an IOC Container

This sample is setup to be extremely simple.  I’m assuming that you have existing .Net legacy code and you’re attempting to add unit tests to the code.  Normally, legacy code is hopelessly un-unit testable.  In other words, it’s usually not worth the effort to apply unit tests because of the tightly coupled nature of legacy code.  There are situations where legacy code is not too difficult to add unit testing.  This can occur if the code is relatively new and the developer(s) took some care in how they built the code.  If you are building new code, you can use this same technique from the beginning, but you should also plan your entire project to use an IOC container.  I would not recommend refactoring an existing project to use an IOC container.  That is a level of madness that I have attempted more than once with many man-hours of wasted time trying to figure out what is wrong with the scoping of my objects.

If your code is relatively new and you have refactored to use contructors as your injection points, you might be able to adapt to an IOC container.  If you are building your code from the ground up, you need to use an IOC container.  Do it now and save yourself the headache of trying to figure out how to inject objects three levels deep.  What am I talking about?  Here’s an example of a program that is tightly coupled:

class Program
{
    static void Main(string[] args)
    {
        var myRootClass = new MyRootClass();

        myRootClass.Increment();

        Console.WriteLine(myRootClass.CountExceeded());
        Console.ReadKey();
    }
}
public class MyRootClass
{
  readonly ChildClass _childClass = new ChildClass();

  public bool CountExceeded()
  {
    if (_childClass.TotalNumbers() > 5)
    {
        return true;
    }
    return false;
  }

  public void Increment()
  {
    _childClass.IncrementIfTempDirectoryExists();
  }
}

public class ChildClass
{
    private int _myNumber;

    public int TotalNumbers()
    {
        return _myNumber;
    }

    public void IncrementIfTempDirectoryExists()
    {
        if (System.IO.Directory.Exists("c:\\temp"))
        {
            _myNumber++;
        }
    }

    public void Clear()
    {
        _myNumber = 0;
    }
}

The example code above is very typical legacy code.  The “Main()” calls the first object called “MyRootClass()”, then that object calls a child class that uses System.IO.Directory.Exists().  You can use the previous example to unit test the ChildClass for examples when c:\temp exist and when it doesn’t exist.  When you start to unit test MyRootClass, there’s a nasty surprise.  How to you inject your directory wrapper into that class?  If you have to inject class wrappers and mocked classes of every child class of a class, the constructor of a class could become incredibly large.  This is where IOC containers come to the rescue.

As I’ve explained in other blog posts, an IOC container is like a dictionary of your objects.  When you create your objects, you must create a matching interface for the object.  The index of the IOC dictionary is the interface name that represents your object.  Then you only call other objects using the interface as your data type and ask the IOC container for the object that is in the dictionary.  I’m going to make up a simple IOC container object just for demonstration purposes.  Do not use this for your code, use something like AutoFac for your IOC container.  This sample is just to show the concept of how it all works.  Here’s the container object:

public class IOCContainer
{
  private static readonly Dictionary<string,object> ClassList = new Dictionary<string, object>();
  private static IOCContainer _instance;

  public static IOCContainer Instance => _instance ?? (_instance = new IOCContainer());

  public void AddObject<T>(string interfaceName, T theObject)
  {
    ClassList.Add(interfaceName,theObject);
  }

  public object GetObject(string interfaceName)
  {
    return ClassList[interfaceName];
  }

  public void Clear()
  {
    ClassList.Clear();
  }
}

This object is a singleton object (global object) so that it can be used by any object in your project/solution.  Basically it’s a container that holds all pointers to your object instances.  This is a very simple example, so I’m going to ignore scoping for now.  I’m going to assume that all your objects contain no special dependent initialization code.  In a real-world example, you’ll have to analyze what is initialized when your objects are created and determine how to setup the scoping in the IOC container.  AutoFac has options of when the object will be created.  This example creates all the objects before the program starts to execute.  There are many reasons why you might not want to create an object until it’s actually used.  Keep that in mind when you are looking at this simple example program.

In order to use the above container, we’ll need to use the same FileSystem object and interface from the prevous program.  Then create an interface for MyRootObject and ChildObject.  Next, you’ll need to go through your program and find every location where an object is instantiated (look for the “new” command).  Replace those instances like this:

public class ChildClass : IChildClass
{
    private int _myNumber;
    private readonly IFileSystem _fileSystem = (IFileSystem)IOCContainer.Instance.GetObject("IFileSystem");

    public int TotalNumbers()
    {
        return _myNumber;
    }

    public void IncrementIfTempDirectoryExists()
    {
        if (_fileSystem.DirectoryExists("c:\\temp"))
        {
            _myNumber++;
        }
    }

    public void Clear()
    {
        _myNumber = 0;
    }
}

Instead of creating a new instance of FileSystem, you’ll ask the IOC container to give you the instance that was created for the interface called IFileSystem.  Notice how there is no injection in this object.  AutoFac and other IOC containers have facilities to perform constructor injection automatically.  I don’t want to introduce that level of complexity in this example, so for now I’ll just pretend that we need to go to the IOC container object directly for the main program as well as the unit tests.  You should be able to see the pattern from this example.

Once all your classes are updated to use the IOC container, you’ll need to change your “Main()” to setup the container.  I changed the Main() method like this:

static void Main(string[] args)
{
    ContainerSetup();

    var myRootClass = (IMyRootClass)IOCContainer.Instance.GetObject("IMyRootClass");
    myRootClass.Increment();

    Console.WriteLine(myRootClass.CountExceeded());
    Console.ReadKey();
}

private static void ContainerSetup()
{
    IOCContainer.Instance.AddObject<IChildClass>("IChildClass",new ChildClass());
    IOCContainer.Instance.AddObject<IMyRootClass>("IMyRootClass",new MyRootClass());
    IOCContainer.Instance.AddObject<IFileSystem>("IFileSystem", new FileSystem());
}

Technically the MyRootClass object does not need to be included in the IOC container since no other object is dependent on it.  I included it to demonstrate that all objects should be inserted into the IOC container and referenced from the instance in the container.  This is the design pattern used by IOC containers.  Now we can write the following unit tests:

[TestMethod]
public void test_temp_directory_exists()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(true);

    IOCContainer.Instance.Clear();
    IOCContainer.Instance.AddObject("IFileSystem", mockFileSystem.Object);

    var myObject = new ChildClass();
    myObject.IncrementIfTempDirectoryExists();
    Assert.AreEqual(1, myObject.TotalNumbers());
}

[TestMethod]
public void test_temp_directory_missing()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(false);

    IOCContainer.Instance.Clear();
    IOCContainer.Instance.AddObject("IFileSystem", mockFileSystem.Object);

    var myObject = new ChildClass();
    myObject.IncrementIfTempDirectoryExists();
    Assert.AreEqual(0, myObject.TotalNumbers());
}

[TestMethod]
public void test_root_count_exceeded_true()
{
    var mockChildClass = new Mock<IChildClass>();
    mockChildClass.Setup(x => x.TotalNumbers()).Returns(12);

    IOCContainer.Instance.Clear();
    IOCContainer.Instance.AddObject("IChildClass", mockChildClass.Object);

    var myObject = new MyRootClass();
    myObject.Increment();
    Assert.AreEqual(true,myObject.CountExceeded());
}

[TestMethod]
public void test_root_count_exceeded_false()
{
    var mockChildClass = new Mock<IChildClass>();
    mockChildClass.Setup(x => x.TotalNumbers()).Returns(1);

    IOCContainer.Instance.Clear();
    IOCContainer.Instance.AddObject("IChildClass", mockChildClass.Object);

    var myObject = new MyRootClass();
    myObject.Increment();
    Assert.AreEqual(false, myObject.CountExceeded());
}

In these unit tests, we put the mocked up object used by the object under test into the IOC container.  I have provided a “Clear()” method to reset the IOC container for the next test.  When you use AutoFac or other IOC containers, you will not need the container object in your unit tests.  That’s because IOC containers like the one built into .Net Core and AutoFac use the constructor of the object to perform injection automatically.  That makes your unit tests easier because you just use the constructor to inject your mocked up object and test your object.  Your program uses the IOC container to magically inject the correct object according to the interface used by your constructor.

Using AutoFac

Take the previous example and create a new constructor for each class and pass the interface as a parameter into the object like this:

private readonly IFileSystem _fileSystem;

public ChildClass(IFileSystem fileSystem)
{
    _fileSystem = fileSystem;
}

Instead of asking the IOC container for the object that matches the interface IFileSystem, I have only setup the object to expect the fileSystem object to be passed in as a parameter to the class constructor.  Make this change for each class in your project.  Next, change your main program to include AutoFac (NuGet package) and refactor your IOC container setup to look like this:

static void Main(string[] args)
{
    IOCContainer.Setup();

    using (var myLifetime = IOCContainer.Container.BeginLifetimeScope())
    {
        var myRootClass = myLifetime.Resolve<IMyRootClass>();

        myRootClass.Increment();

        Console.WriteLine(myRootClass.CountExceeded());
        Console.ReadKey();
    }
}

public static class IOCContainer
{
    public static IContainer Container { get; set; }

    public static void Setup()
    {
        var builder = new ContainerBuilder();

        builder.Register(x => new FileSystem())
            .As<IFileSystem>()
            .PropertiesAutowired()
            .SingleInstance();

        builder.Register(x => new ChildClass(x.Resolve<IFileSystem>()))
            .As<IChildClass>()
            .PropertiesAutowired()
            .SingleInstance();

        builder.Register(x => new MyRootClass(x.Resolve<IChildClass>()))
            .As<IMyRootClass>()
            .PropertiesAutowired()
            .SingleInstance();

        Container = builder.Build();
    }
}

I have ordered the builder.Register command from innner most to the outer most object classes.  This is not really necessary since the resolve will not occur until the IOC container is called by the object to be used.  In other words, you can define the MyRootClass first, followed by FileSystem and ChildClass, or in any order you want.  The Register command is just storing your definition of which physical object will be represented by each interface and which dependencies it will depend on.

Now you can cleanup your unit tests to look like this:

[TestMethod]
public void test_temp_directory_exists()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(true);

    var myObject = new ChildClass(mockFileSystem.Object);
    myObject.IncrementIfTempDirectoryExists();
    Assert.AreEqual(1, myObject.TotalNumbers());
}

[TestMethod]
public void test_temp_directory_missing()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(false);

    var myObject = new ChildClass(mockFileSystem.Object);
    myObject.IncrementIfTempDirectoryExists();
    Assert.AreEqual(0, myObject.TotalNumbers());
}

[TestMethod]
public void test_root_count_exceeded_true()
{
    var mockChildClass = new Mock<IChildClass>();
    mockChildClass.Setup(x => x.TotalNumbers()).Returns(12);

    var myObject = new MyRootClass(mockChildClass.Object);
    myObject.Increment();
    Assert.AreEqual(true, myObject.CountExceeded());
}

[TestMethod]
public void test_root_count_exceeded_false()
{
    var mockChildClass = new Mock<IChildClass>();
    mockChildClass.Setup(x => x.TotalNumbers()).Returns(1);

    var myObject = new MyRootClass(mockChildClass.Object);
    myObject.Increment();
    Assert.AreEqual(false, myObject.CountExceeded());
}

Do not include the AutoFac NuGet package in your unit test project.  It’s not needed.  Each object is isolated from all other objects.  You will still need to mock any injected objects, but the injection occurs at the constructor of each object.  All dependencies have been isolated so you can unit test with ease.

Where to Get the Code

As always, I have posted the sample code up on my GitHub account.  This project contains four different sample projects.  I would encourage you to download each sample and experiment/practice with them.  You can download the samples by following the links listed here:

  1. MockingFileSystem
  2. TightlyCoupledExample
  3. SimpleIOCContainer
  4. AutoFacIOCContainer
 

Vintage Hardware – The IBM 701

The Basics

The IBM 701 computer was one of IBM’s first commercially available computers.  Here are a few facts about the 701:

  • The 701 was introduced in 1953.
  • IBM sold only 30 units.
  • It used approximately 4,000 tubes.
  • Memory was stored on 72 CRT tubes called Williams Tubes with a total capacity of 2,048 words.
  • The word size was 36 bits wide.
  • The CPU contained two programmer accessible registers: Accumulator, multiplier/quotient register.

Tubes

There were two types of electronic switches in the early 1950’s: The relay and the vacuum tube.  Relays were unreliable, mechanical, noisy, power-hungry and slow.  Tubes were unreliable and power-hungry.  Compared to the relay, tubes were fast.  One of the limitations on how big and complex a computer could be in those days, was determined by the reliability of the tube.  Tubes had a failure rate of 0.3 percent per 1000 hours (in the late 40’s, early 50’s).  That means, that in 1,000 hours of use, 0.3 percent of the tubes will fail.  That’s pretty good if you’re talking about a dozen tubes.  At that failure rate, the 701 would fail at an average of less than one hour (0.3 x 4,000 tubes = 1,200 units per 1,000 hours) assuming an even distribution of failures.

Tubes use higher voltage levels than semiconductors and that means a lot more heat and more power usage.  It also means that tubes are much slower than semiconductors.  By today’s standards, tubes are incredible slow.  Most tube circuit times were measured in microseconds instead of nanoseconds.  A typical add instruction on the IBM 701 took 60 microseconds to complete.

The main memory was composed of Williams tubes.  These are small CRT tubes that stored data by exploiting the delay time that it took phosphor to fade.  Instead of using the visible properties of phosphor to store data, a secondary emission effect can occur by increasing the beam to a threshold causing a charge to occur.  A thin conducting plate is mounted to the front of the CRT.  When the beam hits a point where the phosphor is already lit up, no charge is induced on the plate.  This is a read as a one.  When the beam hits a place where there was no lit phosphor, a current will flow indicating a zero.  The read operation causes the bit to overwritten as a one, so the data must be re-written as it is read.  Wiki has an entire article on the Williams tube: Williams tube wiki.  Here’s an example of the dot pattern on a Williams tube:

By National Institute of Standards and Technology – National Institute of Standards and Technology, Public Domain

The tubes that were used for memory storage didn’t have visible phosphor, therefore a technician could plug in a phosphor tube in parallel and see the image as picture above for troubleshooting purposes.

Williams tube memory was called electrostatic storage and it had an access time of 12 microseconds.  Each tube stored 1,024 bits of data data for one bit of the data bus.  To form a parallel data bus, 36 tubes together represented the full data path for 1,024 words of data.  The data is stored on the tube in 32 columns by 32 rows.  If you count the columns and rows in the tube pictured above, you’ll see that it is storing 16 by 16, representing a total of 256 bits of data.  The system came with 72 tubes total for a memory size of 2,048 words of memory.  Since the word size is 36 bits wide, 2,048 words is equal to 9,216 bytes of storage.

Here’s a photo of one drawer containing two Williams tubes:

By www.Computerhistory.org

Williams tubes used electromagnets to steer the electron beam to an xy point on the screen.  What this means is that the entire assembly must be carefully shielded to prevent stray magnetic fields from redirecting the beam.  In the picture above you can see the black shielding around each of the long tubes to prevent interference between the two tubes.

In order to address the data from the tube, the addressing circuitry would control the x and y magnets to steer to the correct dot, then read or write the dot.  If you want to learn more about the circuitry that drives an electrostatic memory unit, you can download the schematics in PDF here.

One type of semiconductor was used in this machine and that was the Germanium diode.  The 701 used at total of 13,000 germanium diodes.

CPU

The CPU or Analytical Control Unit could perform 33 calculator operations as shown in this diagram:

Here is the block diagram of the 701:

Instructions are read from the electrostatic storage into the memory register where it is directed to the instruction register. Data that is sent to external devices must go through the multiplier/quotient register.  Data read from external devices (i.e. tape, drum or card reader) is loaded into the multiplier/quotient register.  As you can see from the block diagram, there is a switch that feeds inputs from the tape, drum, card reader or the electrostatic memory.  This is a physical switch on the front panel of the computer.  The computer operator would change the switch position to the desired input before starting the machine.  Here’s a photo of the front panel, you can see the round switch near the lower left (click to zoom):

By Dan – Flickr: IBM 701, CC BY 2.0

The front panel also has switches and lights for each of the registers so the operator could manually input binary into the accumulator or enter a starting address (instruction counter).  Notice how the data width of this computer is wider than the address width.  Only 12 bits are needed to address all 2,048 words of memory.  If you look closely, you’ll also notice that the lights are grouped in octal sets (3 lights per group).  The operator can key in data that is written as octal numbers (0-7) without trying to look at a 12-bit number of ones and zeros.  The switches for entering data are grouped gray and white in octal groupings as well.

There is a front panel control to increment the machine cycle.  An operator or technician could troubleshoot a problem by executing one machine cycle at a time with one press of the button per machine cycle.  For a multiply instruction the operator could push the button 38 times to complete the operation or examine the partial result as each cycle was completed.  The machine also came with diagnostic programs that could be run to quickly identify a physical problem with the machine.  A set of 16 manuals was provided to assist the installation, maintenance, operation and programming of the 701.  The computer operator normally only operated the start, stop and reset controls as well as the input and output selectors on the machine.  The instruction counter and register controls are normally used by a technician to troubleshoot problems.

The CPU was constructed using a modular technique.  Each module or “Pluggable Unit” had a complete circuit on it and could be replaced all at once by a technician.  The units looked like this:

Image from Pintrest (Explore Vacuum Tube, IBM, and more!)

Each of these units are aligned together into one large block:

(By Bitsavers, IBM 701)

Notice how all the tubes face the front of the machine.  One of the first things to burn out on a tube is the heating element.  By facing the tubes toward the front, a technician can quickly identify any burned out tubes and replace them.  If all tubes are glowing, then the technician will need to run diagnostics and try to limit down which tube or tubes are not working.  Another reason to design the modules to face all tubes forward, is that the technician can grab a tube and pull it out of it’s socket to put it into a tube tester and determine which tube really is bad.

My experience with tubes date back to the late ’70s when people wanted to “fix” an old TV set (usually a TV that was built in the early 60s).  I would look for an unlit tube, read the tube number off the side and run down to the electronics shop to buy a new one.  If that failed, then my next troubleshooting step was to pull all the tubes (after the TV cooled off), put them in a box and run up to the electronics store where they had a tube tester (they wanted to sell new tubes, so they provided a self-help tester right at their store).  I would plug my tube into the tester, flip through the index pages for the tube type being tested.  Then adjust the controls according to the instructions and push the test button.  If the tube was bad, then I bought a new one.  Other components such as diodes, resisters, capacitors rarely went bad.  Those would be the next thing on the troubleshooting list after all the tubes were determined to be good.  One other note about troubleshooting tubes: The tube tester had a meter that would be read to determine what the testing current flow was.  The manual had a range of acceptable values (minimum and maximum values).  The values were based on the tube manufacturer’s specifications.  For some devices, the tube would not work, even though it was within manufacturer specifications.  We referred to tubes like this as a “weak” tube (in other words it might not be able to pass as much current as needed by the circuit).  So a determination would have to be made to replace the tube.

Here’s an example of the type of tube tester I remember using:

(PDX Retro)

All cabinets were designed to be no larger than necessary to be able to transport up an elevator and fit through a normal sized office door (see: Buchholz IBM 701 System Design, Page 1285 Construction and Design Details).  The entire system was divided into separate cabinets for this very purpose.  Cables would be run through the floor to connect cabinets together, much like the wiring in modern server rooms.

Storage Devices

The 701 could be configured with a drum storage unit and a tape storage unit.  The IBM 731 storage unit contained two physical drums organized as four logical drums able to store 2,048 full words of information each (for a total of 8,192 words, equal to 82,000 decimal digits).  Each logical drum reads and writes 36 bits in a row.  The drum spins at a rate of 2,929 RPM with a density of 50 bits to the inch.  There are 2,048 bits around the drum for each track.  Seek time could take up to 1,280 microseconds.  The storage capacity was intentionally designed to match the capacity of the electrostatic memory.  This was used to read all memory onto a drum or read a drum into the electrostatic memory.

Inputs

The primary input device for the 701 was the punch card reader.  Computers in the 50’s were designed as batch machines.  It cost a lot of money to run a computer room and it is more economical to batch jobs so the computer was processing data continuously.  It would be too expensive to have an operator typing data into the computer and saving it on a storage device like we do today.  Timeshare systems were not invented yet and this machine was too slow and tiny for time sharing.  In order to perform batch jobs, operators or programmers would use the keypunch machine to type their program and data onto punch cards (usually in a different room).  The keypunch machine was an electro-mechanical device that looked like a giant typewriter (IBM 026):

By Columbia University, The IBM 026 Keypunch

A punch card reader is connected as an input to the 701 in order to read in a batch of punch cards.  The cards can be read in at a rate of 150 cards per minute.  Each card holds one line of data from a program or 72 characters wide.  If the punch cards are used as binary input, then they can contain 24 full words per card.  A program can be punched onto a stack of cards and then loaded into the card reader for execution.  Programmers used a lot of common routines with their programs, so they would punch a routine on a small stack of cards and then include that stack with their main program to be run in sequence (see: Buchholz IBM 701 System Design, Page 1274 Card programming).  Then they could save the common routine cards to be used with other jobs.  The stacks of cards were equivalent to today’s “files”.

By IBM Archives, IBM 711 Punch Card Reader

Outputs

The 701 was normally configured with a line printer that can print 150 lines per minute.  Information to be printed was transferred from the multiplier/quotient register to 72 thyratrons (a type of current amplifier tube) connected directly to the printer.  The tyratrons were located inside the Analytical Control Unit and their function was shared by the printer and the card punch (thyratrons activated the print magnets or the punch magnets).  Data that was printed came directly from the electrostatic storage and needed to be converted by the program into decimal data before going to the printer.

Sources

I have provided links from each embedded photo above to the sources (with the exception of diagrams I copied from the sources below).  You may click on a photo to go to the source and obtain more information.  For more detailed information about the IBM 701, I would recommend clicking through these sites/documents: