Dear Computer Science Majors…

Introduction

It has been a while since I wrote a blog posts directed at newly minted Computer Science Majors.  In fact, the last time I wrote one of these articles was in 2014.  So I’m going to give all of you shiny-new Computer Scientists a leg-up in the working world by telling you some inside information about what companies need from you.  If you read through my previous blog post (click here) and read through this post, you’ll be ahead of the pack when you submit your resume for that first career starting job.

Purpose of this Post

First of all, I’m going to tell you my motivation for creating these posts.  In other words: “What’s in it for Frank.”  I’ve been programming since 1978 and I’ve been employed as a software engineer/developer since 1994.  One of my tasks as a seasoned developer is to review submitted programming tests, create programming tests, read resumes, submit recommendations, interview potential developers, etc.  By examining the code submitted by a programming test, I can tell a lot about the person applying for a job.  I can tell how sophisticated they are.  I can tell if they are just faking it (i.e. they just Googled results, made it work and don’t really understand what they are doing).  I can tell if the person is interested in the job or not.  One of the trends that I see is that there is a large gap between what is taught in colleges and what is needed by a company.  That gap has been increasing for years.  I would like to close the gap, but it’s a monstrous job.  So YOU, the person reading this blog that really wants a good job, must do a little bit of your own leg-work.  Do I have your attention?  Then read on…

What YOU Need to Do

First, go to my previous blog post on this subject and take notes on the following sections:

  • Practice
  • Web Presence

For those who are still in school and will not graduate for a few more semesters, start doing this now:

  • Programming competitions
  • Resume Workshops
  • Did I mention: Web Presence?

You’ll need to decide which track you’re going to follow and try to gain deep knowledge in that area.  Don’t go out and learn a hundred different frameworks, twenty databases and two dozen in-vogue languages.  Stick to something that is in demand and narrow your expertise to a level that you can gain useful knowledge.  Ultimately you’ll need to understand a subject well enough to make something work.  You’re not going to be an expert, that takes years of practice and a few failures.  If you can learn a subject well enough to speak about it, then you’re light-years ahead of the average newly minted BS-degree.

Now it’s time for the specifics.  You need to decide if you’re going to be a Unix person or a .Net person.  I’ve done both and you can cross-over.  It’s not easy to cross-over, but I’m proof that it can happen.  If you survive and somehow end up programming as long as I have, then you’ll have experience with both.  Your experience will not be even between the two sides.  It be weighted toward one end or the other.  In my case, my experience is weighted toward .Net because that is the technology that I have been working on more recently.

If you’re in the Unix track, I’m probably not the subject expert on which technologies you need to follow.  Python, Ruby, which frameworks, unit testing, you’ll need to read up and figure out what is in demand.  I would scan job sites such as Glass Door, Indeed, LinkedIn, Stack Exchange or any other job sites just to see what is in demand.  Look for entry level software developer positions.  Ignore the pay or what they are required to do and just take a quick tally of how many companies are asking for Python, PHP, Ruby, etc.  Then focus on some of those.

If you’re in the .Net track, I can tell you exactly what you need to get a great paying job.  First, you’re going to need to learn C#.  That is THE language of .Net.  Don’t let anybody tell you otherwise.  Your college taught you Java?  No problem, you’re language knowledge is already 99% there.  Go to Microsoft’s website and download the free version of Visual Studio (the community version) and install it.  Next, you’ll need a database and that is going to be MS SQL Server.  Don’t bother with MS Access.  There is a free version of SQL as well.  In fact the developer version is fully functional, but you probably don’t need to download and install that.  When you install Visual Studio the Express version of SQL is normally installed with it.  You can gain real database knowledge from that version.

Follow this list:

  • Install Visual Studio Community.
  • Check for a pre-installed version of MS SQL Server Express.
  • Go out and sign up for a GitHub account.  Go ahead, I’ll wait (click here).
  • Download and install SourceTree (click here).

Now you have the minimum tools to build your knowledge.  Here’s a list of what you need to learn, using those tools:

  • How to program in C# using a simple console application.
  • How to create simple unit tests.
  • Create an MVC website, starting with the template site.
  • How to create tables in MS SQL Server.
  • How to insert, delete, update and select data in MS SQL Server.
  • How to create POCOs, fluent mappings and a database context in C#.
  • How to troubleshoot a website or API (learn some basic IIS knowledge).
  • How to create a repository on GitHub.
  • How to check-in your code to GitHub using SourceTree.

That would pretty much do it.  The list above will take about a month of easy work or maybe a hard-driven weekend.  If you can perform these tasks and talk intelligently about them, you’ll have the ability to walk into a job.  In order to seal-the-deal, you’ll have to make sure this information is correctly presented on your resume.  So how should you do that?

First, make sure you polish your projects and remove any commented code, remove any unused or dead-code.  If there are tricky areas, put in some comments.  Make sure you update your “Read-Me” file on GitHub for each of your projects.  Put your GitHub URL near the top of your resume.  If I see a programmer with a URL to a GitHub account, that programmer has already earned some points in my informal scale of who gets the job.  I usually stop reading the resume and go right to the GitHub account and browse their software.  If you work on a project for some time, you can check-in your changes as you progress.  This is nice for me to look at, because I can see how much effort you are putting into your software.  If I check the history and I see the first check-in was just a blank solution followed by several check-ins that show the code being refactored and re-worked into a final project, I’m going to be impressed.  That tells me that you’re conscious enough to know to get your code checked-in and protected from loss immediately.  Don’t wait for the final release.  Building software is a lot like producing sausage.  The process is messy, but the final product is good (assuming you like sausage).

If you really want to impress me and by extension, any seasoned programmer, create a technical blog.  Your blog can be somewhat informal, but you need to make sure you express your knowledge of the subject.  A blog can be used as a tool to secure a job.  It doesn’t have to get a million hits a day to be successful.  In fact, if your blog only receives hits from companies that are reading your resume, it’s a success.  You see, the problem with the resume is that it doesn’t allow me to into your head.  It’s just a sheet of paper (or more if you have a job history) with the bare minimum information on it.  It’s usually just to get you past the HR department.  In the “olden” days, when resumes were mailed with a cover letter, the rule was one page.  Managers would not have time to read novels, so they wanted the potential employee to narrow down their knowledge to one page.  Sort of a summary of who you are in the working world.  This piece of paper is compared against a dozen or hundreds of other single-page resumes to determine which hand-full of people would be called in to be interviewed.  Interviews take a lot of physical time, so the resume reading needs to be quick.  That has changed over the years and the rules don’t apply to the software industry as a whole.  Even though technical resumes can go on for two or more pages, the one-page resume still applies for new graduates.  If you are sending in a resume that I might pick up and read, I don’t want to see that you worked at a Walmart check-out counter for three years, followed by a gig at the car wash.  If you had an intern job at a tech company where you got some hands-on programming experience, I want to see that.  If you got an intern with Google filing their paperwork, I don’t care.

Back to the blog.  What would I blog about if I wanted to impress a seasoned programmer?  Just blog about your experience with the projects you are working on.  It can be as simple as “My First MVC Project”, with a diary format like this:

Day 1, I created a template MVC project and started digging around.  Next I modified some text in the “View” to see what would happen.  Then I started experimenting with the ViewBag object.  That’s an interesting little object.  It allowed me to pass data between the controller and the view.

And so on…

Show that you did some research on the subject.  Expand your knowledge by adding a feature to your application.  Minor features like, search, column sort, page indexing are important.  It demonstrates that you can take an existing program and extend it to do more.  When you enter the working world, 99% of what you will create will be a feature added to code you never wrote.  If your blog demonstrates that you can extend existing code, even code you wrote yourself, I’ll be impressed.

Taking the Test

Somewhere down the line, you’re going to generate some interest.  There will be a company out there that will want to start the process and the next step is the test.  Most companies require a programming test.  At my point in my career the programming test is just an annoyance.  Let’s call it a formality.  As a new and inexperienced programmer, the test is a must.  It will be used to determine if you’re worth someone’s time to interview.  Now I’ve taken many programming tests and I’ve been involved in designing and testing many different types of programming tests.  The first thing you need to realize is that different companies have different ideas about what to test.  If it was up to me, I would want to test your problem solving skills.  Unfortunately, it’s difficult to test for that skill without forcing you to take some sort of test that may ask for your knowledge in a subject that you don’t have.  I’ve seen tests that allow the potential hire to use any language they want.  I’ve also seen tests that give very gray specifics and are rated according to how creative the solution is.  So here are some pointers for passing the test:

  • If it’s a timed test, try to educate yourself on the subjects you know will on the tests before it starts.
  • If it’s not a timed test, spend extra time on it.  Make it look like you spent some time to get it right.
  • Keep your code clean.
    • No “TODO” comments.
    • No commented code or dead-code.
    • Don’t leave code that is not used (another description for dead-code).
    • Follow the naming convention standards, no cryptic variable names (click here or here for examples).
  • If there is extra credit, do it.  The dirty secret is that this is a trick to see if you’re going to be the person who does just the minimum, or you go the extra mile.
  • Don’t get too fancy.
    • Don’t show off your knowledge by manually coding a B-tree structure instead of using the “.Sort()” linq method.
    • Don’t perform something that is obscure just to look clever.
    • Keep your program as small as possible.
    • Don’t add any “extra” features that are not called for in the specification (unless the instructions specifically tell you to be creative).

When you’re a student in college, you are required to analyze algorithms and decide which is more efficient in terms of memory use and CPU speed.  In the working world, you are required to build a product that must be delivered in a timely manner.  Does it matter if you use the fastest algorithm?  It might not really matter.  It will not make a difference if you can’t deliver a working product on time just because you spent a large amount of your development time on a section of code that is only built for the purpose of making the product a fraction faster.  Many companies will need a product delivered that works.  Code can be enhanced later.  Keep that in mind when you’re taking your programming test.  Your program should be easy to follow so another programmer can quickly enhance it or repair bugs.

The Interview

For your interview, keep it simple.  You should study up on general terms, in case you’re asked.  Make sure you understand these terms:

  • Dependency injection
  • Polymorphism
  • Encapsulation
  • Single purpose
  • Model/View/Controller
  • REST
  • Base Class
  • Private/public methods/classes
  • Getters/Setters
  • Interface
  • Method overloading

Here’s a great place to do your study (click here).  These are very basic concepts and you should have learned them in one of your object oriented programming classes.  Just make sure you haven’t forgotten about them.  Make sure you understand the concepts that you learned from any projects that you checked into GitHub.  If you learned some unit testing, study the terms.  Don’t try to act like an expert for your first interview.  Just admit the knowledge that you have.  If I interview you and you have nothing more than a simple understanding of unit testing, I’m OK with that.  All it means is that there is a base-line of knowledge that you can build on with my help.

Wear a suit, unless explicitly specified that you don’t need one.  At a minimum, you need to dress one step better than the company dress policy.  I’m one of the few who can walk into an interview with red shoes, jeans and a technology T-shirt and get a job.  Even though I can get away with such a crazy stunt, I usually show up in a really nice suit.  To be honest, I only show up in rags when I’m Luke-warm about a job and I expect to be wooed.  If I really like the company, I look sharp.  The interviewers can tell me to take off my tie if they think I’m too stuffy.  If you’re interviewing for your first job, wear a business suit.

Don’t BS your way through the interview.  If you don’t know something, just admit it.  I ask all kinds of questions to potential new hires just to see “if” by chance they know a subject.  I don’t necessarily expect the person to know the subject and it will not have a lot of bearing on the acceptance or rejection of the person interviewing.  Sometimes I do it just to find out what their personality is like.  If you admit that you know SQL and how to write a query, I’m going to hand you a dry-erase marker and make you write a query to join two tables together.  If you pass that, I’m going to give you a hint that I want all records from the parent table to show up even if it doesn’t have child records.  If you don’t know how to do a left-outer join, I’m not going to hold it against you.  If you are able to write a correct or almost correct left join, I’ll be impressed.  If you start performing a union query or try to fake it with a wild guess, I’ll know you don’t know.  I don’t want you to get a lucky guess.  I’m just trying to find out how much I’m going to have to teach you after you’re hired.  Don’t assume that another candidate is going to get the job over you just because they know how to do a left outer join.  That other candidate might not impress me in other ways that are more important.  Just do the best you can and be honest about it.

Don’t worry about being nervous.  I’m still nervous when I go in for an interview and I really have not reason to be.  It’s natural.  Don’t be insulted if the interviewer dismisses you because they don’t think you’ll be a fit for their company.  You might be a fit and their interview process is lousy.  Of course, you might not be a fit.  The interviewer knows the company culture and they know the type of personality they are looking for.  There are no hard-fast rules for what an interviewer is looking for.  Every person who performs an interview is different.  Every company is different.

What Does a Career in Software Engineering Look Like

This is where I adjust your working world expectations.  This will give you a leg-up on what you should focus on as you work your first job and gain experience.  Here’s the general list:

  • Keep up on the technologies.
  • Always strive to improve your skills.
  • Don’t be afraid of a technology you’ve never encountered.

Eventually you’re going to get that first job.  You’ll get comfortable with the work environment and you’ll be so good at understanding the software you’ve been working on that you won’t realize the world of computer programming has gone off on a different track.  You won’t be able to keep up on everything, but you should be able to recognize a paradigm shift when it happens.  Read.  Read a lot.  I’m talking about blogs, tech articles, whatever interests you.  If your company is having issues with the design process, do some research.  I learned unit testing because my company at the time had a software quality issue from the lack of regression testing.  The company was small and we didn’t have QA people to perform manual regression testing, so bugs kept appearing in subsystems that were not under construction.  Unit testing solved that problem.  It was difficult to learn how to do unit testing correctly.  It was difficult to apply unit testing after the software was already built.  It was difficult to break the dependencies that were created from years of adding enhancements to the company software.  Ultimately, the software was never 100% unit tested (if I remember correctly, it was around 10% when I left the company), but the unit tests that were applied had a positive effect.  When unit tests are used while the software is being developed, they are very effective.  Now that the IOC container is main-stream, dependencies are easy to break and unit tests are second nature.  Don’t get complacent about your knowledge.  I have recently interviewed individuals who have little to no unit testing experience and they have worked in the software field for years.  Now they have to play catch-up, because unit testing is a requirement, not an option.  Any company not unit testing their software is headed for bankruptcy.

APIs are another paradigm.  This falls under system architecture paradigms like SOA and Microservices.  The monolithic application is dying a long and slow death.  Good riddance.  Large applications are difficult to maintain.  They are slow to deploy.  Dependencies are usually everywhere.  Breaking a system into smaller chunks (called APIs) can ease the deployment and maintenance of your software.  This shift from monolithic design to APIs started to occur years ago.  I’m still stunned at the number of programmers that have zero knowledge of the subject.  If you’ve read my blog, you’ll know that I’m a big fan of APIs.  I have a lot of experience designing, debugging and deploying APIs.

I hope I was able to help you out.  I want to see more applicants that are qualified to work in the industry.  There’s a shortage of software developers who can do the job and that problem is getting worse every year.  The job market for seasoned developers is really good, but the working world is tough because there is a serious shortage of knowledgeable programmers.  Every company I’ve worked for has a difficult time filling a software developer position and I don’t see that changing any time in the near future.  That doesn’t mean that there is a shortage of Computer Science degrees graduating each year.  What it means is that there are still too many people graduating with a degree that just to measure up.  Don’t be that person.

Now get started on that blog!

 

 

 

 

The Case for Unit Tests

Introduction

I’ve written a lot of posts on how to unit test, break dependencies, mocking objects, creating fakes, dependency injection and IOC containers.  I am a huge advocate of writing unit tests.  Unit tests are not the solution to everything, but they do solve a large number of problems that occur in software that is not unit tested.  In this post, I’m going to build a case for unit testing.

Purpose of Unit Tests

First, I’m going to assume that the person reading this post is not sold on the idea of unit tests.  So let me start by defining what a unit test is and what is not a unit test.  Then I’ll move on to defining the process of unit testing and how unit tests can save developers a lot of time.

A unit test is a tiny, simple test on a method or logic element in your software.  The goal is to create a test for each logical purpose that your code performs.  For a given “feature” you might have a hundred unit tests (more or less, depending on how complex the feature is).  For a method, you could have one, a dozen or hundreds of unit tests.  You’ll need to make sure you can cover different cases that can occur for the inputs to your methods and test for the appropriate outputs.  Here’s a list of what you should unit test:

  • Fence-post inputs.
  • Obtain full code coverage.
  • Nullable inputs.
  • Zero or empty string inputs.
  • Illegal inputs.
  • Representative set of legal inputs.

Let me explain what all of this means.  Fence-post inputs are dependent on the input data type.  If you are expecting an integer, what happens when you input a zero?  What about the maximum possible integer (int.MaxValue)?  What about minimum integer (int.MinValue)?

Obtain full coverage means that you want to make sure you hit all the code that is inside your “if” statements as well as the “else” portion.  Here’s an example of a method:

public class MyClass
{
    public int MyMethod(int input1)
    {
        if (input1 == 0)
        {
            return 4;
        }
        else if (input1 > 0)
        {
            return 2;
        }
        return input1;
    }
}

How many unit tests would you need to cover all the code in this method?  You would need three:

  1. Test with input1 = 0, that will cover the code up to the “return 4;”
  2. Test with input = 1 or greater, that will cover the code to “return 2;”
  3. Test with input = -1 or less, that will cover the final “return input1;” line of code.

That will get you full coverage.  In addition to those three tests, you should account for min and max int values.  This is a trivial example, so min and max tests are overkill.  For larger code you might want to make sure that someone doesn’t break your code by changing the input data type.  Anyone changing the data type from int to something else would get failed unit tests that will indicate that they need to review the code changes they are performing and either fix the code or update the unit tests to provide coverage for the redefined input type.

Nullable data can be a real problem.  Many programmers don’t account for all null inputs.  If you are using an input type that can have null data, then you need to account for what will happen to your code when it receives that input type.

The number zero can have bad consequences.  If someone adds code and the input is in the denominator, then you’ll get a divide by zero error, and you should catch that problem before your code crashes.  Even if you are not performing a divide, you should probably test for zero, to protect a future programmer from adding code to divide and cause an error.  You don’t necessarily have to provide code in your method to handle zero.  The example above just returns the number 4.  But, if you setup a unit test with a zero for an input, and you know what to expect as your output, then that will suffice.  Any future programmer that adds a divide with that integer and doesn’t catch the zero will get a nasty surprise when they execute the unit tests.

If your method allows input data types like “string”, then you should check for illegal characters.  Does your method handle carriage returns?  Unprintable characters?  What about an empty string?  Strings can be null as well.

Don’t forget to test for your legal data.  The three tests in the previous example test for three different legal inputs.

Fixing Bugs

The process of creating unit tests should occur as you are creating objects.  In fact, you should constantly think in terms of how you’re going to unit test your object, before you start writing your object.  Creating software is a lot like a sausage factory and even I write objects before unit tests as well as the other way around.  I prefer to create an empty object and some proposed methods that I’ll be creating.  Just a small shell with maybe one or two methods that I want to start with.  Then I’ll think up unit tests that I’ll need ahead of time.  Then I add some code and that might trigger a thought for another unit test.  The unit tests go with the code that you are writing and it’s much easier to write the unit tests before or just after you create a small piece of code.  That’s because the code you just created is fresh in your mind and you know what it’s supposed to do.

Now you have a monster that was created over several sprints.  Thousands of lines of code and four hundred unit tests.  You deploy your code to a Quality environment and a QA person discovers a bug.  Something you would have never thought about, but it’s an easy fix.  Yeah, it was something stupid, and the fix will take about two seconds and you’re done!

Not so fast!  If you find a bug, create a unit test first.  Make sure the unit test triggers the bug.  If this is something that blew up one of your objects, then you need to create one or more unit tests that feeds the same input into your object and forces it to blow up.  Then fix the bug.  The unit test(s) should pass.

Now why did we bother?  If you’re a seasoned developer like me, there have been numerous times that another developer unfixes your bug fix.  It happens so often, that I’m never surprised when it does happen.  Maybe your fix caused an issue that was unreported.  Another developer secretly fixes your bug by undoing your fix, not realizing that they are unfixing a bug.  If you put a unit test in to account for a bug, then a developer that unfixes the bug will get an error from your unit test.  If your unit test is named descriptively, then that developer will realize that he/she is doing something wrong.  This episode just performed a regression test on your object.

Building Unit Tests is Hard!

At first unit tests are difficult to build.  The problem with unit testing has more to do with object dependency than with the idea of unit testing.  First, you need to learn how to write code that isn’t tightly coupled.  You can do this by using an IOC container.  In fact, if you’re not using an IOC container, then you’re just writing legacy code.  Somewhere down the line, some poor developer is going to have to “fix” your code so that they can create unit tests.

The next most difficult concept to overcome is learning how to mock or fake an object that is not being unit tested.  These can be devices, like database access, file I/O, smtp drivers, etc.  For devices, learn how to use interfaces and wrappers.  Then you can use Moq to mock your unit tests.

Unit Tests are Small

You need to be conscious of what you are unit testing.  Don’t create a unit test that checks a whole string of objects at once (unless you want to consider those as integration tests).  Limit your unit tests to the smallest amount of code you need in order to test your functionality.  No need to be fancy.  Just simple.  Your unit tests should run fast.  Many slow running unit tests bring no benefit to the quality of your product.  Developers will avoid running unit tests if it takes 10 minutes to run them all.  If your unit tests are taking too long to run, you’ll need to analyze what should be scaled back.  Maybe your program is too large and should be broken into smaller pieces (like APIs).

There are other reasons to keep your unit tests small and simple: Some day one or more unit tests are going to fail.  The developer modifying code will need to look at the failing unit test and analyze what it is testing.  The quicker a developer can analyze and determine what is being tested, the quicker he/she can fix the bug that was caused, or update the unit test for the new functionality.  A philosophy of keeping code small should translate into your entire programming work pattern.  Keep your methods small as well.  That will keep your code from being nested too deep.  Make sure your methods server a single purpose.  That will make unit testing easier.

A unit test only tests methods of one object.  The only time you’ll break other objects is if you add parameters to your object or public methods/parameters.  If you change something to a private method, only unit tests for the object you’re working on will fail.

Run Unit Tests Often

For a continuous integration environment, your unit tests should run right after you build.  If you have a build serer (and you should), your build server must run the unit tests.  If your tests do not pass, then the build needs to be marked as broken.  If you only run your unit tests after you end your sprint, then you’re going to be in for a nasty surprise when hundreds of unit tests fail and you need to spend days trying to fix all the problems.  Your programming pattern should be: Type some code, build, test, repeat.  If you test after each build, then you’ll catch mistakes as you make them.  Your failing unit tests will be minimal and you can fix your problem while you are focused on the logic that caused the failure.

Learning to Unit Test

There are a lot of resources on the Internet for the subject of unit testing.  I have written many blog posts on the subject that you can study by clicking on the following links:

 

Mocking Your File System

Introduction

In this post, I’m going to talk about basic dependency injection and mocking a method that is used to access hardware.  The method I’ll be mocking is the System.IO.Directory.Exists().

Mocking Methods

One of the biggest headaches with unit testing is that you have to make sure you mock any objects that your method under test is calling.  Otherwise your test results could be dependent on something you’re not really testing.  As an example for this blog post, I will show how to apply unit tests to this very simple program:

class Program
{
    static void Main(string[] args)
    {
        var myObject = new MyClass();
        Console.WriteLine(myObject.MyMethod());
        Console.ReadKey();
    }
}

The object that is used above is:

public class MyClass
{
    public int MyMethod()
    {
        if (System.IO.DirectoryExists("c:\\temp"))
        {
            return 3;
        }
        return 5;
    }
}

Now, we want to create two unit tests to cover all the code in the MyMethod() method.  Here’s an attempt at one unit test:

[TestMethod]
public void test_temp_directory_exists()
{
    var myObject = new MyClass();
    Assert.AreEqual(3, myObject.MyMethod());
}

The problem with this unit test is that it will pass if your computer contains the c:\temp directory.  If your computer doesn’t contain c:\temp, then it will always fail.  If you’re using a continuous integration environment, you can’t control if the directory exists or not.  To compound the problem you really need test both possibilities to get full test coverage of your method.  Adding a unit test to cover the case where c:\temp to your test suite would guarantee that one test would pass and the other fail.

The newcomer to unit testing might think: “I could just add code to my unit tests to create or delete that directory before the test runs!”  Except, that would be a unit test that modifies your machine.  The behavior would destroy anything you have in your c:\temp directory if you happen to use that directory for something.  Unit tests should not modify anything outside the unit test itself.  A unit test should never modify database data.  A unit test should not modify files on your system.  You should avoid creating physical files if possible, even temp files because temp file usage will make your unit tests slower.

Unfortunately, you can’t just mock System.IO.Directory.Exists().  The way to get around this is to create a wrapper object, then inject the object into MyClass and then you can use Moq to mock your wrapper object to be used for unit testing only.  Your program will not change, it will still call MyClass as before.  Here’s the wrapper object and an interface to go with it:

public class FileSystem : IFileSystem
{
  public bool DirectoryExists(string directoryName)
  {
    return System.IO.Directory.Exists(directoryName);
  }
}

public interface IFileSystem
{
    bool DirectoryExists(string directoryName);
}

Your next step is to provide an injection point into your existing class (MyClass).  You can do this by creating two constructors, the default constructor that initializes this object for use by your method and a constructor that expects a parameter of IFileSystem.  The constructor with the IFileSystem parameter will only be used by your unit test.  That is where you will pass along a mocked version of your filesystem object with known return values.  Here are the modifications to the MyClass object:

public class MyClass
{
    private readonly IFileSystem _fileSystem;

    public MyClass(IFileSystem fileSystem)
    {
        _fileSystem = fileSystem;
    }

    public MyClass()
    {
        _fileSystem = new FileSystem();
    }

    public int MyMethod()
    {
        if (_fileSystem.DirectoryExists("c:\\temp"))
        {
            return 3;
        }
        return 5;
    }
}

This is the point where your program should operate as normal.  Notice how I did not need to modify the original call to MyClass that occurred at the “Main()” of the program.  The MyClass() object will create a IFileSystem wrapper instance and use that object instead of calling System.IO.Directory.Exists().  The result will be the same.  The difference is that now, you can create two unit tests with mocked versions of IFileSystem in order to test both possible outcomes of the existence of “c:\temp”.  Here is an example of the two unit tests:

[TestMethod]
public void test_temp_directory_exists()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(true);

    var myObject = new MyClass(mockFileSystem.Object);
    Assert.AreEqual(3, myObject.MyMethod());
}

[TestMethod]
public void test_temp_directory_missing()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(false);

    var myObject = new MyClass(mockFileSystem.Object);
    Assert.AreEqual(5, myObject.MyMethod());
}

Make sure you include the NuGet package for Moq.  You’ll notice that in the first unit test, we’re testing MyClass with a mocked up version of a system where “c:\temp” exists.  In the second unit test, the mock returns false for the directory exists check.

One thing to note: You must provide a matching input on x.DirectoryExists() in the mock setup.  If it doesn’t match what is used in the method, then you will not get the results you expect.  In this example, the directory being checked is hard-coded in the method and we know that it is “c:\temp”, so that’s how I mocked it.  If there is a parameter that is passed into the method, then you can mock some test value, and pass the same test value into your method to make sure it matches (the actual test parameter doesn’t matter for the unit test, only the results).

Using an IOC Container

This sample is setup to be extremely simple.  I’m assuming that you have existing .Net legacy code and you’re attempting to add unit tests to the code.  Normally, legacy code is hopelessly un-unit testable.  In other words, it’s usually not worth the effort to apply unit tests because of the tightly coupled nature of legacy code.  There are situations where legacy code is not too difficult to add unit testing.  This can occur if the code is relatively new and the developer(s) took some care in how they built the code.  If you are building new code, you can use this same technique from the beginning, but you should also plan your entire project to use an IOC container.  I would not recommend refactoring an existing project to use an IOC container.  That is a level of madness that I have attempted more than once with many man-hours of wasted time trying to figure out what is wrong with the scoping of my objects.

If your code is relatively new and you have refactored to use contructors as your injection points, you might be able to adapt to an IOC container.  If you are building your code from the ground up, you need to use an IOC container.  Do it now and save yourself the headache of trying to figure out how to inject objects three levels deep.  What am I talking about?  Here’s an example of a program that is tightly coupled:

class Program
{
    static void Main(string[] args)
    {
        var myRootClass = new MyRootClass();

        myRootClass.Increment();

        Console.WriteLine(myRootClass.CountExceeded());
        Console.ReadKey();
    }
}
public class MyRootClass
{
  readonly ChildClass _childClass = new ChildClass();

  public bool CountExceeded()
  {
    if (_childClass.TotalNumbers() > 5)
    {
        return true;
    }
    return false;
  }

  public void Increment()
  {
    _childClass.IncrementIfTempDirectoryExists();
  }
}

public class ChildClass
{
    private int _myNumber;

    public int TotalNumbers()
    {
        return _myNumber;
    }

    public void IncrementIfTempDirectoryExists()
    {
        if (System.IO.Directory.Exists("c:\\temp"))
        {
            _myNumber++;
        }
    }

    public void Clear()
    {
        _myNumber = 0;
    }
}

The example code above is very typical legacy code.  The “Main()” calls the first object called “MyRootClass()”, then that object calls a child class that uses System.IO.Directory.Exists().  You can use the previous example to unit test the ChildClass for examples when c:\temp exist and when it doesn’t exist.  When you start to unit test MyRootClass, there’s a nasty surprise.  How to you inject your directory wrapper into that class?  If you have to inject class wrappers and mocked classes of every child class of a class, the constructor of a class could become incredibly large.  This is where IOC containers come to the rescue.

As I’ve explained in other blog posts, an IOC container is like a dictionary of your objects.  When you create your objects, you must create a matching interface for the object.  The index of the IOC dictionary is the interface name that represents your object.  Then you only call other objects using the interface as your data type and ask the IOC container for the object that is in the dictionary.  I’m going to make up a simple IOC container object just for demonstration purposes.  Do not use this for your code, use something like AutoFac for your IOC container.  This sample is just to show the concept of how it all works.  Here’s the container object:

public class IOCContainer
{
  private static readonly Dictionary<string,object> ClassList = new Dictionary<string, object>();
  private static IOCContainer _instance;

  public static IOCContainer Instance => _instance ?? (_instance = new IOCContainer());

  public void AddObject<T>(string interfaceName, T theObject)
  {
    ClassList.Add(interfaceName,theObject);
  }

  public object GetObject(string interfaceName)
  {
    return ClassList[interfaceName];
  }

  public void Clear()
  {
    ClassList.Clear();
  }
}

This object is a singleton object (global object) so that it can be used by any object in your project/solution.  Basically it’s a container that holds all pointers to your object instances.  This is a very simple example, so I’m going to ignore scoping for now.  I’m going to assume that all your objects contain no special dependent initialization code.  In a real-world example, you’ll have to analyze what is initialized when your objects are created and determine how to setup the scoping in the IOC container.  AutoFac has options of when the object will be created.  This example creates all the objects before the program starts to execute.  There are many reasons why you might not want to create an object until it’s actually used.  Keep that in mind when you are looking at this simple example program.

In order to use the above container, we’ll need to use the same FileSystem object and interface from the prevous program.  Then create an interface for MyRootObject and ChildObject.  Next, you’ll need to go through your program and find every location where an object is instantiated (look for the “new” command).  Replace those instances like this:

public class ChildClass : IChildClass
{
    private int _myNumber;
    private readonly IFileSystem _fileSystem = (IFileSystem)IOCContainer.Instance.GetObject("IFileSystem");

    public int TotalNumbers()
    {
        return _myNumber;
    }

    public void IncrementIfTempDirectoryExists()
    {
        if (_fileSystem.DirectoryExists("c:\\temp"))
        {
            _myNumber++;
        }
    }

    public void Clear()
    {
        _myNumber = 0;
    }
}

Instead of creating a new instance of FileSystem, you’ll ask the IOC container to give you the instance that was created for the interface called IFileSystem.  Notice how there is no injection in this object.  AutoFac and other IOC containers have facilities to perform constructor injection automatically.  I don’t want to introduce that level of complexity in this example, so for now I’ll just pretend that we need to go to the IOC container object directly for the main program as well as the unit tests.  You should be able to see the pattern from this example.

Once all your classes are updated to use the IOC container, you’ll need to change your “Main()” to setup the container.  I changed the Main() method like this:

static void Main(string[] args)
{
    ContainerSetup();

    var myRootClass = (IMyRootClass)IOCContainer.Instance.GetObject("IMyRootClass");
    myRootClass.Increment();

    Console.WriteLine(myRootClass.CountExceeded());
    Console.ReadKey();
}

private static void ContainerSetup()
{
    IOCContainer.Instance.AddObject<IChildClass>("IChildClass",new ChildClass());
    IOCContainer.Instance.AddObject<IMyRootClass>("IMyRootClass",new MyRootClass());
    IOCContainer.Instance.AddObject<IFileSystem>("IFileSystem", new FileSystem());
}

Technically the MyRootClass object does not need to be included in the IOC container since no other object is dependent on it.  I included it to demonstrate that all objects should be inserted into the IOC container and referenced from the instance in the container.  This is the design pattern used by IOC containers.  Now we can write the following unit tests:

[TestMethod]
public void test_temp_directory_exists()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(true);

    IOCContainer.Instance.Clear();
    IOCContainer.Instance.AddObject("IFileSystem", mockFileSystem.Object);

    var myObject = new ChildClass();
    myObject.IncrementIfTempDirectoryExists();
    Assert.AreEqual(1, myObject.TotalNumbers());
}

[TestMethod]
public void test_temp_directory_missing()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(false);

    IOCContainer.Instance.Clear();
    IOCContainer.Instance.AddObject("IFileSystem", mockFileSystem.Object);

    var myObject = new ChildClass();
    myObject.IncrementIfTempDirectoryExists();
    Assert.AreEqual(0, myObject.TotalNumbers());
}

[TestMethod]
public void test_root_count_exceeded_true()
{
    var mockChildClass = new Mock<IChildClass>();
    mockChildClass.Setup(x => x.TotalNumbers()).Returns(12);

    IOCContainer.Instance.Clear();
    IOCContainer.Instance.AddObject("IChildClass", mockChildClass.Object);

    var myObject = new MyRootClass();
    myObject.Increment();
    Assert.AreEqual(true,myObject.CountExceeded());
}

[TestMethod]
public void test_root_count_exceeded_false()
{
    var mockChildClass = new Mock<IChildClass>();
    mockChildClass.Setup(x => x.TotalNumbers()).Returns(1);

    IOCContainer.Instance.Clear();
    IOCContainer.Instance.AddObject("IChildClass", mockChildClass.Object);

    var myObject = new MyRootClass();
    myObject.Increment();
    Assert.AreEqual(false, myObject.CountExceeded());
}

In these unit tests, we put the mocked up object used by the object under test into the IOC container.  I have provided a “Clear()” method to reset the IOC container for the next test.  When you use AutoFac or other IOC containers, you will not need the container object in your unit tests.  That’s because IOC containers like the one built into .Net Core and AutoFac use the constructor of the object to perform injection automatically.  That makes your unit tests easier because you just use the constructor to inject your mocked up object and test your object.  Your program uses the IOC container to magically inject the correct object according to the interface used by your constructor.

Using AutoFac

Take the previous example and create a new constructor for each class and pass the interface as a parameter into the object like this:

private readonly IFileSystem _fileSystem;

public ChildClass(IFileSystem fileSystem)
{
    _fileSystem = fileSystem;
}

Instead of asking the IOC container for the object that matches the interface IFileSystem, I have only setup the object to expect the fileSystem object to be passed in as a parameter to the class constructor.  Make this change for each class in your project.  Next, change your main program to include AutoFac (NuGet package) and refactor your IOC container setup to look like this:

static void Main(string[] args)
{
    IOCContainer.Setup();

    using (var myLifetime = IOCContainer.Container.BeginLifetimeScope())
    {
        var myRootClass = myLifetime.Resolve<IMyRootClass>();

        myRootClass.Increment();

        Console.WriteLine(myRootClass.CountExceeded());
        Console.ReadKey();
    }
}

public static class IOCContainer
{
    public static IContainer Container { get; set; }

    public static void Setup()
    {
        var builder = new ContainerBuilder();

        builder.Register(x => new FileSystem())
            .As<IFileSystem>()
            .PropertiesAutowired()
            .SingleInstance();

        builder.Register(x => new ChildClass(x.Resolve<IFileSystem>()))
            .As<IChildClass>()
            .PropertiesAutowired()
            .SingleInstance();

        builder.Register(x => new MyRootClass(x.Resolve<IChildClass>()))
            .As<IMyRootClass>()
            .PropertiesAutowired()
            .SingleInstance();

        Container = builder.Build();
    }
}

I have ordered the builder.Register command from innner most to the outer most object classes.  This is not really necessary since the resolve will not occur until the IOC container is called by the object to be used.  In other words, you can define the MyRootClass first, followed by FileSystem and ChildClass, or in any order you want.  The Register command is just storing your definition of which physical object will be represented by each interface and which dependencies it will depend on.

Now you can cleanup your unit tests to look like this:

[TestMethod]
public void test_temp_directory_exists()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(true);

    var myObject = new ChildClass(mockFileSystem.Object);
    myObject.IncrementIfTempDirectoryExists();
    Assert.AreEqual(1, myObject.TotalNumbers());
}

[TestMethod]
public void test_temp_directory_missing()
{
    var mockFileSystem = new Mock<IFileSystem>();
    mockFileSystem.Setup(x => x.DirectoryExists("c:\\temp")).Returns(false);

    var myObject = new ChildClass(mockFileSystem.Object);
    myObject.IncrementIfTempDirectoryExists();
    Assert.AreEqual(0, myObject.TotalNumbers());
}

[TestMethod]
public void test_root_count_exceeded_true()
{
    var mockChildClass = new Mock<IChildClass>();
    mockChildClass.Setup(x => x.TotalNumbers()).Returns(12);

    var myObject = new MyRootClass(mockChildClass.Object);
    myObject.Increment();
    Assert.AreEqual(true, myObject.CountExceeded());
}

[TestMethod]
public void test_root_count_exceeded_false()
{
    var mockChildClass = new Mock<IChildClass>();
    mockChildClass.Setup(x => x.TotalNumbers()).Returns(1);

    var myObject = new MyRootClass(mockChildClass.Object);
    myObject.Increment();
    Assert.AreEqual(false, myObject.CountExceeded());
}

Do not include the AutoFac NuGet package in your unit test project.  It’s not needed.  Each object is isolated from all other objects.  You will still need to mock any injected objects, but the injection occurs at the constructor of each object.  All dependencies have been isolated so you can unit test with ease.

Where to Get the Code

As always, I have posted the sample code up on my GitHub account.  This project contains four different sample projects.  I would encourage you to download each sample and experiment/practice with them.  You can download the samples by following the links listed here:

  1. MockingFileSystem
  2. TightlyCoupledExample
  3. SimpleIOCContainer
  4. AutoFacIOCContainer
 

Vintage Hardware – The IBM 701

The Basics

The IBM 701 computer was one of IBM’s first commercially available computers.  Here are a few facts about the 701:

  • The 701 was introduced in 1953.
  • IBM sold only 30 units.
  • It used approximately 4,000 tubes.
  • Memory was stored on 72 CRT tubes called Williams Tubes with a total capacity of 2,048 words.
  • The word size was 36 bits wide.
  • The CPU contained two programmer accessible registers: Accumulator, multiplier/quotient register.

Tubes

There were two types of electronic switches in the early 1950’s: The relay and the vacuum tube.  Relays were unreliable, mechanical, noisy, power-hungry and slow.  Tubes were unreliable and power-hungry.  Compared to the relay, tubes were fast.  One of the limitations on how big and complex a computer could be in those days, was determined by the reliability of the tube.  Tubes had a failure rate of 0.3 percent per 1000 hours (in the late 40’s, early 50’s).  That means, that in 1,000 hours of use, 0.3 percent of the tubes will fail.  That’s pretty good if you’re talking about a dozen tubes.  At that failure rate, the 701 would fail at an average of less than one hour (0.3 x 4,000 tubes = 1,200 units per 1,000 hours) assuming an even distribution of failures.

Tubes use higher voltage levels than semiconductors and that means a lot more heat and more power usage.  It also means that tubes are much slower than semiconductors.  By today’s standards, tubes are incredible slow.  Most tube circuit times were measured in microseconds instead of nanoseconds.  A typical add instruction on the IBM 701 took 60 microseconds to complete.

The main memory was composed of Williams tubes.  These are small CRT tubes that stored data by exploiting the delay time that it took phosphor to fade.  Instead of using the visible properties of phosphor to store data, a secondary emission effect can occur by increasing the beam to a threshold causing a charge to occur.  A thin conducting plate is mounted to the front of the CRT.  When the beam hits a point where the phosphor is already lit up, no charge is induced on the plate.  This is a read as a one.  When the beam hits a place where there was no lit phosphor, a current will flow indicating a zero.  The read operation causes the bit to overwritten as a one, so the data must be re-written as it is read.  Wiki has an entire article on the Williams tube: Williams tube wiki.  Here’s an example of the dot pattern on a Williams tube:

By National Institute of Standards and Technology – National Institute of Standards and Technology, Public Domain

The tubes that were used for memory storage didn’t have visible phosphor, therefore a technician could plug in a phosphor tube in parallel and see the image as picture above for troubleshooting purposes.

Williams tube memory was called electrostatic storage and it had an access time of 12 microseconds.  Each tube stored 1,024 bits of data data for one bit of the data bus.  To form a parallel data bus, 36 tubes together represented the full data path for 1,024 words of data.  The data is stored on the tube in 32 columns by 32 rows.  If you count the columns and rows in the tube pictured above, you’ll see that it is storing 16 by 16, representing a total of 256 bits of data.  The system came with 72 tubes total for a memory size of 2,048 words of memory.  Since the word size is 36 bits wide, 2,048 words is equal to 9,216 bytes of storage.

Here’s a photo of one drawer containing two Williams tubes:

By www.Computerhistory.org

Williams tubes used electromagnets to steer the electron beam to an xy point on the screen.  What this means is that the entire assembly must be carefully shielded to prevent stray magnetic fields from redirecting the beam.  In the picture above you can see the black shielding around each of the long tubes to prevent interference between the two tubes.

In order to address the data from the tube, the addressing circuitry would control the x and y magnets to steer to the correct dot, then read or write the dot.  If you want to learn more about the circuitry that drives an electrostatic memory unit, you can download the schematics in PDF here.

One type of semiconductor was used in this machine and that was the Germanium diode.  The 701 used at total of 13,000 germanium diodes.

CPU

The CPU or Analytical Control Unit could perform 33 calculator operations as shown in this diagram:

Here is the block diagram of the 701:

Instructions are read from the electrostatic storage into the memory register where it is directed to the instruction register. Data that is sent to external devices must go through the multiplier/quotient register.  Data read from external devices (i.e. tape, drum or card reader) is loaded into the multiplier/quotient register.  As you can see from the block diagram, there is a switch that feeds inputs from the tape, drum, card reader or the electrostatic memory.  This is a physical switch on the front panel of the computer.  The computer operator would change the switch position to the desired input before starting the machine.  Here’s a photo of the front panel, you can see the round switch near the lower left (click to zoom):

By Dan – Flickr: IBM 701, CC BY 2.0

The front panel also has switches and lights for each of the registers so the operator could manually input binary into the accumulator or enter a starting address (instruction counter).  Notice how the data width of this computer is wider than the address width.  Only 12 bits are needed to address all 2,048 words of memory.  If you look closely, you’ll also notice that the lights are grouped in octal sets (3 lights per group).  The operator can key in data that is written as octal numbers (0-7) without trying to look at a 12-bit number of ones and zeros.  The switches for entering data are grouped gray and white in octal groupings as well.

There is a front panel control to increment the machine cycle.  An operator or technician could troubleshoot a problem by executing one machine cycle at a time with one press of the button per machine cycle.  For a multiply instruction the operator could push the button 38 times to complete the operation or examine the partial result as each cycle was completed.  The machine also came with diagnostic programs that could be run to quickly identify a physical problem with the machine.  A set of 16 manuals was provided to assist the installation, maintenance, operation and programming of the 701.  The computer operator normally only operated the start, stop and reset controls as well as the input and output selectors on the machine.  The instruction counter and register controls are normally used by a technician to troubleshoot problems.

The CPU was constructed using a modular technique.  Each module or “Pluggable Unit” had a complete circuit on it and could be replaced all at once by a technician.  The units looked like this:

Image from Pintrest (Explore Vacuum Tube, IBM, and more!)

Each of these units are aligned together into one large block:

(By Bitsavers, IBM 701)

Notice how all the tubes face the front of the machine.  One of the first things to burn out on a tube is the heating element.  By facing the tubes toward the front, a technician can quickly identify any burned out tubes and replace them.  If all tubes are glowing, then the technician will need to run diagnostics and try to limit down which tube or tubes are not working.  Another reason to design the modules to face all tubes forward, is that the technician can grab a tube and pull it out of it’s socket to put it into a tube tester and determine which tube really is bad.

My experience with tubes date back to the late ’70s when people wanted to “fix” an old TV set (usually a TV that was built in the early 60s).  I would look for an unlit tube, read the tube number off the side and run down to the electronics shop to buy a new one.  If that failed, then my next troubleshooting step was to pull all the tubes (after the TV cooled off), put them in a box and run up to the electronics store where they had a tube tester (they wanted to sell new tubes, so they provided a self-help tester right at their store).  I would plug my tube into the tester, flip through the index pages for the tube type being tested.  Then adjust the controls according to the instructions and push the test button.  If the tube was bad, then I bought a new one.  Other components such as diodes, resisters, capacitors rarely went bad.  Those would be the next thing on the troubleshooting list after all the tubes were determined to be good.  One other note about troubleshooting tubes: The tube tester had a meter that would be read to determine what the testing current flow was.  The manual had a range of acceptable values (minimum and maximum values).  The values were based on the tube manufacturer’s specifications.  For some devices, the tube would not work, even though it was within manufacturer specifications.  We referred to tubes like this as a “weak” tube (in other words it might not be able to pass as much current as needed by the circuit).  So a determination would have to be made to replace the tube.

Here’s an example of the type of tube tester I remember using:

(PDX Retro)

All cabinets were designed to be no larger than necessary to be able to transport up an elevator and fit through a normal sized office door (see: Buchholz IBM 701 System Design, Page 1285 Construction and Design Details).  The entire system was divided into separate cabinets for this very purpose.  Cables would be run through the floor to connect cabinets together, much like the wiring in modern server rooms.

Storage Devices

The 701 could be configured with a drum storage unit and a tape storage unit.  The IBM 731 storage unit contained two physical drums organized as four logical drums able to store 2,048 full words of information each (for a total of 8,192 words, equal to 82,000 decimal digits).  Each logical drum reads and writes 36 bits in a row.  The drum spins at a rate of 2,929 RPM with a density of 50 bits to the inch.  There are 2,048 bits around the drum for each track.  Seek time could take up to 1,280 microseconds.  The storage capacity was intentionally designed to match the capacity of the electrostatic memory.  This was used to read all memory onto a drum or read a drum into the electrostatic memory.

Inputs

The primary input device for the 701 was the punch card reader.  Computers in the 50’s were designed as batch machines.  It cost a lot of money to run a computer room and it is more economical to batch jobs so the computer was processing data continuously.  It would be too expensive to have an operator typing data into the computer and saving it on a storage device like we do today.  Timeshare systems were not invented yet and this machine was too slow and tiny for time sharing.  In order to perform batch jobs, operators or programmers would use the keypunch machine to type their program and data onto punch cards (usually in a different room).  The keypunch machine was an electro-mechanical device that looked like a giant typewriter (IBM 026):

By Columbia University, The IBM 026 Keypunch

A punch card reader is connected as an input to the 701 in order to read in a batch of punch cards.  The cards can be read in at a rate of 150 cards per minute.  Each card holds one line of data from a program or 72 characters wide.  If the punch cards are used as binary input, then they can contain 24 full words per card.  A program can be punched onto a stack of cards and then loaded into the card reader for execution.  Programmers used a lot of common routines with their programs, so they would punch a routine on a small stack of cards and then include that stack with their main program to be run in sequence (see: Buchholz IBM 701 System Design, Page 1274 Card programming).  Then they could save the common routine cards to be used with other jobs.  The stacks of cards were equivalent to today’s “files”.

By IBM Archives, IBM 711 Punch Card Reader

Outputs

The 701 was normally configured with a line printer that can print 150 lines per minute.  Information to be printed was transferred from the multiplier/quotient register to 72 thyratrons (a type of current amplifier tube) connected directly to the printer.  The tyratrons were located inside the Analytical Control Unit and their function was shared by the printer and the card punch (thyratrons activated the print magnets or the punch magnets).  Data that was printed came directly from the electrostatic storage and needed to be converted by the program into decimal data before going to the printer.

Sources

I have provided links from each embedded photo above to the sources (with the exception of diagrams I copied from the sources below).  You may click on a photo to go to the source and obtain more information.  For more detailed information about the IBM 701, I would recommend clicking through these sites/documents:

 

Stored Procedures Vs. No Stored Procedures

There is this debate raging among developers: “Is it better to use stored procedures or not use stored procedures”.  From first glance, this seems like a simple question, but there are some complicated implications around this question.  Here’s the basic pros and cons of using stored procedures in your system:

Pros

  • You can isolate table changes from your front-end.
  • Updating business code is easy, just update the stored procedure while your system is running (assuming you already tested the changes off-line and everything is good).
  • Many queries can be reduced to one call to the database making your system faster.

Cons

  • Your business processing is performed by your database which is expensive (license fees compared to web servers).
  • Unit testing is all but impossible.
  • Version control is difficult.
  • You are married to your database server technology (MS SQL cannot be easily ported to Oracle).
  • Stored Procedure languages are not well structured.  This leads to spaghetti code.

Now, let’s expand a bit and discuss each of these issues:

Isolate Table Changes

It may seem like overkill to setup all of your select, insert, update and delete queries to call a stored procedure, then have the stored procedure call the tables.  This can be a boon for situations where you might change your table structure in the future.  An example would be a situation where a table becomes too wide and you want to break the table into two tables.  Your stored procedures can have the same interface to your program and the code inside could handle all the details of where the data is stored and read from.  Removing a field from a table can also be safely done, since you can just remove it inside the stored procedure but leave the parameter called from your software until you can root out all calls to the database and change those (or leave the dead parameter).

Updating Business Code

Business code is the logic that computes and performs work on your database.  Your stored procedure might update one table, add an entry to a log table and remove corresponding records from another table.  Another use for a stored procedure is to pass filter and ordering information into a query for a list of data.  A dynamic query can be formed in a stored procedure and executed with all the parameters entered.  This relieves your front-end from the task of sorting and filtering data.  It can also reduce the amount of raw data returned from the database.

The point here is that your business code might need a change or an enhancement.  If there is a bug, then it can be fixed and deployed on the fly.  A .Net application must be compiled and carefully deployed on servers that are not accessed during the time of deployment.  A stored procedure can be changed on the fly.  If your website is a large monolithic application, this capability becomes a larger “pro”.

One Call to the Database

If a record edit requires a log entry, then you’ll be forced to call your database twice.  If you use a stored procedure, the stored procedure code can make those calls, which would be performed from the database itself and not over the wire.  This can reduce the latency time to perform such an operation.

Now for the list of cons…

Business Processing is Being Performed by the Database

MS SQL server and Oracle licensing is expensive.  Both licenses are based on number of CPUs and the price can get steep.  If you have a choice between performing your processing on a web server or a database server, it’s a no-brainer.  Do it on the web server.  Web servers, even an MS Server license is cheaper than a SQL Server license.  Initially your business will not feel the cost difference because your system is small.  However, once your system grows you’ll see your costs increase on an exponential scale.

Unit Testing

Unit testing is not available for stored procedures.  You could probably get creative and setup a testing environment in Visual Studio, but it would be custom.  Without unit tests, you are not able to regression test.  If your stored procedures contain logic that allows different modes to occur, it can be difficult to properly test each mode.  An example is a stored procedure that performs filter and sorting operations.  If you are building a dynamic query using “if” statements, then you’ll have a combination of different possible inputs.  How do you ensure that your dynamic query doesn’t have a missing “and” or a missing comma, or a union query with non-matching fields?  It’s difficult.  If this logic is in your website code, you can wrap each combination of inputs in unit tests to provide regression testing when you add a filter or sorting option or change an option.

Version Control

No matter how you divide your code, you’ll need to make sure you version control your database so you can keep a history of your changes and match those changes with your code.  Visual Studio allows you to define all your database objects in a project (see SQL Server Database Project type).  There are tools available to allow you to create change scripts from two Team Foundation Server versions.  This can be used to update multiple databases.  Versioning a database is not a common practice and that is why I’ve put this under the “con” list instead of the “pro”.  For companies that keep their database definitions in a version control system, they can take this off the “con” list.

Married to Your Database Server

Till death do you part!  If you ever decide to switch from one database server technology to another, you’ll discover how steep the hill is that you’ll need to climb.  Each stored procedure will need to be converted by hand one-by-one.  If your system doesn’t have stored procedures, then you’ll have an easier time converting.  Minor differences between triggers and indexes might be an issue between Oracle and SQL, and there’s the recursive query in Oracle that is different.  You might even have issues with left and right outer joins if you used the “(+)” symbol in Oracle.  Stored procedures will be your Achilles heel.

Spaghetti Code

Writing stored procedures is a lot like writing in Classic ASP.  It’s messy.  I see a lot of sloppy coding practices.  There is no standard for formatting queries or TSQL code.  Everybody has their own short-cuts.  Once a system grows to a point where it contains tens of thousands of stored procedures, you’re faced with a mess that has no hope.  C# code has the luxury of being able to be refactored.  This is a powerful capability that can be used to reduce entangled code.  Being able to break code into small and management chunks is also helpful.  If your database code is contained in a Visual Studio project, you can perform some of the same refactoring, but you can’t test on the fly.  So programmers prefer to change stored procedures on their test database where refactoring is not available.

Conclusion

Are there more pros and cons?  Sure.  Every company has special needs for their database.  Some companies have a lot of bulk table to table processing that must be performed.  That screams stored procedures, and I would recommend sticking with that technique.  Other companies have a website front-end with large quantities of small transactions.  I would recommend those companies keep their business logic in their website code where they can unit test their logic.  In the end, you’ll need to take this list of pros and cons and decide which item to give more weight.  Your scale may tip one way or the other depending on which compromise you want to make.

 

Caching

Summary

This subject is much larger than the blog title suggests.  I’m going to discuss some basics of using a caching system to take the load off of your database and speed up your website.  I’ll also discuss how cache should be handled by your software and what the pitfalls of caching can be.

Caching Data Calls

First of all, you can speed up your database routines by using a cache system configured as a helper.  Here’s a diagram of what I mean by configuring a cache system as a helper:

Basically, what this abstract diagram shows is that the website front-end retrieves it’s data from the database by going through the cache server.  Technically, that is not how the system is setup.  The cache and database are accessed through the back-end, but the cache instructions are setup with the following flow-chart:

You’ll notice that the first place to check is the cache system.  Only if the data is not in the cache, does the program connect to the database and read the data from there.  If the data is read from the database, then it must be saved in the cache first, then returned to the calling method.

One other thing to note here, is that it is vitally important to design and test your cache system in a way that it will work without crashing if the cache server is off or not responding.  If your cache server crashes, you don’t want your website to crash, you want it to continue operating directly from the database.  Your website will operate more slowly, but it will operate.  This will prevent your cache system from becoming a single point of failure.

Why is Caching Faster?

Programmers who have never dealt with caching systems before might ask this question.  It doesn’t seem like a good way to speed up a system by just throwing an extra server in the middle and adding more code to every data read.  The reason for the speed increase is due to the fact that the cache system is typically RAM based, where databases are mostly hard-drive based (though all modern databases use caching of their own).  Also, a Redis server costs pennies compared to a SQL server just in licensing costs alone.  By reducing the load on your SQL server you can reduce the number of processors needed for your SQL instances and reduce the amount of your licensing fees.

This sounds so magical, but caching requires great care and can cause a lot of problems if not implemented correctly.

Caching Pitfalls to Avoid

One pitfall of caching is how to handle stale data.  If you are using an interactive interface with CRUD operations, you’ll want to make sure you flush your cached objects for edits and deletes.  You only need to delete the key of the data in the cache server that relates to the data changed in the database.  This can become complicated if data being changed shows up in more than one cache key.  An example is where you cache result sets instead of the raw data.  You might have one cache object that contains a list of products at a particular store that contains the store name and another cache object that contains product coupons offered by a store containing the store name.  Caching is not a normalized structure and in this instance the cached instance is matched to the web page that needs the data.  Now, think about an example where the store name is changed for one reason or another.  If your software is responsible for clearing the cache when data is changed, then the store name change administration page must be aware of all cache keys that could contain the store name and delete/expire those keys.  The quick and dirty method of such an operation is to flush all the cache keys, but that method could cause a slowdown of the system and is not recommended.

It’s also tempting to cache every result in your website.  While this could theoretically be done, there is a point where caching can become more of a headache than a help.  One instance is in caching large result sets that constantly change.  One example is if you have a list of stores with sales figures that you refer to often.  The underlying query to compute this data might be SQL intensive, so it’s tempting to cache the results.  However, if the data is constantly changing, then each change in data must clear the cache and the underlying operation of caching and expiring cache keys can slow down your system.  Another example is to cache a list of employees with a different cache key per sort or filter operation.  This could lead to a large number of cached sets that can fill up your cache server and cause the server to expire cached items early in order to make room for the constant saving of new data.  If you need to cache a list, then cache the whole list and do your filtering and sorting in your website after reading it from the cache.  If there is too much data to cache, you can limit the cache time.

Sticky Cache

You can adjust the cache expire time to improve system performance.  If your cached data doesn’t change very often, like a lookup table that is used everywhere in your system, then make the expire time infinite.  Handle all cache expiring through your interface by only expiring the data when it is changed.  This can be one of your greatest speed increases.  Especially if you have lookup tables that are hit by all kinds of pages, like your system settings or user rights matrix.  These lookup tables might get hit several times for each web page access.  If you can cache that lookup, then the cache system will take the hit instead of your database.  This type of caching is sometimes referred to as “Sticky” because the cache keys stick in the system and never expire.

Short Cache Cycles

The opposite extreme is to assign a short expire time.  This can be used for instances where you would cache the results of a list page.  The cached page might have an expire time of 5 minutes with a sliding expire time.  That would allow a user to see the list, click on a next page button until they find what they are looking for.  After the user has found the piece of data to view or edit, the list is no longer needed.  The expire time can kick in and expire the cached data, or the view/edit screen can expire the cache when the user clicks on the link to go to that page.  It can also reduce database calls by using a cached set that is just the raw list.  Sorting and searching can be done from the front-end.  When the user clicks on the header of a column to sort by that column, the data can be re-read from the cache and sorted by that column, instead of being read from the database.

This particular use of caching should only be used after careful analysis of your data usage.  You’ll need the following data points:

  • Total amount of data per cache key.  This will be the raw query of records that will be read by the user.
  • Total number of times database is accessed first time arrival at the web page in question, per user.  This can be used to compute memory used by cache system.
  • Average number of round trips to the database the website uses when a user is performing a typical task.  This can be obtained by totaling the number of accesses to the database by each distinct user.

Multiple Cache Servers

If you get to a point where you need more than one cache server there are many ways to divide up your cached items.  If you’re using multiple database instances, then it would make sense to use one cache server per instance (or one per two database instances depending on usage).  Either way, you’ll need to make sure that you are able to know which instance your cached data is located on.  If you have a load-balanced web farm setup with a round-robin scheme, you don’t want to perform your caching per web server.  Such a setup would cause your users to get cache misses more often than hits and you would duplicate your caching for most items.  It’s best to think of this type of caching as being married to your database.  Each database should be matched up with one cache server.

If you have multiple database instances that you maintain for your customer data and you have a common database for system related lookup information, it would be advisable to setup a caching system for the common database system first.  You’ll get more bang for the buck by doing this and you’ll reduce your load on your common system.  Your common data is usually where your sticky caching will be used.  If your fortunate, you’ll be able to cache all of your common data and only use your common database for loading the system when there are changes or outages.

Result Size to Cache

Let’s say you have a lookup table containing all the stores in your company.  This data doesn’t change very often and there is one administrative screen that edits this data.  So sticky caching is your choice.  Analysis shows that each website call causes a read from this table to lookup the store name from the id or some such information.  Do you cache the entire table as one cache item or do you cache each store as one cache key per store?

If your front-end only looks up a store by it’s id, then you can name your cache keys with the store id and it will be more efficient to store each key separately.  Loading the cache will take multiple reads to the database, but each time your front-end hits the cache, the minimum amount of cached data is sent over the wire.

If your front-end searches or filters by store name or by zip code or by state, etc.  Then it’s wiser to cache the whole table into the cache system as one key.  Then your front-end can pull all the cached data and perform filtering and sorting as needed.  This will also depend on data size.

If your data size is very large, then you might need to create duplicated cached data for each store by id, each zip code, each state, etc.  This would seem wasteful at first, but remember the data is not stored permanently.  It’s OK to store duplicate results in cache.  The object of cache is not to reduce wasted space but to reduce website latency and reduce the load on your database.

 

.Net MVC Project with AutoFac, SQL and Redis Cache

Summary

In this blog post I’m going to demonstrate a simple .Net MVC project that uses MS SQL server to access data.  Then I’m going to show how to use Redis caching to cache your results to reduce the amount of traffic hitting your database.  Finally, I’m going to show how to use the AutoFac IOC container to tie it all together and how you can leverage inversion of control to to break dependencies and unit test your code.

AutoFac

The AutoFac IOC container can be added to any .Net project using the NuGet manager.  For this project I created an empty MVC project and added a class called AutofacBootstrapper to the App_Start directory.  The class contains one static method called Run() just to keep it simple.  This class contains the container builder setup that is described in the instructions for AutoFac Quick Start: Quick Start.

Next, I added .Net library projects to my solution for the following purposes:

BusinessLogic – This will contain the business classes that will be unit tested.  All other projects will be nothing more than wire-up logic.

DAC – Data-tier Application.

RedisCaching – Redis backed caching service.

StoreTests – Unit testing library

I’m going to intentionally keep this solution simple and not make an attempt to break dependencies between dlls.  If you want to break dependencies between modules or dlls, you should create another project to contain your interfaces.  For this blog post, I’m just going to use the IOC container to ensure that I don’t have any dependencies between objects so I can create unit tests.  I’m also going to make this simple by only providing one controller, one business logic method and one unit test.

Each .Net project will contain one or more objects and each object that will be referenced in the IOC container must use an interface.  So there will be the following interfaces:

IDatabaseContext – The Entity Framework database context object.

IRedisConnectionManager – The Redis connection manager provides a pooled connection to a redis server.  I’ll describe how to install Redis for windows so you can use this.

IRedisCache – This is the cache object that will allow the program to perform caching without getting into the ugly details of reading and writing to Redis.

ISalesProducts – This is the business class that will contain one method for our controller to call.

Redis Cache

In the sample solution there is a project called RedisCaching.  This contains two classes: RedisConnectionManager and RedisCache.  The connection manager object will need to be setup in the IOC container first.  That needs the Redis server IP address, which would normally be read from a config file.  In the sample code, I fed the IP address into the constructor at the IOC container registration stage.  The second part of the redis caching is the actual cache object.  This uses the connection manager object and is setup in the IOC container next, using the previously registered connection manager as a paramter like this:

builder.Register(c => new RedisConnectionManager("127.0.0.1"))
    .As<IRedisConnectionManager>()
    .PropertiesAutowired()
    .SingleInstance();

builder.Register(c => new RedisCache(c.Resolve<IRedisConnectionManager>()))
    .As<IRedisCache>()
    .PropertiesAutowired()
    .SingleInstance();

In order to use the cache, just wrap your query with syntax like this:

return _cache.Get("ProductList", 60, () =>
{
  return (from p in _db.Products select p.Name);
});

The code between the { and } represents the normal EF linq query.  This must be returned to the anonymous function call: ( ) =>

The cache key name in the example above is “ProductList” and it will stay in the cache for 60 minutes.  The _cache.Get() method will check the cache first, if the data is there, then it returns the data and moves on.  If the data is not in the cache, then it calls the inner function, causing the EF query to be executed.  The result of the query is then saved to the cache server and then the result is returned.  This guarantees that the next query in less than 60 minutes will be in the cache for direct retrieval.  If you dig into the Get() method code you’ll notice that there are multiple try/catch blocks that will error out if the Redis server is down.  For a situation where the server is down, the inner query will be executed and the result will be returned.  In a production situation your system would run a bit slower and you’ll notice your database is working harder, but the system keeps running.

A precompiled version of Redis for Windows can be downloaded from here: Service-Stack Redis.  Download the files into a directory on your computer (I used C:\redis) then you can open a command window and navigate into your directory and use the following command to setup a windows service:

redis-server –-service-install

Please notice that there are two “-” in front of the “service-install” instruction.  Once this is setup, then Redis will start every time you start your PC.

The Data-tier

The DAC project contains the POCOs, the fluent configurations and the context object.  There is one interface for the context object and that’s for AutoFac’s use:

builder.Register(c => new DatabaseContext("Server=SQL_INSTANCE_NAME;Initial Catalog=DemoData;Integrated Security=True"))
    .As<IDatabaseContext>()
    .PropertiesAutowired()
    .InstancePerLifetimeScope();

The context string should be read from the configuration file before being injected into the constructor shown above, but I’m going to keep this simple and leave out the configuration pieces.

Business Logic

The business logic library is just one project that contains all the complex classes and methods that will be called by the API.  In a large application you might have two or more business logic projects.  Typically though, you’ll divide your application into independent APIs that will each have their own business logic project as well as all the other wire-up projects shown in this example.  By dividing your application by function you’ll be able to scale your services according to which function uses the most resources.  In summary, you’ll put all the complicated code inside this project and your goal is to apply unit tests to cover all combination of features that this business logic project will contain.

This project will be wired up by AutoFac as well and it needs the caching and the data tier to be established first:

builder.Register(c => new SalesProducts(c.Resolve<IDatabaseContext>(), c.Resolve<IRedisCache>()))
    .As<ISalesProducts>()
    .PropertiesAutowired()
    .InstancePerLifetimeScope();

As you can see the database context and the redis caching is injected into the constructor of the SalesProjects class.  Typically, each class in your business logic project will be registered with AutoFac.  That ensures that you can treat each object independent of each other for unit testing purposes.

Unit Tests

There is one sample unit test that performs a test on the SalesProducts.Top10ProductNames() method.  This test only tests the instance where there are more than 10 products and the expected count is going to be 10.  For effective testing, you should test less than 10, zero, and exactly 10.  The database context is mocked using moq.  The Redis caching system is faked using the interfaces supplied by StackExchange.  I chose to setup a dictionary inside the fake object to simulate a cached data point.  There is no check for cache expire, this is only used to fake out the caching.  Technically, I could have mocked the caching and just made it return whatever went into it.  The fake cache can be effective in testing edit scenarios to ensure that the cache is cleared when someone adds, deletes or edits a value.  The business logic should handle cache clearing and a unit test should check for this case.

Other Tests

You can test to see if the real Redis cache is working by starting up SQL Server Management Studio and running the SQL Server Profiler.  Clear the profiler, start the MVC application.  You should see some activity:

Then stop the MVC program and start it again.  There should be no change to the profiler because the data is coming out of the cache.

One thing to note, you cannot use IQueryable as a return type for your query.  It must be a list because the data read from Redis is in JSON format and it’s de-serialized all at once.  You can de-searialize and serialize into a List() object.  I would recommend adding a logger to the cache object to catch errors like this (since there are try/catch blocks).

Another aspect of using an IOC container that you need to be conscious of is the scope.  This can come into play when you are deploying your application to a production environment.  Typically developers do not have the ability to easily test multi-user situations, so an object that has a scope that is too long can cause cross-over data.  If, for instance, you set your business logic to have a scope of SingleInstance() and then you required your list to be special to each user accessing your system, then you’ll end up with the data of the first person who accessed the API.  This can also happen if your API receives an ID to your data for each call.  If the object only reads the data when the API first starts up, then you’ll have a problem.  This sample is so simple that it only contains one segment of data (top 10 products).  It doesn’t matter who calls the API, they are all requesting the same data.

Other Considerations

This project is very minimalist, therefore, the solution does not cover a lot of real-world scenarios.

  • You should isolate your interfaces by creating a project just for all the interface classes.  This will break dependencies between modules or dlls in your system.
  • As I mentioned earlier, you will need to move all your configuration settings into the web.config file (or a corresponding config.json file).
  • You should think in terms of two or more instances of this API running at once (behind a load-balancer).  Will there be data contention?
  • Make sure you check for any memory leaks.  IOC containers can make your code logic less obvious.
  • Be careful of initialization code in an object that is started by an IOC container.  Your initialization might occur when you least expect it to.

Where to Get The Code

You can download the entire solution from my GitHub account by clicking here.  You’ll need to change the database instance in the code and you’ll need to setup a redis server in order to use the caching feature.  A sql server script is provided so you can create a blank test database for this project.

 

DotNet Core vs. NHibernate vs. Dapper Smackdown!

The Contenders

Dapper

Dapper is a hybrid ORM.  This is a great ORM for those who have a lot of ADO legacy code to convert.  Dapper uses SQL queries and parameters can be used just like ADO, but the parameters to a query can be simplified into POCOs.  Select queries in Dapper can also be translated into POCOs.  Converting legacy code can be accomplished in steps because the initial pass of conversion from ADO is to add Dapper, followed by a step to add POCOs, then to change queries into LINQ (if desired).  The speed difference in my tests show that Dapper is better than my implementation of ADO for select queries, but slower for inserts and updates.  I would expect ADO to perform the best, but there is probably a performance penalty for using the data set adapter instead of the straight sqlCommand method.

If you’re interested in Dapper you can find information here: Stack Exchange/Dapper.   Dapper has a NuGet package, which is the method I used for my sample program.

ADO

I rarely use ADO these days, with the exception of legacy code maintenance or if I need to perform some sort of bulk insert operation for a back-end system.  Most of my projects are done in Entity Framework, using the .Net Core or the .Net version.  This comparison doesn’t feel complete without including ADO, even though my smackdown series is about ORM comparisons.  So I assembled a .Net console application with some ADO objects and ran a speed test with the same data as all the ORM tests.

NHibernate

NHiberate is the .Net version of Hibernate.  This is an ORM that I used at a previous company that I worked for.  At the time it was faster than Entity Framework 6 by a large amount.  The .Net Core version of Entity Framework has fixed the performance issues of EF and it no longer makes sense to use NHibernate.  I am providing the numbers in this test just for comparison purposes.  NHibernate is still faster than ADO and Dapper for everything except the select.  Both EF-7 and NHibernate are so close in performance that I would have to conclude that they are the same.  The version of NHibernate used for this test is the latest version as of this post (version 4.1.1 with fluent 2.0.3).

Entity Framework 7 for .Net Core

I have updated the NuGet packages for .Net Core for this project and re-tested the code to make sure the performance has not changed over time.  The last time I did a smackdown with EF .Net Core I was using .Net Core version 1.0.0, now I’m using .Net Core 1.1.1.  There were not measurable changes in performance for EF .Net Core.

The Results

Here are the results side-by-side with the .ToList() method helper and without:

Test for Yourself!

First, you can download the .Net Core version by going to my GitHub account here and downloading the source.  There is a SQL script file in the source that you can run against your local MS SQL server to setup a blank database with the correct tables.  The NHibernate speed test code can also be downloaded from my GitHub account by clicking here. The ADO version is here.  Finally, the Dapper code is here.  You’ll want to open the code and change the database server name.

 

EntityFramework .Net Core Basics

In this post I’m going to describe the cleanest way to setup your Entity Framework code in .Net Core.  For this example, I’m going to create the database first, then I’ll run a POCO generator to generate the Plain Old C# Objects and their mappings.  Finally, I’ll put it together in a clean format.

The Database

Here is the SQL script you’ll need for the sample database.  I’m keeping this as simple as possible.  There are only two tables with one foreign key constraint between the two.  The purpose of the tables becomes obvious when you look at the table names.  This sample will be working with stores and the products that are in the stores.  Before you start, you’ll need to create a SQL database named DemoData.  Then execute this script to create the necessary test data:

delete from Product
delete from Store
go
DBCC CHECKIDENT ('[Store]', RESEED, 0);
DBCC CHECKIDENT ('[Product]', RESEED, 0);
go
insert into store (name) values ('ToysNStuff')
insert into store (name) values ('FoodBasket')
insert into store (name) values ('FurnaturePlus')

go
insert into product (Store,Name,price) values (1,'Doll',5.99)
insert into product (Store,Name,price) values (1,'Tonka Truck',18.99)
insert into product (Store,Name,price) values (1,'Nerf Gun',12.19)
insert into product (Store,Name,price) values (2,'Bread',2.39)
insert into product (Store,Name,price) values (1,'Cake Mix',4.59)
insert into product (Store,Name,price) values (3,'Couch',235.97)
insert into product (Store,Name,price) values (3,'Bed',340.99)
insert into product (Store,Name,price) values (3,'Table',87.99)
insert into product (Store,Name,price) values (3,'Chair',45.99)

POCO Generator

Next, you can follow the instructions at this blog post: Scaffolding ASP.Net Core MVC

There are a few tricks to making this work:

  1. Make sure you’re using Visual Studio 2015.  This example does not work for Visual Studio 2017.
  2. Create a throw-away project as a .Net Core Web Application.  Make sure you use “Web Application” and not “Empty” or “Web API”.

You can navigate to the directory, then open a command window, copy the directory from the top of the explorer window, type “cd ” in the command window and paste the directory there.  Then hit enter and you’ll be in the correct directory.  You’ll need to be inside the directory that contains the project.json file where you copied the NuGet package text (see article at link).  The dotnet ef command will not work outside that directory.  You’ll end up with a command line like this:

dotnet ef dbcontext scaffold "Server=YOURSQLNAME;Database=DemoData;Trusted_Connection=True;" Microsoft.EntityFrameworkCore.SqlServer --output-dir Models

Once you have generated your POCOs, you’ll have a directory named Models inside the .Net Core MVC project:

 

 

If you look at Product.cs or Store.cs, you’ll see the POCO objects, like this:

public partial class Product
{
    public int Id { get; set; }
    public int Store { get; set; }
    public string Name { get; set; }
    public decimal? Price { get; set; }

    public virtual Store StoreNavigation { get; set; }
}

Next, you’ll find a context that is generated for you.  That context contains all your mappings:

public partial class DemoDataContext : DbContext
{
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseSqlServer(@"Server=YOURSQLNAME;Database=DemoData;Trusted_Connection=True;");
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Product>(entity =>
        {
            entity.Property(e => e.Name).HasColumnType("varchar(50)");
            entity.Property(e => e.Price).HasColumnType("money");
            entity.HasOne(d => d.StoreNavigation)
                .WithMany(p => p.Product)
                .HasForeignKey(d => d.Store)
                .OnDelete(DeleteBehavior.Restrict)
                .HasConstraintName("FK_store_product");
        });

        modelBuilder.Entity<Store>(entity =>
        {
            entity.Property(e => e.Address).HasColumnType("varchar(50)");
            entity.Property(e => e.Name).HasColumnType("varchar(50)");
            entity.Property(e => e.State).HasColumnType("varchar(50)");
            entity.Property(e => e.Zip).HasColumnType("varchar(50)");
        });
    }

    public virtual DbSet<Product> Product { get; set; }
    public virtual DbSet<Store> Store { get; set; }
}

At this point, you can copy this code into your desired project and it’ll work as is.  It’s OK to leave it in this format for a small project, but this format will become cumbersome if your project grows beyond a few tables.  I would recommend manually breaking up your mappings to individual source files.  Here is an example of two mapping files for the mappings of each table defined in this sample:

public static class StoreConfig
{
  public static void StoreMapping(this ModelBuilder modelBuilder)
  {
    modelBuilder.Entity<Store>(entity =>
    {
        entity.ToTable("Store");
        entity.Property(e => e.Address).HasColumnType("varchar(50)");
        entity.Property(e => e.Name).HasColumnType("varchar(50)");
        entity.Property(e => e.State).HasColumnType("varchar(50)");
        entity.Property(e => e.Zip).HasColumnType("varchar(50)");
    });
  }
}

And the product config file:

public static class ProductConfig
{
    public static void ProductMapping(this ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Product>(entity =>
        {
            entity.ToTable("Product");
            entity.Property(e => e.Name).HasColumnType("varchar(50)");

            entity.Property(e => e.Price).HasColumnType("money");

            entity.HasOne(d => d.StoreNavigation)
                .WithMany(p => p.Product)
                .HasForeignKey(d => d.Store)
                .OnDelete(DeleteBehavior.Restrict)
                .HasConstraintName("FK_store_product");
        });
    }
}

Then change your context to this:

public partial class DemoDataContext : DbContext
{
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseSqlServer(@"Server=YOURSQLNAME;Database=DemoData;Trusted_Connection=True;");
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.StoreMapping();
        modelBuilder.ProductMapping();
    }

    public virtual DbSet<Product> Product { get; set; }
    public virtual DbSet<Store> Store { get; set; }
}

Notice how the complexity of the mappings is moved out of the context definition and into individual configuration files.  It’s best to create a configuration directory to put those files in and further divide your config files from your POCO source files.  I also like to move my context source file out to another directory or project.  An example of a directory structure is this:

 

 

 

You can name your directories any name that suits your preferences.  I’ve seen DAC used as Data Access Control, Sometimes Repository makes more sense.  As you can see in my structure above, I grouped all of my database source files in the DAC directory, then the Configuration sub-directory contains mappings, the Domain directory has all of my POCOs and finally I left the context in the DAC directory itself.  Another technique is to split these into different projects and use an IOC container to glue it all together.  You can use that technique to break dependencies between dlls that are generated by your project.

 

Dot Net Core and NuGet Packages

One of the most frustrating changes made in .Net Core is the NuGet package manager.  I like the way the new package manager works, unfortunately, there are still a lot of bugs in the way the package manager works.  I call it the tyranny of intelligent software.  The software is supposed to do all the intelligent work for you, leaving you to do your work as a developer and create your product.  Unfortunately, one or more bugs cause errors to occur and then you have to out-think the smart software and try and figure out what it was supposed to do.  When smart software works, it’s magical.  When it doesn’t work, life as a developer can be miserable.

I’m going to show you how the NuGet manager works in .Net Core and I’ll show you some tricks you’ll need to get around problems that might arise.  I’m currently using Visual Stdio 2015 Community.

The Project.json File

One of the great things about .Net Core is the new project.json file.  You can literally type or paste in the name of the NuGet package you want to include in your project and it will synchronize all the dlls that are needed for that package.  If look real close, you’ll see a triangle next to the file.  There is another file that is automatically maintained by the package manager called the project.lock.json file.  This file is excluded from TFS check-in because it can be automatically re-generated from the project.json file.  You can open the file up and observe the thousands of lines of json data that is stuffed into this file.  Sometimes this file contains old versions of dlls, especially if you created your project months ago and now you want to update your NuGet packages.  If your dependencies section in your project.json file are all flagged as errors, there could be a conflict in the lock file.  You can hold your cursor over any NuGet package and see what the error is, but sometimes that is not very helpful.

To fix this issue, you can regenerate the lock file.  Just delete the file from the solution explorer.  Visual studio should automatically restore the file.  If not, then open up your package manager console window.  It should be at the bottom of visual studio, or you can go to “Tools -> NuGet Package Manager -> Package Manager Console”.    Type “dotnet restore” in the console window and wait for it to complete.

The NuGet Local Cache

When the package manager brings in packages from the Internet, it keeps a copy of each package in a cache.  This is the first place where the package manager will look for a package.  If you use the same package in another project, you’ll notice that it doesn’t take as much time to install it as it did the first time.  The directory is under your user directory.  Go to c:\Users, then find the directory with your user name (the name you’re currently logged in as, or the user name that was setup for your computer when you installed your OS).  Then you’ll see a folder named “.nuget” Open that folder and drill down into “packages”.  You should see about a billion folders with packages that you’ve installed since you started using .Net Core.  You can select all of these and delete them.  Then you can go back to your solution and restore packages.  It’ll take longer than normal to restore all your packages because they must be read from the Internet first.

An easier method to clearing this cache is to go to the package manager console and type in:

nuget locals all -clear

If you have your .nuget/packages folder open, you’ll see all the sub directories disappear.

If the nuget command does not work in Visual Studio, you’ll have to download the NuGet.exe file from here.  Get the recommended latest.  Then search for your nuget execution directory.  For VS2015 it is:

C:\Program Files (x86)\NuGet\Visual Studio 2015

Drop the EXE file in that directory (There is probably already a vsix file in there).  Then make sure that your system path contains the directory.  I use Rapid Environment Editor to edit my path, you can download and install that application from here.  Once you have added to your PATH, then exit Visual Studio and start it back up again.  Now your “nuget” command should work in the package manager console command line.

NuGet Package Sources

If you look at the package console window you’ll see a drop-down that normally shows “nuget.org”.  There is a gear icon next to the drop-down.  Click it and you’ll see the “Package Sources” window.  This window has the list of locations where NuGet packages will be searched for.  You can add your own local packages if you would like, and then add a directory to this list.  You can also update the list with urls that are shown at the NuGet site.  Go to www.nuget.org and look for the “NuGet Feed Locations” header.  Below that is a list of urls that you can put into the package sources window.  As of this blog post, there are two URLs:

Sometimes you’ll get an error when the package manager attempts to update your packages.  If this occurs, it could be due to a broken url to a package site.  There is little you can do about the NuGet site.  If it’s down, you’re out.  Fortunately, that’s a rare event.  For local package feeds, you can temporarily turn them off (assuming your project doesn’t use any packages from your local site).  To turn off one feed, you can go to the “Package Sources” window and just uncheck the check box.  Just selecting one package feed from the drop-down does not prevent the package manager from checking and failing from a bad local feed.

Restart Visual Studio

One other trick that I’ve learned, is to restart Visual Studio.  Sometimes the package manager just isn’t behaving itself.  It can’t seem to find any packages and your project has about 8,000 errors consisting of missing dependencies.  In this instance, I’ll clear the local cache and then close Visual Studio.  Then re-open Visual Studio with my solution and perform a package restore.

Package Dependency Errors

Sometimes there are two or more versions of the same package in your solution.  This can cause dependency errors that are tough to find.  You’ll get a dependency error in one project that has a package version that is newer than the same package in another project that your current project is dependent on.

To find these problems, you can right click on the solution and select “Manage NuGet Packages for Solution…” then click on each package name and look at the check boxes on the right.  If you see two different versions, update all projects to the same version:

Finally

I hope these hints save you a lot of time and effort when dealing with packages.  The problems that I’ve listed here have appeared in my projects many times.  I’m confident you’ll run into them as well.  Be sure and hit the like button if you found this article to be helpful.