Automated Deployment with .Net Core 2.0 Unit Tests

If you’re using an automated deployment system or continuous integration, you’ll need to get good at compiling and running your unit tests from the command line.  One of the issues I found with .Net Core was the difficulty in making xUnit work with Jenkins.  Jenkins has plug-ins for different types of unit testing modules and support for MSTest is easy to implement.  There is no plug-in that makes xUnit work in Jenkins for .Net Core 1.  There is a plug-in for nUnit that works with the xUnit output if you convert the xml tags to match what is expected by the plug-in.  That’s where this powershell script becomes necessary:

https://blog.dangl.me/archive/unit-testing-and-code-coverage-with-jenkins-and-net-core/

If you’re attempting to use .Net Core 1 projects, follow the instructions at the link to make it work properly.

For .Net Core 2.0, there is an easier solution.  There is a logger switch that allows you to output the correct xml formatted result file that can be used by the MSTest report runner in Jenkins.  You’ll need to be in the directory containing the project file for the unit tests you want to run, then execute the following:

dotnet test --logger "trx;LogFileName=mytests.trx"

Run this command for each unit test project you have in your solution and then use the MSTest runner:

This will pickup any trx files and display the familiar unit test line chart.

The dotnet-test command will run xUnit as well as MSTest so you can mix and match test projects in your solution.  Both will produce the same formatted xml output trx file for consumption by Jenkins.

One note about the powershell script provided Georg Dangl:

There are environment variables in the script that are only created when executed from Jenkins.  So you can’t test this script from outside of the Jenkins environment (unless you fake out all the variables before executing the script).  I would recommend modifying the script to convert all the $ENV variables into a parameter passed into the script.  From Jenkins the variable names would be the same as they are in the script (like $ENV:WORKSPACE), but you can pass in a workspace url to the script if you want to tests this script on your desktop.  Often times I’ll test my scripts on my desktop/laptop first to make sure the script works correctly.  Then I might test it on the Jenkins server under my user account.  After that I test from the Jenkins job itself.  Otherwise, it could take a lot of man-hours to fix a powershell script from re-running a Jenkins job just to test the script.

 

 

Ransomware

My wife and I recently took a vacation in Nevada so we could hike trails in parks surrounding Las Vegas.  To get a sense of what we did on vacation you can check out the pictures on my hiking blog by clicking here (I lined up the post dates to be the date when we visited each park).  This article is not about the fun we had on our vacation.  This is about a ransomware attack that occurred while I was on vacation.

I’m a software engineer.  That means that I know how to hack a computer and I also know how to protect myself from hackers.  But I’m not immune to making mistakes.  The mistake I made was that I have passwords that I haven’t changed in forever.  Like all large companies, Apple had been hacked about a year ago and their entire database of passwords were obtained and distributed throughout the hacking community (yeah, “community”, that’s what they are).  The Apple hack involved their cloud service, which I didn’t pay much attention to, because I don’t use their cloud storage.  What I didn’t pay attention to was that their cloud services play a part in some of the iPhone and iPad security.

If you clicked the link above and started looking through my pictures, you’ll notice that the first place we visited was Death Valley.  I discovered that it was supposed to be a record 120 degrees that day and I wanted to know what it would feel like to stand outside in 120 degree weather!  Yeah, I’m crazy like that.  As it turned out it got up to 123 degrees according to the thermometer built into the Jeep that we rented.  Dry heat or not, 123 degrees was HOT!

I use my Canon Rebel xti for the photos that I post on my hiking blog, but I also take pictures with my iPhone in order to get something on my Facebook account as well as get a few panoramas.  When we arrived at the sand dunes it was just before noon local time or before 3PM Eastern time.  My photos indicate they were taken around 2:07, but the camera time is off by more than an hour and the camera is on Eastern time.

I took a few pictures of the dunes and then I pulled out my iPhone and was about to take pictures when I got a warning that my phone is locked and I need to send an email to get instructions on how to pay for access to my iPhone.  So I used my usual pin number to unlock my iPhone and it worked correctly.  I was annoyed, but I moved on.  I thought it was some “clever” advertisement or spam notification.

When we returned to the resort, I sat down on the couch and started to use my iPad to do some reading.  My iPad had the same message on the front and a pin number was set up.  Unfortunately, for me, I never set a pin number on my iPad because I only use it at home to surf the web, read a book and maybe play a game.  What the hackers did was setup a pin number on my iPad.  What an annoyance.  Ransomware on any of my devices is no more worrisome than a rainy day.  It’s more of an irritation than anything.  I have off-site backups for my desktop machine and I know how to restore the entire machine in a few hours.  Hacking my iPad while I was on vacation (and the second day of vacation to boot), was really annoying.  Primarily because I don’t have access to all of my adapters, computers and tools.  My wife has a tiny laptop that we use for minor stuff.  It has a grand total of 64 gigabytes of storage space.  So she installed iTunes on it (with some difficulty) and we restored the iPad and got everything back to normal.

After returning from vacation, I cleaned out all of my spam emails for the past couple of weeks and discovered these emails:

It appears that someone manually logged into my iCloud account, enabled lost mode and put in a message, for both of my devices.  The iPhone was first, which was locked by pin number, so they couldn’t change that.  The iPad, however, was not setup with a pin number, so they went in and set their own.  Or so I assumed, when I saw it was asking for a 6-digit pin.  Apparently, the pin that shows up is the pin that is set when the device is first setup.  My pin was not the same for the iPad as I used on my iPhone (which was what I tried when I first saw it appear).

My wife and I changed the password on our iCloud accounts when we were at the resort and she set the two-factor on for the iCloud.  Of course, that is a bit of a problem if I lose my iPhone, but it prevents anyone from hacking my iCloud account.

One thing that makes me wonder… how was Apple storing my password?  Are the passwords stored in clear text?  Are they encrypted with an algorithm that allows the password to be decrypted?  That seems foolish.  Maybe Apple was using something like a weak MD5 hash and the hacked database was decrypted using a brute force method like this: 25-GPU cluster cracks every standard Windows password in <6 hours.  I know that the correct password was used to login to the iCloud using a browser.  The notification sent to my email account proves it.

How to Protect Yourself

The first level of protection that I have is that I assume I will get hacked.  From that assumption, I have plans in place to reduce any damages that can occur.  First, I have an off-site backup system that backs up everything I can’t replace on my desktop computer.  Pictures, documents, etc.  They are all backed up.  Some of my software is on GitHub so I don’t worry about backing up my local repository directory.  I have backup systems in place on my blogs and my website.

Next in line is the two-factor system.  This is probably one of the best ways to protect yourself.  Use your phone as your second factor and protect your phone from theft.  If someone steals your phone, they probably don’t have your passwords.  If someone has your passwords, they don’t have your phone.  If you see messages arrive at your phone with a second factor pin number, then you need to change the password for the account that requested it.

Next, you should turn on notifications of when someone logs into your account (if the feature is available).  Like the notifications about my iCloud being used in the emails above, I can see that someone accessed my account when I wasn’t around.  If someone is silently logging into your account, a lot more damage can be done before you figure out what is going on.

If you’re using email as your second factor, you need to protect your email account as though it was made of gold.  Change your email password often, in case the provider has been hacked.  Your email account is most likely used as a method of resetting your password on other sites.  So if a hacker gets into your email account, they can guess at other sites that you might have accounts and reset your password to get in.  I have my own urls and hosts so I create and maintain my own email system.  If my email system gets hacked it’s 100% my fault.

Disable unused accounts.  If you’re like me, you have hundreds of web accounts for stores and sites that you signed up for.  Hey, they were “free” right?  Unfortunately, your passwords are out there and any one site can get hacked.  You can’t keep track of which sites got hacked last week.  Keep a list of sites that you have accounts on.  Review that list at least annually and delete accounts on sites you no longer use.  If the site doesn’t allow you to delete your account, then go in and change the password to something that is completely random and long (like 20 characters or more depending on what the site will allow).

Use a long password if possible.  Just because the minimum password is 8 characters doesn’t mean you need to come up with an 8 character password.  If sites allow you to use 30 characters, then make something up.  There is an excellent XKCD comic demonstrating password strengths: click here.  For companies providing websites with security, I would recommend you allow at least 256 characters for passwords.  Allow your customers to create a really strong password.  Storage is cheap.  Stolen information is expensive.

Don’t use the same password for everything.  That’s a bit obvious, but people do crazy things all the time.  The problem with one password for all is that any site that gets hacked means a hacker can get into everything you have access to.  It also means you need to change all of your passwords.  If you use different passwords or some sort of theme (don’t make the theme obvious), then you can change your most important passwords often and the passwords to useless sites less often.

Last but not Least…

Don’t pay the ransom!  If you pay money, what happens if you don’t get the unlock key?  What happens if you unlock your computer and it gets re-ransomed again?  Plan for this contingency now.  Paying ransom only funds a criminal organization.  The more money they make performing these “services” the more likely they will continue the practice.  I like to think of these people as telemarketers, if nobody paid, then they would all be out of work.  Since telemarketing continues to this day, someone, somewhere is buying something.  Don’t keep the ransomware cycle going.

 

Documentation

Why is there No Documentation?

I’m surprised at the number of developers who don’t create any documentation.  There are ways to self-document code and there are packages to add automatic help to an API to document the services.  Unfortunately, that’s not enough.  I’ve heard all the arguments:

  • It chews up programming time.
  • It rapidly becomes obsolete.
  • The code should explain itself.
  • As a programmer, I’m more valuable if I keep the knowledge to myself.

Let me explain what I mean by documentation.  What should you document outside of your code?  Every time you create a program you usually create a design for it.  If it’s just drawings on a napkin, then it needs to go someplace where other developers can access it if they need to work on your software.  Configuration parameters need documentation.  Not a lot, just enough to describe what the purpose is.  Installation?  If there is some trick to making your program work in a production, QA or staging environment, then you should document it.  Where is the source code located?  Is there a deployment package?  Was there a debate on the use of one technology over another?

So what happens when you have no documentation?  First, you need to find the source code.  Hopefully it’s properly named and resides in the current repository.  Otherwise, you may be forced to dig through directories on old servers or in some cases the source might not be available.  If the source is not available your options are limited: De-compile, rewrite or work with what is built.  Looking at source code written by a programmer that no longer exists at your company is a common occurrence (did that programmer think not documenting made him/her valuable?).  Usually such code is tightly coupled, contains poorly named methods and variables with no comments.  So here are the arguments of why you should do documentation:

  • Reverse engineering software chews up programming time.
  • Most undocumented code is not written to be self-explanatory.
  • Attempting to figure out why a programmer wrote a program the way he/she did can be difficult and sometimes impossible.
  • Programmers come and go no matter how little documentation exists.

It’s easy to go overboard with documentation.  This can be another trap.  Try to keep your documentation to just the facts.  Don’t write long-winded literature.  Keep it technical.  Start with lists of notes.  Expand as needed.  Remove any obsolete documentation.

Getting Your Documentation Started

The first step to getting your documentation started is to decide on a place to store it.  The best option is a wiki of some sort.  I prefer Confluence or GitHub.  They both have clean formatting and are easy to edit and drop in pictures/screenshots.

So you have a wiki setup and it’s empty.  Next, create some subjects.  If you have several software projects in progress, start with those.  Create a subject for each project and load up all the design specifications.  If your development team is performing a retrospective, type it directly into the wiki.  If there is a debate or committee meeting to discuss a change or some nuance with the software, type it into the wiki.  They can just be raw historical notes.

Next, add “documentation” as a story point to your project, or add it to each story.  This should be a mandatory process.  Make documentation part of the development process.  Developers can just add a few notes, or they can dig in and do a brain-dump.  Somewhere down the road a developer not involved in the project will need to add an enhancement or fix a bug.  That developer will have a starting point.

Another way to seed the wiki is to create subjects for each section of your existing legacy code and just do a dump of notes in each section.  Simple information off the top of everyone’s head is good enough.  The wiki can be reworked at a later date to make things more organized.  Divide an conquer.  If a developer has fixed a bug in a subsystem that nobody understands, that developer should input their knowledge into the wiki.  This will save a lot of time when another developer has to fix a bug in that system and it will prevent your developers from becoming siloed.

You Have Documentation – Now What?

One of the purposes of your technical documentation is to train new people.  This is something that is overlooked a lot.  When a new developer is hired, that person can get up to speed faster if they can just browse a wiki full of technical notes.  With this purpose in mind, you should expand your wiki to include instructions on how to setup a desktop/laptop for a development environment.  You can also add educational material to get a developer up to speed.  This doesn’t mean that you need to type in subjects on how to write an MVC application.  You should be able to link to articles that can be used by new developers to hone their skills.  By doing this, you can keep a new person busy while you coordinate your day to day tasks, instead of being tied down to an all-day training session to get that person up to speed.

Your documentation should also contain a subject on your company development standards.  What frameworks are acceptable?  What processes must be followed before introducing new technologies to the system?  Coding standards?  Languages that can be used?  Maybe a statement of goals that have been laid down.  What is the intended architecture of your system?  If your company has committees to decide what the goal of the department is, then maybe the meeting minutes would be handy.

Who is Responsible for Your Documentation

Everyone should be responsible.  Everyone should participate.  Make sure you keep backups in case something goes wrong or someone makes a mistake.  Most wiki software is capable of tracking revisions.  Documentation should be treated like version control.  Don’t delete anything!  If you want to hide subjects that have been deprecated, then create a subject at the bottom for all your obsolete projects.  When a project is deprecated, move the wiki subject to that folder.  Someday, someone might ask a question about a feature that used to exist.  You can dig up the old subject and present what used to exist if necessary.  This is especially handy if you deprecated a subsystem that was a problem-child due to it’s design.  If someone wants to create that same type of mess, they can read what experience was learned from the obsolete subsystem.  The answer can be: “We tried that, it didn’t work and here’s why…” or it can be: “We did that before and here’s how it used to work…”

If you make the task of documenting part of the work that is performed, developers can add documentation as software is created.  Developers can modify documentation when bugs are fixed or enhancements are made.  Developers can remove or archive documentation when software is torn down or replaced.  The documentation should be part of the software maintenance cycle.  This will prevent the documentation from getting out of sync with your software.

 

Three Tier Architecture

There is a lot of information on the Internet about the three-tier architecture, three-tier structure and other names for the concept of breaking a program into tiers.  The current system design paradigm is to break your software into APIs and the three-tier architecture still applies.  I’m going to try and explain the three-tier architecture from the point of practicality and explain the benefits of following this structure.  But first, I have to explain what happens when a hand-full of inexperienced programmers run in and build a system from the ground up…

Bad System Design

Every seasoned developer knows what I’m talking about.  It’s the organically grown, not very well planned system.  Programming is easy.  Once a person learns the basic syntax, the world is their oyster!  Until the system get really big.  There’s a tipping point where tightly-coupled monolithic systems become progressively more difficult and time consuming to enhance.  Many systems that I have worked on were well beyond that point when I started working on them.  Let me show a diagram of what I’m talking about:

Technically, there is no firm division between front-end and back-end code.  The HTML/JavaScript is usually embedded in back-end code and there is typically business code scattered between the HTML.  Sometimes systems like this contain a lot of stored procedures which does nothing more than marry you to the database that was first used to build the application.  Burying business code in stored procedures also has the additional financial burden of ensuring you are performing your processing on the product with the most expensive licensing fees.  When your company grows, you’ll be forced to purchase more and more licenses for the database in order to keep up.  This proceeds exponentially and you’ll come to a point where any tiny change in a stored procedure, function or table causes your user traffic to overwhelm the hardware that your database runs on.

I often-times joke about how I would like to build a time machine for no other reason than to go back in time, sit down with the developers of the system I am working on and tell them what they should do.  To head it off before it becomes a very large mess.  I suspect that this craziness occurs because companies are started by non-programmers and they hook-up with some young and energetic programmer with little real-world experience who can make magic happen.  Any programmer can get the seed started.  Poor programming practices and bad system design doesn’t show up right away.  A startup company might only have a few hundred users at first.  Hardware is cheap, SQL server licenses seem reasonable, everything is working as expected.  I also suspect that those developers move on when the system becomes too difficult to manager.  They move on to another “new” project that they can start bad.  Either that, or they learn their lesson and the next company they work at is lucky to get a programmer with knowledge of how not to write a program.

Once the software gets to the point that I’ve described, then it takes programmers like me to fix it.  Sometimes it takes a lot of programmers with my kind of knowledge to fix it.  Fixing a system like this is expensive and takes time.  It’s a lot like repairing a jet while in flight.  The jet must stay flying while you shut down one engine and upgrade it to a new one.  Sound like fun?  Sometimes it is, usually it’s not.

Three-Tier Basics

In case you’re completely unfamiliar with the three-tier system, here is the simplified diagram:

It looks simple, but the design is a bit nuanced.  First of all, the HTML, JavaScript, front-end frameworks, etc. must be contained in the front-end box.  You need isolation from the back-end or middle-tier.  The whole purpose of the front-end is to handle the presentation or human interface part of your system.  The back-end or middle-tier is all the business logic.  It all needs to be contained in this section.  It must be loosely coupled and unit tested.  Preferably with an IOC container like AutoFac.  The database must be nothing more than a container for your data.  Reduce special features as much as possible.  Your caching system is also located in this layer.

The connection between the front-end and back-end is usually an API connection using REST.  You can pass data back and forth between these two layers using JSON or XML or just perform “get”, “post”, “delete” and “put” operations.  If you treat your front-end as a system that communicates with another system called your back-end, you’ll have a successful implementation.  You’ll still have hardware challenges (like network bandwidth and server instances), but those can be solved much quicker and cheaper than rewriting software.

The connection between the back-end and database has another purpose.  Your goal should be to make sure your back-end is database technology independent as much as possible.  You want the option of switching to a database with cheap licensing costs.  If you work hard up-front, you’ll get a pay-back down the road when your company expands to a respectable size and the database licensing cost starts to look ugly.

What About APIs?

The above diagram looks like a monolithic program at first glance.  If you follow the rules I already laid out, you’ll end up with one large monolithic program.  So there’s one more level of separation you must be aware of.  You need to logically divide your system into independent APIs.  You can split your system into a handful of large APIs or hundreds of smaller APIs.  It’s better to build a lot of smaller APIs, but that can depend on what type of system is being built and how many logical boxes you can divide it into.  Here’s an example of a very simple system divided into APIs:

This is not a typical way to divide your APIs.  Typically, an API can share a database with another API and the front-end can be separate from the API itself.  For now, let’s talk about the advantages of this design as I’ve shown.

  1. Each section of your system is independent.  If a user decides to consume a lot of resources by executing a long-running task, it won’t affect any other section of your system.  You can contain the resource problem.  In the monolithic design, any long-running process will kill the entire system and all users will experience the slow-down.
  2. If one section of your system requires heavy resources, then you can allocate more resources for that one section and leave all other sections the same.  In other words, you can expand one API to be hosted by multiple servers, while other APIs are each on one server.
  3. Deployment is easy.  You only deploy the APIs that are updated.  If your front-end is well isolated, then you can deploy a back-end piece without the need for deployment of your front-end.
  4. Your technology can be mixed.  You can use different database technologies for each section.  You can use different programming languages for each back-end or different frameworks for each front-end.  This also means that you have a means to convert some or all of your system to a Unix hosted system.  A new API can be built using Python or PHP and that API can be hosted on a Linux virtual box.  Notice how the front-end should require no redesign as well as your database.  Just the back-end software for one subsection of your system.

Converting a Legacy System

If you have a legacy system built around the monolithic design pattern, you’re going to want to take steps as soon as possible to get into a three-tier architecture.  You’ll also want to build any new parts using an API design pattern.  Usually it takes multiple iterations to remove the stored procedures and replace the code with decoupled front-end and back-end code.  You’ll probably start with something like this:

In this diagram the database is shared between the new API and the legacy system, which is still just a monolithic program.  Notice how stored procedures are avoided by the API on the right side.  All the business logic must be contained in the back-end so it can be unit tested.  Eventually, you’ll end up with something like this:

Some of your APIs can have their own data while others rely on the main database.  The monolithic section of your system should start to shrink.  The number of stored procedures should shrink.  This system is already easier to maintain than the complete monolithic system.  You’re still saddled with the monolithic section and the gigantic database with stored procedures.  However, you now have sections of your system that is independent and easy to maintain and deploy.  Another possibility is to do this:

In this instance the front-end is consistent.  One framework with common JavaScript can be contained as a single piece of your system.  This is OK because your front-end should not contain any business logic.

The Front-End

I need to explain a little more about the front-end that many programmers are not aware of.  Your system design goal for the front-end is to assume that your company will grow so large that you’re going to have front-end specialists.  These people should be artists who work with HTML, CSS and other front-end languages.  The front-end designer is concerned with usability and aesthetics.  The back-end designer is concerned about accuracy and speed.  These are two different skill sets.  The front-end person should be more of a graphic designer while the back-end person should be a programmer with knowledge of scalability and system performance.  Small companies will hire a programmer to perform both tasks, but a large company must begin to divide their personnel into distinct skill-sets to maximize the quality of their product.

Another overlooked aspect of the front-end is that it is going to become stale.  Somewhere down the road your front-end is going to be ugly compared to the competition.  If your front-end code is nothing more than HTML, CSS, JavaScript and maybe some frameworks, you can change the look and feel of the user interface with minimum disruption.  If you have HTML and JavaScript mixed into your business logic, you’ve got an uphill battle to try and upgrade the look and feel of your system.

The Back-End

When you connect to a database the common and simple method is to use something like ODBC or ADO.  Then SQL statements are sent as strings with parameters to the database directly.  There are many issues with this approach and the current solution is to use an ORM like Entity Framework, NHibernate or even Dapper.  Here’s a list of the advantages of an ORM:

  1. The queries are in LINQ and most errors can be found at compile time.
  2. The context can be easily changed to point to another database technology.
  3. If you including mappings that match your database, you can detect many database problems at compile time, like child to parent relationship issues (attempt to insert a child record with no parent).
  4. An ORM can break dependency with the database and provide an easy method of unit testing.

As I mentioned earlier, you must avoid stored procedures, functions and any other database technology specific features.  Don’t back yourself into a corner because MS SQL server had a feature that made it easy to use as an enhancement.  If your system is built around a set of stored procedures, you’ll be in trouble if you want to switch from MS SQL to MySQL, or from MS SQL to Oracle.

Summary

I’m hoping that this blog post is read by a lot of entry-level programmers.  You might have seen the three-tier architecture mentioned in your school book or on a website and didn’t realize what it was all about.  Many articles get into the technical details of how to implement a three-tier architecture using C# or some other language, glossing over the big picture of “why” it’s done this way.  Be aware that there are also other multi-tier architectures that can be employed.  Which technique you use doesn’t really matter as long as you know why it’s done that way.  When you build a real system, you have to be aware of what the implications of your design are going to be five or ten years from now.  If you’re just going to write some code and put it into production, you’ll run into a brick wall before long.  Keep these techniques in mind when you’re building your first system.  It will pay dividends down the road when you can enhance your software just be modifying a small API and tweak some front-end code.

 

Dear Computer Science Majors…

Introduction

It has been a while since I wrote a blog posts directed at newly minted Computer Science Majors.  In fact, the last time I wrote one of these articles was in 2014.  So I’m going to give all of you shiny-new Computer Scientists a leg-up in the working world by telling you some inside information about what companies need from you.  If you read through my previous blog post (click here) and read through this post, you’ll be ahead of the pack when you submit your resume for that first career starting job.

Purpose of this Post

First of all, I’m going to tell you my motivation for creating these posts.  In other words: “What’s in it for Frank.”  I’ve been programming since 1978 and I’ve been employed as a software engineer/developer since 1994.  One of my tasks as a seasoned developer is to review submitted programming tests, create programming tests, read resumes, submit recommendations, interview potential developers, etc.  By examining the code submitted by a programming test, I can tell a lot about the person applying for a job.  I can tell how sophisticated they are.  I can tell if they are just faking it (i.e. they just Googled results, made it work and don’t really understand what they are doing).  I can tell if the person is interested in the job or not.  One of the trends that I see is that there is a large gap between what is taught in colleges and what is needed by a company.  That gap has been increasing for years.  I would like to close the gap, but it’s a monstrous job.  So YOU, the person reading this blog that really wants a good job, must do a little bit of your own leg-work.  Do I have your attention?  Then read on…

What YOU Need to Do

First, go to my previous blog post on this subject and take notes on the following sections:

  • Practice
  • Web Presence

For those who are still in school and will not graduate for a few more semesters, start doing this now:

  • Programming competitions
  • Resume Workshops
  • Did I mention: Web Presence?

You’ll need to decide which track you’re going to follow and try to gain deep knowledge in that area.  Don’t go out and learn a hundred different frameworks, twenty databases and two dozen in-vogue languages.  Stick to something that is in demand and narrow your expertise to a level that you can gain useful knowledge.  Ultimately you’ll need to understand a subject well enough to make something work.  You’re not going to be an expert, that takes years of practice and a few failures.  If you can learn a subject well enough to speak about it, then you’re light-years ahead of the average newly minted BS-degree.

Now it’s time for the specifics.  You need to decide if you’re going to be a Unix person or a .Net person.  I’ve done both and you can cross-over.  It’s not easy to cross-over, but I’m proof that it can happen.  If you survive and somehow end up programming as long as I have, then you’ll have experience with both.  Your experience will not be even between the two sides.  It be weighted toward one end or the other.  In my case, my experience is weighted toward .Net because that is the technology that I have been working on more recently.

If you’re in the Unix track, I’m probably not the subject expert on which technologies you need to follow.  Python, Ruby, which frameworks, unit testing, you’ll need to read up and figure out what is in demand.  I would scan job sites such as Glass Door, Indeed, LinkedIn, Stack Exchange or any other job sites just to see what is in demand.  Look for entry level software developer positions.  Ignore the pay or what they are required to do and just take a quick tally of how many companies are asking for Python, PHP, Ruby, etc.  Then focus on some of those.

If you’re in the .Net track, I can tell you exactly what you need to get a great paying job.  First, you’re going to need to learn C#.  That is THE language of .Net.  Don’t let anybody tell you otherwise.  Your college taught you Java?  No problem, you’re language knowledge is already 99% there.  Go to Microsoft’s website and download the free version of Visual Studio (the community version) and install it.  Next, you’ll need a database and that is going to be MS SQL Server.  Don’t bother with MS Access.  There is a free version of SQL as well.  In fact the developer version is fully functional, but you probably don’t need to download and install that.  When you install Visual Studio the Express version of SQL is normally installed with it.  You can gain real database knowledge from that version.

Follow this list:

  • Install Visual Studio Community.
  • Check for a pre-installed version of MS SQL Server Express.
  • Go out and sign up for a GitHub account.  Go ahead, I’ll wait (click here).
  • Download and install SourceTree (click here).

Now you have the minimum tools to build your knowledge.  Here’s a list of what you need to learn, using those tools:

  • How to program in C# using a simple console application.
  • How to create simple unit tests.
  • Create an MVC website, starting with the template site.
  • How to create tables in MS SQL Server.
  • How to insert, delete, update and select data in MS SQL Server.
  • How to create POCOs, fluent mappings and a database context in C#.
  • How to troubleshoot a website or API (learn some basic IIS knowledge).
  • How to create a repository on GitHub.
  • How to check-in your code to GitHub using SourceTree.

That would pretty much do it.  The list above will take about a month of easy work or maybe a hard-driven weekend.  If you can perform these tasks and talk intelligently about them, you’ll have the ability to walk into a job.  In order to seal-the-deal, you’ll have to make sure this information is correctly presented on your resume.  So how should you do that?

First, make sure you polish your projects and remove any commented code, remove any unused or dead-code.  If there are tricky areas, put in some comments.  Make sure you update your “Read-Me” file on GitHub for each of your projects.  Put your GitHub URL near the top of your resume.  If I see a programmer with a URL to a GitHub account, that programmer has already earned some points in my informal scale of who gets the job.  I usually stop reading the resume and go right to the GitHub account and browse their software.  If you work on a project for some time, you can check-in your changes as you progress.  This is nice for me to look at, because I can see how much effort you are putting into your software.  If I check the history and I see the first check-in was just a blank solution followed by several check-ins that show the code being refactored and re-worked into a final project, I’m going to be impressed.  That tells me that you’re conscious enough to know to get your code checked-in and protected from loss immediately.  Don’t wait for the final release.  Building software is a lot like producing sausage.  The process is messy, but the final product is good (assuming you like sausage).

If you really want to impress me and by extension, any seasoned programmer, create a technical blog.  Your blog can be somewhat informal, but you need to make sure you express your knowledge of the subject.  A blog can be used as a tool to secure a job.  It doesn’t have to get a million hits a day to be successful.  In fact, if your blog only receives hits from companies that are reading your resume, it’s a success.  You see, the problem with the resume is that it doesn’t allow me to into your head.  It’s just a sheet of paper (or more if you have a job history) with the bare minimum information on it.  It’s usually just to get you past the HR department.  In the “olden” days, when resumes were mailed with a cover letter, the rule was one page.  Managers would not have time to read novels, so they wanted the potential employee to narrow down their knowledge to one page.  Sort of a summary of who you are in the working world.  This piece of paper is compared against a dozen or hundreds of other single-page resumes to determine which hand-full of people would be called in to be interviewed.  Interviews take a lot of physical time, so the resume reading needs to be quick.  That has changed over the years and the rules don’t apply to the software industry as a whole.  Even though technical resumes can go on for two or more pages, the one-page resume still applies for new graduates.  If you are sending in a resume that I might pick up and read, I don’t want to see that you worked at a Walmart check-out counter for three years, followed by a gig at the car wash.  If you had an intern job at a tech company where you got some hands-on programming experience, I want to see that.  If you got an intern with Google filing their paperwork, I don’t care.

Back to the blog.  What would I blog about if I wanted to impress a seasoned programmer?  Just blog about your experience with the projects you are working on.  It can be as simple as “My First MVC Project”, with a diary format like this:

Day 1, I created a template MVC project and started digging around.  Next I modified some text in the “View” to see what would happen.  Then I started experimenting with the ViewBag object.  That’s an interesting little object.  It allowed me to pass data between the controller and the view.

And so on…

Show that you did some research on the subject.  Expand your knowledge by adding a feature to your application.  Minor features like, search, column sort, page indexing are important.  It demonstrates that you can take an existing program and extend it to do more.  When you enter the working world, 99% of what you will create will be a feature added to code you never wrote.  If your blog demonstrates that you can extend existing code, even code you wrote yourself, I’ll be impressed.

Taking the Test

Somewhere down the line, you’re going to generate some interest.  There will be a company out there that will want to start the process and the next step is the test.  Most companies require a programming test.  At my point in my career the programming test is just an annoyance.  Let’s call it a formality.  As a new and inexperienced programmer, the test is a must.  It will be used to determine if you’re worth someone’s time to interview.  Now I’ve taken many programming tests and I’ve been involved in designing and testing many different types of programming tests.  The first thing you need to realize is that different companies have different ideas about what to test.  If it was up to me, I would want to test your problem solving skills.  Unfortunately, it’s difficult to test for that skill without forcing you to take some sort of test that may ask for your knowledge in a subject that you don’t have.  I’ve seen tests that allow the potential hire to use any language they want.  I’ve also seen tests that give very gray specifics and are rated according to how creative the solution is.  So here are some pointers for passing the test:

  • If it’s a timed test, try to educate yourself on the subjects you know will on the tests before it starts.
  • If it’s not a timed test, spend extra time on it.  Make it look like you spent some time to get it right.
  • Keep your code clean.
    • No “TODO” comments.
    • No commented code or dead-code.
    • Don’t leave code that is not used (another description for dead-code).
    • Follow the naming convention standards, no cryptic variable names (click here or here for examples).
  • If there is extra credit, do it.  The dirty secret is that this is a trick to see if you’re going to be the person who does just the minimum, or you go the extra mile.
  • Don’t get too fancy.
    • Don’t show off your knowledge by manually coding a B-tree structure instead of using the “.Sort()” linq method.
    • Don’t perform something that is obscure just to look clever.
    • Keep your program as small as possible.
    • Don’t add any “extra” features that are not called for in the specification (unless the instructions specifically tell you to be creative).

When you’re a student in college, you are required to analyze algorithms and decide which is more efficient in terms of memory use and CPU speed.  In the working world, you are required to build a product that must be delivered in a timely manner.  Does it matter if you use the fastest algorithm?  It might not really matter.  It will not make a difference if you can’t deliver a working product on time just because you spent a large amount of your development time on a section of code that is only built for the purpose of making the product a fraction faster.  Many companies will need a product delivered that works.  Code can be enhanced later.  Keep that in mind when you’re taking your programming test.  Your program should be easy to follow so another programmer can quickly enhance it or repair bugs.

The Interview

For your interview, keep it simple.  You should study up on general terms, in case you’re asked.  Make sure you understand these terms:

  • Dependency injection
  • Polymorphism
  • Encapsulation
  • Single purpose
  • Model/View/Controller
  • REST
  • Base Class
  • Private/public methods/classes
  • Getters/Setters
  • Interface
  • Method overloading

Here’s a great place to do your study (click here).  These are very basic concepts and you should have learned them in one of your object oriented programming classes.  Just make sure you haven’t forgotten about them.  Make sure you understand the concepts that you learned from any projects that you checked into GitHub.  If you learned some unit testing, study the terms.  Don’t try to act like an expert for your first interview.  Just admit the knowledge that you have.  If I interview you and you have nothing more than a simple understanding of unit testing, I’m OK with that.  All it means is that there is a base-line of knowledge that you can build on with my help.

Wear a suit, unless explicitly specified that you don’t need one.  At a minimum, you need to dress one step better than the company dress policy.  I’m one of the few who can walk into an interview with red shoes, jeans and a technology T-shirt and get a job.  Even though I can get away with such a crazy stunt, I usually show up in a really nice suit.  To be honest, I only show up in rags when I’m Luke-warm about a job and I expect to be wooed.  If I really like the company, I look sharp.  The interviewers can tell me to take off my tie if they think I’m too stuffy.  If you’re interviewing for your first job, wear a business suit.

Don’t BS your way through the interview.  If you don’t know something, just admit it.  I ask all kinds of questions to potential new hires just to see “if” by chance they know a subject.  I don’t necessarily expect the person to know the subject and it will not have a lot of bearing on the acceptance or rejection of the person interviewing.  Sometimes I do it just to find out what their personality is like.  If you admit that you know SQL and how to write a query, I’m going to hand you a dry-erase marker and make you write a query to join two tables together.  If you pass that, I’m going to give you a hint that I want all records from the parent table to show up even if it doesn’t have child records.  If you don’t know how to do a left-outer join, I’m not going to hold it against you.  If you are able to write a correct or almost correct left join, I’ll be impressed.  If you start performing a union query or try to fake it with a wild guess, I’ll know you don’t know.  I don’t want you to get a lucky guess.  I’m just trying to find out how much I’m going to have to teach you after you’re hired.  Don’t assume that another candidate is going to get the job over you just because they know how to do a left outer join.  That other candidate might not impress me in other ways that are more important.  Just do the best you can and be honest about it.

Don’t worry about being nervous.  I’m still nervous when I go in for an interview and I really have not reason to be.  It’s natural.  Don’t be insulted if the interviewer dismisses you because they don’t think you’ll be a fit for their company.  You might be a fit and their interview process is lousy.  Of course, you might not be a fit.  The interviewer knows the company culture and they know the type of personality they are looking for.  There are no hard-fast rules for what an interviewer is looking for.  Every person who performs an interview is different.  Every company is different.

What Does a Career in Software Engineering Look Like

This is where I adjust your working world expectations.  This will give you a leg-up on what you should focus on as you work your first job and gain experience.  Here’s the general list:

  • Keep up on the technologies.
  • Always strive to improve your skills.
  • Don’t be afraid of a technology you’ve never encountered.

Eventually you’re going to get that first job.  You’ll get comfortable with the work environment and you’ll be so good at understanding the software you’ve been working on that you won’t realize the world of computer programming has gone off on a different track.  You won’t be able to keep up on everything, but you should be able to recognize a paradigm shift when it happens.  Read.  Read a lot.  I’m talking about blogs, tech articles, whatever interests you.  If your company is having issues with the design process, do some research.  I learned unit testing because my company at the time had a software quality issue from the lack of regression testing.  The company was small and we didn’t have QA people to perform manual regression testing, so bugs kept appearing in subsystems that were not under construction.  Unit testing solved that problem.  It was difficult to learn how to do unit testing correctly.  It was difficult to apply unit testing after the software was already built.  It was difficult to break the dependencies that were created from years of adding enhancements to the company software.  Ultimately, the software was never 100% unit tested (if I remember correctly, it was around 10% when I left the company), but the unit tests that were applied had a positive effect.  When unit tests are used while the software is being developed, they are very effective.  Now that the IOC container is main-stream, dependencies are easy to break and unit tests are second nature.  Don’t get complacent about your knowledge.  I have recently interviewed individuals who have little to no unit testing experience and they have worked in the software field for years.  Now they have to play catch-up, because unit testing is a requirement, not an option.  Any company not unit testing their software is headed for bankruptcy.

APIs are another paradigm.  This falls under system architecture paradigms like SOA and Microservices.  The monolithic application is dying a long and slow death.  Good riddance.  Large applications are difficult to maintain.  They are slow to deploy.  Dependencies are usually everywhere.  Breaking a system into smaller chunks (called APIs) can ease the deployment and maintenance of your software.  This shift from monolithic design to APIs started to occur years ago.  I’m still stunned at the number of programmers that have zero knowledge of the subject.  If you’ve read my blog, you’ll know that I’m a big fan of APIs.  I have a lot of experience designing, debugging and deploying APIs.

I hope I was able to help you out.  I want to see more applicants that are qualified to work in the industry.  There’s a shortage of software developers who can do the job and that problem is getting worse every year.  The job market for seasoned developers is really good, but the working world is tough because there is a serious shortage of knowledgeable programmers.  Every company I’ve worked for has a difficult time filling a software developer position and I don’t see that changing any time in the near future.  That doesn’t mean that there is a shortage of Computer Science degrees graduating each year.  What it means is that there are still too many people graduating with a degree that just to measure up.  Don’t be that person.

Now get started on that blog!

 

 

 

 

Computing PI

Today is March 14th, which means that it’s PI day!  So I’m going to talk about computing the number PI.

Years ago I wanted to use my computer to calculate the number PI and I assumed that it was just a matter of getting a formula for PI and just running through a loop that computed each digit for as many digits as I wanted.  Nope.  Not that easy.  The problem is that PI is not computed from left to right, it’s computed with a formula, like everything else in this world and you must create numbers and math functions that work with as many digits as you want to represent PI into.  For instance: If you wanted 1 million digits of PI, you need to be able to handle 1 million digit numbers.  This includes the math functions such as add, multiply, subtract and whatever functions you’ll need to compute PI.

Recently I rediscovered the method of calculating PI and I stumbled across this article: How to calculate 1 million digits of pi.  What’s nice about this article is that it discusses the method of computing PI using C#.  Dot Net includes a large number package called System.Numerics.BigInteger that can be set to use any number of digits and that’s what is used in the article.  I copied the code and compiled it and ran for different sizes of PI which computed in the following times:

5000 digits = 0.05 seconds

10,000 digits = 0.2 seconds

100,000 digits = 17.748 seconds

1,000,000 digits = 31.5 seconds

Next I wanted to know how long it would take to compute PI to 10 million or 100 million digits.  So I plotted all my time estimates onto a graph in Excel and performed a curve fit (using the power curve):

I purposely left out the 1,000,000 digit estimate and I set the “Forecast” to 1,000,000 to see if it came out to 31.5 seconds.  As you can see from the above diagram the number of seconds is about 1900, which is 31.66 minutes. The curve fitting formula is: y=3E-09x^1.9495.

Now, for:

10,000,000 digits = 36 hours

100,000,000 digits = 136 days

1 billion digits = 33 years

Oh what fun!  33 years!  I’m pretty sure I’ll have a faster computer long before that time is up.  I can’t even imagine running a personal computer 136 days straight.  I would have to “enhance” the program so that it can save intermediate values to the hard drive every once in a while so I can switch it off if I needed to, or to recover if the machine crashed or lost power.  Anyway, here’s 1 million digits of PI: pi_one_million_digits

 

Working on the Blog

I’m working on the blog today.  This is a bit different from actual “blogging.”  This task is a bit more like “installation and repair” and not so much “writing and publishing.”

I recently switched from Blogspot to WordPress.  Now I feel like a customer.  I know what it feels like to be a customer, because I’ve purchased and installed software and hardware for years.  I’ve been in the industry long enough to have upgraded products many times over and sometimes I go through the bad experience of trading one “problem” for another.  In the case of Blogger, the text editor is hideous.  The next irritation is the lack of built-in themes.  I’ve gone through the pain of attempting to install a 3rd-party theme into Blogger and that’s not always a successful endeavor.  Eventually, I decided that the pain of switching to another technology was less than the pain of staying and putting up with the problems.  Especially since Google has not upgraded their blog engine since I started using it (it has the feel of a dead or abandoned product).  I understand.  It’s free and it’s not Google’s core business.

I found a few blogging sites that were interesting, but they were expensive.  Then I realized that I could use WordPress on my own website host (GoDaddy .Net host if you’re wondering).  My hosting package is the unlimited plan so I can upload as many pictures as I want.  I figured, if I’m already paying for this, I should use more of it.  So I made the plunge.

I found a WordPress theme that was very clean and I found a plug-in that performs syntax highlighting.  The highlighting plug-in does not produce the correct colors for C#, but I think I can tweak the CSS when I get some time.  I also discovered that I can import Blogger formatted export files.  I imported all of my posts from Blogger.  I have been going through these and cleaning up the formatting snafus (like huge blank spaces, teeny-tiny text, etc.).  I’ll release these posts as I get time to clean them up.  My Blogspot site will remain active forever (unless Google gets frugal and decides to remove it).

Then the shininess started to wear off.  I like WordPress for my blog.  I’ll be staying right here, but I still have a few issues to resolve.  I have some minor issues like the syntax highlighting.  I finally found a plug-in to handle the email subscriptions.  For those who are looking for that feature, you’ll see it in the comments section (to see comments, click the blog header, scroll to the bottom, then put your cursor in the comment text box to see the subscription check boxes) or at the bottom of the page.  Next, I’ll need to install a backup package.  Then there’s Analytics…

I’ve used Google Analytics for years.  I’ve written APIs to pull massive amounts of data from Analytics, so I know how it’s all structured.  I just added another Analytics tracking property setting and obtained another ID from my current account.  Then I added the tracking JS to the header of my wordpress account.  That gets things started, but it’s not quite what I need.  At least I have an idea of the total traffic going to my url (which is blog.frankdecaire.com).  What I don’t have is the kind of tracking data that Blogspot has.  Not that Blogspot had a lot of data, because it was lacking as well.  One of my missing metrics is how many people are hitting each blog article, but there are WordPress plug-ins that I might try (though I’m not sure they are geared for a blog setup).  I can also tweak the JS to insert some meta-data into the Analytics tracking code so I can sort by blog post.

Added Features

So far today, I added a plug-in called Jetpack.  So far, this seems like a very nice plug-in.  Technically, I just installed it and have only looked at a few features.  There’s a feature to click a “like” button.  I use Facebook just like most people do, so I like that feature (no pun intended).  Please feel free to click the like button if you enjoyed reading one of my blog posts.  The number of likes on a post can drive the subjects that I steer towards and it’s a bit more relevant than the Google Analytics report of how many people landed on a subject from a search engine.  Plus, you might come to my blog from a saved link and scroll down to a subject you really like and I don’t get that information in Analytics.

The Jetpack plug-in also has the subscription check boxes.  I have been asked about my email subscription controls by people at my workplace and also a person who commented on this blog.  As I mentioned above, the comments only show up if you are in the article.  So click on the article title, then scroll to the bottom.  When you put your cursor in the text of the comment, two check boxes will appear just below the comment box.  One check box allows you to receive emails when posts are created or updated.  The other allows you to follow updates to the current comments (so you can leave a comment and get a notification when someone replies to you).  You’ll receive a confirmation email when you subscribe, so keep an eye out for that.  You need to leave a comment in order for the check-box to work.

There is also a subscribe section that appears at the very bottom where you can type in your email address and subscribe to the blog:

blog_subscribeNow I can check that task off my list!

I have provided sharing links.  Two sets of sharing links, to be exact: One is used by Jetpack and the other is from another plug-in called “Social” and it has more share sites available:

social_buttonsI have not tested either one of these plug-ins, so if you have issues, please leave me a comment or send me an email.  I’m not sure how important these are for my followers.

There’s a CSS link available.  I provide that primarily for myself.  My website uses the CSS link to display the top posts:

latest_blog_postsI wrote some C# code to look for the blog titles:

List latestBlogPosts = new List();

string url = "http://blog.frankdecaire.com/feed"; 
XDocument feed = XDocument.Load(url);

SyndicationFeed sf = SyndicationFeed.Load(feed.CreateReader());
if (sf != null)
{
    int totalPosts = 0;
    foreach (SyndicationItem si in sf.Items)
    {
	    latestBlogPosts.Add(new BlogPostModel
	    {
		    Title = si.Title.Text,
		    Url = si.Links[0].Uri.AbsoluteUri.Replace("#comment-form", "")
		});

		totalPosts++;

	    if (totalPosts > 8)
	    {
		    break;
	    }
    }
}

I pass the results in a viewbag variable and display on the view (in case you’re interested in the technical details of this “miraculous” piece of code… LOL).

Pros and Cons

One of the pros of maintaining a blog on Blogspot is that Google maintains the data forever.  I’m hosting this blog.  What that means is that “if” my financial situation goes South for any reason, or something happens to me (like I win the lottery and move to the beach), the information on Blogspot will remain.  My GoDaddy host would be expired if I don’t pay the annual fee (though technically, I pay for multiple years to save money).  I know that it is very annoying to find the right subject in a Google search only to arrive at a broken link.  So the minimum requirement is that I maintain my domain names.  This allows me to move to any other host and the search engine links will always be valid.

Another pro is that the using Blogspot means I don’t have to worry about backups, upgrades, installations, etc.  I currently have a WordPress update and a couple of plug-in updates that are bugging me.  I can’t do the WordPress update until I have my backup plug-in working (next on my list).

I have listed a few Cons of Blogspot above.  Many of the Cons are due to the fact that I have less control over the blog itself when I’m using Blogspot.  Let’s face it, Blogspot was created for the masses.  Individuals with little to no computer knowledge can use Blogspot right out of the box without knowing what the word “host” means (that was actually appealing to me too).  I, however, have no such excuse.  If I didn’t know how to setup a host, WordPress or any of this technology, then I would have no credibility in writing blog posts under the subjects that I write about.  I have maintained a website and host since the mid-90’s.

If you’re setting up a new WordPress blog and you’re a technical guru like me, then I would recommend the following blog article for customizing: How to Customize WordPress (Step-by-Step).  I found it to be very helpful.  There is a lot of information on the subject.

Purpose of my Blog

I’m sure I’ve talked about this before, but I’ll describe my purpose in blogging again.  Initially I intended to use the blog format to contain information that I learned about subjects that I struggled to find a solution.  I figured, if I did it before, then I can always look up the information in my own blog.  It’ll be on-line, so I don’t have to worry about hauling around a thumb-drive or notepad of my notes.  It took me a couple of years to overcome my fear of posting something that might not be “perfect.”  I finally took the plunge and it was a bit of a rocky start.  Getting acclimated to creating, editing and posting took a lot of practice.  Now I know what writers mean when they talk about writing a lot of short-stories just to get used to writing.  Blogging works the same way.

As of this post, I’ve been blogging since April of 2012, which is almost 5 years.  Some years have been better than others as you can tell if you look at the total posts for each month or year from my old blog:

blog_archiveI anticipate that this year will be a heavy posting year again (This will be my 5th post!).  My blog is still used by me to contain subjects that I might use as reference, but I also enjoy sharing information that is difficult to find on the web.  My most recent subject and resurrected hobby involves digital logic and home-brew computers.  In case you’ve never visited my website, I built a single-board computer from chips a long time ago:

8085_cpu_topThis hobby took a back-seat when I graduated from college and I recently stumbled across an article about a guy who built a computer out of tens of thousands of transistors.  Which brought my interest back.  That’s where my current cache of subjects are coming from… my hobby.

As I’m alluding to, it’s sometimes difficult to produce blog posts for this type of blog because I run into subject block.  This is similar to writer’s block, except it’s just a lack of interest in subjects that I am qualified to blog about.  Some of these subjects are technologies that I use at work.  Since I spend all day on these tasks at work, I’m ready to think about something else on my weekends.  I have a list of potential blog subjects that I use to jog my memory of what I should blog about.  That can be a good crutch in instances where I have time to blog, but can’t think of a specific subject.  Sometimes, I get into what my wife calls a blog funk.  That’s where I would like to blog, but I don’t want to spend time setting up the software to get the screenshots and code-snippets to create the blog post.  So I spend my time on something non-blog related.

The bottom line is that I’m blogging for me.  I don’t have any advertising on this site and I don’t intend to get paid to blog.  So don’t expect to see pop-ups or annoying animations along the sides of the blog.  I find those to be very distracting and I don’t expect anyone to have to suffer through that onslaught just to consume some technical information that I posted.

If you have a blog story you would like to share, feel free to post in the comments, or leave a link to your blog.

 

Legacy Storage

Summary

In this post I’m going to talk about legacy storage and the pitfalls of not upgrading and moving your data to newer technology.

A Little History

This post comes from some of my own experience.  My first commercial computer was the Macintosh.  I’m talking about the 128k original Macintosh with a 400k single-sided floppy drive.  The floppy was a 3.5 inch format, but the method of storage was different from the later drives.  The first Mac drives used a variable speed rotation to store the same amount of information on the inner track as the outer track of the floppy disk.  When the PC began to use the 3.5 inch floppy, the spindle speed was fixed and they floppy disk stored 360k per side.  Later, the Fat Mac (512k memory) and then the Mac Plus (SCSI interface, memory slots up to 4 meg) upgrades were introduced.  Somewhere in those upgrades I ended up with an 800k double-sided floppy drive.  This is the drive that I currently have in my Mac Plus.  I transitioned to a PC between 1991 and 1995.  I finally went full-on PC in 1995 because of Windows 95, which allowed me to ween myself off of the superior graphical interface of the Mac IIsi (the Mac II was on loan to me from my brother who was overseas at the time).

Over the years of using my Mac (between 1984 and 1993ish) I built up a collection of floppies.  I also had a 40meg external SCSI drive that I had files on.  Over the past 4 years, I’ve been lamenting the fact that I did not copy all of my software off the Mac and onto a PC.  The dilemma came when I realized that I had no method to get from the early 1990 technology to today’s technology.  SCSI version 1 is obsolete and I couldn’t find a hard drive to work on the Macintosh.  The floppies cannot be read by any floppy disk hardware today because of the variable speed technology used at that time.  Also, floppies went obsolete right after the super disk came out (and died a quiet death due to the hope that CDs would become the new storage method).  Now the preferred external storage device is the USB thumb drive.  The old Mac doesn’t know what a thumb drive is!  Network transfer?  That version of the Macintosh was pre-Internet.  When I connected to the Internet in the early 90s with that machine, I had a 1,200-baud modem (ah, the good ole’ days).

I’ve scoured the Internet in the past and came across sites that would convert disks for a fee, but I wanted to scan my disks and determine if they were salvageable.  Also, my disks are more than 20 years old now and that means that many probably are not readable any more (I’m actually surprised at how many still worked).  Late last year I came across this device: http://www.bigmessowires.com/floppy-emu/.  This circuit is an emulator that attaches to the external floppy port on a Macintosh (it works on Apple II computers as well).  It has a Micro SD card that you can write 20 megabyte external hard disk images that the Mac recognizes and will boot from.  Then there is a program for the PC called the HFV Explorer: http://www.emaculation.com/doku.php/hfvexplorer that can mount the image files and navigate Macintosh files in a window.  Then the ability to drag and convert data files can be done from this application.

So I promptly converted as many files as I could and discarded my old floppies (since they’re just collecting dust in my closet).

I have all my files copied onto images and saved them on my PC hard drive.  These are backed up with my Internet backup tool so there’s no chance of losing anything.  I have also converted some of the text based files onto my PC.  I had a lot of pascal programs that I wrote and it’s nice to be able to look back and reference what I did back then.

One type of file that is frustrating to convert is the Mac Draw files.  I have schematics that I drew on the Mac and saved in that format.  I currently use Visio, but Mac Draw (or Claris Draw) is not convertible into anything current.  I have a working version of Mac Draw and I can boot up my Mac Plus with the emulator and a 20 meg hard drive and open a Mac Draw file.  Unfortunately, there is no format that it saves in that can be used on a PC.  So I was forced to take some screenshots of schematics with my camera (oh yeah, that turned out great!  No, not really) just so I don’t have to boot up the old Mac when I want to see the circuit.

How to Protect Your Files

As you’ve discovered from my story and if you’re old enough to have used many iterations of computer technology, there are two problems with computer storage.  First, the technology itself becomes obsolete.  Second the file formats become obsolete.  I’ve scoured the Internet looking for the binary format of Mac Draw.  This was a proprietary format so it’s not available.  Also, the format went obsolete before the Internet really took off, so it wasn’t archived.  I’m sure the format is saved someplace in Apples legacy files.  There are some file converters that can convert from Mac Draw onto newer software, but it’s for a newer Macintosh and I’d like to convert to MS Visio.  Maybe I’ll do my own hacking.

I’ve read articles where NASA has struggled with this issue in the past.  They have converted their space probe data from mainframe hard disks to laser discs and I’m sure they are now using some sort of cloud storage.  Fortunately mass storage is so cheap these days that if you can get your files there, then you have plenty of room to store it.  The cloud storage has the added benefit of never going obsolete because the hard drives used are constantly upgraded by the provider (Amazon, Google, Microsoft, etc.).  Looks like NASA has a web portal to dig up any data you would like: https://www.nasa.gov/open/data.html  Very nice.

If you’re a programmer, like me, you’ll want to make sure you keep your files transferred to a newer storage.  Converting from one technology to another can be a real challenge.  If you convert between Unix, Mac or PC, you’ll need to figure out how to convert your files early on.  If you can’t, then you might be able to get an emulator.  For DOS programs, I use DOSBox.  I can use the DOSBox emulator to play all my old DOS games (such as Doom, Heretic, Descent, etc.).  There is a lot of interest in resurrecting retro arcade games and retro consoles.  The Raspberry PI has hundreds of retro games that can be downloaded and played.  So keep your games.

I don’t see any future where the USB hard drive will become obsolete, but I thought the same about the 3.5 inch floppy.  So avoid keeping your files on external devices.  Most people have enough hard drive storage these days that external device storage seems foolish.  I currently only use SD cards, Flash cards or USB drives to transfer files to and from my hard disk.  My most valuable files are the files that I create myself.  Those are either backed up to the Internet or I check them into GitHub.  Everything else is replaceable.  Always think about what would happen if your computer were destroyed in a fire or natural disaster.  What have you lost?  Make sure you have some method to recover your files.

Another type of file that needs some thought is the photo.  Photographs used to be stored in albums and shoe boxes (at least that’s where I put ’em!).  A few years ago I scanned all my old 35mm photos and stored them on my computer.  I had photos from the early 80’s when I was in the Navy.  These photos were turning yellow and they were stored in shoe boxes where nobody could enjoy them.  After I scanned them in, I converted some with Photoshop to correct the color and brightness.  I forgot how good those photos used to look.  More importantly, they are backed up on my Internet storage.  I also borrowed my parents photo albums and scanned those photos.   Here’s a picture of the Mackinac bridge from 1963 (I was less than a year old at the time):

macinac_bridge

Organizing Your Files

The final problem you’ll run into is how to find your files.  This takes some organizational skills.  I have all of my digital photos stored by date and title.  When I come home from a hike and I copy files from my Cannon Rebel, I store the files in my photos directory (of course) in a new directory that starts with the year, month then date (so they sort right).  Then I put a title so I can scan down and see what the photos represent.  For example, my wife and I visited the Smithsonian annex in Virginia on August 8th of last year, so my directory is: “2016-08-13 Smithsonian Annex”.  All of my raw photos go into that directory.  If I crop a photo or fix the color, I give the photo a new name and I might even put it in another directory.  The photo above is a reduced size photo.  The original is over 2000 pixels by 2000.  In the future I see the need for a database to track photos, but I haven’t figured out how to get the data organized in a fast and easy way.  The ultimate database would have different key words that can be searched on, like the location of the photo, who is in the photo, the date of the photo, who took the photo, etc.  Then I can search for all the photos of my wife and I when we were on vacation in July.  Currently, I have to scan through all the thumbnails and look for our faces.

Over the years I’ve changed my methods of organizing files.  Some of those files are still in the same directories that they were in when I first created them.  I left them this way because I know where to find them.  I remember an old professor in college with piles of papers in his office claiming that he has a “system”.  He then proceeded to pull the exact paper out of the middle of what looked like a random pile.  Yeah, that’s me.  In fact I have a root sub-directory called “d_drive”.  Back in the day,  I owned a 1.9 Gig hard drive that I installed in my DOS machine.  Unfortunately, DOS at the time had a limit to the size of a partition on a disk, so I had to create 3 or 4 partitions on this hard drive and one of the drive letters was D:.  This is where I kept all my source files for games that I was working on.  When I upgraded that machine to an 8 Gig hard drive, I just copied the partitions to sub-directories on the new drive.  Of course, I just named the directories “d_drive”, “e_drive”, etc.  That directory has not changed much since.   That drive is currently on my new 4 terabyte drive.  Too many sub-directories and files inside there to try and re-organize it now.

Organize your files before you acquire too many files!  Eh, nobody does that.  I didn’t recognize that I needed a system until it got out of control.  Having files on floppies made things difficult as well.   I had a filing system back then, but it was still a bit painful to find a file.  Just be aware that you’ll fall into this trap eventually.

Obsolete Files

If you have files that you can’t read anymore because you don’t have the program that created them.  Save them.  Search the Internet for a solution.  Someone out there probably has those files and have experienced the same problem.  Maybe a device, program or emulator will come along and save your files.  Storage space is cheap.  There is no need to throw away files unless you really want to get rid of them.  Try to keep your programs as well.  Sometimes that’s not feasible due to licensing issues.  Sometimes you’ll need to use an emulator (like DOSBox) to run your old programs.  Make sure you keep your license numbers in digital format in a location you can find quickly.  I have a directory containing files of everything that I have installed on my PC.  I keep all my license information in this directory (which incidentally has sub-directories labelled as the name of the software installed).  Steam has made it easier to keep track of Games that I have purchased.  At least I can always get into my Steam account and re-install a game I previously purchased without digging around for the CD/DVD and the license number.  Other programs such as Photoshop need the license number to activate.

I will also download the install file and save those.  If I upgrade my computer hard drive and I need to re-install, the installation medium might not be available to download for the version that I have a license for.  I’ve run into that problem before and it’s quite annoying to be forced to purchase a new version just because I can’t obtain the old installation material.

One last way I avoid obsolete material is that I have signed up for annual contracts on a couple of programs I’ve used over the years.  Microsoft Office is one of them.  I recently analyzed how many times I have upgraded (I used to upgrade every two or three years) and for the $99 a year price, I can get 5 copies of Office 365 and not deal with the process of purchase and upgrade.  You’ll need to analyze the cost and usage to determine if it’s worth the price to go that route.

Finally…

If you have unique ideas on how you keep ahead of the legacy battle, feel free to send me an email or leave a comment.  I’ll be happy to share your information with anyone who stumbles onto this blog.

 

Hello world!

I’m moving my blog from Google Blogspot to my own domain.   Almost any person who has used blogspot knows the problems.  Google does not maintain their blog engine.  I’ve been using Blogspot since April of 2012 and I’ve seen little change.  The javascript for the theme I used to use doesn’t work as well as it used to.  Clicking on a picture and then closing the picture using an iPad causing the blog to snap to the top.  Code highlighting is not an easy add-on.  Available themes are limited.  Blog customization is difficult.  So I finally reached my pain point and decided to look for another blog engine.  I stumbled across Ghost and I really like their blogging site, but their prices are high and I have two blogs.  I don’t make any money from my blogging, so there’s no real justification to move from a fee blog site to a site that costs money.  I do, however, have a website that I am already paying for.  That website is hosted by Godaddy.  Fortunately, Godaddy has WordPress and there was a Casper theme, which is the same default theme that Ghost uses.  Therefore, I can get the same quality of output as Casper and not pay a dime more for services that I already pay.  Sweet!

WordPress has been around since the stone-age and they have come a long way since the last time I installed and configured it.  First, it was a click of the button to activate (after I setup a blog subdomain to my frankdecaire.com domain).  Then I added some plug-ins to do code highlighting.  Next, I scraped a blog post from Blogspot and tweaked the appearance.  The WordPress editor interface is a pleasure to use.  Blogspot has a bug involving inserted images.  Sometimes you can’t get the cursor to occur after the image.  Another bug in Blogspot that drives me insane is when I add carriage returns and nothing happens.  Hitting the publish or save and then re-editing the post shows the extra lines I added but couldn’t see.  I no longer have to deal with all that mess.

My next task is to move all my old articles over from Blogspot.  Maybe I’ll just put a forwarding link there and just continue here… Decisions.  Decisions.

 

Source Control – Branching and Merging

Summary

In this blog post I’m going to talk about branching and merging your software in a multi-team development environment.

The Community Branch

A common mistake in branching is assuming that fewer branches are better.  The logic behind this assumption revolves around the idea that fewer branches translate to fewer merges.  Technically this is true, but each merge is very large and complex.  A larger number of smaller branches are easier to manage than a small number of larger branches and I’m going to show why this is true.

But first, I’m going to describe what I call the “community branch.”  The community branch is a branch that contains many different projects, shared by a large group of programmers.  The idea is that the branch is just an abstract place to put code (like a development branch) and that change sets can be moved around as though they are independent of each other.  

Here’s a simple example diagram of such a setup:


I want to stop and mention that I’m trying to explain a very complex “issue” here.  So I’m going to make this explanation as simple as possible and ignore any QA or stage testing process.  I’ll just assume that the process goes from development to deployment and demonstrate problems that occur when projects are mixed on one branch.

In the above diagram I have a main branch which is the branch that contains the latest software deployed to the production system.  The first operation (point B) is a branch for the start of development work (C).  Two teams of developers are working on the software.  The first team completes some of their work and checks it into the development branch at point D.  Project team 2 begins work after this point (for simplicity of this example) and they then check their software in at point E.  Project team 2 then merge their version into the main branch and the software is deployed.  Project team 1 is still working on their software and they continue to check in changes (point G).

Now for the problem in this scheme.  First, points D, E, and G are change sets.  If the change sets are all merged back to the main branch in the same order that they are checked into the development branch, then everything will work correctly.  Also, if at any point the team who merges to main performs a merge of all change sets from their point to the beginning, everything will work out (assuming all software is ready to go).  

However, if project 1 requires a refactoring of common code that project 2 is dependent on, then there is a problem.  If change set E is merged, but relied on changes made at change set D, then the merge to main will be broken (even though it worked on the development branch).  The only way to fix the merge to main is to manually make changes on the main branch to compensate for the modified common code that was made in the development branch.  The entire change set D cannot be merged with main at this point because project 1 is a partially completed project.

Every community branch will experience this issue if the branch is allowed to live long enough.  It’s unavoidable.  The problem will get worse as more changes are made to the branch that are not deployed.  The reason code might not be deployed can be due to long development time projects, projects that get placed on the back-burner or projects that get cancelled.


Project Branches

The correct method of branching is to isolate projects on their own branches.  Each team of programmers work with their own branch.  If a deployment to main is made from another project (or bug fixes, hot fixes, etc.), then the main branch is re-merged with the local project branch.  Each programming team must assign an owner to their project branch who must ensure that the branch is always up to date.  Once the project (or the sprint) is complete, then the branch can be merged into main.  

Here’s an example:

In this example project team 1 forms their own branch, which we will refer to as the project 1 branch.  Project team two form their own branch, called the project 2 branch.  Each branch is always formed from the main branch because that is the latest deployed software.  When project team 2 complete their sprint or project, they are authorized to merge with main.  The merge is made and since there are no other changes on main, they should get a clean merge.  Then the software is deployed to the production system.  After deployment, each branch must be re-merged from main.  In this example, project team 1 will merge changes from main back to their own branch.  If there are any common object changes, then they will need to be updated on the project 1 branch.  In this instance, project 1 will be ready to merge clean into the main branch after point G.  Any common objects changed by project 1 do not need to show up in the software created by project 2 until deployed to main by project 1.  It is also project team 1 who is responsible for the update to all software that is touched by the refactoring of common software.  Later down the line, project team 1 will merge with main and the merge will be clean.

Each time software is merged with main, all branches must re-merge from the main to obtain the latest changes that were deployed.  

Using a QA Branch

OK, so let’s show an example using a QA branch.  First, the QA branch is created from main and lives forever.  This is a special branch, and no work is to be performed directly on this branch.  The QA branch should only contain changes merged from other branches.  Once QA testing has been completed, then the results of the QA branch can be merged with main and deployed.  At the point of deployment, the QA branch should equal the main branch (i.e. a software source code comparison should be identical, minus any config files specific to the QA environment).  

Here’s an example:

In this example, all the rules from the previous example still apply.  When a deployment occurs, all live branches must re-merge from main (not from QA).  In the diagram above, project 1 starts first and the branch is created from main (C and D).  Project 2 starts after and it is also created from main (E and F).  Then each project in this example are completed and selected for the next deployment cycle (G and I).  Both branches are merged with QA (H and J).  Then an bug is detected during QA and project team 2 must fix this bug.  They must fix the bug in their branch and then merge their latest change sets down to QA for re-testing (K and L).  Once QA has been completed and the software is ready for release, it can be merged with main and deployed (M and N).  

One other note: When a project is completed and deployed, then the branch for that project should be closed.  If any bugs are detected after this point, it should be treated as a bug and not part of the original project.  The reason for closing the branch is to reduce the maintenance required.  There is no reason to keep maintaining branches that belong to completed projects.

Other Branches

I’m sure by now you can visualize adding a permanent branch called “stage” that would perform a similar function to the QA branch.  In such a setup, the stage branch would be the destination of the QA branch upon completion of quality checking.  Alternatively, merging to stage could occur right after software is merged into QA and continuously merged from QA to stage as updates are made.  That would provide the ability to test in a QA environment and a stage environment in parallel.  The completed software would be merged from stage down to main for final deployment.

One wrench in the works are bugs.  Bug fixing is usually performed in a short period of time.  Many software shops will create a bug branch, fix bugs and merge these into QA or directly to main.  Bugs would need to be bundled together to prevent too many changes to main which would trigger re-merges with project branches.  It’s recommended to perform bug fixes and merge into the QA branch at the time when projects are first merged.  Then QA can be performed on the bugs, the projects and merged toward main at one time.  Once everything is merged in main and deployed, then all changes can be re-merged back to any open project branches.

Hot fixes or emergency bugs also need to be accounted for.  Hot fixes can also have a special branch that lives forever.  This branch would probably bypass the QA branch since a hot fix is usually something that must go out right away.  Once a hot fix is deployed, all branches, including QA should re-merge the changes from main.

Potential Issues

One branching issue occurs when a database change must be performed.  A branch that requires a change in a database must somehow merge those changes into the QA environment database and then down to the main database(s) (in production) at the right time.  I’m not going to cover this issue in this blog post because it applies to the community branch just as well as project branching scheme.

Automated deployments must account for constantly changing branches.  The automated deployment system should be setup to allow any branch to be deployed to any environment.  The most efficient setup would involve virtual development environments that can be cloned from your production environment with nearly identical configurations.  This will reduce the amount of time it takes developers to fix problems related to differences in environments.  It also increases the success rate of the final deployment since configuration variables are removed at development, QA and staging time.  I’m not going to go into automated deployment systems in this article but hard-coding an environment to a branch is bad practice and should be avoided.