Entity Framework 6 vs. LINQ-to-SQL smackdown!

Today, I’m doing my own performance testing on Entity Framework 6 and LINQ to SQL.  I created two tables in a sample SQL server database.  Here is the ERD for the two tables I created:

Next, I created two console application projects.  One to test LINQ to SQL and the other with EF-6.  You can download the LINQ to SQL test project here: linq2sqltest.zip and the EF6 test project here: ef6test.zip.

I started with the LINQ to SQL test using a basic insert loop like this:

for (int j = 0; j < 10; j++)
{
    for (int i = 0; i < 1000; i++)
    {
        person personRecord = new person()
        {
            first = firstnames[i],
            last = lastnames[i],
            department = 1
         };


         db.persons.InsertOnSubmit(personRecord);
    }
}

db.SubmitChanges();

I used the same code for the normal EF6 test:

for (int j = 0; j < 10; j++)
{
    for (int i = 0; i < 1000; i++)
    {
        person personRecord = new person()
        {
            first = firstnames[i],
            last = lastnames[i],
            department = 1
         };


         db.persons.Add(personRecord);
    }
}

db.SaveChanges();

I am using EF 6 version 6.0.1 as of this writing.  Microsoft has indicated that they are working on a bug fix version of EF6 that will be version 6.0.2.  There is no expected date when that version will become available, so keep your eyes open for this one.

In order to make the test measurable, I downloaded two text files full of first and last names with at least 1000 rows.  Then I ran the loop through 10 times to make it 10,000 rows of data inserted into the person table.  You can download the text files from the census like I did (if you want to include more names) by going to this stack overflow article and click on the suggested links: Raw list of person names.

I also attempted to move the db.SaveChanges(); inside the loop to see what effect that would have on the timing, and received the expected result of slower-than-dirt!  So I did some research to find a method to speed up the inserts and came across this stack overflow hint on bulk inserts: EF codefirst bulk insert.  By changing the configuration of the context before the inserts were performed, I was able to increase the insert speed significantly:

db.Configuration.AutoDetectChangesEnabled = false;
db.Configuration.ValidateOnSaveEnabled = false;

Here are the final results of all four tests:

The first test is the base-line LINQ-to-SQL.  I did not attempt to optimize this query (and I’m sure there is a way to make it go faster as well).  My interest was to see if I can make EF6 perform as fast or faster than straight LINQ-to-SQL.  The first test is the slow test where I put the SaveChanges() inside the loop (epic fail).  T2 is the second test, which is shown in the code above.  The third test (T3) is the code where I added the two configuration changing lines before running the insert loops.  All timings are in seconds.  As you can see from the results LINQ-to-SQL, un-optimized ran for 18.985 seconds to insert 10,000 records.  EF6 in test 3 ran for 2.371 seconds.  Just using the same code in EF6 produced a poor result at just over 46 seconds.  Be conscious of the time that EF6 is going to take some extra work to make it perform.

UPDATE: I have since written a test for NHibernate inserts and re-ran the above tests.  The LINQ-to-SQL test was able to perform the same inserts at a measured time of 3.49 seconds.  Because of this discrepency, I re-ran all tests several times and it seems that all the measurements are close except for the LINQ-to-SQL.  This is not a scientifically accurate test and I would recommend downloading my examples and run your own tests on your own hardware.

 

Entity Framework 6 Mocking and Unit Testing

Last week I attempted to mimic this article from Microsoft: Testing and Mocking Framework.  I ran into a problem involving some code that was in the article last week, but has since been removed (I’m assuming that Microsoft changed something in their EF6 framework or .Net and didn’t update the article before I tested it).  Anyway, I have now duplicated the unit tests that Microsoft demonstrates with my own table.

Two “gotchas” were discovered while attempting to make this work.  First, I had to pass the context into my object (instead of just using a “using” statement with the context I needed.  The reason for this is so the unit test can pass in a mock context.  The second “gotcha” was that the DataSet object inside the EF code needs to be virtual.  This was more difficult due to the fact that the code itself was auto-generated (because I used the EF designer to create it).  Fortunately, there is a reference in the article that states that “virtual” can be added to the T4 template used by EF (inside the <model_name>.Context.tt file).  I added the virtual keyword and my unit test worked like a charm.

The Details

So I wrote a console application and created one table.  The table was generated in MS SQL server.  I named the table “account” and put a couple of fields in it (the primary key field was set up as an “identity” field so it will generate a unique key upon insert).  Here’s the ERD:

Don’t laugh, I like to start off with really simple stuff.  Things seem to get complicated on their own.

Make sure you use NuGet to download the version 6.0.1 (or later) version of Entity Framework.  I just opened the NuGet window (see “Tools -> Library Package Manager -> Manager NuGet Packages for Solution…”), then I typed “Entity Framework” in the “Search Online” search box.

My console application main program looks like this:

using System;

namespace DatabaseTestConsole
{
    class Program
    {
        static void Main(string[] args)
        {
            UserRights rights = new UserRights(new DatabaseContext());

            string test = rights.LookupPassword(“test“);
            Console.WriteLine(test);
        }
    }
}

My entity container name is called “DatabaseContext”, I created the container using the project right-click, then “add” -> “new item”, then selecting “ADO .Net Entity Data Model”.  I added a connection and dragged my table to the EF Model Diagram.

Then I created a new class called “UserRights” (right-click on project, “add” -> “class”).  This is the content of the UserRights.cs file:

using System.Linq;

namespace DatabaseTestConsole
{
    public class UserRights
    {
        private DatabaseContext _context;


        public UserRights(DatabaseContext context)
        {
            _context = context;
        }


        public string LookupPassword(string userName)
        {
            var query = (from a in _context.accounts

                         where a.username == userName
                         select a).FirstOrDefault();
            return query.pass;
        }
    }
}

I manually added some data into my table and tested my program, just to make sure it worked.  Then I added a unit test source file (I named it “UnitTests.cs”), using the same “add -> class” that I used to create the UserRights.cs file above.  Then I added in two references and usings for unit testing and moq.  Here’s the entire source code for the test:

using System.Collections.Generic;
using System.Data.Entity;
using System.Linq;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Moq;


namespace DatabaseTestConsole
{
    [TestClass]
    public class UnitTests
    {
        [TestMethod]
        public void TestQuery()
        {
            var data = new List<account>
            {
                new account { username = “test“,pass=”testpass1” },
                new account { username = “ZZZ“,pass=”testpass2” },
                new account { username = “AAA“,pass=”testpass3” },
            }.AsQueryable();


            var mockSet = new Mock<DbSet<account>>();
            mockSet.As<IQueryable<account>>().Setup(m => m.Provider)

                   .Returns(data.Provider);
            mockSet.As<IQueryable<account>>().Setup(m => m.Expression)

                   .Returns(data.Expression);
            mockSet.As<IQueryable<account>>().Setup(m => m.ElementType)

                   .Returns(data.ElementType);
            mockSet.As<IQueryable<account>>().Setup(m => m.GetEnumerator())

                   .Returns(data.GetEnumerator());

            var mockContext = new Mock<DatabaseContext>();
            mockContext.Setup(c => c.accounts).Returns(mockSet.Object);


            UserRights rights = new UserRights(mockContext.Object);

            Assert.AreEqual(“testpass1”, rights.LookupPassword(“test“),

                  “password for account test is incorrect“);
            Assert.AreEqual(“testpass2”, rights.LookupPassword(“ZZZ“),

                  “password for account ZZZ is incorrect“);
            Assert.AreEqual(“testpass3”, rights.LookupPassword(“AAA“),

                  “password for account AAA is incorrect“);
        }
    }
}

As you can see from the unit test (called “TestQuery”) above, three rows of data are inserted into the mocked up account table.  Then the mock context is setup and the UserRights object is executed to see if the correct result is read from the mock data.  If you want to test this for yourself, go ahead and copy the code segments from this article and put it into your own project.  Unit testing methods that perform a lot of database operations will be easy using this technique and I also plan to use this for end-to-end integration testing.


Update:

I have posted the code on GitHub, you can click https://github.com/fdecaire/MockingEF6 to download the code and try it yourself.

 

Getting the Show on the Road

Introduction

So I’ve been evaluating different ORM’s and I’ve decided to stick with Entity Framework.  I’ve done some testing with NHibernate and discovered that it is very difficult to get running.  I still need to spend more time to research NHibernate, but for now I think I’ll just run with Entity Framework.  One of the positives of EF over NHibernate is the visual tool that makes it easy to setup the data objects.  Our company is planning to employ interns starting in the Spring of 2014 and I’m thinking forward along the lines of using the KISS principle wherever I can.  Later, when things are moving again, I can take another hard look at NHibernate and determine if we want to switch to that ORM over NHibernate.

So Friday I began the tedious task of converting one of the LINQ-to-SQL subsystems to EF.  Most of it went smoothly.  We don’t currently have a lot of data access using an ORM, so now is the time to determine which tool we’re going to stick with.  In order to get around the namespace weakness of EF, we are going to put our database access inside it’s own project or subdirectory.  Our code will need to share tables that will be defined in one place.  I think the ability to refactor will assist us in weeding out deprecated tables and fields in the future.

One of my co-workers made me aware of an article called Performance Considerations for Entity Framework 5″.  This is quite lengthy and very detailed.  I would recommend putting it on your “favorites” list and keep it handy when you’re ready to use EF.  This article talks about EF version 5, but I’m sure they’ll update it for version 6 soon.  Here’s an interesting side article for unit testing and mocking EF6: Testing with a mocking framework (EF6 onwards).

So What’s the Point of Using an ORM?

Speed is not necessarily the only reason for using a different data access model.  In this case, the point is to catch SQL query mistakes at compile-time.  In the good-old-days when queries were sent back to the database as a string, any errors in SQL were only detected when the query executed (i.e. run-time).   

ORM’s create objects that represent the tables, fields and other components of the database so that the developer can write a query directly in code (like C# or VB).  The SQL statement written in code can have automatic syntax highlighting for errors and errors are detected at compile time.  Of course, it’s still up to the software developer to write a correct query.  At least this is one more step in reducing the amount of troubleshooting time a developer must take to get the software right.

Unit Testing

Unit testing is the latest new-hotness.  Technically, unit testing has been around for some time (so it’s not new, but it’s still hot).  In EF6 there is support for in-memory database mocking that can be used to test parts of your code.  In the past I have used test databases that I generated in a local copy of MS SQL server.  Using a test database means that I had to generate the tables and then tear them down when I was done.  By creating data in memory, the whole process takes less time and resources.

The reason why unit testing is so important in this instance is that many web applications are mostly database queries.  I know that the software that at my current company contains mostly code to access and manipulate data.  So unit tests that don’t test the queries that access the database are very limited.

I’m currently looking at the unit testing and mocking of EF described here: Testing with a mocking framework (EF6 onwards).  Specifically the “Testing query scenarios” section.  If you’re using Moq, make sure you download the NuGet package for Moq (it’s not included in the base install of Visual Studio 2012).  I’m hoping to be back with an example soon.  Stay tuned…

 

LINQ and Entity Framework … sigh

Our company has a ton of legacy code using VB and simple text based queries that are submitted to the database.  There are several disadvantages to this method versus new techniques like LINQ and EF.  First, syntax errors in the query itself can only be tested in run-time.  Second, syntax highlighting, automatic formatting and intellisense are not supported.

One of the improvements started earlier this year was the use of LINQ in place of string queries.  Immediately a flaw was discovered in the interface for setting up tables.  That flaw is the missing capability to synchronize changes made in tables in the corresponding database.  The developer is forced to delete the table and re-add the table in order to reflect any field changes.  This is ok, if there are no stored procedures that return a result set for that table.  Then you have to go to each stored procedure and switch to something else, replace the table, then switch the stored procedures back.  There are third-party add-ons that can synch the tables, but they seem to of the bloat-ware variety and are not very effective (some of them don’t work right).  A second “feature” of LINQ is that it creates a settings.settings file in the project to add the ability to change database settings from an interface.  This seems like a great feature until one realizes that it over writes the apps.config file when changes are saved.  The side effect is that comments in apps.config are removed.  Our company uses comments to switch between databases located on our local PC, staging server and the production server.  Powershell scripts comment and uncomment the sections for each of the connector strings to enable the correct database connector.  This goes out the window when the settings.settings control panel wipes out any commented connection strings (usually the staging server and production server settings). 

So we began looking at Entity Framework.  This was a dream.  It has a synchronize capability making it much easier to use.  I jumped in with both feet and started replacing LINQ setups with EF (they are compatible with each other, just keep the table names the same).  My plan was to create a device context of table definitions in each object directory.  This would create an organization where the tables that were used by the current object were in the same directory and any abandoned tables would not be used by some far off code not related to the object that might be refactored in the future.  Then I ran into a serious problem.  EF doesn’t recognize name spaces.  Yikes!  So I changed plans quick.  I placed the device context in a central directory under the project and decided we would just use one device context for all objects.  I further discovered that only one database can be used per device context.  While that hasn’t been an issue yet, it will be if we decide to use this for more of our larger objects (that query spanning two or more databases).  I theorized that we might be able to use nested “using” statements, but I haven’t tried it yet.  This issue, coupled with the namespace issue has made me put on the breaks until I can do further analysis.  I have also heard rumors that EF is not as efficient at queries as LINQ. 

Other issues that discovered is that the objects created for the tables that are dragged into the context may have an “s” appended to them.  This causes all kinds of confusion about the table names.  It’s also a headache to have to trace back to the device context to get the real table name to find it in the database (if you are troubleshooting code you are unfamiliar with).  I would prefer an ORM* that would create object names that matched the tables exactly, followed the standard namespace convention and had a visual interface that didn’t stomp on my app settings and was flexible enough to synchronize changes with the database (because we make database changes).  Am I asking for too much?

So now what’s my plan?  At this point, I’m going to put some time and effort into NHibernate (I have posted on that subject before).  Stay tuned for my results on this ORM.



*LINQ and EF are also known as ORMs (Object Relational Mappers).