What .Net Developers Should Know about MS SQL and IIS

Summary

In this post, I’m going to explain a couple techniques and tools that every developer should know.  If you are just graduating from college and you are looking for a .Net job, learn these simple techniques and advance your career.  If you’re a mid-level to advanced .Net developer and you don’t know these tools and techniques yet, learn them.  These tools and techniques will save you time and give you the edge to building better software.

SQL Profiler

Let’s assume you have built this outstanding program.  It’s a work of art.  It uses .Net Core 2.0 with IOC containers, Entity Framework Core 2.0 and other advanced technologies.  You’re testing your web-based software and you notice a pause when you click on a button or a page is loading.  The first thing that pops into my mind is: What is causing the slowdown?  Is it the database or the IIS server?  Finally, what can I do about it?

Let’s eliminate or confirm the database.  If you have installed the profiler tool in SQL (I’m going to assume you did, otherwise you’ll need to start the installer and install this tool), then go to the Tools menu and select SQL Server Profiler.  A new window will open and you’ll need to connect to your database instance as though you are opening the management studio itself.  Once you open the profiler, it’s time to execute that page that you are having issues with.  You can click on the stop button and use the eraser to clean up any records that have shown in the window already.  Get to the point where you are about to click the button to your web page.  Then hit the run button in the profiler and hit the web page.  Once the page loads, then hit the stop button in your profiler so nothing else will record.  Now you have records to analyze.  You’ll be surprised at the number of database calls EF will perform.

I used this project as a demo for the screenshot coming up:

https://github.com/fdecaire/MVCWithAutoFacIOC

Download the project, run the database create script (store_product_database_setup.sql) and run the application.  You should see something like this:

As you can see there is one call to the database and you can see the “select” command.  Click on the “BatchStarting” line and notice the detail in the window at the bottom:

Now you can scrape that query and paste it into the Server Management Studio and test the exact query that your Entity Framework is sending to SQL:

This will indicate if you are querying for too many fields, or the total records queried are crazy.  If you discover that your query result was a million records and your interface only shows the top 10 records, then you need to tweak your LINQ query to only ask for 10 records.  In this tiny example we have three records.  Let’s make it ask for 2 records.  Here’s my original LINQ query:

(from s in _db.Stores select s).ToList();

I changed it to:

(from s in _db.Stores select s).Take(2).ToList();

Re-run the program, capture the data in profiler and this is what I get:

Notice the “Top(2)” difference in the query.  This is the kind of performance tuning you should be aware of.  It’s very easy to create C# code and LINQ queries, only to never understand what is really going on behind the scenes.  Entity Framework takes your LINQ query and turns it into a string that represents a SELECT query and transmits that to MS SQL.  Then MS SQL queries the database, returns the results so that EF can turn it back into a list of objects.  With SQL profiler, you can get into the nuts and bolts of what is really going on and I would recommend you run profiler at least once after you have built your software and think it is ready for prime-time.  If you see a query pause, copy the profile SQL query into management studio and see if you can speed up the query and get the query results that you need.

Another tool you can use is the estimated execution plan tool.  The toolbar button looks like this:

This tool will break your query down into the pieces that will be executed to form the results.  In the case of my previous query, there is only one piece:

That piece of the query costs 100% of the execution time.  If your query included a union and maybe some sub-queries, this tool is very useful in determining which part of the query is costing you the most processing cycles.  Use this tool to decide which part of your query you want to focus your energy on.  Don’t waste time trying to optimize the portion of your query that only takes 2% of the execution time.  Maybe you can get that to go twice as fast, but the overall query will only be about 1% faster than before.  If you focus your energy on a section that takes 98% of your execution time, then you’ll be able to boost the performance in a noticeable manner.

Web Server Troubleshooting

If you’re using IIS, some day you’ll run into a problem where you don’t get any logs and your website or API crashes immediately (usually a 500 error).  This is always a difficult problem to troubleshoot, until you realize that there are only a hand-full of problems that cause this.  The most common problem is an issue with the XML formatting in your web.config file.  I can’t tell you how many times I have been bit by this problem!  The easiest way to test and troubleshoot this error is to open the IIS manager control panel, select your website and then click on one of the icons that displays a section of your web.config file, like “Modules”:

If there is an error, then the line number in the web.config file will be shown.  You’ll be able to look at the xml in the web.config and see your missing tag, extra quote or other symbol (sometimes it’s an “&”, “>” or “<” symbol inside your database connection string password or something).  Fix the web.config issue and go back to modules again.  If there is another error, then fix it and return again, until it works.

On .Net Core, there is an error log that can report startup errors.  This logging happens before your log program starts and is very handy for situations where you don’t get any logging.  When you publish your site to a folder in .Net Core (or Core 2.0), you’ll end up with a bunch of dlls, some config files and a web.config file.  The web.config file is mostly empty and might look like this:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
 <system.webServer>
 <handlers>
 <add name="aspNetCore" path="*" verb="*" modules="AspNetCoreModule" resourceType="Unspecified" />
 </handlers>
 <aspNetCore processPath="dotnet" arguments=".\Website.dll" stdoutLogEnabled="false" stdoutLogFile=".\logs\stdout" />
 </system.webServer>
</configuration>

Change your “stdoutLogFile” parameter to point to a file location that you can find.  I usually set mine to “C:\logs\myapplication_logging_error.txt” or something like that.  Then I run the program until it crashes and check in the c:\logs directory to see if the file exists.  If it does, it usually contains information about the crash that can be used to troubleshoot what is going on.  I’m assuming at this point in your troubleshooting, the website or API works from Visual Studio and you are having issues with the deployed application.  If you are having issues with executing your application in Visual Studio, you should be able to zero in on the error in VS using breakpoints and other tools.

For NLog there is an error log for the error log.  This is where the errors go when there is an error detected in the NLog code.  Usually caused by a configuration error.  At the top of your nlog.config file should be something like this:

<?xml version="1.0" encoding="utf-8" ?>
<nlog xmlns="http://www.nlog-project.org/schemas/NLog.xsd"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 autoReload="true"
 internalLogLevel="Warn"
 internalLogFile="C:\logs\myapplication_nlog.txt">

As you can see, there is an “internalLogFile” parameter.  Set this to a location and filename that you can find.  Then execute your program and see if that log file appears.  If it does, then open it up and examine the contents.  Usually it’s an exception error involving the NLog logger.

Application Pools

The application pool type must match the .Net version that you are using.  For .Net Core and Core 2.0, you’ll need to use “no managed code”.  For .Net, you’ll set your pool to “.Net Framework v4.0”.  If you are unfortunate enough to be using really obsolete legacy code, you can set it to “.Net Framework v2.0”.

When you deploy your Core 2.0 application, you may need to cycle the application pool.  You’ll run into a problem where you deploy new code and the “bug fix” or “new feature” does not show up.  Cycle the app pool and see if it fixes your problem.  The application pool might be using a copy of the previous program in memory.

Sites

Inside your site settings, there is an ASP settings section:

You can turn on debugging flags inside here.  For your local development system, turn it all on:

Leave these settings off for any production system.  You do not want to advertise the line numbers of your code (and your code) when you have a crash.  For development purposes, you want to know all the data you can get from an exception.  If these settings are on, you’ll get the full dump of the exception message in your browser and then you can return to Visual Studio to find the exact line of code where the error occurred.

I hope this saves you some time when a frustrating deployment problem occurs.  The trick is to learn the most common causes of deployment problems so you don’t spend all day troubleshooting a tiny error instead of working on that next big chunk of code.

 

 

 

 

.Net MVC Project with AutoFac, SQL and Redis Cache

Summary

In this blog post I’m going to demonstrate a simple .Net MVC project that uses MS SQL server to access data.  Then I’m going to show how to use Redis caching to cache your results to reduce the amount of traffic hitting your database.  Finally, I’m going to show how to use the AutoFac IOC container to tie it all together and how you can leverage inversion of control to to break dependencies and unit test your code.

AutoFac

The AutoFac IOC container can be added to any .Net project using the NuGet manager.  For this project I created an empty MVC project and added a class called AutofacBootstrapper to the App_Start directory.  The class contains one static method called Run() just to keep it simple.  This class contains the container builder setup that is described in the instructions for AutoFac Quick Start: Quick Start.

Next, I added .Net library projects to my solution for the following purposes:

BusinessLogic – This will contain the business classes that will be unit tested.  All other projects will be nothing more than wire-up logic.

DAC – Data-tier Application.

RedisCaching – Redis backed caching service.

StoreTests – Unit testing library

I’m going to intentionally keep this solution simple and not make an attempt to break dependencies between dlls.  If you want to break dependencies between modules or dlls, you should create another project to contain your interfaces.  For this blog post, I’m just going to use the IOC container to ensure that I don’t have any dependencies between objects so I can create unit tests.  I’m also going to make this simple by only providing one controller, one business logic method and one unit test.

Each .Net project will contain one or more objects and each object that will be referenced in the IOC container must use an interface.  So there will be the following interfaces:

IDatabaseContext – The Entity Framework database context object.

IRedisConnectionManager – The Redis connection manager provides a pooled connection to a redis server.  I’ll describe how to install Redis for windows so you can use this.

IRedisCache – This is the cache object that will allow the program to perform caching without getting into the ugly details of reading and writing to Redis.

ISalesProducts – This is the business class that will contain one method for our controller to call.

Redis Cache

In the sample solution there is a project called RedisCaching.  This contains two classes: RedisConnectionManager and RedisCache.  The connection manager object will need to be setup in the IOC container first.  That needs the Redis server IP address, which would normally be read from a config file.  In the sample code, I fed the IP address into the constructor at the IOC container registration stage.  The second part of the redis caching is the actual cache object.  This uses the connection manager object and is setup in the IOC container next, using the previously registered connection manager as a paramter like this:

builder.Register(c => new RedisConnectionManager("127.0.0.1"))
    .As<IRedisConnectionManager>()
    .PropertiesAutowired()
    .SingleInstance();

builder.Register(c => new RedisCache(c.Resolve<IRedisConnectionManager>()))
    .As<IRedisCache>()
    .PropertiesAutowired()
    .SingleInstance();

In order to use the cache, just wrap your query with syntax like this:

return _cache.Get("ProductList", 60, () =>
{
  return (from p in _db.Products select p.Name);
});

The code between the { and } represents the normal EF linq query.  This must be returned to the anonymous function call: ( ) =>

The cache key name in the example above is “ProductList” and it will stay in the cache for 60 minutes.  The _cache.Get() method will check the cache first, if the data is there, then it returns the data and moves on.  If the data is not in the cache, then it calls the inner function, causing the EF query to be executed.  The result of the query is then saved to the cache server and then the result is returned.  This guarantees that the next query in less than 60 minutes will be in the cache for direct retrieval.  If you dig into the Get() method code you’ll notice that there are multiple try/catch blocks that will error out if the Redis server is down.  For a situation where the server is down, the inner query will be executed and the result will be returned.  In a production situation your system would run a bit slower and you’ll notice your database is working harder, but the system keeps running.

A precompiled version of Redis for Windows can be downloaded from here: Service-Stack Redis.  Download the files into a directory on your computer (I used C:\redis) then you can open a command window and navigate into your directory and use the following command to setup a windows service:

redis-server –-service-install

Please notice that there are two “-” in front of the “service-install” instruction.  Once this is setup, then Redis will start every time you start your PC.

The Data-tier

The DAC project contains the POCOs, the fluent configurations and the context object.  There is one interface for the context object and that’s for AutoFac’s use:

builder.Register(c => new DatabaseContext("Server=SQL_INSTANCE_NAME;Initial Catalog=DemoData;Integrated Security=True"))
    .As<IDatabaseContext>()
    .PropertiesAutowired()
    .InstancePerLifetimeScope();

The context string should be read from the configuration file before being injected into the constructor shown above, but I’m going to keep this simple and leave out the configuration pieces.

Business Logic

The business logic library is just one project that contains all the complex classes and methods that will be called by the API.  In a large application you might have two or more business logic projects.  Typically though, you’ll divide your application into independent APIs that will each have their own business logic project as well as all the other wire-up projects shown in this example.  By dividing your application by function you’ll be able to scale your services according to which function uses the most resources.  In summary, you’ll put all the complicated code inside this project and your goal is to apply unit tests to cover all combination of features that this business logic project will contain.

This project will be wired up by AutoFac as well and it needs the caching and the data tier to be established first:

builder.Register(c => new SalesProducts(c.Resolve<IDatabaseContext>(), c.Resolve<IRedisCache>()))
    .As<ISalesProducts>()
    .PropertiesAutowired()
    .InstancePerLifetimeScope();

As you can see the database context and the redis caching is injected into the constructor of the SalesProjects class.  Typically, each class in your business logic project will be registered with AutoFac.  That ensures that you can treat each object independent of each other for unit testing purposes.

Unit Tests

There is one sample unit test that performs a test on the SalesProducts.Top10ProductNames() method.  This test only tests the instance where there are more than 10 products and the expected count is going to be 10.  For effective testing, you should test less than 10, zero, and exactly 10.  The database context is mocked using moq.  The Redis caching system is faked using the interfaces supplied by StackExchange.  I chose to setup a dictionary inside the fake object to simulate a cached data point.  There is no check for cache expire, this is only used to fake out the caching.  Technically, I could have mocked the caching and just made it return whatever went into it.  The fake cache can be effective in testing edit scenarios to ensure that the cache is cleared when someone adds, deletes or edits a value.  The business logic should handle cache clearing and a unit test should check for this case.

Other Tests

You can test to see if the real Redis cache is working by starting up SQL Server Management Studio and running the SQL Server Profiler.  Clear the profiler, start the MVC application.  You should see some activity:

Then stop the MVC program and start it again.  There should be no change to the profiler because the data is coming out of the cache.

One thing to note, you cannot use IQueryable as a return type for your query.  It must be a list because the data read from Redis is in JSON format and it’s de-serialized all at once.  You can de-searialize and serialize into a List() object.  I would recommend adding a logger to the cache object to catch errors like this (since there are try/catch blocks).

Another aspect of using an IOC container that you need to be conscious of is the scope.  This can come into play when you are deploying your application to a production environment.  Typically developers do not have the ability to easily test multi-user situations, so an object that has a scope that is too long can cause cross-over data.  If, for instance, you set your business logic to have a scope of SingleInstance() and then you required your list to be special to each user accessing your system, then you’ll end up with the data of the first person who accessed the API.  This can also happen if your API receives an ID to your data for each call.  If the object only reads the data when the API first starts up, then you’ll have a problem.  This sample is so simple that it only contains one segment of data (top 10 products).  It doesn’t matter who calls the API, they are all requesting the same data.

Other Considerations

This project is very minimalist, therefore, the solution does not cover a lot of real-world scenarios.

  • You should isolate your interfaces by creating a project just for all the interface classes.  This will break dependencies between modules or dlls in your system.
  • As I mentioned earlier, you will need to move all your configuration settings into the web.config file (or a corresponding config.json file).
  • You should think in terms of two or more instances of this API running at once (behind a load-balancer).  Will there be data contention?
  • Make sure you check for any memory leaks.  IOC containers can make your code logic less obvious.
  • Be careful of initialization code in an object that is started by an IOC container.  Your initialization might occur when you least expect it to.

Where to Get The Code

You can download the entire solution from my GitHub account by clicking here.  You’ll need to change the database instance in the code and you’ll need to setup a redis server in order to use the caching feature.  A sql server script is provided so you can create a blank test database for this project.

 

Entity Framework Core 1.0 Smackdown!

Summary

In this post I’m going to compare EF 7 or EF Core 1.0 with the previous version of Entity Framework.  I’m also going to throw in the scores for Fluent NHibernate just for kicks.

I’ve done a few of these head to head comparisons, mostly for my own curiosity, but also to provide myself with a chance to learn a new technology that I know I’ll be using for years to come.  Core is the beginning of a whole new paradigm with Microsoft I I’m happy with their decisions.  First, I have to commend them on their choice to include dependency injection right into their new .Net objects.  In addition to that, they have a new in-memory Entity Framework that makes it so much quicker and easier to design unit tests.  

Setup

To get started, I have the free version of Visual Studio 2015 (2015 Community).  Previous Visual Studios came with a free version, but they have always been limited to a web version or a desktop version.  Community 2015 seems to have all the features of 2013 Professional.  The Professional version of 2015 has a lot of nice features, but the Community version is a great way to get students “hooked” on Visual Studio.  

You’ll need to make sure you have all the updates for VS 2015 before you start.  You might also need to install PowerShell 3.0.  You can search Google (or Bing) for PowerShell and download the latest version for your OS.  I’m currently using Windows 7 on this machine.

To create a Core 1.0 application, your best resource is Microsoft’s on-line documentation.  I would start here, and go to the .Net Core download site.  Follow the directions from there.  I’m currently using the Tooling Preview 2.

EF 6.0 vs. EF 7.0 Core

I used my previous smackdown code to test the performance of EF 6.  I have a new machine so I know the timing will be different to start with and I needed a base-line.  Then I copied the performance object and all the POCO and context stuff I used in the EF 6 project to my new .Net Core project.  That’s when I discovered something fishy.  EF 7 was performing very bad for selects.  Inserts, updates and deletes were faster under EF7, but a basic select was 100’s of times slower.  As I dug around with SQL profiler, I discovered that there was no EF performance problem.  The problem was in the .ToList() conversion.  Not sure why this has become such a slug, but I don’t want the .ToList() to skew my performance measurements.

Another problem I discovered was my logic involving the best of 5 runs.  I had coded this incorrectly as thus:

double smallest = -1;
for (int i = 0; i < 5; i++)
{
    double result = TestUpdate();

    if (smallest == -1)
    {
        smallest = result;
    }
    else
    {
        if (result < smallest)
        {
            result = smallest;
        }
    }
}
WriteLine(“UPDATE:” + smallest);

Currently, I’m using the Resharper ultimate add-on by JetBrains.  This tool quickly identified that “result = smallest;” was unused.  That’s because it’s backwards, and should have been “smallest = result;”.  This bug doesn’t throw the result, since the results can vary by a small amount anyway, but it was not intended, so I fixed it in the old version as well as the new.  I also dragged out the old NHibernate test code and refactored to make it mimic the Core 1.0 code as close as possible and ran that to get a base-line for NHibernate.

Here are the results:

As you can see, Microsoft has done a good job of improving the performance of Entity Framework.  I’m impressed.  NHibernate still beats the overall performance, but it’s very tiny and real-world differences could go either way.

Update

I have revisited this problem and re-tested all three scenarios (EF6, Core, NHibernate) with the .ToList() method added to the select section.  One issue I ran into was a bug in my update, select and delete tests.  The first test was an insert test and I looped 5 times to get the best insert times.  That was inserting 10,000 records each time.  Therefore, the update, select and delete times were based on 50,000 records.  Not what I intended.  So I inserted code to delete the records for each loop of the insert and I also deleted and inserted 10,000 records in-between each test (in case I ever run one test without the others).  The above list was updated to reflect the times without using the .ToList() on the select statement.  The below list is with the .ToList():

This indicates that Entity Framework 6 selects and brings in the list of data faster than Core or NHibernate.  I have scrutinized the code endlessly.  If anyone can find anything that looks odd in my code, be sure to leave a message and I’ll look into it and update this post to reflect any bug fixes.

Test for yourself!

I would recommend downloading my code and trying this on your own system.  First, you can download the .Net Core solution by going to GitHub here and downloading the source.  There is a SQL script file in the source that you can run against your local MS SQL server.  The NHibernate speed test code can also be downloaded from my GitHub account by clicking here.  The Entity Framework 6 code can be found by clicking here.  You can ignore the MySQL part, I didn’t use that for this test.

 

MySQL

Summary

Let’s do something different.  I’m going to install the latest version of MySQL, which is version 5.7, which is advertised to be 3x faster.  I’m going to compare some basic database operations between the 32 bit version of MySQL, with the 64 bit version of Microsoft SQL Server 2014 (unfortunately, the MySQL msi is only in a 32 bit version).  I’ll be using Entity Framework to communicate with my basic CRUD operations.

Database Servers

Choosing a database server to build your software around is not very critical when you start a small project.  There are free versions of MS SQL, MySQL and Oracle that you can use.  There are also cheap versions that you can use with commercial projects until your software grows to enterprise level.  At the point where you need highavailability, the licensing costs of these servers can be very critical to the cost of doing business.  Unfortunately, there is no easy way to convert between these database engines.  I have converted between Oracle and MS SQL server in a rare setup involving no stored procedures.  Once a critical mass of stored procedures have been written the game is over.  You’re locked in.  So keep this in mind when you choose which database you use and how you construct your system.  You may be forced to change technologies in the future.

With that in mind, I want to test MySQL against MS SQL and see how they compare.  I will not be doing any stored procedures.

The Test Data

I’m going to do a very simple test to get a feel for the differences between these two databases.  First, I’m not going to use any indexes.  Second, I’m going to use only two simple tables: Department, with one field and Person, with three fields.  I’ll do one test with a foreign key constraint and one test without.  I’ll add an index to the foreign key of the person table in each database.  Both sets of tests will use EF 6.0 and I’ll use the same table objects for both databases.  Only the context will be different.


 
The first set of numbers shows the fastest timing for each operation of 5 tests with no foreign key applied to the person table.  The second set of data shows the same tests with a forign key applied.  There is definitely an advantage to using MS SQL Server, especially for select operations.  I want to also note that I set the change tracking to off for the select operations for both SQL and MySQL.  

To create the tables in MySQL use the following SQL statement:
 
CREATE TABLE `department` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=30 DEFAULT CHARSET=utf8;

CREATE TABLE `person` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `first` varchar(50) DEFAULT NULL,
  `last` varchar(50) DEFAULT NULL,
  `department` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_department` (`department`),
  CONSTRAINT `fk_department` FOREIGN KEY (`department`) REFERENCES `department` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=436451 DEFAULT CHARSET=utf8;

To Create the tables in MS SQL:

 CREATE TABLE [dbo].[Department](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [name] [varchar](50) NULL,
 CONSTRAINT [PK_department] PRIMARY KEY CLUSTERED
(
    [id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [dbo].[Person](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [first] [varchar](50) NULL,
    [last] [varchar](50) NULL,
    [department] [int] NOT NULL,
 CONSTRAINT [PK_person] PRIMARY KEY CLUSTERED
(
    [id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

ALTER TABLE [dbo].[Person]  WITH CHECK ADD  CONSTRAINT [fk_department] FOREIGN KEY([department])
REFERENCES [dbo].[Department] ([id])
GO

ALTER TABLE [dbo].[Person] CHECK CONSTRAINT [fk_department]
GO

Conclusions

This is a very small test and should be not be used as evidence to use one database server over another.  If you are making a decision on which server to use, I would recommend expanding the tables used in this sample to include other field types.  I would also recommend attempting other query types, such as outer joins and subqueries.  If you have a small application written to use one data server or the other (and you don’t have any stored procedures), then you can copy your app and tweak the context to point to the other database.  Then generate a new database and insert some test data for comparison.  See how your application performs.  

One other test I would perform is to setup an Oracle instance and see how it compares with these in a side-by-side test.  The performance difference could be huge, but I would prefer something real-world over any claims made by the companies promoting their products.


Where to Get the Source Code

As always, I have uploaded the source code to my GitHub account.  You can download it by clicking here.  Make sure you fix up any context strings before running the tests.  The MySQL section was setup to require a password.  You’ll have to match that with the password you set in your data server.

 

 

 

The Cloud

This post is going to be a bit different from the usual.  I’m going to talk a little about cloud computing.  Specifically about Amazon’s cloud and what you should consider if you are thinking about using a cloud-based system.  This article will be geared more toward the business end of cloud computing, though I’m going to describe some technical details up front.

Some History

When the cloud was first offered, I was a bit skeptical about using it.  At the time I was working for a company that hosted their own equipment.  Internet connection bandwidth was not like it is today (I think we had a 5 megabit connection back then).  The cloud was new and virtualization was new and expensive.  There were a lot of questions about how to do it.  If I were to startup a new system for a company like that today, I’d recommend the cloud.

Any IT person who installs equipment for a data center today knows about cloud computing.  I have now worked for two companies that have hosted their equipment at a major data center, using virtual hosts.  The advantages of hosting equipment at a data center verses providing your own facility are numerous.  Off the top of my head are: Cooling, backup power, data bandwidth to the internet, and physical security to the site.  The cloud provides additional benefits: Pay for equipment as you need it, avoid delay required to order new equipment.

Amazon Web Services (AWS)

I examined a few different cloud services and I’ll blog about other services as I get time to gain some experience.  The reason I started with AWS is due to the fact that they have a 1 year trial for free.  That is marketing genius right there!  First, they encourage software developers to sign up and learn their system.  That allows them to get their foot in the door of companies that might start using cloud computing and abandon their physical data centers.  All because they have developers on staff that already know the technology.  Second, a year is a lot of time to experiment.  A person can get really good at understanding the services or they can attempt to build a new product using their service to see how it operates.

I signed up and it does require a credit card to complete the sign up.  That sends off a few alarms in the back of my head because technically, they could charge my card without me knowing it.  So the first thing I did was find out where I can review any charges.  I also noticed that there are warning messages that tell me when I’m attempting to setup a service that does not apply to the free tier (which means that I’ll get charged).  The great unknown is what happens if I accidentally get a flood of traffic for a test application that I’ve posted?  I guess I’ll find out, or hopefully not.

Anyway, here’s what the billing screen looks like:


This is accessible from the drop-down menu above with your name on the account (“Frank DeCaire” for my account).

There are a lot of services on AWS and their control panel is rather large:

Where to start?  I started with Elastic Beanstalk.  Amazon uses the word “Elastic” in all kinds of services.  At first, I thought it was just a cute word they used to describe their product the way that Microsoft uses the word “Azure”.  I began to read some documents on their services and the word “Elastic” refers to the fact that you can program your cloud to provision new servers or tear-down servers according to trigger points.  So you can cause more servers to be put on line if your load becomes too high.  Conversely you can automatically tear-down servers if the load gets too low (so you don’t have to pay for servers you don’t need during low volume times).  This is where the term “Elastic” comes in.  The number of servers you apply to your product is elastic.  

Back to Beanstalk.  The Elastic Beanstalk application has a web server and an optional database server.  So I clicked into the Beanstalk app and created an IIS server (there are several web server types to choose from).  Then I added a SQL Server Express database under RDS.  The database server required an id and password.  Once that was created there is a configuration details screen and it contains a url under the section named Endpoint,  This is the connection url that can be used by SQL Server Management Studio.  Once connected, I was able to manipulate SQL Server the same as a local instance.  I created tables and inserted data to make sure it worked.


IIS

The IIS server control panel looks like this:



You can click on the blue link to pop-up the website url that points to this web server (or server farm).  I have intentionally obscured the id by replacing it with “abcdefgh”, so the id above will not work.  You’ll need to create your own account and a random id will be generated for your own server.

Next, you need to download the tool kit for Visual Studio (click here).  I installed it on VS 2015, so I know it works on the newest version of Visual Studio.  I also tested on VS 2013.  There are a few gotchas that I ran into.  First, I ran into an error when attempting to deploy to AWS.  The error I received was that the URL validation failed (“Error during URL validation; check URL and try again”).  This turned out to be a false error.  What I discovered was that there was a permissions problem with access to IIS.  This can be found in the Identity and Access Management console (IAM).  I had a user created, but I did not assign a group to the user.  The IAM console is rather complex and requires some head-scratching.  Stack overflow is where I found the best answer to troubleshooting this issue:

aws-error-error-during-url-validation-check-url-and-try-again

My next problem gave an error “The type initializer for ‘Microsoft.Web.Deployment.DeploymentManager’ threw an exception.” which was just as cryptic.  As it turned out there are registry entries that SQL Server doesn’t remove when uninstalling older versions of SQL Server that interfere with the deployment software in Visual Studio.  The keys are:

HKLMSoftwareMicrosoftIIS Extensionsmsdeploy3extensibility
HKLMSoftwareWow6432NodeMicrosoftIIS Extensionsmsdeploy3extensibility


They both should be removed.  I also found that information from stack overflow:

Web deployment task failed. (The type initializer for ‘Microsoft.Web.Deployment.DeploymentManager’ threw an exception.)

At that point I was able to deploy my application and get a “Hello World” program running.  Once this capability is in place you can focus on the development process and not deal with configuration details until you need more capabilities.


Real World Application

Now that I have the basics down, I still need to test some of the other features of AWS (like their EC2 virtual servers).  However, I have enough knowledge to actually use AWS for a production system.  If you’re analyzing this service as a migration of an existing system, then there are a lot of things you still need to consider.  The first thing you’ll need to do is find out how much it’ll cost to store the amount of data that you already use.  How much web traffic are you using?  How many servers do you currently use?  These are going to go into an equation of cost.  When you compute those costs it should be lower than what you are currently paying for your equipment, data connection and facility.  If not, then you should not move your system.

If you are contemplating a start-up, you’ll have other factors to consider.  First and foremost, assuming you haven’t created your software yet, you’ll need to decide which web platform and database engine you’ll use.  If you’re not experienced with working at a company that has a large database system, you might not realize how much licenses can cost when you need to scale out.  In the early stages of development priority might be placed on how easy it is to get the site up and running.  This will haunt you in the long run if your user base grows.  I would seriously consider using free or open-source software where you can.  AWS has MySql and Apache with Java, Python or PHP.  Ruby is another option.  If you lock yourself into IIS and SQL Server, you’ll need to pay the extra licensing fees when your application outgrows the Express edition.  Once you have created thousands of stored procedures in SQL, you’re locked in, with a re-development cost that is astronomical or license fees that are almost as bad.

Another factor to contemplate in a start-up is the cost of getting your business going.  If you have seed capital, then you’re probably set for a fixed period of time.  If you are doing this on your own, then you’re probably worried about how much it will cost until you get enough customers to cover your fees.  You’ll need to compute this information ahead of time.  You need to ask yourself: “How many paying customers do I need in order to break even.”  If you are providing a two-tier website that has a free component (which is a great way to hook people) and a paid component that has powerful features, you’ll need to figure out what the ratio of paid vs. free customers there will be.  If you’re conservative with your figures, you’ll come out ahead.  I would start with a 5%/95% and compute what you need.  That means you’ll need to pay for 100% of your customer’s data and bandwidth usage, but you’ll only collect money from the 5% that are paying.  If you plan to sell advertisements, you’ll need to compute that.

Now you’re probably thinking “how do I know what these numbers are going to be?”  Well, that’s where this free AWS service is handy.  If you’re clever, you’ll get your application up and running before you sign up for AWS, or if your application is expected to be small and easy to build, you can build it directly on AWS.  When you’re ready to do some usage testing, you can put it on line and get it into the search engines.  At first you’ll end up with 100% free users.  Your traffic should increase.  You’ll have to take an educated guess at what to charge for the advanced features.  Too much, and nobody will see the value.  Too cheap and you’ll go broke.  The ideal price point would be something that seems cheap for what the customer receives, but enough to cover costs and earn a profit.  What that price point is, depends on what your application does.

AWS has a system for taking credit care information and keeping track of accounting information.  You’ll need this type of system in order to keep track of who has paid and how much they have paid for.  This service is called DevPay.  The goal is to automate the process of collecting payment information, activating accounts and deactivating accounts.  That’s a task that can overwhelm a person in no time if your product becomes successful.  Here’s the basic information on DevPay:

What is Amazon DevPay?


Other Considerations

Once you launch your application and it becomes established, you’ll need to consider your growth rate.  If your income is large enough, you can plan for new versions of your software according to how many developers you can keep on staff or contract.  In the cloud scenario, there is no need to pay for office space.  Technically, you can run the entire operation from your home.  Avoid adding the cost of an expensive facility until you really need it.  

Keep your eyes open on other cloud providers.  Google or Microsoft (and others) can provide equivalent services.  If their pricing structure makes your product cheaper to operate, consider porting to their cloud.  If you keep this in mind when you’re small, you can keep your application in a format that can be re-deployed quickly.  If you build in too many Amazon specific features you might be stuck until you can redesign a feature (Yes, I mentioned this fact after I talked about DevPay in the previous paragraph).  Another option is to use a cloud provider specific feature long enough to design your own non-cloud provider specific feature.  In other words, use DevPay for your application until you can hire developers or put in the development time to write your own (or possibly use another 3rd party product).  Always keep your application capable of being moved.  Otherwise, you’ll be hostage to a provider that someday may become hostile to your business.

Deployment tools are another feature you should get familiar with.  Automate your deployment as much as possible.  AWS has deployment tools that allow the developer to clone a production web server in isolation and to deploy a development version of your application for testing purposes.  If you need to do a lot of manual steps to get your application tested and deployed, you’ll be wasting valuable developer time.  Time that is very expensive.

Get familiar with the security features.  If you hire outside contractors to perform maintenance or development tasks, you’ll need to be able to shut off their accounts quickly if something goes wrong.  Make sure you understand what capabilities you are giving to another person.  Don’t allow a rogue programmer to put in back-doors and open holes to the internet that you don’t know exist.  Always monitor what is going on with your system.

I could go on all day, but at this point you should go to the AWS site and sign up for free usage.  Get some experience.  Click here.  When you get a “Hello World” program deployed and working, try some new features.  I would also recommend seeking out other cloud products from other vendors.  Google and Microsoft come to mind but there are others like AT&T, EMC, IBM, etc.

  

 

Lazy Loading Data

Summary

In this blog post I’m going to demonstrate how to lazy load your data using Entity Framework and SQL Server.  The purpose is to write an object that returns program settings stored in a table, but read from the table only once during the lifetime of the program.

The Data

In a database named “sampledata” I’m going to create a table that contains true/false settings for any number of settings.  This is a simple example for the purposes of showing how to lazy load.  So I will be “pretending” that there is a purpose for these settings even though the demo program doesn’t use any of it.

You can use the following SQL script to generate a table to run this program:

CREATE TABLE [dbo].[ProgramSetting](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [varchar](50) NOT NULL,
    [Setting] [bit] NOT NULL,
 CONSTRAINT [PK_ProgramSetting] PRIMARY KEY CLUSTERED
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

INSERT INTO ProgramSetting (Name,Setting) values (‘RequirePassword‘,1)
INSERT INTO ProgramSetting (Name,Setting) values (‘ShowUserName‘,0)
INSERT INTO ProgramSetting (Name,Setting) values (‘ShowUserPicture‘,0)
INSERT INTO ProgramSetting (Name,Setting) values (‘AllowUserToChangePicture‘,1)

The Class
 
Now, I’m going to design an object that will be used throughout a program to read settings from the database and return the true/false value.  Here’s the basic class:

public class ConfigurationSettings
{
    private static List<string> _Data = null;
    private static List<string> Data
    {
        get
        {
            if (_Data == null)
            {
                using (var db = new ConfigDataContext())
                {
                    _Data = (from ps in db.ProgramSettings
                             where ps.Setting
                             select ps.Name).ToList();
                }
            }
            return _Data;
        }
    }

    public static bool RequirePassword
    {
        get
        {
            return Data.Contains(“RequirePassword“);
        }
    }

    public static bool ShowUserName
    {
        get
        {
            return Data.Contains(“ShowUserName“);
        }
    }

    public static bool ShowUserPicture
    {
        get
        {
            return Data.Contains(“ShowUserPicture“);
        }
    }

    public static bool AllowUserToChangePicture
    {
        get
        {
            return Data.Contains(“AllowUserToChangePicture“);
        }
    }
}

Now, if you look at the code carefully, you’ll see that there is a property that gets the raw data using LINQ and Entity Framework.  You can download the sample code to see the code for the context and the table.  For this blog post all you need to know is that all the data from the ProgramSettings table is read at one time.  It will occur the first time a configuration parameter is read.  Why?  Because the Data.Contains() will call the Data getter which will then check to see if the local variable named _Data is null.  The first time a parameter is read, this will be null and the data will be red into _Data.  The next parameter that is read will cause the _Data to be returned without any read to the database.

The list that is maintained in the variable _Data is just a list of the names of configuration parameters that returned true.  If the parameter is missing or they are false, then we’ll return a false to the calling program.  This can be convenient for new features that can be added to the code first.  They’ll be false until you put parameters in the database and you won’t be stuck with a chicken or the egg problem of when to insert the data into the ProgramSetting table.

Obviously, this works for data that is mostly static.  If you include this code in a stand-alone program and you change a configuration setting in your database, you’ll need to exit from any previously running programs before the changes will take affect.

Another possible problem involves the fact that your configuration parameter names are coded as well as stored in the database.  That means that changing a name in the database, adding a name or deleting a name requires some matching code changes.  This can get tedious if the list of configuration parameters is large.  You could write a program to generate the parameters and just regenerate the parameter list every time you make changes to the database and then update your software.  You can also opt to use some sort of read parameter method that requires the parameter name in a string passed in.  Most likely, you’ll be making changes to your code if you’re adding, deleting or modifying a configuration parameter, so it makes no real difference.

How to Get the Code

You can download the sample code at my GitHub account by clicking here.  Be sure to go to the context source file and change the database connection string to match your own SQL server.

 

 

Entity Framework 6.1.3 vs. Linq2SQL vs. NHibernate Smackdown!

Summary

I’ve done these ad-hoc ORM performance tests in the past.  In this test I’m going to re-test the latest Entity Framework (version 6.1.3) against the latest Linq-to-SQL version.  I’ll be using EF code-first this time to make sure there isn’t any extra overhead.  I’ll also be using Visual Studio 2013.

The Hardware and Software

This test was executed on an older PC with an Intel Core2 Duo CPU running at 3Ghz (E8400).  I’m using 8 Gigabytes of Ram and a solid-state hard drive.  This machine has Windows 10 (64 bit) and I’m using Visual Studio 2013 and Microsoft SQL Server 2014.

The Tests

I ran the tests 5 times each and chose the shortest time for each section (insert, update, delete and select).  I also added the NHibernate ORM just to make this a 3-way smack-down.  I performed the basic 4 CRUD operations (insert, update, select, delete) and compared results for each operation.

The Results

It appears that EF has been improved since the last time I’ve performed this test.  Entity Framework seems to be a bit faster than Linq-to-SQL this time.  However, NHibernate still outperforms both Entity Framework and Linq-to-SQL by a large margin (between 2x and almost 3x).

Here are the results:



Usual Disclaimer

I have to warn you that this experiment is unscientific.  I’m using a PC that I use for general purpose day-to-day operations.  What this means is that I have a virus checker and a backup program that can kick in at any time.  Not to mention the fact that I have automatic updates set to on.  To make this experiment more accurate, I would need to setup a stripped down PC and run comparison tests with no other programs running.  In other words, your results may vary.

So my advice is to download the source from this blog post, compile and run this test on your machine or the expected destination setup (i.e. against your company SQL server).  You can also increase the size of the data set to simulate your expected data volume.


Download the Code

You can go to my GitHub account by clicking here.  Download the zip file and unzip it to your visual studio project directory.  Don’t forget to search the code for the connection string (“My Connection String”) and replace with your own connection string.  This occurs in 3 files: SampleDataContext.cs, App.config and NHibernateSessionFactory.cs.

 

Data Caching with Redis and C#

Summary

Caching is a very large subject and I’m only going to dive into a small concept that uses Redis, an abstract caching class, and a dummy caching class for unit testing and show how to put it all together for a simple but powerful caching system.

Caching Basics

The type of caching I’m talking about in this blog post involves the caching of data from a database that is queried repetitively.  The idea is that you would write a query to read your data and return the results to a web site, or an application that is under heavy load.  The data being queried might be something that is used as a look-up, or maybe it’s a sales advertisement that everybody visiting your website is going to see.  

The data request should check to see if the data results are already in the cache first.  If they are not, then read the data from the database and copy to the cache and then return the results.  After the first time this data is queried, the results will be in the cache and all subsequent queries will retreive the results from the cache.  One trick to note is that the cached results name needs to be unique to the data set being cached, otherwise, you’ll get a conflict.

Redis

Redis is free, powerful and there is a lot of information about this caching system.  Normally, you’ll install Redis on a Linux machine and then connect to that machine from your website software.  For testing purposes, you can use the windows version of Redis, by downloading this package at GitHub (click here).   Once you download the Visual Studio project, you can build the project and there should be a directory named “x64”.  You can also download the MSI file from here.  Then you can install and run it directly.

Once the Redis server is up and running you can download the stack exchange redis client software for C#.  You’ll need to use “localhost:6379” for your connection string (assuming you left the default port of 6379, when you installed Redis).


Caching System Considerations

First, we want to be able to unit test our code without the unit tests connecting to Redis.  So we’ll need to be able to run a dummy Redis cache when we’re unit testing any method that includes a call to caching.

Second we’ll need to make sure that if the Redis server fails, then we can still run our program.  The program will hit the database every time and everything should run slower than with Redis running (otherwise, what’s the point), but it should run.

Third, we should abstract our caching class so that we can design another class that uses a different caching system besides Redis.  An example Windows caching system we could use instead is Memcached.

Last, we should use delegates to feed the query or method call to the cache get method, then we can use our get method like it’s a wrapper around our existing query code.  This is really convenient if we are adding caching to a legacy system, since we can leave the queries in place and just add the cache get wrappers.


CacheProvider Class

The CacheProvider class will be an abstract class that will be setup as a singleton pattern with the instance pointing to the default caching system.  In this case the RedisCache class (which I haven’t talked about yet).  The reason for this convoluted setup, is that we will use the CacheProvider class inside our program and ignore the instance creation.  This will cause the CacheProvider to use the RedisCache class implementation.  For unit tests, we’ll override the CacheProvider instance in the unit test using the BlankCache class (which I have not talked about yet).

Here’s the CacheProvider code:

public abstract class CacheProvider
{
    public static CacheProvider _instance;
    public static CacheProvider Instance
    {
        get
        {
            if (_instance == null)
            {
                _instance = new RedisCache();
            }
            return _instance;
        }
        set { _instance = value; }
    }

    public abstract T Get<T>(string keyName);
    public abstract T Get<T>(string keyName, Func<T> queryFunction);
    public abstract void Set(string keyName, object data);
    public abstract void Delete(string keyName);
}

I’ve provided methods to save data to the cache (Set), read data directly from the cache (Get) and a delete method to remove an item from the cache (Delete).  I’m only going to talk about the Get method that involves the delegate called “queryFunction”.


RedisCache Class

There is a link to download the full sample at the end of this blog post, but I’m going to show some sample snippets here.  The first is the Redis implementation of the Get.  First, you’ll need to add the Stack Exchange Redis client using NuGet.  Then you can use the connect to the Redis server and read/write values.

The Get method looks like this:

public override T Get<T>(string keyName, Func<T> queryFunction)
{
    byte[] data = null;

    if (redis != null)
    {
        data = db.StringGet(keyName);
    }

    if (data == null)
    {
        var result = queryFunction();

        if (redis != null)
        {
            db.StringSet(keyName, Serialize(result));
        }

        return result;
    }

    return Deserialize<T>(data);
}

The first thing that happens is the StringGet() method is called.  This is the Redis client read method.  This will only occur if the value of redis is not equal to null.  The redis value is the connection multiplexer connection that occurs when the instance is first created.  If this fails, then all calls to Redis will be skipped.

After an attempt to read from Redis is made, then the variable named data is checked for null.  If the read from Redis is successful, then there should be something in “data” and that will need to be deserialized and returned.  If this is null, then the data is not cached and we need to call the delegate function to get results from the database and save that in the cache.

The call to StringSet() is where the results of the delegate are saved to the Redis server.  In this instanced, the delegate is going to return the results we want (already deserialized).  So we need to serialize it when we send it to Redis, but we can just return the results from the delegate result.

The last return is the return that will occur if we were able to get the results from Redis in the first place.  If both the Redis and the database servers are down, then this method will fail, but the app will probably fail anyway.  You could include try/catch blocks to handle instances where the delegate fails, assuming you can recover in your application if your data doesn’t come back from the database server and it’s not cached already.

You can look at the serialize and deserialize methods in the sample code.  In this instanced I serialized the data into a binary format.  You can also serialize to JSON if you prefer.  Just replace the serialize and deserialize methods with your own code.


Using the RedisCache Get Method

There are two general ways to use the Get method: Generic or Strict.  Here’s the Generic method:

var tempData = CacheProvider.Instance.Get(“SavedQuery“, () =>
{
    using (var db = new SampleDataContext())
    {
        return (from r in db.Rooms select r).ToList();
    }
});



For strict:

for (int i = 0; i < iternations; i++)
{
    List<Room> tempData = CacheProvider.Instance.Get<List<Room>>(“SavedQuery2“, () =>
    {
        using (var db = new SampleDataContext())
        {
            return (from r in db.Rooms select r).ToList();
        }
    });
}


In these examples you can see the LINQ query with a generic database using statement wrapping the query.  This sample was coded in Entity Framework 6 using Code-First.  The query is wrapped in a function wrapper using the “() => {}” syntax.  You can do this with any queries that you already have in place and just make sure the result set is set to return from this.  The tempData variable will contain the results of your query.


Using the BlankCache Class

There are two different ways you could implement a dummy cache class.  In one method, you would provide a Get() method that skips the caching part and always returns the result of the delegate.  You would be implementing an always miss cache class.  

The other method is to simulate a caching system by using a dictionary object to store the cached data and implement the BlankCache class to mimic the Redis server cache without a connection.  In this implementation we making sure our code under test will behave properly if a cache system exists and we’re not concerned about speed per-say.  This method could have a negative side-effect if your results are rather large, but for unit testing purposes you should not be accessing large results.

In either BlankCache implementation, we are not testing the caching system.  The purpose is to use this class for unit testing other objects in the system.

A snippet of the BlankCache class is shown here:

public class BlankCache : CacheProvider
{
    // This is a fake caching system used to fake out unit tests
    private Dictionary<string, byte[]> _localStore = new Dictionary<string, byte[]>();

    public override T Get<T>(string keyName, Func<T> queryFunction)
    {
        if (_localStore.ContainsKey(keyName))
        {
            return Deserialize<T>(_localStore[keyName]);
        }
        else
        {
            var result = queryFunction();
            _localStore[keyName] = Serialize(result);
            return result;
        }
    }
}

As you can see, I used a dictionary to store byte[] data and used the same serialize and deserialize methods that I used with Redis.  I also simplified the Get method, since I know that I will always get a connection to the fake cache system (aka the Dictionary).

When using the CacheProvider from a unit test you can use this syntax:

CacheProvider.Instance = new BlankCache();

That will cause the singleton instance to point to the BlankCache class instead of Redis.


Getting the Sample Code

You can download the sample code from my GitHub account by clicking here.  Make sure you search for “<your data server name here>” and replace with the name of your SQL server database name (this is in the TestRedis project under the DAL folder inside the SampleDataContext.cs file).

If you didn’t create the ApiUniversity demo database from any previous blog posts, you can create an empty database, then copy the sql code from the CreateApiUniversityDatabaseObjects.sql file included in the Visual Studio code.  The sample code was written in VS2013, but it should work in VS2012 as well.





 

Exporting XML From MS SQL Server

Summary

This is going to be a short post on how to query an MS SQL Server database and produce XML output.

The Query

There are two formats that you can use when you want to convert a query result into XML, Elements format and Properties Format.  The following query will select from a table named “product” and put the result set into elements:

SELECT * FROM product FOR XML RAW ('product'), ROOT('data'), ELEMENTS
 
 

The results window will look something like this:
 

Click on the url link and the output should look like this:

<data>
  <product>
    <ProductId>1</ProductId>
    <ProductType>1</ProductType>
    <Name>Matchbox Car</Name>
    <store>1</store>
  </product>
  <product>
    <ProductId>2</ProductId>
    <ProductType>1</ProductType>
    <Name>Baby Rattle</Name>
    <store>1</store>
  </product>
  <product>
    <ProductId>3</ProductId>
    <ProductType>1</ProductType>
    <Name>Lego Bricks</Name>
    <store>1</store>
  </product>
  <product>
    <ProductId>4</ProductId>
    <ProductType>2</ProductType>
    <Name>Steak</Name>
    <store>1</store>
  </product>
</data>

If you want to flatten out your XML, you can use this syntax:

SELECT * FROM product FOR XML RAW ('product')

Which will produce this output:

<product ProductId=”1ProductType=”1Name=”Matchbox Carstore=”1” />
<product ProductId=”2ProductType=”1Name=”Baby Rattlestore=”1” />
<product ProductId=”3ProductType=”1Name=”Lego Bricksstore=”1” />
<product ProductId=”4ProductType=”2Name=”Steakstore=”1” />

As you can see by the last example, you can leave the “Root” keyword off the query if you don’t want to the root data tags added.

Producing JSON

If you need to produce JSON, you will need to do it using another tool.  Use the For XML RAW command above, including the root tag (the output must be well-formed.  Then go to this site (or any other free xml to json converter website): thomasfrank.se XML to JSON – a converter

Paste your xml into the window and hit the convert button.

 

Bulk Data Inserts

Summary

In this blog post I’m going to demonstrate how to do a bulk data insert using ADO.  I’ll be adding code to my helper set of tools so that you can use the ADO database code at my GitHub account to write clean ADO database methods.

Bulk Inserts

The C# SqlClient library contains a method called SqlBulkCopy, which is described in this blog post: Bulk Insert into SQL Server using SqlBulkCopy in C# 

The first thing I’m going to do is add a static helper object to contain all of this code:

DataColumn dateColumn = new DataColumn();
dateColumn.DataType = Type.GetType(“System.DateTime“);
dateColumn.ColumnName = “SaleDate”;
prodSalesData.Columns.Add(dateColumn);

I wanted to contain this in one helper method in order to avoid a lot of repetition.  So I created this object/method to handle it:

public static class DataSetHelpers
{
    public static void AddDataColumn(this DataTable dataTable, 

       string fieldName, string dataType)
    {
        DataColumn dataColumn = new DataColumn();
        dataColumn.DataType = Type.GetType(dataType);
        dataColumn.ColumnName = fieldName;
        dataTable.Columns.Add(dataColumn);
    }
}

Inside my ADODatabaseContext object, I added this method:

public void BulkInsert(DataTable dataTable)
{
    using (SqlBulkCopy s = new SqlBulkCopy(_db))
    {
        s.DestinationTableName = dataTable.TableName;
        s.BulkCopyTimeout = CommandTimeOut;

        foreach (var column in dataTable.Columns)
        {
            s.ColumnMappings.Add(column.ToString(), 
               column.ToString());
        }

        s.WriteToServer(dataTable);
    }
}

Now the bulk insert consists of two parts: The DataTable definition and the DataSet populate.  You can create a custom method in your code to populate each row with one method call.  This would be specific to the table you are populating, I’m going to leave that part expanded.  Here’s the object I used to bulk insert into a table named “Store”:

public class CreateStoreRecords
{
    private DataTable StoreData;


    private void SetupDataTable()
    {
        StoreData = new DataTable(“Store“);

        StoreData.AddDataColumn(“Name“, “System.String“);
        StoreData.AddDataColumn(“Address“, “System.String“);
        StoreData.AddDataColumn(“City“, “System.String“);
        StoreData.AddDataColumn(“State“, “System.String“);
        StoreData.AddDataColumn(“Zip“, “System.String“);
    }

    public void InsertRecords()
    {
        SetupDataTable();

        using (var db = new ADODatabaseContext(“Server=sqlserver;
          Integrated Security=True“))
        {
            DataRow StoreList = StoreData.NewRow();
            StoreList[“Name“] = “Toys R Us“;
            StoreList[“Address“] = “123 Main St“;
            StoreList[“City“] = “Chicago“;
            StoreList[“State“] = “IL“;
            StoreList[“Zip“] = “12345“;
            StoreData.Rows.Add(StoreList);

            StoreList = StoreData.NewRow();
            StoreList[“Name“] = “Target“;
            StoreList[“Address“] = “5th Ave“;
            StoreList[“City“] = “New York“;
            StoreList[“State“] = “NY“;
            StoreList[“Zip“] = “33333“;
            StoreData.Rows.Add(StoreList);

            db.BulkInsert(StoreData);
        }
    }
}

If you want to create the store table in MS SQL Server to match my sample code, then you can use this code in a query window to generate it:

CREATE TABLE [dbo].[Store](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [varchar](50) NULL,
    [Address] [varchar](50) NULL,
    [City] [varchar](50) NULL,
    [State] [varchar](50) NULL,
    [Zip] [varchar](50) NULL,
 CONSTRAINT [PK_Store] PRIMARY KEY CLUSTERED
(
    [id] ASC
)WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON
[PRIMARY]
) ON [PRIMARY]

You’ll have to replace the connection string to match your server and access rights (assuming you’re not using integrated security).

Where to Get The Code

You can download the sample code from above at my GitHub account by clicking here.