Data Caching with Redis and C#

Summary

Caching is a very large subject and I’m only going to dive into a small concept that uses Redis, an abstract caching class, and a dummy caching class for unit testing and show how to put it all together for a simple but powerful caching system.

Caching Basics

The type of caching I’m talking about in this blog post involves the caching of data from a database that is queried repetitively.  The idea is that you would write a query to read your data and return the results to a web site, or an application that is under heavy load.  The data being queried might be something that is used as a look-up, or maybe it’s a sales advertisement that everybody visiting your website is going to see.  

The data request should check to see if the data results are already in the cache first.  If they are not, then read the data from the database and copy to the cache and then return the results.  After the first time this data is queried, the results will be in the cache and all subsequent queries will retreive the results from the cache.  One trick to note is that the cached results name needs to be unique to the data set being cached, otherwise, you’ll get a conflict.

Redis

Redis is free, powerful and there is a lot of information about this caching system.  Normally, you’ll install Redis on a Linux machine and then connect to that machine from your website software.  For testing purposes, you can use the windows version of Redis, by downloading this package at GitHub (click here).   Once you download the Visual Studio project, you can build the project and there should be a directory named “x64”.  You can also download the MSI file from here.  Then you can install and run it directly.

Once the Redis server is up and running you can download the stack exchange redis client software for C#.  You’ll need to use “localhost:6379” for your connection string (assuming you left the default port of 6379, when you installed Redis).


Caching System Considerations

First, we want to be able to unit test our code without the unit tests connecting to Redis.  So we’ll need to be able to run a dummy Redis cache when we’re unit testing any method that includes a call to caching.

Second we’ll need to make sure that if the Redis server fails, then we can still run our program.  The program will hit the database every time and everything should run slower than with Redis running (otherwise, what’s the point), but it should run.

Third, we should abstract our caching class so that we can design another class that uses a different caching system besides Redis.  An example Windows caching system we could use instead is Memcached.

Last, we should use delegates to feed the query or method call to the cache get method, then we can use our get method like it’s a wrapper around our existing query code.  This is really convenient if we are adding caching to a legacy system, since we can leave the queries in place and just add the cache get wrappers.


CacheProvider Class

The CacheProvider class will be an abstract class that will be setup as a singleton pattern with the instance pointing to the default caching system.  In this case the RedisCache class (which I haven’t talked about yet).  The reason for this convoluted setup, is that we will use the CacheProvider class inside our program and ignore the instance creation.  This will cause the CacheProvider to use the RedisCache class implementation.  For unit tests, we’ll override the CacheProvider instance in the unit test using the BlankCache class (which I have not talked about yet).

Here’s the CacheProvider code:

public abstract class CacheProvider
{
    public static CacheProvider _instance;
    public static CacheProvider Instance
    {
        get
        {
            if (_instance == null)
            {
                _instance = new RedisCache();
            }
            return _instance;
        }
        set { _instance = value; }
    }

    public abstract T Get<T>(string keyName);
    public abstract T Get<T>(string keyName, Func<T> queryFunction);
    public abstract void Set(string keyName, object data);
    public abstract void Delete(string keyName);
}

I’ve provided methods to save data to the cache (Set), read data directly from the cache (Get) and a delete method to remove an item from the cache (Delete).  I’m only going to talk about the Get method that involves the delegate called “queryFunction”.


RedisCache Class

There is a link to download the full sample at the end of this blog post, but I’m going to show some sample snippets here.  The first is the Redis implementation of the Get.  First, you’ll need to add the Stack Exchange Redis client using NuGet.  Then you can use the connect to the Redis server and read/write values.

The Get method looks like this:

public override T Get<T>(string keyName, Func<T> queryFunction)
{
    byte[] data = null;

    if (redis != null)
    {
        data = db.StringGet(keyName);
    }

    if (data == null)
    {
        var result = queryFunction();

        if (redis != null)
        {
            db.StringSet(keyName, Serialize(result));
        }

        return result;
    }

    return Deserialize<T>(data);
}

The first thing that happens is the StringGet() method is called.  This is the Redis client read method.  This will only occur if the value of redis is not equal to null.  The redis value is the connection multiplexer connection that occurs when the instance is first created.  If this fails, then all calls to Redis will be skipped.

After an attempt to read from Redis is made, then the variable named data is checked for null.  If the read from Redis is successful, then there should be something in “data” and that will need to be deserialized and returned.  If this is null, then the data is not cached and we need to call the delegate function to get results from the database and save that in the cache.

The call to StringSet() is where the results of the delegate are saved to the Redis server.  In this instanced, the delegate is going to return the results we want (already deserialized).  So we need to serialize it when we send it to Redis, but we can just return the results from the delegate result.

The last return is the return that will occur if we were able to get the results from Redis in the first place.  If both the Redis and the database servers are down, then this method will fail, but the app will probably fail anyway.  You could include try/catch blocks to handle instances where the delegate fails, assuming you can recover in your application if your data doesn’t come back from the database server and it’s not cached already.

You can look at the serialize and deserialize methods in the sample code.  In this instanced I serialized the data into a binary format.  You can also serialize to JSON if you prefer.  Just replace the serialize and deserialize methods with your own code.


Using the RedisCache Get Method

There are two general ways to use the Get method: Generic or Strict.  Here’s the Generic method:

var tempData = CacheProvider.Instance.Get(“SavedQuery“, () =>
{
    using (var db = new SampleDataContext())
    {
        return (from r in db.Rooms select r).ToList();
    }
});



For strict:

for (int i = 0; i < iternations; i++)
{
    List<Room> tempData = CacheProvider.Instance.Get<List<Room>>(“SavedQuery2“, () =>
    {
        using (var db = new SampleDataContext())
        {
            return (from r in db.Rooms select r).ToList();
        }
    });
}


In these examples you can see the LINQ query with a generic database using statement wrapping the query.  This sample was coded in Entity Framework 6 using Code-First.  The query is wrapped in a function wrapper using the “() => {}” syntax.  You can do this with any queries that you already have in place and just make sure the result set is set to return from this.  The tempData variable will contain the results of your query.


Using the BlankCache Class

There are two different ways you could implement a dummy cache class.  In one method, you would provide a Get() method that skips the caching part and always returns the result of the delegate.  You would be implementing an always miss cache class.  

The other method is to simulate a caching system by using a dictionary object to store the cached data and implement the BlankCache class to mimic the Redis server cache without a connection.  In this implementation we making sure our code under test will behave properly if a cache system exists and we’re not concerned about speed per-say.  This method could have a negative side-effect if your results are rather large, but for unit testing purposes you should not be accessing large results.

In either BlankCache implementation, we are not testing the caching system.  The purpose is to use this class for unit testing other objects in the system.

A snippet of the BlankCache class is shown here:

public class BlankCache : CacheProvider
{
    // This is a fake caching system used to fake out unit tests
    private Dictionary<string, byte[]> _localStore = new Dictionary<string, byte[]>();

    public override T Get<T>(string keyName, Func<T> queryFunction)
    {
        if (_localStore.ContainsKey(keyName))
        {
            return Deserialize<T>(_localStore[keyName]);
        }
        else
        {
            var result = queryFunction();
            _localStore[keyName] = Serialize(result);
            return result;
        }
    }
}

As you can see, I used a dictionary to store byte[] data and used the same serialize and deserialize methods that I used with Redis.  I also simplified the Get method, since I know that I will always get a connection to the fake cache system (aka the Dictionary).

When using the CacheProvider from a unit test you can use this syntax:

CacheProvider.Instance = new BlankCache();

That will cause the singleton instance to point to the BlankCache class instead of Redis.


Getting the Sample Code

You can download the sample code from my GitHub account by clicking here.  Make sure you search for “<your data server name here>” and replace with the name of your SQL server database name (this is in the TestRedis project under the DAL folder inside the SampleDataContext.cs file).

If you didn’t create the ApiUniversity demo database from any previous blog posts, you can create an empty database, then copy the sql code from the CreateApiUniversityDatabaseObjects.sql file included in the Visual Studio code.  The sample code was written in VS2013, but it should work in VS2012 as well.





 

GPX (GPS Exchange Format) Files

Hiking

So one of the ways I pass the time is by hiking.  Of course, I’m also a data junkie, so I like to blog about my hiking experience (frankdecairehiking.blogspot.com) and I like to track information about the hike.  My wife invested in a data watch made by Garmin (the Forerunner 15).  It has a little GPS receiver built in and a USB port that allows the user to track their position and then download the data.  In the software that is used to download and synchronize the watch is an export function that will export to many types of files.  The GPX file is the most useful for me since it is an XML output of the GPS coordinates.

Here’s an example of actual data contained in the file:

As you can see there is an element called trkpt which is a track point and contains longitude and latitude information.  The elevation information is not provided by the watch, but the software inserts it from a database of elevation coordinates for each location on the Earth.  Google Maps can also insert this data when you import it into Google Maps.  Did I mention that you can upload this to Google Maps.  Oh yeah.  It’s sweet.

Before I describe what Google Maps can do with this data, let me also mention that the time stamp and other information can appear in this data set.  So in the above data, you can see that my wife was standing at GPS position 38.76018354669213,-78.28429399989545 (lat,long) at 6:03 pm zulu time (or 2:03 pm EST) on August 9th, 2015.


Google Maps

There are a lot of applications that can accept the GPX files, but Google Maps is rather nice for hiking maps.  Why?  Because Google Maps provides a Terrain map that shows the elevation information.  To create a custom map, you can sign into Google (assuming you have an account).  Then click on the triple line icon to the left of the search text box.  This will open a menu and you can select “My Maps”.  A drop-down will appear below the search text box and you can click the “Create” button (it has a pencil icon).  Then click the “Import” link that will appear, and a drag and drop screen appears:


Just drag your GPX file into this rectangle and Google Maps will import your data.

Next, you’ll want to change your base map.  There’s a drop-down arrow to the left of the “Base Map” text.  Click that and select which map type you want.  For hiking in the mountains it’s best to use the Terrain map.  You can also edit the map title, the names of the start and end points, etc.  When you have completed your map, you can make it visible to those who poses the link.  Like this: https://www.google.com/maps/d/edit?mid=z5denw_V-Nlo.kpnvxWGfEG6I

That’s a link to this map:


Another trick is to edit the data points.  You can delete as many trkpt elements as you’d like.  This is handy for cleaning up any double-lines, or if you wandered off the trail and you don’t want to show that data.  This also means that you can add points.  If you wanted to extend your trail to end at a point that you didn’t actually record, you can get the point on Google Maps by right-clicking at the location you want and select the “Where’s here?” menu item.  You’ll see a set of GPS coordinates below the search box.  Copy these into your xml and fix-up the tags.  Be aware that the points reported by Google Maps is in latitude then longitude.  So don’t mix up the first coordinate with the second coordinate, or your point might end up in Antarctica.

If you’re looking for the official GPX schema, click here.

If you’re looking for the official documentation, click here.


 

EF Code-First Stored Procedures With Parameters

Summary

In an earlier post I showed how to create a method that can access a stored procedure using Entity Framework Code-First.  In this post I’m going to show how to pass parameters and show a few tricks.

The Stored Procedure

First, you’ll need to create a new stored procedure in your APIUniversity database (see this post for the table definition and other stored procedure).

CREATE PROCEDURE [dbo].[select_one_room]
    @roomnum AS int
AS
BEGIN
    SELECT
id,name FROM Room WHERE roomnumber = @roomnum
END

New Method

Now create a new method named GetRoom:

public DbRawSqlQuery<RoomListResult> GetRoom(int roomNumber)
{
    var roomNumberParam = new SqlParameter(“roomnum“, roomNumber);

    return this.Database.SqlQuery<RoomListResult>(“EXEC select_one_room @roomnum“, roomNumberParam);
}

 First, you’ll notice that I added a SqlParameter for the parameter that will be fed into the SqlQuery call.  Second, you might notice that I had to put in the “EXEC” command in the stored procedure string.  Apparently, this can be left off, if you’re stored procedure doesn’t contain any parameters, but will complain with an error “Incorrect syntax near ‘select_one_room'” if it is not used in this case.

Make sure you put commas between multiple parameters.  You can also leave off the “@” symbol in the parameter name when used in the SqlParameter() method, but you’ll need the “@” symbol in the SqlQuery() method string.

To use this stored procedure, your query might look something like this:

using (var db = new MyContext())
{
    var results = db.GetRoom(1);

    foreach (var item in results)
    {
        Console.WriteLine(item.Name);
    }

    Console.ReadKey();
}

Bug in the Return Class Column Attribute

For tables, you can rename fields by using an attribute to declare what the column name is inside the database.  Here’s an example where I changed the “Room” class so that the “Name” field is now “RoomName”, but the field in the database remains “Name”:

[Table(“Room“)]
public class Room
{
    [Key]
    public int id { get; set; }

    [Column(“Name“)]
    public string RoomName { get; set; }
    public int RoomNumber { get; set; }
}

This works as expected.

Now, if you try this trick on the “RoomListResult” class, you’ll discover that the attribute is ignored.  This is a known bug (maybe sometime in the future I can amend this post if the bug gets fixed).  Your result will return null for any fields that do not match the field name returned in the stored procedure result.

 Where to Get the Sample Code

I have posted the sample code on my GitHub account.  You can download it by clicking here.  Be sure to search for “My Connection String” and replace with your own database connection string.

 

 

EF Code-First Stored Procedures

Summary

In this post I’m going to show the very basics of calling a stored procedure from Entity Framework using the Code-First technique.

The Database Setup

First, I’m going to create a database named “APIUniversity”.  If you’ve tried out the API projects from my previous posts, you probably already have this database in place and you can just use that.  Here’s the “Room” table:

CREATE TABLE [dbo].[Room](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [varchar](50) NULL,
    [RoomNumber] [int] NULL,
 CONSTRAINT [PK_Room] PRIMARY KEY CLUSTERED
(
    [id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

Some data you can use to populate the table:

insert into room (name,roomnumber) values (‘Gym‘,1)
insert into room (name,roomnumber) values (‘Math Room‘,100)
insert into room (name,roomnumber) values (‘Chemistry Lab‘,101)
insert into room (name,roomnumber) values (‘Lecture 1‘,102)
insert into room (name,roomnumber) values (‘Lecture 2‘,103)
insert into room (name,roomnumber) values (‘Art Room‘,104)


And finally, here’s the SQL script to create the stored procedure that we’ll use:

CREATE PROCEDURE [dbo].[select_rooms]
AS
BEGIN

    SELECT id,name FROM Room
END


First, we’ll need a class to define the record that will be returned from this stored procedure:

public class RoomListResult
{
    public int id { get; set; }
    public string Name { get; set; }
}

You don’t need any “usings” for this code. 

Now you’ll need to add your stored procedure read method to your context:

using System;
using System.Data.Entity;
using System.Data.Entity.Infrastructure;

namespace CodeFirstSample
{
    public class MyContext : DbContext, IDisposable
    {
        public MyContext()
            : base(“My Connection String“)
        {

        }

        public DbRawSqlQuery<RoomListResult> RoomList()
        {
            return this.Database.SqlQuery<RoomListResult>(“select_rooms“);
        }
    }
}

Technically, you can perform your stored procedure access from within your linq query, but this method keeps your code clean when you start to add parameters to stored procedures.

Here’s the code you can test in your console application to see the results:

class Program
{
    static void Main(string[] args)
    {
        using (var db = new MyContext())
        {
            var results = db.RoomList();

            foreach (var item in results)
            {
                Console.WriteLine(item.Name);
            }

            Console.ReadKey();
        }
    }
}

You should see a list of room names like this:



I didn’t cover how to handle parameters. I’m saving that for a later post.

The Code

You can download the entire project at my GitHub account by clicking here.  Included in the project is the SQL script you’ll need to generate the table, stored procedure and populate your table with the seed data.

 

Entity Framework Code-First and Code Only

Summary

In this blog post I’m going to explain the basics of Entity Framework code-first.  I’ll show how to setup your context so that a database is not generated.

Code-First / Code-Only

If your database is already in place, then the most obvious method of using EF is to use Database First.  This is where you start a new EDMX file and generate the code that matches your database.  The problem with database first is that there are script files (tt files) based on the xml master file (edmx) that automatically generate your table and context code.  If you make any database changes, you have to run the update on the edmx file and force the script files to regenerate your code.  This can be a headache if you are deploying to a database that has a slightly different design than your test database.  It can also be a serious problem when you are using version control since two different versions of the edmx file are impossible to reconcile.  If you’re unsure what I’m talking about, right-click on the edmx file in your project and open as an xml file.  Scroll through and observe how the data is stored.

In code-first, you define your context and tables manually in code.  Then your database is generated when you run it the first time.  This guarantees that your database is defined the match your EF code.  The issue with this scheme is that you might already have a database designed and you might want more flexibility in your table definitions in code.  Once instance is a situation where you might design a module (dll) that is used for a specific purpose in your program and only needs to query a few fields from a hand-full of tables.  With code-first you can define exactly which fields you want to read and ignore any extra fields.  However, if your database gets generated from the minimal context, you’ll end up with a database that doesn’t contain all the fields you need.  This is where you need to disable the auto-generation part of code-first and turn your ORM into Code-Only.

Here’s an example of a code-first database with one simple table (you’ll need to add NuGet EF 6 to your project):

MyContext.cs file:

using System;
using System.Data.Entity;

namespace CodeFirstSample
{
    public class MyContext : DbContext, IDisposable
    {
        public MyContext()
            : base(“My Connection String“)
        {

        }

        // define tables here
        public DbSet<Room> Rooms { get; set; }
    }
}

Room.cs file:

using System.ComponentModel.DataAnnotations;
using System.ComponentModel.DataAnnotations.Schema;

namespace CodeFirstSample
{
    [Table(“Room“)]
    public class Room
    {
        [Key]
        public int id { get; set; }
        public string Name { get; set; }
        public int RoomNumber { get; set; }
    }
}

In the above example one table named “Room” will get created in a database named according to the connection string.  If you wish to prevent database creation from happening you can alter the context constructor like this:

public class MyContext : DbContext, IDisposable
{
    public MyContext()
        : base(“My Connection String“)
    {
        Database.SetInitializer<MyContext>(null);
    }

    // define tables here
    public DbSet<Room> Rooms { get; set; }
}

This will cause the context to skip the database generator.

When you’re laying out your code for your database it’s standard practice to create a sub-directory in your project and name it something like DAL (for data access layer) or similar.  Then create your context cs file and all your table cs files in that directory.  You can put all your tables in one file, but it’s easier to find code if your put one file per table and name the file the name of the table.

It’s also standard practice to pluralize the table name and leave the record instance or class name non-plural.  In the example above the record is “Room” and the table is “Rooms”.  If your table is named “Rooms” in your database, then you can alter the table attribute to match ([Table(“Rooms“)]).

The connection string can be a string, a variable or it can be read from your app.config or web.config file using the ConfigurationManager object like so:

private string ConnectionString = ConfigurationManager.ConnectionStrings[“My Connection“].ToString();

You’ll have to add a reference to System.Configuration and put in a using for System.Configuration in order to use the ConfigurationManager object.