A Function Of Its Time

February 13. 2018 0 Comments

Working on a legacy system definitely has its challenges.

For example, its very common for there to be large amounts of important business logic encapsulated in the structures and features of the database. Usually this takes the form of things like stored procedures and functions, default values and triggers, and when you put them all together, they can provide a surprising amount of functionality for older applications.

While not ideal by todays standards, this sort of approach is not necessarily terrible. If the pattern was followed consistently, at least all of the logic is in the same place, and legacy apps tend to have these magical global database connections anyway, so you can always get to the DB whenever you need to.

That is, until you start adding additional functionality in a more recent programming language, and you want to follow good development practices.

Like automated tests.

What The EF

If you’re using Entity Framework on top of an already existing database and you have stored procedures that you want (need?) to leverage, you have a few options.

The first is to simply include the stored procedure or function when you use the Entity Data Model Wizard in Visual Studio. This will create a function on the DbContext to call the stored procedure, and map the result set into some set of entities. If you need to change the entity return type, you can do that too, all you have to do is make sure the property names line up. This approach is useful when your stored procedures represent business logic, like calculations or projections.

If the stored procedures in the database represent custom insert/update/delete functionality, then you can simply map the entity in question to its stored procedures. The default mapping statement will attempt to line everything up using a few naming conventions, but you also have the ability to override that behaviour and specify procedures and functions as necessary.

If you don’t want to encapsulate the usage of the stored procedures, you can also just use the SqlQueryand ExecuteSqlCommandAsync functions available on the DbContext.Database property, but that requires you to repeat the usage of magic strings (the stored procedure and function names) whenever you want to execute the functionality, so I don’t recommend it.

So, in summary, its all very possible, and it will all work, up until you want to test your code using an in-memory database.

Which is something we do all the time.

In Loving Memory

To prevent us from having to take a direct dependency on the DbContext, we learn towards using factories.

There are a few reasons for this, but the main one is that it makes it far easier to reason about DbContext scope (you make a context, you destroy a context) and to limit potential concurrency issues within the DbContext itself. Our general approach is to have one factory for connecting to a real database (i.e. ExistingLegacyDatabaseDbContextFactory) and then another for testing (like an InMemoryDbContextFactory, using Effort). They both share an interface (usually just the IDbContextFactory<TContext> interface), which is taken as a dependency as necessary, and the correct factory is injected whenever the object graph is resolved using our IoC container of choice.

Long story short, we’re still using the same DbContext, we just have different ways of creating it, giving us full control over the underlying provoider at the dependency injection level.

When we want to use an in-memory database, Effort will create the appropriate structures for us using the entity mappings provided, but it can’t create the stored procedures because it doesn’t know anything about them (except maybe their names). Therefore, if we use any of the approaches I’ve outlined above, the in-memory database will be fundamentally broken depending on which bits you want to use.

This is one of the ways that Entity Framework and its database providers are something of a leaky abstraction, but that is a topic for another day.

This is pretty terrible for testing purposes, because sometimes the code will work, and sometimes it won’t.

But what else can we do?

Abstract Art

This is one of those nice cases where an abstraction actually comes to the rescue, instead of just making everything one level removed from what you care about and ten times harder to understand.

Each stored procedure and function can easily have an interface created for it, as they all take some set of parameters and return either nothing or some set of results.

We can then have two implementations, one which uses a database connection to execute the stored procedure/function directly, and another which replicates the same functionality through Linq or something similar (i.e. using the DbContext). We bind the interface to the first implementation when we’re running on top of a real database, and to the DbContext specific implementation when we’re not. If a function calls another function in the database, you can replicate the same approach by specifying the function as a dependency on the Linq implementation, which works rather nicely.

Of course, this whole song and dance still leaves us in a situation where the tests might do different things because there is no guarantee that the Linq based stored procedure implementation is the same as the one programmed into SQL Server.

So we write tests that compare the results returned from both for identical inputs, trusting the legacy implementation when differences are discovered.

Why bother at all though? I mean after everything is said and done, you now have two implementations to maintain instead of one, and more complexity to boot.

Other than the obvious case of “now we can write tests on an in-memory database that leverage stored procedures”, there are a few other factors in favour off this approach:

With a good abstraction in place, its more obvious what is taking a dependency on the stored procedures in the database
With a solid Linq based implementation of the stored procedure, we can think about retiring them altogether, putting the logic where it belongs (in the domain)
We gain large amounts of knowledge around the legacy stored procedures while building and testing the replacement, which makes them less mysterious and dangerous
We have established a strong pattern for how to get at some of the older functionality from our new and shiny code, leaving less room for sloppy implementations

So from my point of view, the benefits outweigh the costs.

Conclusion

When trying to leverage stored procedures and functions programmed into a database, I recommend creating interfaces to abstract their usages. You are then free to provide implementations of said interfaces based on the underlying database provider, which feels a lot more flexible than just lumping the function execution into whatever structures that EF provides for that purpose. The approach does end up adding some additional complexity and effort, but the ability to ensure that tests can run without requiring a real database (which is slow and painful) is valuable enough, even if you ignore the other benefits.

Caveat, the approach probably wouldn’t work as well if there aren’t good dependency injection systems in place, but the general concept is sound regardless.

To echo my opening statement, working with legacy code definitely has its own unique set of challenges. Its nice in that way though, because solving those challenges can really make you think about how to provide a good solution within the boundaries and limitations that have already been established.

Like playing a game with a challenge mode enabled, except you get paid at the end.

All Left Out

June 14. 2016 0 Comments

We have the following simple entities, representing a fairly basic property management system.

public class Contact
{
    public int Id { get; set; }
}

public class ContactLink
{
    public int Id { get; set; }
    public int ContactId { get; set; }
    public string Type { get; set; }
    public int LinkedId { get; set; }
}

public class Owner
{
    public int Id { get; set; }
}

public class Tenant
{
    public int Id { get; set; }
    public int? PropertyId { get; set; }
}

public class Property
{
    public int Id { get; set; }
    public int? OwnerId { get; set; }
}

Each Property has zero or one Owners, but an Owner can be linked to multiple Contacts.

A Property can be Tenanted, and each Tenant can be linked to multiple Contacts.

The ContactLink class represents multiple arbitrary links to Contacts, and is how both the Owner and Tenant to Contact relationships are specified.

This model was not created with Entity Framework in mind, which unfortunately is something that you have to deal with when working on a legacy system. The EF model was put in place after many years of the database being the closest thing the software had to a domain model. The models that I’ve listed above are obviously not complete. They contain many more fields than those displayed, and have fields for segmentation because the database is multi-tenant. Anything not specified here is not useful for the purposes of illustrating the point of post.

We need to answer a relatively simple question.

“Show me the Properties that are related to a Contact”

The goal is to answer that question while leaving as much of the work as possible up to the database. So ideally one call and minimal post processing.

Give Me All The Relationships

First things first, a test to ensure that our query works the way we expect it to. Make a database, seed it, run the query, validate the right number of results. Very straightforward.

public void WhenAContactIsRelatedToPropertiesThroughBothOwnerAndTenantRelationships_TheQueryReturnsTheCorrectNumberOfRows()
{
    using (var db = GetDbContext())
    {
        var contactId = RandomId();

        // Seed some data into the database. We use Builder objects and extension methods on DataSets to do
        // seeding, allowing us to fill a previously empty database with all the information we need for a test.

        // Specifically for this test, seed:
        // 1.) 1 Contact with the specified Id.
        // 2.) 1 Owner.
        // 3.) 2 Properties. One owned by the Owner that was just created. 
        // 4.) 1 Tenant, for the property with no Owner.
        // 5.) 2 ContactLinks, linking the Contact to the Owner and Tenant.

        var target = new QueryEncapsulation();

        var results = target.Execute(db, contactId);
        
        results.Count().Should().Be(2);
    }
}

With the test written, all that’s left is to implement the query itself.

public class QueryEncapsulation
{
    public IEnumerable<LinkedProperty> Execute(DbContext db)
    {
        return (from c in db.Contacts
            .Where(c => c.Id == specifiedId)
        from cl in db.ContactLinks
            .Where(cl => cl.ContactId == c.Id)
        from o in db.Owners
            .Where(o => o.Id == cl.LinkedId && cl.Type == "owner")
            .DefaultIfEmpty()
        from t in db.Tenants
            .Where(t => t.Id == cl.LinkedId && cl.Type == "tenant")
            .DefaultIfEmpty()
        from p in db.Properties
            .Where(p => p.OwnerId == o.Id || t.PropertyId == p.Id)
        select new LinkedProperty(o, t, p))
        .ToArray();
    }
}

Not super complicated, in fact it looks fairly similar to the query that you might put together in SQL. Join a bunch of tables together and then filter them as appropriate so only the valid links are included.

The usage of the DefaultIfEmptymethod is the mechanism by which you accomplish Left Outer Joins in Linq. This is important because even though each Owner and Tenant map to a single property, as a result of the way the relationships are structured, and the direction we are starting from (the Contact) each “row” returned by this query will only contain either an Owner or a Tenant, never both. Without the DefaultIfEmpty, the query wouldn’t return anything.

Of course, the test fails.

Failure Is Just A Stepping Stone To Greatness

Instead of getting 2 results (one Property linked via the Owner and one Property linked via the Creditor) the query returns 3 results.

The Property linked to the Tenant is coming back twice. Once for the correct link (to the Tenant) and then again for the Owner, which is nonsense.

As far as I understand, this combination is still included because a completely unspecified Owner can be linked to a null Owner Id on the Property.

Intuitively, p.OwnerId == o.Id feels like it should throw a NullReferenceException when o is null. I’m still not exactly sure I understand why this is a valid result, but I know if I make a similar query in SQL using Left Outer Joins, that it will work the same way. Essentially I’m not being specific enough in my exclusion criteria and null is indeed equal to null.

The fix?

Add a check on the Property condition that checks for Owner being null.

The final query we’re left with is:

public class QueryEncapsulation
{
    public IEnumerable<LinkedProperty> Execute(DbContext db)
    {
        return (from c in db.Contacts
            .Where(c => c.Id == specifiedId)
        from cl in db.ContactLinks
            .Where(cl => cl.ContactId == c.Id)
        from o in db.Owners
            .Where(o => o.Id == cl.LinkedId && cl.Type == "owner")
            .DefaultIfEmpty()
        from t in db.Tenants
            .Where(t => t.Id == cl.LinkedId && cl.Type == "tenant")
            .DefaultIfEmpty()
        from p in db.Properties
            .Where(p => (o != null && p.OwnerId == o.Id) || t.PropertyId == p.Id)
        select new LinkedProperty(o, t, p))
        .ToArray();
    }
}

Summary

Honestly, I didn’t expect this sort of behaviour from EF, but that’s not surprising. EF is an incredibly complicated beast and I’m barely familiar with the shallow end.

To try and understand the results, I had to build up the set by hand and account for all of the potential combinations, combined with whether or not it made sense for them to be included in the final results. Doing this helped to identify the fact that fields on what should be a null object created via DefaultIfEmpty seem to be nullable even though the underlying type on the model is not.

An interesting finding at least and a good example of how sometimes its important to understand your queries conceptually.

Especially if they are giving weird results.

The Seed Is Strong

September 1. 2015 0 Comments

Update:I wrote the code for the seeder below outside of a development environment. It doesn’t work. I’ll try to revisit this post at a later date if I get a chance to implement a seeder class, but I’m currently using the Seed extension methods to great effect. My apologies to anyone who finds this post and is surprised when it doesn’t work.

Entity Framework and ORM’s in general have come a long way. Entity Framework in particular is pretty amazing now, compared to where it was 5 years ago. It was around then that my team made the decision to use NHibernate as our ORM, instead of EF. EF has obviously matured a lot since then, and seems to be the default choice now when working in the .NET world.

I’ve made a couple of posts on this blog involving Entity Framework and some of my adventures with it, one on creating test databases leveraging a scratch MSSQL instance in AWS and another on using a different EF provider to allow for in-memory databases.

One of the great things about working with ORMs, is that your persisted data is just objects, which means you have far more control over it than you ever did before. No need to use SQL statements (or similar) to build up some test data, just create some objects, insert them and off you go.

This post is going to talk about a mechanism for creating those objects, specifically about the concept of seeding.

Also, obviously, all of the sub-titles will be puns based on seeds and seeding.

Just Tossing My Seed Around

Most of the time I see seeding functions built into the DbContext class. They are typically executed whenever the context is created, making sure that certain data is available.

To me this is a violation of the Single Responsibility Principle, because now you have a class that is responsible for both managing data access and for putting some subset of the data there in the first place. While this alone is definitely a good reason to have a dedicated seeder class, there are others as well:

If you have a hardcoded seed method inside your DbContext, its much harder to customise it to seed different data based on your current needs.
Commonly, seed implementations inside the DbContext are wasteful, always trying to seed into the database whenever you create a new DbContext. I’m in favour of using a DbContextFactory and creating a DbContext per operation (or at least per request), which can make the time spent dealing with seeding significant.

I find that the best way to think about seeding is to use the specification pattern (or at least the concept). You want to be able to create an object that describes how you want your data to look (or what capabilities you want your data to have), and then execute it. Let the object sort out the seeding as its dedicated function.

This works fairly well. Yu define a Seeder or DataSpecification class, and expose appropriate properties and methods on it to describe the data (like how many DbFoo entries do I want, how many DbBar entries, what they look like, etc). You implement a method that takes a DbContext of the appropriate type, and in that method you use the information supplied to create and save the appropriate entities.

If you follow this approach, you find that your Seeder can become very complicated very quickly, especially because its entire purpose is to be highly configurable. Its also responsible for knowing how to construct many different varieties of objects, which is another violation of SRP.

I find SRP to be a pretty good guideline for handling class complexity. If you think about the responsibilities that your class has, and it has more than a few, then those responsibilities either need to be very tightly coupled, such that you couldn’t reasonably pull them apart, or you should really consider having more than one class. The downside of SRP is that you tend to have quite a lot of small classes, which is another form of complexity. The upside is that you have a lot of small, composable, modular classes, which are extremely useful once you get over that initial complexity bump for having many many classes.

Ready For Round Two

I didn’t like that my Seeder class had detailed knowledge about how to construct the various entities available from the DbContext. Plus it was huge and hard to understand at a glance.

The next step was to split the logic for how to create an entity into classes dedicated to that. I tend to use the naming convention of XBuilder for this purpose, and they all look very similar:

using System;

public interface IEntityBuilder<TEntity>
{
    TEntity Build();
}

public class DbFoo
{
    public int Id { get; set; }
    public string Do { get; set; }
    public string Re { get; set; }
}

public class DbFooBuilder : IEntityBuilder<DbFoo>
{
    private string _Do = "bananas";
    private string _Re = "purple";

    public DbFooBuilder WithDo(string v)
    {
        _Do = v;
        return this;
    }

    public DbFoo Build()
    {
        return new DbFoo()
        {
            Do = _Do,
            Re = _Re
        };
    }
}

As you can see, the builder features somewhat fluent syntax (the WithX methods) allowing you to chain calls to customise the constructed entity, but has sane defaults for all of the various properties that matter.

The Faker.Net package is handy here, for generating company names, streets, etc. You can also simply generate random strings for whatever properties require it, but its generally much better to generate real looking data than completely nonsensical data.

With the additional of dedicated builders for entities, the Seeder looks a lot better, being mostly dedicated to the concept of “how many” of the various entities. It could be improved though, because its difficult to use the Seeder to specify a subset of entities that meet certain criteria (like generate 10 DbFoo’s with their Do property set to “bananas”, and 200 where its set to “apples”).

We can fix that by providing some additional methods on the Seeder that allow you to customise the builders being used, instead of just letting to Seeder create X number of them to fulfil its “number of entities” requirement.

public class Seeder
{
    private List<DbFooBuilder> FooBuilders = new List<DbFooBuilder>();

    public Seeder WithDbFoos<TEntityBuilder>(IEnumerable<TEntityBuilder> builders)
        where TEntityBuilder : IEntityBuilder<DbFoo>
    {
        FooBuilders.AddRange(builders);

        return this;
    }

    public Seeder WithDbFoos<TEntityBuilder>(int number, Action<TEntityBuilder> customise = null)
        where TEntityBuilder : IEntityBuilder<DbFoo>, new()
    {
        var builders = Enumerable.Range(0, number).Select(a => customise(new TEntityBuilder()));
        return WithDbFoos(builders);
    }

    public void Seed(DbContext db)
    {
        foreach (var builder in FooBuilders)
        {
            db.DbFoos.Add(builder.Build());
        }
        db.SaveContext();
    }
}

Much better and extremely flexible.

Bad Seed

I actually didn’t quite implement the Seeder as specified above, though I think its definitely a better model, and I will be implementing it in the near future.

Instead I implemented a series of builders for each of the entities I was interested in (just like above), and then wrote a generic Seed extension method for IDbSet:

using System;
using System.Collections.Generic;
using System.Data.Entity;
using System.Linq;

namespace Solavirum.Database.EF
{
    public IEntityBuilder<TEntity>
    {
        TEntity Build();
    }

    public static class SeedExtensions
    {
        private static Random _random = new Random();

        public static void Seed<TEntity, TBuilder>(this IDbSet<TEntity> set, Func<TBuilder, TBuilder> modifications = null, int number = 10)
            where TEntity : class
            where TBuilder : IEntityBuilder<TEntity>, new()
        {
            modifications = modifications ?? (a => a);

            for (int i = 0; i < number; i++)
            {
                var builder = new TBuilder();
                builder = modifications(builder);
                set.Add(builder.Build());
            }
        }

        public static T Random<T>(this IEnumerable<T> enumerable)
        {
            int index = _random.Next(0, enumerable.Count());
            return enumerable.ElementAt(index);
        }
    }
}

This is nice from a usability point of view, because I can seed any entity that has an appropriate builder, just by using the one method. The Random<T> method exists so I can get a random element out of a DbSet for linking purposes, if I need to (it was used in a method that I removed, dealing specifically with an entity with links to other entities).

What I don’t like about it:

Its difficult to supply dependencies to the seed method (unless you expose them in the method signature itself) because its inside a static class. This means supplying a logger of some description is hard.
The builders have to have parameterless constructors, again because its hard to supply dependencies. This isn’t so bad, because the builders are meant to be simple and easy to use, with sane default values.
Builders with dependencies on other entities (like a hypothetical DbFooBar class that has references to both a DbFoo and a DbBar) have to have their own Seed method in order to use entities that exist in the current DbContext. This isn’t a dealbreaker, but it does complicate things.

I think a well constructed Seeder class better encapsulates the concept, even though its nice to be able to just hit up a Seed method right off the IDbSet and have it all just work.

Conclusion

Being able to easily create data that meets certain criteria is an amazing tool when it comes to development and testing. Doing it in a provider agnostic way is even better, because you can push data into an in-memory database, an SQL database or a Postgres database, using the same code. In my opinion, the ability to sub out providers is one of the best parts of using an ORM (that and not having to deal with query/data manipulation languages directly).

Nothing that I’ve written above is particularly ground breaking, but its still very useful, and I highly recommend following the general strategy when working with a persistence layer via Entity Framework.

I hope you enjoyed all the terrible seed puns.

I regret nothing.

You’re Mocking Me Aren’t You

January 6. 2015 0 Comments

Object Relational Mappers (ORMs) are useful tools. If you don’t want to have to worry about writing the interactions to a persistence layer yourself, they are generally a good idea. Extremely powerful, they let you focus on describing the data that you want, rather than manually hacking out (and then maintaining!) the queries yourself, and help with change tracking, transactional constructs and other things.

Testing code that uses an ORM, however, is typically a pain. At least in my experience.

People typically respond to this pain by abstracting the usage of their ORM away, by introducing repositories or some other persistence strategy pattern. They use the ORM inside the repository, and then use the more easily mocked repository everywhere else, where it can be substituted with a much smaller amount of effort. There are other benefits to this approach, including the ability to model the domain more accurately (can abstract away the persistence structure) and the ability to switch out the ORM you use for some other persistence strategy without having to make lots of changes. Possibly.

The downside of creating an abstraction like the above is that you lose a lot of ORM specific functionality, which can be quite powerful. One of the most useful feature of ORMs in C# is to be able to write Linq queries directly against the persistence layer. Doing this allows for all sorts of great things, like only selecting the properties you want and farming out as much of the work as possible to the persistence layer (maybe an SQL database), rather than doing primitive queries and putting it all together in memory. If you do want to leverage that power, you are forced to either make your abstraction leaky, exposing bits of the ORM through it (which makes mocking it harder) or you have to write the needed functionality again yourself, except into your interface, which is duplicate work.

Both of those approaches (expose the ORM, write an abstraction layer) have their upsides and downsides so like everything in software it comes down to picking the best solution for your specific environment.

In the past, I’ve advocated creating an abstraction to help isolate the persistence strategy, usually using the Repository pattern. These days I’m not so sure that is the best way to go about it though, as I like to keep my code as simple as possible and the downsides of introducing another layer (with home grown functionality similar to but not quite the same as the ORM) have started to wear on me more and more.

EF You

I’ve recently started working on an application that uses Entity Framework 6, which is a new experience for me, as all of my prior experience with ORM’s was via NHibernate, and to be brutally honest, there wasn’t much of it.

Alas, this application does not have very many tests, which is something that I hate, so I have been trying to introduce tests into the codebase as I add functionality and fix bugs.

I’m going to assume at this point that everyone who has ever done and programming and writes tests has tried to add tests into a codebase after the fact. Its hard. Its really hard. You have to try and resist the urge to rebuild everything and try and find ways to add testability to an architecture that was never intended to be testable without making too many changes or people start to get uncomfortable.

I understand that discomfort. I mean that's one of the biggest reasons you have tests in the first place, so you can make changes without having to worry about breaking stuff. Without those tests, refactoring to introduce tests is viewed as a risky activity, especially when you first start doing it.

Anyway, I wanted to write an integration tests for a particular piece of new functionality, to verify that everything worked end to end. I’ve written about what I consider an integration test before, but in essence it is any test that involves multiple components working together. These sorts of tests are usually executed with as many things configured and setup the same as the actual application, with some difficult or slow components that sit right at the boundaries being substituted. Persistence layers (i.e. databases) are a good thing to substitute, as well as non-local services, because they are slow (compared to in memory) and usually hard to setup or configure.

In my case I needed to find a way to remove the dependency on an external database, as well as a number of services. The services would be easy, because its relatively trivial to introduce an interface to encapsulate the required behaviour from a service, and then provide an implementation just for testing.

The persistence layer though…

This particular application does NOT use the abstraction strategy that I mentioned earlier. It simply exposes the ability to get a DbContext whenever something needs to access to a persistent store.

A for Effort

Being that the application in question used EF6, I thought that it would be easy enough to leverage the Effort library.

Effort provides an in-memory provider for Entity Framework, allowing you to easily switch between whatever your normal provider is (probably SQL Server) for one that runs entirely in memory.

Notice that I said I thought that it would be easy to leverage Effort…

As is always the case with this sort of thing, the devil is truly in the details.

It was easy enough to introduce a factory to create the DbContext that the application used instead of using its constructor. This allowed me to supply a different factory for the tests, one that leveraged Effort’s in-memory provider. You accomplish this by making sure that there is a constructor for the DbContext that takes a DbConnection, and then use Effort to create one of its fancy in-memory connections.

On the first run of the test with the new in-memory provider, I got one of the least helpful errors I have ever encountered:

System.InvalidOperationException occurred: Sequence contains no matching element
  StackTrace:
       at System.Linq.Enumerable.Single[TSource](IEnumerable`1 source, Func`2 predicate)
       at System.Data.Entity.Utilities.DbProviderManifestExtensions.GetStoreTypeFromName(DbProviderManifest providerManifest, String name)     at System.Data.Entity.ModelConfiguration.Configuration.Properties.Primitive.PrimitivePropertyConfiguration.Configure(EdmProperty column, EntityType table, DbProviderManifest providerManifest, Boolean allowOverride, Boolean fillFromExistingConfiguration)
       at System.Data.Entity.ModelConfiguration.Configuration.Properties.Primitive.PrimitivePropertyConfiguration.<>c__DisplayClass1.<Configure>b__0(Tuple`2 pm)
       at System.Data.Entity.Utilities.IEnumerableExtensions.Each[T](IEnumerable`1 ts, Action`1 action)
       at System.Data.Entity.ModelConfiguration.Configuration.Properties.Primitive.PrimitivePropertyConfiguration.Configure(IEnumerable`1 propertyMappings, DbProviderManifest providerManifest, Boolean allowOverride, Boolean fillFromExistingConfiguration)
       at System.Data.Entity.ModelConfiguration.Configuration.Properties.Primitive.BinaryPropertyConfiguration.Configure(IEnumerable`1 propertyMappings, DbProviderManifest providerManifest, Boolean allowOverride, Boolean fillFromExistingConfiguration)
       at System.Data.Entity.ModelConfiguration.Configuration.Types.StructuralTypeConfiguration.ConfigurePropertyMappings(IList`1 propertyMappings, DbProviderManifest providerManifest, Boolean allowOverride)
       at System.Data.Entity.ModelConfiguration.Configuration.Types.EntityTypeConfiguration.ConfigurePropertyMappings(DbDatabaseMapping databaseMapping, EntityType entityType, DbProviderManifest providerManifest, Boolean allowOverride)
       at System.Data.Entity.ModelConfiguration.Configuration.Types.EntityTypeConfiguration.Configure(EntityType entityType, DbDatabaseMapping databaseMapping, DbProviderManifest providerManifest)
       at System.Data.Entity.ModelConfiguration.Configuration.ModelConfiguration.ConfigureEntityTypes(DbDatabaseMapping databaseMapping, DbProviderManifest providerManifest)
       at System.Data.Entity.ModelConfiguration.Configuration.ModelConfiguration.Configure(DbDatabaseMapping databaseMapping, DbProviderManifest providerManifest)
       at System.Data.Entity.DbModelBuilder.Build(DbProviderManifest providerManifest, DbProviderInfo providerInfo)
       at System.Data.Entity.DbModelBuilder.Build(DbConnection providerConnection)
       at System.Data.Entity.Internal.LazyInternalContext.CreateModel(LazyInternalContext internalContext)
       at System.Data.Entity.Internal.RetryLazy`2.GetValue(TInput input)

Keep in mind, I got this error when attempting to begin a transaction on the DbContext. So the context had successfully been constructed, but it was doing…something…during the begin transaction that was going wrong. Probably initialization.

After a significant amount of reading, I managed to find some references to the fact that Effort doesn’t support certain SQL Server specific column types. Makes sense in retrospect, although at the time I didn’t even know you could specify provider specific information like that. I assume it was all based around automatic translation between CLR types and the underlying types of the provider.

There is a lot of entities in this application and the all have a large amount of properties. I couldn’t read through all of the classes to find what the problem was, and I didn’t even know exactly what I was looking for. Annoyingly, the error message didn’t say anything about what the actual problem was, as you can see above. So, back to first principles. Take everything out, and start reintroducing things until it breaks.

I turns out quite a lot of the already existing entities were specifying the types (using strings!) in the Column attribute of their properties. The main offenders were the “timestamp” and “money” data types, which Effort did not seem to understand.

Weirdly enough, Effort had no problems with the Timestamp attribute when specified on a property. It was only when the type “timestamp” was specified as a string in the Column attribute that errors occurred.

The issue here was of course that the type was string based, so the only checking that occurred, occurred at run-time. Because I had introduced a completely different provider to the mix, and the code was written assuming SQL Server, it would get to the column type in the initialisation (which is lazy, because it doesn’t happen until you try to use the DbContext) and when there was no matching column type returned by the provider, it would throw the exception above.

Be Specific

Following some advice on the Effort discussion board, I found some code that moved the SQL Server specific column types into their own attributes. These attributes would then only be interrogated when the connection of the DbContext was actually an SQL Server connection. Not the best solution, but it left the current behaviour intact, while allowing me to use an in-memory database for testing purposes.

Here is the attribute, it just stores a column type as a string.

public class SqlColumnTypeAttribute : Attribute
{
    public SqlColumnTypeAttribute(string columnType = null)
    {
        ColumnType = columnType;
    }

    public string ColumnType { get; private set; }
}

Here is the attribute convention, which EF uses to define a rule that will interpret attributes and change the underlying configuration.

public class SqlColumnTypeAttributeConvention : PrimitivePropertyAttributeConfigurationConvention<SqlColumnTypeAttribute>
{
    public override void Apply(ConventionPrimitivePropertyConfiguration configuration, SqlColumnTypeAttribute attribute)
    {
        if (!string.IsNullOrWhiteSpace(attribute.ColumnType))
        {
            configuration.HasColumnType(attribute.ColumnType);
        }
    }
}

Here is a demo DbContext showing how I used the attribute convention. Note that the code only gets executed if the connection is an SqlConnection.

public partial class DemoDbContext : DbContext
{
    public DemoDbContext(DbConnection connection, bool contextOwnsConnection = true)
        : base(connection, contextOwnsConnection)
    {
    
    }
    
    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        if (Database.Connection is SqlConnection)
        {
            modelBuilder.Conventions.Add<SqlColumnTypeAttributeConvention>();
        }
    }
}

Finally, here is the attribute being used in an entity. Previously this entity would have simply had a [Column(TypeName = “timestamp”)] attribute on the RowVersion property, which causes issue with Effort.

public partial class Entity
{
    [Key]
    public int Id { get; set; }

    [SqlColumnType("timestamp")]
    [MaxLength(8)]
    [Timestamp]
    public byte[] RowVersion { get; set; }
}

Even though there was a lot of entities with a lot of properties, this was an easy change to make, as I could leverage a regular expression and find and replace.

Of course, it still didn’t work.

I was still getting the same error after making the changes above. I was incredibly confused for a while, until I did a search for “timestamp” and found an instance of the Column attribute where it supplied both the data type and the order. Of course, my regular expression wasn’t smart enough to pick this up, so I had to manually go through and split those two components (Type which Effort didn’t support and Order which it did) manually wherever they occurred. Luckily it was only about 20 places, so it was easy enough to fix.

And then it worked!

No more SQL Server dependency for the integration tests, which means they are now faster and more controlled, with less hard to manage dependencies.

Of course, the trade-off for this is that the integration tests are no longer testing as close to the application as they could be, but that’s why we have functional tests as well, which run through the instaled application, on top of a real SQL Server instance. You can still choose to run the integration tests with an SQL Server connection if you want, but now you can use the much faster and easier to manage in-memory database as well.

Conclusion

Effort is awesome. Apart from the problems caused by using SQL Server specific annotations on common entities, Effort was extremely easy to setup and configure.

I can’t really hold the usage of SQL Server specific types against the original developers though, as I can’t imagine they saw the code ever being run on a non-SQL Server provider. Granted, it would have been nice if they had of isolated the SQL Server specific stuff from the core functionality, but that would have been unnecessary for their needs at the time, so I understand.

The biggest problem I ran into was the incredibly unhelpful error message coming from EF6 with regards to the unsupported types. If the exception had stated what the type was that couldn’t be found and for which property in which class, I wouldn’t have had to go to so much trouble to find out what the actual problem was.

Its never good being confronted with an entirely useless exception message, and we have to always be careful to make sure that our exceptions fully communicate the problem so that they help future developers, instead of just getting in the way.

A little Effort goes a long way after all.