A Function Of Its Time

February 13. 2018 0 Comments

Working on a legacy system definitely has its challenges.

For example, its very common for there to be large amounts of important business logic encapsulated in the structures and features of the database. Usually this takes the form of things like stored procedures and functions, default values and triggers, and when you put them all together, they can provide a surprising amount of functionality for older applications.

While not ideal by todays standards, this sort of approach is not necessarily terrible. If the pattern was followed consistently, at least all of the logic is in the same place, and legacy apps tend to have these magical global database connections anyway, so you can always get to the DB whenever you need to.

That is, until you start adding additional functionality in a more recent programming language, and you want to follow good development practices.

Like automated tests.

What The EF

If you’re using Entity Framework on top of an already existing database and you have stored procedures that you want (need?) to leverage, you have a few options.

The first is to simply include the stored procedure or function when you use the Entity Data Model Wizard in Visual Studio. This will create a function on the DbContext to call the stored procedure, and map the result set into some set of entities. If you need to change the entity return type, you can do that too, all you have to do is make sure the property names line up. This approach is useful when your stored procedures represent business logic, like calculations or projections.

If the stored procedures in the database represent custom insert/update/delete functionality, then you can simply map the entity in question to its stored procedures. The default mapping statement will attempt to line everything up using a few naming conventions, but you also have the ability to override that behaviour and specify procedures and functions as necessary.

If you don’t want to encapsulate the usage of the stored procedures, you can also just use the SqlQueryand ExecuteSqlCommandAsync functions available on the DbContext.Database property, but that requires you to repeat the usage of magic strings (the stored procedure and function names) whenever you want to execute the functionality, so I don’t recommend it.

So, in summary, its all very possible, and it will all work, up until you want to test your code using an in-memory database.

Which is something we do all the time.

In Loving Memory

To prevent us from having to take a direct dependency on the DbContext, we learn towards using factories.

There are a few reasons for this, but the main one is that it makes it far easier to reason about DbContext scope (you make a context, you destroy a context) and to limit potential concurrency issues within the DbContext itself. Our general approach is to have one factory for connecting to a real database (i.e. ExistingLegacyDatabaseDbContextFactory) and then another for testing (like an InMemoryDbContextFactory, using Effort). They both share an interface (usually just the IDbContextFactory<TContext> interface), which is taken as a dependency as necessary, and the correct factory is injected whenever the object graph is resolved using our IoC container of choice.

Long story short, we’re still using the same DbContext, we just have different ways of creating it, giving us full control over the underlying provoider at the dependency injection level.

When we want to use an in-memory database, Effort will create the appropriate structures for us using the entity mappings provided, but it can’t create the stored procedures because it doesn’t know anything about them (except maybe their names). Therefore, if we use any of the approaches I’ve outlined above, the in-memory database will be fundamentally broken depending on which bits you want to use.

This is one of the ways that Entity Framework and its database providers are something of a leaky abstraction, but that is a topic for another day.

This is pretty terrible for testing purposes, because sometimes the code will work, and sometimes it won’t.

But what else can we do?

Abstract Art

This is one of those nice cases where an abstraction actually comes to the rescue, instead of just making everything one level removed from what you care about and ten times harder to understand.

Each stored procedure and function can easily have an interface created for it, as they all take some set of parameters and return either nothing or some set of results.

We can then have two implementations, one which uses a database connection to execute the stored procedure/function directly, and another which replicates the same functionality through Linq or something similar (i.e. using the DbContext). We bind the interface to the first implementation when we’re running on top of a real database, and to the DbContext specific implementation when we’re not. If a function calls another function in the database, you can replicate the same approach by specifying the function as a dependency on the Linq implementation, which works rather nicely.

Of course, this whole song and dance still leaves us in a situation where the tests might do different things because there is no guarantee that the Linq based stored procedure implementation is the same as the one programmed into SQL Server.

So we write tests that compare the results returned from both for identical inputs, trusting the legacy implementation when differences are discovered.

Why bother at all though? I mean after everything is said and done, you now have two implementations to maintain instead of one, and more complexity to boot.

Other than the obvious case of “now we can write tests on an in-memory database that leverage stored procedures”, there are a few other factors in favour off this approach:

With a good abstraction in place, its more obvious what is taking a dependency on the stored procedures in the database
With a solid Linq based implementation of the stored procedure, we can think about retiring them altogether, putting the logic where it belongs (in the domain)
We gain large amounts of knowledge around the legacy stored procedures while building and testing the replacement, which makes them less mysterious and dangerous
We have established a strong pattern for how to get at some of the older functionality from our new and shiny code, leaving less room for sloppy implementations

So from my point of view, the benefits outweigh the costs.

Conclusion

When trying to leverage stored procedures and functions programmed into a database, I recommend creating interfaces to abstract their usages. You are then free to provide implementations of said interfaces based on the underlying database provider, which feels a lot more flexible than just lumping the function execution into whatever structures that EF provides for that purpose. The approach does end up adding some additional complexity and effort, but the ability to ensure that tests can run without requiring a real database (which is slow and painful) is valuable enough, even if you ignore the other benefits.

Caveat, the approach probably wouldn’t work as well if there aren’t good dependency injection systems in place, but the general concept is sound regardless.

To echo my opening statement, working with legacy code definitely has its own unique set of challenges. Its nice in that way though, because solving those challenges can really make you think about how to provide a good solution within the boundaries and limitations that have already been established.

Like playing a game with a challenge mode enabled, except you get paid at the end.