0 Comments

A few weeks ago I uploaded a post describing the usage of a simple BNF grammar to describe and parse the configuration for a feature in the software my team maintains. As I mentioned then, that post didn’t cover any of the detail around how the configuration worked, only how it was structured and used to construct an in memory model that could be leveraged to push all of the necessary data to the external service.

This post will instead cover the other half of the story, the content of the configuration file beyond the structure.

Charging a Dynamo

As I said in the post I linked above, we recently integrated with an external service that allowed for the creation of various forms relevant to the Real Estate industry in Australia. Their model was to use an API to programmatically construct the form and to then use their website to confirm that the form was correct (and print it or email it or whatever other usage might be desired).

For the integration, we needed to supply the ability to pick the desired form from the available one, and then supply all of the data that that form required, as sourced from our application.

The form selection was trivial, but filling in the necessary data was somewhat harder. Each form could be printed from various places within the application, so we had to put together a special intermediary domain model based on the current context. Once the model was constructed, we could easily extract the necessary fields/properties from it (depending on what was available) and then put them into a key/value store to be uploaded to the external API.

For a first version, the easiest way to approach the problem was to do it in C#. A domain model, a factory for creating it from different contexts and a simple extractor that built the key/value store.

The limitations of this approach are obvious, but the worst one is that it can’t be changed without releasing a new version of the software. We generally only do a few releases a year (for various reasons I don’t really want to get into), so this meant we would have a very limited ability to change in the face of the external service changing. New keys would be particularly brutal, because they would simply be lacking values until we did a new release, but fixing any errors in the released key mappings would also be difficult.

Our second version needed to be more flexible about how the key mappings were defined and updated, so we switched to a configuration model

Dyno-mite!

Driving the key mappings from configuration added a whole bunch of complexity to the solution.

For one, we could no longer use the infinite flexibility of C# to extract what we needed from the intermediary domain model (and format it in the way we thought was best). Instead we needed to be able to provide some sort of expression evaluation that could be easily defined within text. The intent was that we would maintain the C# code that constructed the domain model based on the context of the operation, and then use a series of key mappings created from the configuration file to extract all of the necessary data to be pushed to the external API.

The second complication was that we would no longer be the only people defining key mappings (as we were when they were written in C#). An expected outcome of the improvements was that anyone with sufficient technical knowledge would be able to edit the mappings (or provide overrides to customers as part of our support process).

At first we thought that we might be able to published a small exploration language which would allow for the definition of a simple expression to get values out of the intermediary domain model. Something similar to C# syntax (i.e. dot notation, like A.B.C). This would be relatively easy to define in text, and would be evaluatable via Reflection.

The more we looked at our original C# key mappings though, the more we realised that the simple property exploration approach would not be enough. We were selecting items from arrays, dynamically concatenating two or more properties together and formatting strings, none of which would be covered by the simple model. We were also somewhat pensive about using Reflection to evaluate the expressions. There were quite a few of them (a couple of hundred) and we knew from previous experience that Reflection could be slow. The last thing we wanted to do was make the feature slower.

Two of those problems could be solved by adding the ability to concatenate two or more expressions in the mapping definition (the <concatenated> term in the grammer defined in the previous post and by offering a format string component for the expression (which just leverages String.Format).

Accessing items out of arrays/lists was something else entirely.

Dynamic Links

Rather than try to write some code to do the value extraction ourselves, we went looking for a library to do it for us.

We located two potential candidates, Flee and System.Linq.Dynamic.

Both of these libraries offered the ability to dynamically execute C# code obtained from a string, which would let us stick with C# syntax for the configuration file. Both were also relatively easy to use and integrate.

In the end, for reasons I can no longer remember, we went with System.Linq.Dynamic. I think this might have been because we were already using it for some functionality elsewhere (dynamic filtering and sorting, before we knew how to manipulate expression trees directly), so it made sense to reuse it.

A single line in the configuration file could now look like this:

Key_Name|DynamicLinq;Owner.Contacts[0].First;”{0} “;”unknown”

This line translates to “For the key named Key_Name, use the DynamicLinq engine, where the expression is to access the first name of zeroth element of the contacts list associated with the current owner. This value should be formatted with a trailing space, and if any errors occur it should default to the string unknown”.

The beauty of this is that we don’t have to handle any of the actual expression evaluation, just the bits around the edges (formatting and error handling).

After the configuration is parsed into an Abstract Syntax Tree by Irony, that tree is then converted into a series of classes that can actually be used to obtain a key/value store from the intermediary domain model. An interface describes the commonality of the right side of the configuration line above (IEvaluationExpression) and there are implementations of this class for each of the types of expression supported by the grammar called DateExpression, LiteralExpression, ConcatenatedExpression and the one I actually want to talk about, DynamicLinqExpression.

This class is relatively simple. Its entire goal is to take a string that can be used to extract some value from an object and run it through the functionality supplied by System.Dynamic.Linq. It does some error handling and formatting as well, but its main purpose is to extract the value.

public class DynamicLinqExpression : IEvaluationExpression
{
    public DynamicLinqExpression(string expression, string default = null, string format = null)
    {
        _expression = expression;
        _default = default;
        _format = format;
    }

    public string Evaluate(Model model)
    {
        IEnumerable query;
        try
        {
            query = (new List<Model> { model }).AsQueryable<Model>().Select(_expression);
        }
        catch (Exception ex)
        {
            return _default;
        }

        List<dynamic> list;
        try
        {
            list = query.Cast<dynamic>().ToList();
        }
        catch (Exception ex)
        {
            return _default;
        }

        try
        {
            return list.Single() == null ? _default : string.Format(_format, list.Single());
        }
        catch (Exception ex)
        {
            return _default;
        }
    }
}

I’ve stripped out all of the logging and some other non-interesting pieces, so you’ll have to excuse the code. In reality we have some detailed logging that occurs at the various failure levels, which is why everything is spread out the way it is.

The important piece is the part where the incoming Model is converted into a Queryable, and System.Dynamic.Linq is used to query that using the supplied expression. A value is then extracted from the resulting queried enumerable, which is them formatted and returned as necessary.

Conclusion

Pushing off the majority of the value extraction let us focus on a nice structure to support all of the things that the configuration file needed to do that were outside of the “just extract a value from this object” scope. It also let us put more effort into the other parts of managing a configuration based approach to a complex problem, like the definition and usage of a grammar (another case of code I would rather not own if I didn’t have to).

The only weird thing left over is the fact that the DynamicLinqExpression has to do a bunch of collection transformations in order to be run on a single object. This leaves a somewhat sour taste in my mouth, but performance testing showed that it was well within the bounds that we needed to accomplish for this particular feature.

In the end, I was mostly just happy that we didn’t have to maintain some convoluted (and likely slow) Reflection based code that extracted fields and properties from an object model in some sort of vastly reduced mockery of C#.