Not As Well Hung As I Thought

July 18. 2017 0 Comments

That thing that I complained about with Logstash and HTTP outputs a few weeks ago? Turns out it was actually a bug after all!

Not only that, but it was totally a race condition relating to the way the success/failure handlers of the Manticore HTTP library were being used, which was what I guessed, so go me. I guess even a broken clock is right once a day.

Long story short, don’t fire off an asynchronous request until you’ve got your asynchronous handlers in place to handle the result.

In their typical awesome style, the maintainers of the HTTP output plugin have already fixed the issue and released a new version (4.3.2) for the masses to consume.

Unfortunately, I don’t think this new version has been rolled into an official Logstash release yet, so if I want to use it I’ll have to make a custom redistributable package to use in Octopus deploy. Not a huge issue, just a small pain. I’m sure it will be rolled into the official package soon.

Other than that piece of news, I don’t really have anything else, so I’m going to take this opportunity to muse on some of my experiences with asynchronous code.

Hellish Code

Depending on how its written, asynchronous code can be hard to reason about, and even harder to test.

Vanilla Node is a good example of bad asynchronous code. Its incredibly easy to get into callback hell, where this function will call into this function, which has a handler that calls this function, which does this thing and so on. Its even worse when you realise that the easiest way to accumulate state between so many asynchronous functions is to close over variables that are in scope when the function is created, which leaves you in a position where its hard to break things apart into nice self contained, easily testable components.

*shudder*

Encapsulating asynchronous operations into some sort of structure is a good counter to callback hell, and once we started using Promises in the ELB logs processor Javascript, I found the quality of the code increased rapidly. It was still Javascript, but at least you could reason about it.

Continuing on with a language that I actually like, C#, you’ve got a similar concept at play with tasks and the TPL (Task Parallel Library).

Before the TPL, you would probably just use raw threads to offload work (or maybe dedicated asynchronous functions or classes like BackgroundWorker), and you could easily get into the same sort of problems that you get into in vanilla Node. I mean, you had the benefit of working with a nice, structured language, but it was still difficult and prone to error.

The TPL introduced encapsulations around asynchronous operations called Tasks, which helped write asynchronous code that actually made sense.

It did something else that was incredibly helpful as well.

It exposed the Task schedulers.

All The Control

Encapsulating operations as Tasks is all well and good, and in most cases you’re probably looking to run those operations asynchronously on another thread (or a pool of threads perhaps). I mean, that’s literally exactly what the default TaskScheduler does in C# (the ThreadPoolScheduler).

Of course, that makes testing hard, because now your main thread (which the test is running on) has to have some logic in it wait for outstanding operations to complete before you can assert things. This gets even more difficult when working with ViewModels, which require that certain updates be performed on the main/UI thread in order to work properly.

This is particularly important for WPF and PropertyChanged events, which only work properly if you raise them on the UI thread.

The good news is that you can write your own TaskScheduler and then inject it for testing purposes, assuming you’ve written your code in such a way that your Tasks are created with a specified scheduler.

I’ve written a TaskScheduler that is not threaded at all (i.e. it just executes the task inline), one that accumulates operations to be executed at will (good for testing transitions) and another that allows you to run tasks on the WPF Dispatcher construct (i.e. the main/UI thread in WPF) and I use all of them frequently while developing applications that rely on Tasks to do things.

They aren’t even all that hard to write really, as the following NonThreadedTaskScheduler shows:

public sealed class NonThreadedTaskScheduler : TaskScheduler
{
    protected override void QueueTask(Task task)
    {
        TryExecuteTask(task);
    }

    protected override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
    {
        return TryExecuteTask(task);
    }

    protected override IEnumerable<Task> GetScheduledTasks()
    {
        return Enumerable.Empty<Task>();
    }

    public override int MaximumConcurrencyLevel
    {
        get { return 1; }
    }
}

Without the ability to change how the tasks are executed, none of that would have been possible, and while they would still be useful, they would be much harder to actually use and test.

Coincidentally, this is one of the reasons why I’m a little bit leery of the async/await keywords in C#, because I don’t think they expose any of the scheduling functionality, instead assuming that you are happy with the magical way that they work. Honestly though, my knowledge of async/await is fairly limited, so that might not be entirely correct.

Conclusion

To conclude, asynchronous code is hard and its bugs are subtle and hard to detect.

Its also one of the easiest and highest impact ways to wring more performance out of a piece of software. If you’re doing a lot of things that naturally involve waiting (disk writes, network calls, etc), you stand to gain an incredible amount of time by just doing other things during those times when you’re waiting on something external and out of your control.

Double edged sword really.

Like most software.