You Gotta Have Standards

October 9. 2018 0 Comments

After that brief leadership interlude, its time to get back into the technical stuff with a weird and incredibly frustrating issue that I encountered recently when building a small command-line application using .NET 4.7.

More specifically, it failed to compile on our build server citing problems with a dependency that it shouldn’t have even required.

So without further ado, on with the show.

One Build Process To Rule Them All

One of the rules I like to follow for a good build process is that it should be executable outside of the build environment.

There are a number of reasons for this rule, but the two that are most relevant are:

If something goes wrong during the build process you can try and run it yourself and see what’s happening, without having to involve the build server
As a result of having to execute the process outside of the build environment, its likely that the build logic will be encapsulated in source control, alongside the code

With the way that a lot of software compilation works though, it can be hard to create build processes that automatically bootstrap the necessary components on a clean machine.

For example, there is no real way to compile a .NET Framework 4.7 codebase without using software that has to be installed. As far as I know you have to use either MSBuild, Visual Studio or some other component to do the dirty work. .NET Core is a lot better in this respect, because its all command line driven and doesn’t feature any components that must be installed on the machine before it will work. All you have to do is bootstrap the self contained SDK.

Thus while the dream is for the build process to be painless to execute on a clean git clone (with the intent that that is exactly what the build server does), sometimes dreams don’t come true, no matter how hard you try.

For us, our build server comes with a small number of components pre-installed, including MSBuild, and then our build scripts rely on those components existing in order to work correctly. There is a little bit of fudging involved though so you don’t have to have exactly the same components installed locally, and it will dynamically find MSBuild for you.

This was exactly how the command-line application build process was working before I touched it.

Then I touched it and it all went to hell.

Missing Without A Trace

Whenever you go back to a component that hasn’t been actively developed for a while, you always have to decide whether or not you should go to the effort of updating its dependencies that are now probably out of date.

Of course, some upgrades are a lot easier to action than others (i.e. a NuGet package update is generally a lot less painful than updating to a later version of the .NET Framework), but the general idea is to put some effort into making sure you’ve got a strong base to work from.

So that’s exactly what I did when I resurrected the command-line application used for metrics generation. I updated the build process, renamed the repository/namespaces (to be more appropriate), did a pass over the readme and updated the NuGet packages. No .NET version changes though, because that stuff can get hairy and it was already at 4.7, so it wasn’t too bad.

Everything compiled perfectly fine in Visual Studio and the self-contained build process continued to work on my local machine, so I pushed ahead and implemented the necessary changes.

Then I pushed my code and the automated build process on our build server failed consistently with a bunch of compilation errors like the following:

Framework\IntegrationTestKernel.cs(64,13): error CS0012: The type 'ValueType' is defined in an assembly that is not referenced. You must add a reference to assembly 'netstandard, Version=2.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51'.

The most confusing part?

I had taken no dependency on netstandard as far as I knew.

More importantly, my understanding of netstandard was that it is basically a set of common interfaces to allow for a interoperability between the .NET Framework and .NET Core. I had no idea why my code would fail to compile citing a dependency I didn’t even ask for.

Also, it worked perfectly on my machine, so clearly something was awry.

The Standard Response

The obvious first response is to add a reference to netstandard.

This is apparently possible via NETStandard.Library NuGet package, so I added that, verified that it compiled locally and pushed again.

Same compilation errors.

My next hypothesis was that maybe something had gone weird with .NET Framework 4.7. There are a number of articles on the internet about similar looking topics and some of them read like later versions of .NET 4.7 (which are in-place upgrades for god only knows what reason) have changes relating to netstandard and .NET Framework integrations and compatibility. It was a shaky hypothesis though, because this application had always specifically targeted .NET 4.7.

Anyway, I flipped the projects to all target an earlier version of the .NET Framework (4.6.2) and then reinstalled all the NuGet packages (thank god for the Update-Package –reinstall command).

Still no luck.

The last thing I tried was removing all references to the C# 7 Value Tuple feature (super helpful when creating methods with complex return types), but that didn’t help either.

I Compromised; In That I Did Exactly What It Wanted

In the end I accepted defeat and made the Visual Studio Build Tools 2017 available on our build server by installing them on our current build agent AMI, taking a new snapshot and then updating TeamCity to use that snapshot instead. In order to get everything to compile cleanly, I had to specifically install the .NET Core Build Tools, which made me sad, because .NET Core was actually pretty clean from a build standpoint. Now if someone puts together a .NET Core repository incorrectly, it will probably still continue to compile just fine on the build server, leaving a tripwire for the next time someone cleanly checks out the repo.

Ah well, can’t win them all.

Conclusion

I suspect that the root cause of the issue was updating to some of the NuGet packages, specifically the packages that are only installed in the test projects (like the Ninject.MockingKernel and its NSubstitute implementation) as the test projects were the only ones that were failing to compile.

I’m not entirely sure whya package update would cause compilation errors though, which is pretty frustrating. I’ve never experienced anything similar before, so perhaps those libraries were compiled to target a specific framework (netstandard 2.0) and those dependencies flowed through into the main projects they were installed into?

Anyway, our build agents are slightly less clean now as a result, which makes me sad, but I can live with it for now.

I really do hate system installed components though.