There are many challenges inherent in creating an API. From the organization of the resources, to interaction models (CQRS and eventual consistency? Purely synchronous?) all the way through to the specific mechanisms for dealing with large collections of data (paging? filtering?) there is a lot of complexity in putting a service together.

One such area of complexity is binary data, and the handling thereof. Maybe its images associated with some entity, or arbitrary files, or maybe something else entirely, but the common thread is that the content is delivered differently to a normal payload (like JSON or XML).

I mean, you’re unlikely to to serve an image to a user as a base64 encoded chunk in the middle of a JSON document.

Or you might, I don’t know your business. Maybe it seemed like a really good idea at the time.

Regardless, if you need to deal with binary data (upload or download, doesn’t really matter), its worth thinking about the ramifications with regard to API throughput at the very least.

Use All The Resources

It really all comes down to resource utilization and limits.

Ideally, in an API, you want to minimise the time that every request takes, mostly so that you can maintain throughput. Every single request consumes resources of some sort, and if a particular type of request is consuming more than its fair share, the whole system can suffer.

Serving or receiving binary data is definitely one of those needytypes of requests, as the payloads tend to be larger (files, images and so on), so whatever resources are involved in dealing with the request are consumed for longer.

You can alleviate this sort of thing by making use of good asynchronous practices, ceding control over the appropriate resources so that they can be used to deal with other incoming requests. Ideally you never want to waste time waiting for slow things, like HTTP requests and IO, but there is always going to be an upper limit on the amount you can optimize by using this approach.

A different issue can arise when you consider the details of how you’re dealing with the binary content itself. Ideally, you want to deal with it as a stream, regardless of whether its coming in or going on. This sort of thing can be quite difficult to actually accomplish though, unless you really know what you’re doing and completely understand the entire pipeline of whatever platform you’re using.

Its far more common to accidentally read the entire body of the binary data into memory all at once, consuming way more resources that are strictly necessary than if you were pushing bytes through in a continuous stream. I’ve had this happen silently and unexpectedly when hosting web applications in IIS, and it can be quite a challenging situation to engineer around.

Of course if you’re only dealing with small amounts of binary data, or small numbers of requests, you’re probably going to be fine.

Assuming every relevant stakeholder understands the limitations that is, and no one sells the capability to easily upload and download thousands and thousands of photos taken from a modern camera phone.

But its not like that’s happened to me before at all.

Separation Of Concerns

If you can, a better model to invoke is one in which you don’t deal with the binary data directly at all, and instead make use of services that are specifically built for that purpose.

Why re-invent the wheel, only to do it worse and with more limitations?

A good example of this is to still have the API handle the entities representing the binary data, but when it comes to dealing with the hard bit (the payload), provide instructions on how to do that.

That is, instead of hitting GET /thing/{id}/images/{id}, and getting the binary content in the response payload, have that endpoint return a normal payload (JSON or similar) that contains a link for where to download the binary content from.

In more concrete terms, this sort of thing is relatively easily accomplished using Amazon Simple Storage Service (S3). You can have all of your binary content sitting securely in an S3 bucket somewhere, and create pre-signed links for both uploading and downloading with very little effort. No binary data will ever have to pass directly through your API, so assuming you can deal with the relatively simple requests to provide directions, you’re free and clear.

Its not like S3 is going to have difficulties dealing with your data.

Its somewhat beyond my area of expertise, but I assume you can accomplish the same sort of thing with equivalent services, like Azure Blobs.

Hell, you can probably even get it working with a simple Nginx box serving and storing content using its local disk, but honestly, that seems like a pretty terrible idea from a reliability point of view.


As with anything in software, it all comes down to the specific situation that you find yourself in.

Maybe you know for certain that the binary data you deal with is small and will remain that way. Fine, just serve it from the API. Why complicate a system if you don’t have to.

But if the system grows beyond your initial assumptions (and if its popular, it will), then you’ll almost certainly have problems scaling if you’re constantly handling binary data yourself, and you’ll have to pay a price to resolve the situation in the long run.

Easily the hardest part of being software engineer is trying to predict what direction the wind will blow, and adjust accordingly.

Really, its something of a fools errand, and the best idea is to be able to react quickly in the face of changing circumstances, but that’s even harder.