_____                   _                  _____            _____       _ 
  |     |___ _____ ___ _ _| |_ ___ ___ ___   |  _  |___ ___   | __  |___ _| |
  |   --| . |     | . | | |  _| -_|  _|_ -|  |     |  _| -_|  | __ -| .'| . |
  |_____|___|_|_|_|  _|___|_| |___|_| |___|  |__|__|_| |___|  |_____|__,|___|
  a newsletter by |_| j. b. crawford               home archive subscribe rss

>>> 2020-05-29 message too large (PDF)

I have previously mentioned that everything now runs over HTTP. This is a sufficiently universal phenomena that when I tell people that the OSI model is bad and they need to purge it from their brains, I suggest that they replace it with the 'HTTP model', in which the layers are Physical, IP, TCP, HTTP, and finally the WeChat layer. One wonders if, much like Ethernet addressing was functionally obsoleted by IP addressing[1], IP addressing will one day by obsoleted by HTTP addressing, and future students in computer fields will listen to a professor attempt to justify that cookies are the OSI Session layer. No, that is not a joke, I have heard instructors of computer science say this already.

There are numerous reasons for the Unreasonable Effectiveness of HTTP. First and foremost is that HTTP is a simple, one-channel, session-oriented, request- response protocol which makes no assumption about bidirectional reachability of the hosts involved. For this reason, HTTP works well behind NAT and in complicated routing scenarios. Additionally, and perhaps even more importantly, HTTP is so widely used that even the most slipshod network intercept appliances have relatively well-tested and complete HTTP implementations, so it will probably still work behind a web proxy, web filter, 'Smart Firewall,' and all the other things that industry has introduced to prevent Spotify functioning in the office. Finally, HTTP is implemented by Browsers, and as far as most of the consumer-facing software industry is concerned there is nothing besides the Browser, by which is meant Google Chrome.

The latter two, by the way, are the major motivations of the monstrosities that are HLS (an Apple standard that Google uses) and MPEG-DASH (a Google standard that no one uses) for video streaming. YouTube videos need to work natively in the browser and you need to be able to watch them at work, and the state of the internet is that HTTP is really the only way to achieve those goals. There is no such thing as "streaming media," there is only downloading files repeatedly and rapidly. But I already spent too much time talking about that.

Let's talk about some of the rough edges that HTTP has. HTTP was intended for an ostensibly simple purpose: a user-agent (which we call a browser), let's say Netscape Navigator for old-times sake, requests a file, and the server, let's say Microsoft Internet Information Services, sends it. It's a very simple transactional protocol and follows the model of the UA sending a request and the server sending the response. In the simplest case, once the UA received the response it would close the connection. However, it's a standard performance optimization for the UA to keep the connection open for a while in case it needs to request more things, in which case it can reuse the connection and save a TCP handshake.

This is actually a bit of an oversimplification, HTTP provides a variety of "verbs" or types of request out of which we only really use GET and POST, and we basically use both of those wrong. There's also a lot of headers that allow HTTP to behave in very odd ways that are more or less application-specific, and I will mention a couple of those later.

One of the things that HTTP does not handle well is the need for a server to send requests to the client. This is quite simply not what HTTP was designed for, obviously that would require opening a connection the other direction, but we can't do that because of the way it is. So, various hacks and additions have been introduced for this purpose. One of the most charming is Reverse HTTP or PTTH, which was coined by Linden Labs to allow the Second Life server to retrieve the most up-to-date custom fursuits from the game's client. PTTH is very simple: essentially, the client makes a request which offers to upgrade to PTTH (Upgrade is a mechanism bolted on to HTTP to allow you to do things that God did not intend like switch to other protocols on the same connection). The server agrees to the upgrade and then the two switch roles, the server sends a request and the client sends a response.

PTTH is a neat toy example but does not really address the fundamental problem faced by most modern applications, which is that an HTTP client cannot receive anything from the server in real-time. If it is waiting for something to happen on the server end (say, a message to be received in an IM solution), it has to ask over and over again if the server has anything for it, which is terrible for performance. The first solution introduced was "long-polling" in which the client made a request and then the server simply took a very long time to reply---in fact it just waited and didn't reply until a message arrived. Long polling was a working hack but had downsides, and so it was more or less directly replaced by WebSockets, a protocol in which software that is only capable of HTTP pretends to be capable of arbitrary protocols by using the HTTP upgrade process to turn the connection into a pretend arbitrary byte channel, which is exactly what TCP gives you in the first place, but remember that software only targets Google Chrome and for various reasons (some of them real, none of them good), Chrome only speaks HTTP. The discomfort of that long compound sentence reflects my feelings on websockets. As a "DevOps Engineer" fully half of my job is figuring out why WebSockets don't work.

These two solutions address the majority of happy-path, user-facing needs for the server to send an instant, out-of-cycle response to the client (that is, to send a response not in reply to a request). We often call this a "push" although that terminology is confusing because it becomes conflated with a very different kind of "push" implemented by e.g. Google Play Services on Android and APN on iOS, which are basically the device vendor providing TCP-as-a-service in order to solve problems related to mobile networks but mostly ensure vendor lock-in. That's why your friend who only uses AOSP and F-Droid has to bum a charger every twenty minutes. However, all that aside, they fail to address one of the stupidest problems with HTTP implementations to remain into the year 2020, and that, a page of text in, is what this message is about.

Let's imagine that you operate a website which allows people to upload files. There's a problem you might rapidly run into: if someone decides to upload a file which is nineteen yottabytes in size, your web server will eventually exhaust all of its memory and disk space in the attempt to buffer that file, likely before even telling your application about it. In fact, the problem is much less theoretical than this. Web servers handle multiple connections concurrently, but for performance reasons they usually only reserve a very small (~kiB) buffer for receiving requests. So, they will almost always go to the disk for non-trivial file uploads which has real performance implications that get worse the longer the connection is open, since you can end up with your whole worker pool busy waiting on the disks.

The point of this is that you want to limit the maximum size of file that a user can upload. You actually want to implement the limit at a slightly lower level than this. There's no need for a client to send a 19YiB file to implement that DoS, it could upload no file at all and instead just send a plain GET request with somehow 19YiB of headers. For this reason, web servers usually offer a limit on request headers and a limit on request body (likely a form or file but could be other things). In Nginx for example, this is in the config as large_client_header_buffers and client_max_body_size respectively.

So what happens when a client submits a request with a body that exceeds client_max_body_size? This isn't too far fetched as that configuration defaults to 1MiB, so anyone using Nginx for development purposes has probably hit this problem before and had to turn that directive up. What happens is that Nginx responds with HTTP 413 Payload Too Large, the browser shows an error, and everyone knows what happened.

Except that's not actually true. Or rather, it is true sometimes, but in a lot of real situations, the user ends up waiting for a while longer and then seeing an error that either the connection was reset or that the server sent no reply.

The reason for this is, well, the reason for this is that most web browsers are absolute disasters of legacy code and hack upon hack and when something involving the lower level of the browser behaves wrong it is very hard to fix. But the better sounding reason is that the HTTP 413 error is one of few errors (along with the more exotic 414 URI Too Long) that violates the fundamental request-reply nature of HTTP. An HTTP 413 error is generated by the server before the client completes sending its request. Most HTTP clients simply aren't prepared to handle this.

Without diving into Firefox source code from which I fear I will never return, I suspect that the problem is not even entirely the client's fault. Part of the problem is TCP itself. When the browser sends a large file, it ends up packing up queues in between your computer and the server and it may take long enough to flush those queues that the server ends up sending RST before the browser processes the 413 and stops sending. This is analogous to when you accidentally cat a 40GiB file over SSH again and it's hard to stop. The TCP URG flag was intended to fix this but did not because no one implemented it.

For a fun learning experience, try out our inaugural Computers Are Bad Learn- At-Home Exercise. Install Nginx on some junk computer and/or sketchy VPS and configure it with a small client_max_body_size, say 50[2]. Just for convenience set up a little HTML form that allows the user to submit a file, making sure that you set the encoding to multipart/form-data because you have to for reasons. Don't forget a submit button! Load up your shiny new form and upload a small file, say a few kilobytes. You will very likely correctly receive a 413 error. Now, upload a large file, say a gigabyte. You will very likely now get the behavior where the browser tries to upload far more than 50b before displaying a connection reset error. The occurrence of the problem seems to be directly related to the size of the body, and probably ultimately on whether or not the browser finishes the entire upload and receives replies before the round-trip of the packet after nginx closes the connection resulting in a RST. Think of it as a fun race. But don't call it that, "request racing" is a real thing that web browsers do which is different[3].

Fortunately, the internet being what it is, some developers have found ways to work around these problems by using HTTP in relatively complex ways, for example taking advantage of the HTTP Continue behavior and interesting use of range headers and/or the PATCH verb to upload the file across multiple requests, also allowing for better resumption of failed/interrupted uploads. But none of this is really well standardized so it requires shipping at least a good chunk of frontend JavaScript and careful configuration of the web server, if not an entirely custom client.

All of this is really just a long-winded explanation of the nginx manual's cryptic comment: "Please be aware that browsers cannot correctly display this error." Perhaps a better revision would be "Please be aware that browsers cannot correctly do anything." Patch file incoming.

[1] This is somehow near heresy to many people but it's actually pretty trivially true. Ethernet's addressing is an artifact of Ethernet originally being designed for an architecture in which multiple hosts occupied the same link (often called a "collision domain" when discussing Ethernet). This would be the case if you used a shared-medium physical layer such as thicknet/thinnet, or if you used point-to-point cabling with hubs. However, for many years, Ethernet has been practically exclusively used with point-to-point routed links, using switches. This is no longer only a practical matter either as the gigabit ethernet specification explicitly prohibits multiple hosts in a collision domain.

The point is that an Ethernet device never receives a frame on an
interface unless that frame is actually intended for it (excluding
specialty circumstances like taps in which case really all frames are still
intended for the host), and so MAC addresses are entirely unnecessary.
Modern LANs could operate exclusively on IP for addressing purposes. It's
possible to do this if you buy the right hardware, but "layer 3 switching"
in which the "Ethernet" switches perform IP routing is considered an
Enterprise Feature and does not come cheap. There's not necessarily a
technical reason for this, offloaded IP routing is perfectly possible with
only a modest price increase.

And yes, going to "all IP" routing does prevent the use of Ethernet with
protocols that the Ethernet switches do not understand and thus breaks the
fundamental concept of the layer model but that's been a shipwreck on the
bottom of the ocean for decades anyway. The people that really need
NetBEUI or whatever can just keep doing it the old fashioned way, they're
clearly collectors of historical artifacts.

[2] Somehow the nginx documentation is very bad at explaining how the value of this option is parsed. We need to either find the "Configuration file measurement units" section of the manual or, perhaps easier, check the source code to learn that it understands a bare number (interpreted as bytes) or the suffixes k/K and m/M, but not g/G for this value. Look forward to my upcoming patch submission that adds support for y/Y.

[3] Sometimes reading from the storage device used for cache is slower than just downloading a file again, so many web browsers have a behavior where they will sometimes request files from the server simultaneously with reading them from the cache and then use whichever version arrives first. This is called racing and you will probably see it very often in the case of development with localhost, because it seems like web servers are often better at keeping files in memory than web browsers.

Probably because your web browser is using all 6GiB of memory it has
allocated to store Facebook cookies and auto-playing TikTok embeds.

SPECIAL SHILL:

The Navajo Nation has been severely impacted by COVID-19 and urgently needs resources to provide health and income support beyond those which the long- defunded federal agencies are able to provide. You can donate online via the Navajo Nation Department of Health at https://www.nndoh.org/donate.html, or by mailing a check made out to the Navajo Nation, with memo "COVID-19 Response," to the Navajo Nation Office of the Controller, PO Box 3150, Window Rock, AZ 86515. Charitable donations to government entities are tax deductible.