_____                   _                  _____            _____       _ 
  |     |___ _____ ___ _ _| |_ ___ ___ ___   |  _  |___ ___   | __  |___ _| |
  |   --| . |     | . | | |  _| -_|  _|_ -|  |     |  _| -_|  | __ -| .'| . |
  |_____|___|_|_|_|  _|___|_| |___|_| |___|  |__|__|_| |___|  |_____|__,|___|
  a newsletter by |_| j. b. crawford               home archive subscribe rss

>>> 2023-04-20 different kinds of differential

On the front page of HN today was an [article with a confusing headline], "Farmers 'crippled' by satellite failure as GPS-guided tractors grind to a halt." Of course the headline doesn't actually say this, but it does seem to imply that there has been some kind of failure of GPS.

This is not quite as outlandish as it might sound. The GPS system, maintained by the US Space Force, has occasionally suffered from serious funding shortages and bureaucratic paralysis. In 2009, the GAO issued a concerning report. They found that the program to launch a new generation of GPS satellites was so far behind schedule that the system was in danger of falling below the required 24 satellites. This would result in partial outages of GPS in different parts of the world, and could potentially take years to resolve. In response the Air Force issued a set of strongly worded statements insisting that they had kept GPS working for fifteen years and would consider to do so, and moreover made some changes to accelerate the GPS-III program.

Fortunately, we are now largely out of the woods on this issue, as not just GPS-III but subsequent designs are in service and the GPS constellation has been restored to its full planned size---including on-orbit spares for use in case of an unexpected failure. This is not to say that the administration of GPS is all good news; the GAO continues to issue more or less annual reports on how military programs to acquire more advanced GPS receivers (supporting the newer "M-code" signal) are badly mismanaged. But at least the civilian aspect of GPS should be quite reliable for years to come.

In any case, any significant failure of GPS would become major international news. Another perennial topic of GAO reports is the failure of the US government and infrastructure operators to develop any meaningful backup or alternative for GPS. A GPS failure would cause huge swaths of transportation and communications infrastructure to malfunction. While I remain a huge advocate for the construction of the terrestrial PNT (position, navigation, time) technology called eLORAN, it would cost money, and not in the F-35 kind of way, so it's unlikely to happen.

What happened in Australia, the article tells us about halfway through, was actually a problem with Inmarsat-41. Inmarsat is a bit of a historical oddity, in the same category as ARINC (Aviation Radio Inc.). It was founded as the International Maritime Satellite Organization, a non-profit entity sponsored by the UN's International Maritime Organization to develop a satellite network for emergency communications at sea. In 1998, though, Inmarsat was privatized, becoming a British company. Despite its historic legacy, Inmarsat is today just one of several major commercial satellite communications networks. It offers two-way telephony and data service, and carries both first and third-party broadcast services.

One of these services is SBAS, the Space-Based Augmentation System. "Augmentation System" is a common, if not very specific, term for GPS correction systems. The topic of GPS correction systems is sort of complicated, and we're going to celebrate 04/20 by going into them in some depth. Speaking of celebrating 04/20, this might be more of a ramble than a well-formed narrative, but there's a lot of interesting ideas related to PNT correction.

First, let's start with the concept of GPS augmentation systems. GPS fixes are subject to a number of sources of error. A high-quality GPS receiver with a sufficiently long observation time can eliminate most of the receiver-based positioning error, but there remain some pernicious errors which are hard to eliminate. Two of the most prominent are orbital ephemera and atmospheric effects.

First, orbital ephemera: to produce a GPS fix, receivers need to know the locations of the GPS satellites in orbit. In order to provide this information, ground stations observe the locations of GPS satellites and produce orbital ephemera, sets of parameters that describe the satellite's elliptical paths in relation to the earth. As the term "ephemera" suggests, these are "point in time" measurements that describe the current path of the satellites. Details of how space works like gravitational perturbations mean that satellite orbits are prone to changes, and satellites sometimes navigate to correct their orbits. This is why observations are used to determine the ephemera, and these observations are carried out on a continuous basis.

The GPS operators at the Second and Nineteenth Space Operations Squadrons regularly upload updated ephemera to the GPS satellites which transmit them for receivers to use. For practical reasons the ephemera are transmitted at a very low bitrate and can take some time to receive, which can contribute to "cold start" times on traditional GPS receivers of upwards of ten minutes. Most modern GPS receivers use some form of "assisted GPS" to reduce this time, with the most common example being the widespread practice of smartphone GPS receivers obtaining current ephemera from a web service instead of "waiting for it to come around" on the low-rate GPS data feed.

The problem is that new ephemera are only produced hourly, and the orbits of satellites varies on a minute level too rapidly for hourly observations to keep up with. Additional latency in the upload process means that the ephemera transmitted by GPS satellites can be several hours old. The observations are somewhat limited in precision anyway, considering the incredible precision modern GPS receivers are capable of. So, variation in the actual orbit of satellites from the ephemera leads to ephemeris error in GPS fixes.

Ephemeris error isn't actually that big of a source of error, but it's an interesting one so I wanted to talk about it anyway. Besides, it has introduced us to the term "assisted GPS" or "AGPS," which is unrelated to but often confused with augmentation systems. Assisted GPS just refers to the use of an alternate data path, such as IP, to obtain the current GPS network information that receivers need to make a first fix.

Second are atmospheric effects. GPS positioning relies on extremely accurate time-of-flight measurements. Unfortunately, the atmospheric is a weirdly complex thing, and numerous effects (such as "weather") result in radio frequency radiation traveling by indirect, non-linear paths. This problem is especially acute when dealing with satellites since, well, they're up in space, and so the received signals have to travel through a whole lot of atmosphere to get down here. This includes the ionosphere, which interacts with RF in particularly strange ways and can add significant travel time.

There are a number of approaches to reducing these errors, but one of the most common is differential GPS or DGPS. DGPS relies on a convenient property of most sources of GPS error: satellite position, satellite clock drift, and atmospheric conditions all tend to contribute a pretty similar error across a region. This means that if you can measure the error of GPS fixes in one place, the error at other nearby places is probably pretty similar. This is exactly what DGPS does: a reference station with a precisely surveyed location uses a high-quality GPS receiver to obtain very accurate fixes. The difference between the GPS fix and known location is then distributed as a correction signal that can be applied by other GPS receivers in the same area, canceling much of the inaccuracy. DGPS is one of the most common types of augmentation system, and is widely used by higher-quality GPS receivers.

Numerous augmentation systems exist, which makes them more complicated to talk about. The big issue is how the correction signal is distributed. Historically, one of the earliest major DGPS systems was the NDGPS or Nationwide DGPS. NDGPS is operated by the US Coast Guard (having originally been developed for maritime navigation) and uses a network of site across the US that transmit correction signals at around 300kHz. NDGPS has a great historical detail: when it was expanded from coastal stations only to a nationwide system, a great deal of the new inland correction stations were installed on recently retired sites of the Ground Wave Emergency Network (GWEN). GWEN was a short-lived survivable radio system operated by the military for continuity of government and command and control purposes, and much of the modern conspiracy theory around 5G can be directly traced to historic conspiracy theories surrounding GWEN as a government mind control system.

NDGPS is now being retired, though, as WAAS is viewed as a replacement. WAAS, the Wide Area Augmentation System, is the FAA's version of the same concept, intended for aviation use. Most WAAS reference stations are installed on the roofs of air route traffic control centers, with some others added as needed to fill gaps. WAAS differs from NDGPS in an important way: WAAS correction signals are actually distributed by three different commercial communications satellites. WAAS comes from space, just like GPS itself.

This is a good time to point out an odd bit of terminology: DGPS and augmentation systems are essentially the same thing, but for largely historic reasons "DGPS" usually refers to correction signals distributed by terrestrial radio while "augmentation system" usually refers to correction signals distributed by satellite.

You can probably see where this goes with Inmarsat. WAAS provides complete coverage of North America, but not of elsewhere. Inmarsat operates equivalent space-based augmentation signals as a contractor to several national governments, including Australia, New Zealand, and likely soon the UK. And that's what broke: some kind of problem with the satellite caused a disruption of the SBAS feed for Australia.

While we're talking about GPS augmentation, we should also talk briefly about RTK, Real-Time Kinematic correction. RTK, DGPS, and augmentation are sometimes used almost synonymously, but once again there is a conventional difference in the meanings. RTK most literally refers to the observation of the phase of the the GPS signals. The phase can be used as additional timing information to estimate the distance between the receiver and the satellite. Microwave GPS signals have a short wavelength (about 19cm), and electronics for observing phase difference can be made very precise, so this method can produce extremely accurate fixes. The problem is that RTK only tells you where the receiver is within a wavelength, or in other words it tells you where you are in a 19cm window, but not which 19cm window you're in.

To resolve this mystery, RTK is almost always used with a reference station so that the phase can be compared between a fixed point (with a location determined through long, 48-hour+ observation) and the moving receiver used for surveying in the field. Since RTK is mostly used by surveyors who are striving for huge accuracy and aren't in a hurry, it's most often done with a portable reference station installed nearby for the duration of the surveying project. If you've ever run into a skywards antenna connected to a chained-down Jobox with a solar panel, it's very likely an RTK reference station installed for some highway construction project. While RTK tends to imply DGPS, it's a separate technique, and produces even higher precision than DGPS or augmentation. It's also possible to use RTK without a reference station at all, by analyzing fixes from multiple locations later... a method referred to as virtual reference station.

So how much does this stuff actually impact our lives? Well, for the most part we tend to use smartphones for PNT in our daily lives, and smartphones have a somewhat different bag of tricks that relies on the cellular network to assist positioning. The thing is, smartphone GPS receivers really aren't very accurate at all. They rely on network assistance not for precision but in order to obtain any fix at all. Smartphones are constantly used in situations like urban environments and even indoors where GPS reception is poor and subject to huge multipath error. Cellular receivers bootstrap based on hints from the network (using the known locations of cellular towers) to accelerate GPS fix, and to produce a fix at all when they can't detect signals from a sufficient number of GPS satellites.

And that's it on PNT for today, but maybe I'll come back to shill more for eLORAN in the future.

--------------------------------------------------------------------------------

>>> 2023-04-10 solving problems with chatgpt

One of the foundational goals of computer technology, at least as understood by popular culture, is to automate away our jobs. When your job is (put simply) to get computers to work correctly, there's sort of an irony to this view of things. Still, just within the span of my career there have been technological advancements that have both automated away parts of my job (Puppet might be an example here, the prior art of CFEngine notwithstanding) and given me plenty more to bill time for (Kubernetes).

With the rise of large language models (LLMs), there is a large camp of people who think these ML applications are going to automate away larger portions of more jobs. I find myself curious: should I fear for my employment?

The thing is, a large portion of my profession could be accurately described as "Googling it," although lately I'm racking up a Kagi bill instead. Obviously this is an important strategy for things like error messages where it's often faster to see if someone has solved the same problem before than to figure it out from first principles. But it's sort of true for routine tasks as well... maybe my memory is just unusually poor, but there's a huge set of tasks where I will repeatedly look up the same Digital Ocean guide because I just want to copy-paste their config file snippets instead of having to look in docs to remember the names of the right directives.

Of course, what makes me a "professional" is that I have opinions about the correct ways certain things should be done, so I often ignore parts of these guides or make changes to suit my preferences on important issues like Unix domain sockets or localhost network sockets for communication with application servers. In a way it's sort of silly to follow some guide on best-linux-tutorials.biz and then just ignore the parts I disagree with but, well, that's what I tend to do. So I wonder: would a better approach be an LLM?

In today's episode, I'm going to ask ChatGPT for guides for some increasingly complex Linux sysadmin and DevOps tasks and then see whether or not I agree with its output. I don't really intend this as that serious of a critique of the use of ML in operations but hey, maybe we'll learn something useful!

The LEMP Stack

Back when I was a child first learning to hate software, the dominant stack for websites was "LAMP." This stood for Linux, Apache, MySQL, PHP. Nowadays it seems that at least a few people are saying "LEMP" for Linux, eNginx, Mymariadb, PHP? I don't like this, aesthetically. In any case, there's a number of things in PHP that I tend to deploy a lot, Dokuwiki being a prime example. This means that I am repeatedly wanting to set up Nginx to proxy requests to PHP-FPM, which is not difficult but does involve remembering a couple of config directives and where php-fpm puts its socket by default. Most of the time I don't actually want an RDBMS (personally I usually just use sqlite for everything) so for a long time I've googled for some guide and copied their snippets while ignoring the parts about MySQL/MariaDB. Let's see if ChatGPT has something more to the point for this very common and simple task.

Prompt: "how do I set up nginx and php?"

To set up Nginx and PHP on your server, follow these steps:

  1. Install Nginx on your server:

For Ubuntu/Debian:

 sudo apt update
 sudo apt install nginx

For CentOS/RHEL:

 sudo yum update
 sudo yum install nginx

The output starts off strong by offering snippets for both "Ubuntu/Debian" and "CentOS/RHEL." These two cover the good majority of the Linux server landscape, and while I could quibble with the label "CentOS/RHEL" rather than something that doesn't invoke the mostly-dead CentOS project like "RHEL/Fedora," ChatGPT is following the same convention most people do.

The instructions it provides are pretty reasonable, but there's something interesting here that those of you familiar with the two dominant Linux package manager families might detect. For Debian, it tells us to 'update' and then 'install.' for RHEL, it tells us to 'update' and then 'install.' These are neatly parallel except that the 'update' subcommand of apt and yum do fairly different things! On Debian it will retrieve the repo metadata and then install nginx (upgrading software is a different subcommand), on RHEL it will upgrade all of the managed software on the system and then install nginx (repo metadata is retrieved automatically by yum when it exceeds a configured 'stale' timeout).

I'm not sure how much we can make of this. It seems likely to me that we are seeing ChatGPT's lack of understanding of the underlying material: it is extremely common for people to 'update' and then 'install' on both platforms, so each in isolation is pretty reasonable, but it's odd for it to put them in parallel without noting that they will do different things. If the 'yum update' were an intentional choice we should see 'apt upgrade,' but we don't.

  1. Install PHP and the required PHP extensions:

For Ubuntu/Debian:

 sudo apt install php-fpm php-mysql php-mbstring

For CentOS/RHEL:

 sudo yum install php-fpm php-mysql php-mbstring

This is a little odder. It says "the required PHP extensions" which are apparently mysql and mbstring. It's not clear where or why these are required, which makes me think ChatGPT got this command from some guide on deploying specific software. WordPress might be to blame here as it requires mbstring, and the sheer number of LAMP/LEMP guides for the assumption that we will need the mysql extension.

  1. Configure Nginx to use PHP:

Create a new server block configuration file in /etc/nginx/sites-available directory with the following content:

 server {
   listen 80;
   server_name example.com;
   root /var/www/html;
   index index.php index.html index.htm;

   location / {
       try_files $uri $uri/ /index.php?$query_string;
   }

   location ~ \.php$ {
       include snippets/fastcgi-php.conf;
       fastcgi_pass unix:/var/run/php/php7.4-fpm.sock;
   }
 }

Replace example.com with your domain name and php7.4 with your installed PHP version.

This is pretty much what you would find in any guide. One interesting thing is the 'try_files' directive. There are about a million ways to set this up in nginx and some people prefer different approaches, but what we see here is pretty typical of php setup guides and should work fine. The catch is explicitly passing the $query_string which sort of gets eaten otherwise when nginx internally redirects to use index.php.

  1. Test your Nginx configuration:

    sudo nginx -t

  2. Reload Nginx to apply the new configuration:

    sudo systemctl reload nginx

Hey, nice work! A surprising number of people don't know about '-t' to validate config files.

  1. Create a PHP file to test if everything is working:

    sudo nano /var/www/html/info.php

Add the following content:

 <?php
   phpinfo();
 ?>
  1. Open your web browser and navigate to http://your-server-ip/info.php. You should see the PHP information page.

That's it! You now have Nginx and PHP installed and configured on your server.

So yeah, this isn't exactly part of the install process, but still very normal for guides.

Overall I'd say this is about an 8/10. The only things I disliked were:

  1. Telling the user to upgrade packages on RHEL... not an unusual thing in guides to update first "for good measure" but I personally feel that it's not a great practice to tell the user to 'yum update' without explaining that will update anything.
  2. Telling the user to install mysql and mbstring, which aren't really going to do any harm, but if you're going to include a "common" set of PHP extensions in the instructions it should be bigger than just these two. It's just odd.

'du' and 'df' disagree

This is actually an interview question I like to use for Linux-intensive positions. It might seem a little in the weeds but to be honest I think I run into this situation in real life probably at least once a month. The problem looks like this: tools like 'df' tell you that a partition is 99% or 100% full, and you might be having problems because of the full partition. But when you use 'du' to investigate, it shows a total usage that is appreciably less than 'df' does. Why do these two-letter commands not get along? is dd somehow the answer?

Prompt: On Linux, a partition is full but du doesn't show as much data as df does.

If you are seeing a discrepancy between the output of du and df on a Linux system, where df reports that a partition is full but du does not show as much data, it's possible that there are files that are being held open by processes and therefore are not being deleted even though they have been unlinked (deleted).

This actually isn't the answer that I expect but it's a reasonable answer. This could very well be the cause, I just find that this situation is less common than having a mount that "shadows" data on the volume at the next higher mount point. For example, say you have a volume mounted at / and a volume mounted at /home. If you have some kind of problem that results in /home failing to mount, or changed the filesystem setup at some point, or even have weird boot-order problems, you can sometimes end up with data in /home/ on the volume mounted at /. This data counts against the usage of the volume at / but won't show up in tools like 'du' since it's "shadowed" by /home/ now being a mountpoint to another volume. Easy problem to have if you are rearranging the volumes/mounts on a system that's already been in use.

I'll omit the rest of the output, but it makes sense. It tells us to use 'lsof' to figure out what process is keeping the files open and then kill that process. The guide does include the requisite paragraph that you need to investigate to determine how this happened in the first place.

Overall I'm giving this a 5/10, because it gave a reasonable answer but omitted the possibility of mount shadowing. Which of these two causes is more likely depends on the system and workload, but I think a thorough answer should include both possibilities.

Per-process network policy

One sort of common "advanced" Linux networking scenario is when you are using a full-tunnel VPN and want to route all traffic through it, but you have to get the VPN itself to connect to its endpoint without trying to go through itself. There are a few ways to solve this problem, but one of the less common and (in my opinion) more elegant approaches is to get the VPN service to use its own special routing table. This same concept also makes it easy to set up firewall policy around the VPN to prevent leakage. Now there are a lot of caveats to this and I'm really just talking about userspace VPNs here, but that probably makes it a good challenge for ChatGPT.

prompt: How do I configure per-process route tables on Linux

To configure per-process route tables on Linux, you can use the ip command and the setsockopt() system call.

Here are the steps to follow:

  1. Create a new routing table:

    sudo echo "200 mytable" >> /etc/iproute2/rt_tables

This will create a new routing table with the name mytable and the ID 200.

It maybe seems a little weird that this is the way to create a routing table but it's fine. The iproute2 stuff that facilitates multiple routing tables without multiple headaches is pretty new and there isn't a lot of porcelain around it... and I'm not sure there ever will be, this isn't a very common task and while editing the file seems a little old-school compared to most of the contemporary network tooling it works just fine. There's very little need to expose this over a bus or anything.

  1. Configure the routes for the new table:

    sudo ip route add default via dev table mytable

This command will add a default route to the new table with the specified gateway and interface.

I will take this moment to make a couple of funny observations about the mechanics of ChatGPT's output. First, every step in this guide is numbered 1. Some things here are probably copy-paste errors on my part (I'm reformatting the output to look better in plaintext), but that isn't, this output has four step ones. Another is the code blocks. I'm just using plaintext but ChatGPT uses nice embedded code blocks with syntax highlighting. Syntax highlighting means that it needs to know (or guess) the language it's highlighting and the codeblocks show what language they're set to. It is... seemingly random? The above shell command is 'csharp', but 'sql' seems to be a more common guess for shell commands than even 'bash'. This just means that most of the time the syntax highlighting doesn't really make sense.

I'm not sure where either of these are going wrong, but I get the feeling (without doing any actual research) that ChatGPT is outputting markdown. Markdown numbered lists can be fiddly and there are a few ways to end up with all step 1 like this. When it comes to the code blocks, I'm not sure if chatGPT is specifying the language or if the code block service is guessing it based on contents. I tend to guess the former because otherwise the guesses are very weird. I can't see any reason it would think one of these commands is csharp and the other is sql, they both have the same set of characters in them.

And yeah, this instruction makes perfect sense. You need to add some routes to the new route table.

  1. Configure the process to use the new table:

    sudo ip rule add from table mytable

This command will add a new rule to use the mytable routing table for traffic originating from the specified IP address.

:thinking_emoji:

There's some clear semantic confusion here. I think ChatGPT is just going too readily to the many guides it's read that create a route table and then apply that route table based on ip rules. This is a very common thing to do, I would wager the most common way to use custom route tables, but it doesn't help in our case. The intro of this step is also just wrong (this has nothing to do with the process), but the sentence after correctly explains what it does.

  1. Finally, start the process with the setsockopt() system call to bind it to the new table:

    int table_id = 200; setsockopt(socket_fd, SOL_SOCKET, SO_BINDTODEVICE, "eth0", strlen("eth0")+1); setsockopt(socket_fd, SOL_SOCKET, SO_MARK, &table_id, sizeof(table_id));

This code will bind the process to the eth0 interface and mark its packets with the mytable routing table ID.

This is, well, it's not exactly wrong but it's also not right.

If I fiddled with the prompt I'm sure I could get it to output something that makes more sense, but I also feel like this is an unreasonable way to interpret the prompt I did provide. To be clear, I expected that it would use cgroups and iptables to apply the new custom route table to some existing application. This seems like a way more common situation than to... write or modify an existing application in C. On the upside, the code block correctly has the language as 'c'!

All that said, if we start from the assumption that we are writing a new application this still doesn't work exactly. The option we see here, SO_MARK, applies a 'mark' in the sense of netfilter to traffic on the socket. Marks are a super useful feature that allows us to set an arbitrary tag on each packet that we can then refer to in our other network policy. A very common use for marks is to match the traffic up with an iptables rule that sets the routing table for that traffic... but ChatGPT didn't tell us to set that rule, it had us set the rule based on source port instead. I think maybe ChatGPT was trying to cover this in step 1 number 3 but it didn't quite choose the right rule.

Repeat steps 3 and 4 for each process that needs to use the new routing table.

There are no steps 3 and 4! There is only step 1!

Overall I'd give this a 3/10. Honestly I think ChatGPT did better than I expected when considering that this is not a very common use-case, but ChatGPT's output is worse than what you get if you google the same question.

Takeaways

This isn't intended to be a super carefully thought out research piece, but like others I've seen reporting that suggests that ChatGPT is pretty good at generating source code but has a tendency to leave subtle errors. I wondered if the same thing might be true in the old-school system administration space, but the feeling I walked away with is a little different: ChatGPT does make a lot of errors, but they aren't particularly subtle. I suspect this has to do with the different nature of the training material for these kind of questions, which is more in the form of narrative guides and documentation that keep steps simple but leave a lot of assumptions.

I pretty much covered every question I thought to ask in a short time spent playing around. There was one question that I was just unable to get ChatGPT to generate a good answer for... an AWS network architecture question involving load balancing for both HTTP(S) and arbitrary TCP services on the same elastic IP when you want layer 7 behavior. I could mark this as a 0/10 for ChatGPT but it might be a case of more "prompt engineering" being required... it kept producing output that was reasonable but failed to address one or more of my requirements, so I kept making the requirements more explicit, and then it would just miss a different set of requirements. It may very well be possible to get ChatGPT to produce a correct solution but it was definitely getting to be less useful than a search engine, even with how difficult the AWS documentation can be to use for architecture questions.

On the whole, I felt that ChatGPT was performing more poorly than Google for similar queries. It's possible to get ChatGPT to refine its output by adding more detail (at least if you don't get too deep into AWS networking capabilities), which is a big plus over a conventional search engine, but honestly it still didn't feel to me like this was a savings of effort over reading a few different articles and synthesizing.

One of the reasons this was on my mind is because I'm working with a client right now who has an interesting habit of copying and pasting all the error messages they get into ChatGPT, while still screen sharing. From this sort of eavesdropping on AI I have not been very impressed with its output, which has often been high-level to the point of uselessness. It sort of has the feel of AWS documentation, actually... he would put a very specific error from, let's say, Terraform into ChatGPT and it would answer with a few paragraphs about how Terraform works at a high level. I'm sure this can be improved with a more specific training corpus but I'm not sure what that corpus would be, exactly, which continues to stymie my plans to just forward all the emails I get from clients to ChatGPT and give them the response.

I mean, I think it would keep them going back and forth for a while, but they might feel like it's a good value for money.

--------------------------------------------------------------------------------

>>> 2023-03-24 docker

Lately I tend to stick to topics that are historic by at least twenty years, and that does have a lot of advantages. But I am supposedly a DevOps professional, and so I will occasionally indulge in giving DevOps advice... or at least opinions, which are sort of like advice but with less of a warranty.

There's been a lot of discussion lately about Docker, mostly about their boneheaded reversal following their boneheaded apology for their boneheaded decision to eliminate free teams. I don't really care much about this event in terms of how it impacts my professional work. I long ago wrote off Docker, Inc. as a positive part of the DevOps ecosystem. But what's very interesting to me is how we got here: The story of Docker, Docker Inc., Docker Hub, and their relation to the broader world of containerization is endlessly fascinating to me.

How is it that Docker Inc., creator of one of the most important and ubiquitous tools in the modern software industry, has become such a backwater of rent-seeking and foot-shooting? Silicon Valley continually produces some astounding failures, but Docker stands out to me. Docker as a software product is an incredible success; Docker as a company is a joke; and the work of computing professionals is complicated by the oddly distant and yet oddly close connection between the two.

Docker, from a technical perspective, is more evolutionary than revolutionary. It mostly glued together existing Linux kernel features, following a road that had at least been graded, if not paved and striped, by projects like LXC. Docker as a concept, though, had a revolutionary impact on the DevOps field. Docker quickly became one of the most common ways of distributing server-side software, and whole development workflows rearranged themselves around it. Orchestration tools like the ones we use today are hard to picture without Docker, and for many professionals Docker is on par with their text editor as a primary tool of the trade.

But underlying all of this there has always been sort of a question: what is Docker, exactly? I don't necessarily mean the software, but the concept. I have always felt that the software is not really all that great. Many aspects of Docker's user interface and API seem idiosyncratic; some of the abstraction it introduces is more confusing than useful. In particular, the union file system (UFS) image format is a choice that seems more academically aspirational than practical. Sure, it has tidy properties in theory, but my experience has been that developers spend a lot more time working around it than working with it.

All this is to say that I don't think that Docker, the tool, is really all that important. In a different world, LXC might have gained all this market share. Had Docker not come about, something like containerd would likely have emerged anyway. Or perhaps we would all be using lightweight VMs instead; academic and commercial research tends to show that the advantages containers have over more conventional paravirtualization are far smaller than most believe.

I would argue that the Docker that matters is not software, but a concept. A workflow, you might say, although I don't think it's even that concrete. The Docker that swept DevOps like a savior come to spare us from Enterprise JavaBeans isn't really about the runtime at all. It's about the images, and more about the ease of programatically creating images. Much of this benefit comes from composition: perhaps the most important single feature of Docker is the FROM keyword.

So Docker is an open-source software product, one that is basically free (as in beer and as in freedom) although hindered by a history of messy licensing situations. Docker is also a company, and companies are expected to produce revenue. And that's where other facets of the greater identity we call "Docker" come to light: Docker Desktop and Docker Hub.

Docker Desktop isn't really that interesting to me. Docker is closely coupled to Linux in a way that makes it difficult to run on the predominant platform used by developers [1]. Docker Inc. developed Docker Desktop, a tool that runs Docker in a VM using fewer clicks than it would take to set that up yourself (which is still not that many clicks). Docker Inc. then needed to make money, so they slapped a licensing fee on Docker Desktop. I responded by switching to Podman, but I get that some people are willing to pay the monthly fee for the simplicity of Docker Desktop, even if I feel that the particular implementation of Docker Desktop often makes things harder rather than easier.

Also I find the Docker Desktop "GUI" to be incredibly, intensely annoying, especially since Docker Inc. seems to pressure you to use it in a desperate attempt to dig what Silicon Valley types call a moat. But I fully acknowledge that I am a weird computer curmudgeon who uses Thunderbird and pines for the better performance of, well, pine.

Still, the point of this tangent about Docker Desktop is that Docker's decision to monetize via Desktop---and in a pretty irritating way that caused a great deal of heartburn to many software companies---was probably the first tangible sign that Docker Inc. is not the benevolent force that it had long seemed to be. Suddenly Docker, the open-source tool that made our work so much easier, had an ugly clash with capitalism. Docker became a FOSS engine behind a commercial tool that Docker Inc. badly wanted us to pay for.

Docker Desktop also illustrates a recurring problem with Docker: the borders between free and paid within the scope of their commercial products. Docker Desktop became free for certain use-cases including personal use and use in small businesses, but requires a paid subscription for use in larger companies. This kind of arrangement might seem like a charitable compromise but is also sort of a worst-of-both-worlds: Docker Desktop is free enough to be ubiquitous but commercial enough to pose an alarming liability to large companies. Some companies exceeding Docker's definition of a small company have gone as far as using their device management tools to forcibly remove Docker Desktop, in order to mitigate the risk of a lawsuit for violating its license.

There is a fundamental problem with "free for some, paid for others": it requires that users determine whether or not they are permitted to use the tool for free. Even well-intentioned users will screw this up when the rules require knowledge of their employer's financials and, moreover, are in small print at the very bottom of a pricing page that says "free" at the top. Personally, I think that Docker Inc.'s pricing page borders on outright deception by making the licensing restrictions on Docker Desktop so unobvious.

Docker Hub, though: Docker Hub is really something.

That most compelling feature of Docker, the ability to easily pull images from somewhere else and even build on top of them, depends on there being a place to pull images from. It's easy to see how, at first, Docker Inc. figured that the most important thing was to have a ubiquitous, open Docker registry that made it easy for people to get started. In this way, we might view Docker Hub as having been a sort of scaffolding for the Docker movement. The fact that you could just run 'docker pull ubuntu' and have it work was probably actually quite important to the early adoption of Docker, and many continue to depend on it today.

Docker Hub, though, may yet be Docker's undoing. I can only assume that Docker did not realize the situation they were getting into. Docker images are relatively large, and Docker Hub became so central to the use of Docker that it became common for DevOps toolchains to pull images to production nodes straight from Docker Hub. Bandwidth is relatively expensive even before cloud provider margins; the cost of operating Docker Hub must have become huge. Docker Inc.'s scaffolding for the Docker community suddenly became core infrastructure for endless cloud environments, and effectively a subsidy to Docker's many users.

It's hard to blame Docker Inc. too much for flailing. Docker Hub's operating costs were probably unsustainable, and there aren't a lot of options to fix this other than making Docker Hub expensive, or making Docker Hub worse, or both. Docker Inc. seems to have opted for both. Docker Hub is not especially fast, in fact it's pretty slow compared to almost any other option. Docker Hub now imposes per-IP quotas, which probably would have been totally reasonable at the start but was a total disaster when it was introduced post-hoc and suddenly caused thousands, if not millions, of DevOps pipelines to intermittently fail.

Docker Inc.'s goal was presumably that users would start using paid Docker plans to raise the quotas but, well, that's only attractive for users that either don't know about caching proxies or judge the overhead of using one to be more costly than Docker Hub... and I have a hard time picturing an organization where that would be true.

That's the strange thing about Docker Hub. It is both totally replaceable and totally unreplaceable.

Docker Hub is totally replaceable in that the Docker registry API is really pretty simple and easy to implement in other products. There are tons of options for Docker registries other than Docker Hub, and frankly most of them are much better options. I'm not just saying that because GitLab [2] has a built-in Docker registry, but that sort of illustrates the point. Of course GitLab has a built-in Docker registry, it's no big deal. It's not even that GitLab introduced it as a competitor to Docker Hub, that's sort of absurd, Docker Hub doesn't even really figure. GitLab introduced it as a competitor to Sonatype Nexus and JFrog Artifactory, to say nothing of the docker registries offered by just about every cloud provider. For someone choosing a Docker registry to deploy or subscribe to, Docker Hub has no clear advantage, and probably ranks pretty low among the options.

And yet Docker Hub is the Docker registry, and the whole teetering tower of DevOps is deeply dependent on it! What an odd contradiction, and yet it's completely obvious why:

First, Docker Hub is free. Implausibly free, and as it turns out, probably unsustainably free. There's an old maxim that if you're not paying, you're the product. But Docker Hub reminds us that in the VC-driven (and not particularly results-driven) world of Silicon Valley there is a potent second possibility: if you're not paying, there may be no product at all. At least not once your vendor gets to the end of the runway [3].

Second, Docker Hub is the default. Being the default can be a big deal, and this is painfully true for Docker. The dominance of short, convenient "user/image" or even just "image" references is so strong that Docker image references that actually specify a registry feel almost feels like an off-label hack, a workaround for how Docker is really supposed to be used. What's more, Docker Hub's original quotas (or rather lack thereof) left no need for authentication in many situations, so having to authenticate to a registry also feels like an extra hassle. Many tools built around Docker don't make the use of a non-Docker Hub registry, or any authentication to a registry, as convenient as it probably should be. Tutorials and guides for Docker often omit setup of any registry other than Docker Hub, since Docker Hub is already configured and has everything available in it. You only find out the mistake you've made when your pipelines stop working until the quota period resets, or worse, pulls in production start failing and you have to hope you're lucky enough to check the Kubernetes events before digging around a dozen other places.

So the solution to the Docker Hub problem is obvious: stop using Docker Hub. It was probably a bad idea all along. But the reality of the situation is much harder. Moving off of Docker Hub is a pain, and one that has a way of staying pretty far down priority lists. Docker Hub references, or rather references with no registry at all that default to Docker Hub, are so ubiquitous that any project moving their official builds off of Docker Hub will probably break a tremendous number of downstream users.

Docker Inc.'s behavior with Docker Desktop and especially Docker Hub feels like rent-seeking at best, and potentially extortionate. It's not exactly fair to blame all of this on Docker Inc.; both commercial users and the open-source community should have foreseen the retrospectively obvious risk of Docker actually thinking about the economics. Nonetheless, a cynical and not entirely unreasonable take on this story is that Docker hoodwinked us. Perhaps Docker has simply stumbled upon the "Embrace, Extend, Extinguish" of our age: employ FOSS software defaults and lazy developer practices (that were inculcated by Docker's documentation) to make everyone dependent on Docker Inc.'s free registry, then tighten the quota screws until they have no choice than to pay in. This is a very cynical take indeed! I don't really believe it, mostly because it involves far more strategic vision than I would credit Docker Inc. with.

I decided to write about this because I think there are lessons to be learned. Important lessons. No doubt some of this problem is directly attributable to the economic conditions that dominated Silicon Valley for the last decade. Docker Inc. probably wouldn't have gotten so far, burning so much money, had there not been an incredible amount of money to burn. Still, it seems inevitable that this kind of relationship between open-source software and corporate strategy, and between free and paid services, will happen again.

I propose these takeaways, as discussion topics if nothing else:

  1. Be skeptical of free services, especially ones that are required for any part of your business (or open source venture, or hobby project, etc). Free services should never become a deeply embedded dependency unless there is very good reason to believe they will remain free. Perhaps the backing of a large foundation or corporate sponsor with a good history with open source would count, but even that is no promise. Consider the example of Red Hat, its acquisition by IBM, and the impact of that business event on projects previously considered extremely reliable like CentOS.

  2. Free tools that rely on third-party services are only free for the time being. Sure, this might be obvious, but it's probably a deeper problem than you realize. Docker never relied on Docker Hub in that it has always been possible to use other registries. But Docker and the community strongly encouraged the use of Docker Hub through technical, economic, and social means. This had the result of making Docker Hub a de facto hard requirement for many projects and systems.

  3. When writing documentation, guides, blog posts, advice to coworkers, etc., think about long-term sustainability even when it is less convenient. I suspect that the ongoing slow-motion meltdown over Docker Hub would have been greatly mitigated if the use of multiple Docker registries, or at least the easy ability to specify a third-party registry and authenticate, were considered best practices and more common in the Docker community.

[1] I mean MacOS, but you can assume I mean Windows and it still works.

[2] My employer whose opinions these are not.

[3] I am here resisting the urge to write a convoluted aviation metaphor. Something about being passengers on a whale-shaped plane that is hitting the last thousand feet and still short of V_r, so the captain says we only get 100 builds per 6 hours per IP and the rest are going out the window.

p.s. I took so long to write this so late at night that now the date in the title is wrong, haha whoops not fixing it

--------------------------------------------------------------------------------

>>> 2023-03-13 the door close button

This will probably be a short one, and I know I haven't written for a while, but it has always been the case that you get what you pay for and Computers Are Bad is nothing if not affordable. Still, this is a topic on which I am moderately passionate and so I can probably stretch it to an implausible length.

Elevator control panels have long featured two buttons labeled "door open" and "door close." One of these buttons does pretty much what it says on the label (although I understand that European elevators sometimes have a separate "door hold" button for the most common use of "door open"). The other usually doesn't seem to, and that has lead to a minor internet phenomenon. Here's the problem: the internet is wrong, and I am here to set it right. This works every time!

A huge number of articles confidently state that "80% of door close buttons do nothing." The origin of this 80% number seems to be a 2014 episode of Radiolab titled "Buttons Not Buttons," which I just listened through while doing laundry. Radiolab gets the statistic from the curator of an elevator history museum, who says that most of them "aren't even hooked up." This is reason to doubt our curator's accuracy. I don't think there is anything malicious going on here, but I do think there is an element of someone who has been out of the industry for a while who is at least misstating the details of the issue.

The problem is not unique to Radiolab, though. An Oct. 27, 2016 New York Times article, "Pushing That Crosswalk Button May Make You Feel Better, but...," covers the exact same material as the Radiolab article a couple of years later. And the article was widely repeated in other publications, not by syndication but by "According to the New York Times..." paraphrasing. This means that often the repetitions are more problematic than the original, but even the original says:

But some buttons we regularly rely on to get results are mere artifices - placebos that promote an illusion of control but that in reality do not work.

Many versions of the article lean on this line even harder, asserting that door close buttons in elevators are installed entirely or at least primarily as placebos. But the NYTimes article provides brief mention of the deeper, and less conspiratorial, reality:

Karen W. Penafiel, executive director of National Elevator Industry Inc., a trade group, said the close-door feature faded into obsolescence a few years after the enactment of the Americans With Disabilities Act in 1990.

...

The buttons can be operated by firefighters and maintenance workers who have the proper keys or codes.

There are a few things to cover:

First, anyone who says that the "door close" buttons in elevators are routinely "not even hooked up" shouldn't be trusted. The world is full of many elevators and I'm sure some can be found with mechanically non-functional door close buttons, but the issue should be infrequent. The "door close" button is required to operate the elevator in fire service mode, which disables automatic closing of the doors entirely so that the elevator does not leave a firefighter stranded. Fire service mode must be tested as part of the regular inspection of the elevator (ASME A17.1-2019, but implemented through various state and local codes). Therefore, elevators with a "door close" button that isn't "hooked up" will fail their annual inspections. While no doubt some slip through the cracks (particularly in states with laxer inspection standards), something that wouldn't meet inspection standards can hardly be called normal practice and the affected elevators must be far fewer than 80%.

But perhaps I am being too pedantic. Elevator control systems are complex and highly configurable. Whether or not the door close button is "hooked up" or not is mostly irrelevant if the controller is configured to ignore the button, and it's possible that some of these articles are actually referring to a configuration issue. So what can we find about the way elevators are configured?

I did some desperate research in the hopes of finding openly available documentation on elevator controller programming, but elevator manufacturers hold their control systems very close to their chests. I was not lucky enough to find any reasonably modern programming documentation that I could access. Some years ago I did shoulder-surf an elevator technician for a while as he attempted to troubleshoot a reasonably new two-story ThyssenKrupp hydraulic that was repeatedly shutting off due to a trouble code. In the modern world this kind of troubleshooting consists mostly of sitting on the floor of the elevator with a laptop looking at various status reports available in the configuration software. The software, as I recall, came from the school of industrial software design where a major component of the interface was a large tree view of every option and discoverability came in the form of some items being in ALL CAPS.

The NYTimes article, though, puts us onto the important issue here: the ADA. Multiple articles repeat that door close buttons have been non-functional since 1990, although I think most of them (if not all) are just paraphrasings of this same NYTimes piece. The ADA is easy to find and section 4.10 addresses elevators. Specifically, 4.10.7 and 4.10.8 have been mentioned by some elevator technicians as the source of the "door close" trouble. With some less relevant material omitted:

4.10.7* Door and Signal Timing for Hall Calls

The minimum acceptable time from notification that a car is answering a call until the doors of that car start to close shall be calculated from the following equation:

T = D/(1.5 ft/s) or T = D/(445 mm/s)

where T total time in seconds and D distance (in feet or millimeters) from a point in the lobby or corridor 60 in (1525 mm) directly in front of the farthest call button controlling that car to the centerline of its hoistway door (see Fig. 21).

4.10.8 Door Delay for Car Calls

The minimum time for elevator doors to remain fully open in response to a car call shall be 3 seconds.

Based on posts from various elevator technicians, it's clear that these ADA requirements have at least been widely interpreted as stating hard minimums regardless of any user interaction. In other words, the ADA timing constitutes the minimum door hold time which cannot be shortened. Based on the 4.10.7 rule, we can see that that time will be as long as ten seconds in fairly normal elevator lobbies (16 feet, or about two elevators, from door centerpoint to furthest button). We can read the same in a compliance FAQ from Corada, an ADA compliance consulting firm:

User activation of door close (or automatic operation) cannot reduce the initial opening time of doors (3 seconds minimum) or the minimum door signal timing (based on 1.5/ ft/s travel speed for the distance from the hall call button to car door centerline).

One point here can be kind of confusing. The minimum time for the door to be fully open is 3 seconds, but the door signal timing is based on the time from the indication of which elevator has arrived (usually a chime and illuminated lamp) to the time that the doors start closing. This will be at least a couple of seconds longer than the minimum door time due to the open and close time of the door, but since it starts at 5 seconds and goes up from there it will usually be the longer of the two requirements and thus set the actual minimum door time. Where this is likely to not be the case are single-elevator setups where the 5 second minimum timing will apply and the time from chime to door open eats up the first two seconds... in that case, the 3 second fully open time will become the limiting (or really maximizing) factor.

From some elevator manuals such as one for the Motion Control Engineering VFMC-1000, we can gather that that the "minimum door hold time" and "door hold time" are separately configurable. I have seen several mentions online that in most elevators the "door close" button functions totally normally during the difference between the minimum door hold time and the door hold time. In other words, there may be some period during which pushing the door close button causes the door to close, but it will be after the end of the ADA-required minimum door time.

Here is the obvious catch: since reducing the door hold time will make the elevator more responsive (less time on the way to a call spent waiting with the doors open), elevator installers are usually motivated to make the door hold time as short as possible. Since the ADA requirements impose a minimum, it's likely very common for the minimum door hold time and the "normal" door hold time to be the same... meaning that the window to use the "door close" button is zero seconds in duration.

We can confirm this behavior by finding an elevator with a very long configured door hold time. That seems pretty easy to do: visit a hospital. Most hospitals set the door hold time fairly high to accommodate people pushing hospital beds around, so the normal door open time is longer than the ADA requirement (the ADA rules are of course written assuming a person can cover 1.5 ft/s which isn't very fast but still seems hard to achieve when accelerating a heavy hospital bed in a tight space). Call an elevator, step inside, wait for around ten seconds from the chime for the minimum door hold to elapse, and then push the "door close" button. What happens? Well, in my experience the door promptly closes, although I admit that I've only tested this on two hospitals so far. Perhaps your experience will vary: I can see the possibility of a hospital setting the minimum door hold time high, but of course that would get pretty annoying and probably produce pushback from the staff. In the hospitals where I've studiously observed the elevators the normal door hold time was close to 20 seconds, which feels like an eternity when you're waiting to get up one floor.

Another way we can inspect this issue is via door reopening rules. While older elevators used a rubber bumper on the door called a sensitive edge, most elevators you'll see today use a "light curtain" instead. This device, installed between the hall and car doors, monitors for the interruption of infrared light beams to tell if the door is clear. When the door is obstructed, ADA 4.10.6 requires the door to remain open for at least 20 seconds. After that point ADA just refers to the ASME A17.1 standard, which allows for a behavior called "nudging" in which the elevator controller encourages people to clear the door by closing it anyway (at slow speed). The light curtain can also be used to detect whether or not a person has entered the elevator, which can be used as an input to hold time. Some articles online say that you can "hack" an elevator waiting at an empty floor (because someone called the elevator and walked away, for example) by momentarily interrupting the light curtain so that the controller will believe that someone has entered.

Indeed this seems to work well on some elevators, but the ADA requirements do not allow an exception to minimum hold times based on light curtain detection. This means that the light curtain trick is basically equivalent to the door close button: we can expect it to, at most, shorten the door hold time to the ADA minimum. Nothing is allowed to decrease the time below the ADA minimum, except when the elevator is in a special mode such as fire or perhaps independent service.

So it seems that the reality of elevator "door close" buttons is rather less dramatic than Radiolab and the NYTimes imply: the "door close" button is perfectly functional, but details of the 1990 ADA mean that most of the time people are pressing it the elevator controller isn't permitted to close the door due to ADA rules. As far as I can tell, outside of the ADA minimum door time, door close buttons work just fine.

And yet tons of articles online still tell us that the button is installed as a placebo... something that is demonstrably untrue considering its significance in fire (and maintenance, independent, etc) modes, and shows a general lack of understanding of elevator codes and the ADA. Moreover, it seems like something you would find out is untrue with about five minutes of research. So why is it such "common knowledge" that it makes the rounds of major subreddits and minor local news websites to this day?

No doubt a large portion of the problem is laziness. The "placebo" theory has a lot of sizzle to it. Even though the NYTimes is somewhat noncommittal and only implies that it is the true purpose of the button, most of the online pieces about door close buttons I can find appear to be based solely on the 2016 NYTimes article and actually repeat the claim about the placebo affect more strongly than the NYTimes originally makes it. In other words, the "fact" that the door close button is a placebo seems to mostly just be a product of lazy journalists rewriting an NYTimes piece enough to not feel like plagiarists.

There is also a matter of aesthetic appeal: the placebo theory sounds great. It has the universal appeal of mundane reality but also hints at some kind of conspiracy to deceive in the elevator industry. And, of course, it makes everyone feel better about the high failure rate of mashing the "door close" button without the complexity of an accurate explanation of the 1990 ADA rules. The NYTimes piece basically makes it sound like the ADA banned door close buttons, and it's easy to read the ADA and see that that's not true... but it takes some real attention and thought to figure out how the ADA really did change elevator controls.

This type of phenomenon, a sort of "internet urban pseudo-legend," is not at all unique to elevator buttons. In fact the very same 2016 NYTimes article that started that year's round of elevator button "fun facts" is also to blame for another widespread belief in placebo buttons: crosswalk request buttons. The NYTimes article says that most crosswalk buttons do nothing, explaining that the buttons were made non-functional after an upgrade to computer light controls. What the article does say, but many readers seem to miss, is that this is a fact about crosswalk buttons in New York City.

Many traffic lights operate in "actuated mode," where they base their cycling on knowledge of who is waiting where. Older traffic lights mostly used buried inductive loops under the lanes to detect lane occupancy (that a vehicle is present), but a lot of newer traffic lights use either video cameras or compact radar sets. Since they don't require cutting into the pavement and then resealing it, these are cheaper and faster to install. Newer video and radar systems are also better at detecting cyclists than pavement loops---although earlier video systems performed very poorly on this issue and gave video lane presence detection a bad reputation in some cities.

New York City, though, was a very early adopter of large-area computer control of traffic lights. One of the main advantages of central computer control of traffic lights is the ability to set up complex schedules and dynamically adjust timing. Not only can centrally-controlled traffic lights operate in sequence timing matched to the speed limit of the street, they can also have the durations in different directions and sequence speed adjusted based on real-time traffic conditions.

The problem is that combining central timing control with actuated operation is, well, tricky. In practice, most traffic lights that operate under sequence timing or remote timing control don't operate in actuated mode, or at least not at the same time. What some traffic lights do today is switch: sequence timing during rush hour, and actuated mode during lower traffic. Even with today's developments combining scheduled timing with actuation inputs is tricky, and New York City adopted centralized control in the '70s!

So New York's adoption of central control was also, for the most part, an abandonment of actuated operation. The crosswalk buttons are actuation inputs, so they became non-functional as part of this shift. The 2016 NYTimes article explained that the city had estimated the cost of removing the now non-functional buttons at over a million dollars and so decided to skip the effort... but they are removing the buttons as other work is performed.

For the second time, this runs directly counter to the "mechanical placebo" argument the article is based on. The buttons weren't originally installed as placebo at all; when they were put in they were fully functional. A different decision, to switch to centralized timing control, resulted in their current state, and even then, they are being removed over time.

Moreover, the same does not apply to other cities. The NYTimes makes a very lazy effort at addressing this by by referring to a now-unavailable 2010 ABC News piece reporting that they "...found only one functioning crosswalk button in a survey of signals in Austin, Tex.; Gainesville, Fla.; and Syracuse." It is unclear what the extent of that survey is, and I lack the familiarity with traffic signaling in those cities to comment on it. But in a great many cities, most of them in my experience, actuated traffic signals remain the norm outside of very high-traffic areas, and so the crosswalk buttons serve a real purpose. Depending on the light configuration, you may never get a "walk" signal if you don't press the button, or the duration of the "walk" signal (prior to the flashing red hand clearing time) may be shorter.

Actually one might wonder why those crosswalk buttons have so much staying power, given the technical progress in lane presence detection. Video and radar options for waiting pedestrian detection do exist. I have occasionally even seen PIR sensors installed for this purpose in suburban areas. The problem, I think, is that detecting a pedestrian waiting to cross involves more nuance than a vehicle. Sidewalks don't have lane lines to clearly delineate different queues for each movement. A video or radar-based system can detect a pedestrian waiting on the corner, but not whether that person is waiting to cross one direction, or the other, or for an Uber, or just chose that spot to catch up on Tik Tok. Video-based waiting pedestrian detection may be too prone to false positives, and in any case the button is a robust and low-cost option that can also be used to meet ADA objectives through audible and tactile announcements.

So there's a story about buttons: the conspiracy about them being placebos is itself a conspiracy to get you to read articles in publications like "Science Alert." Or maybe that's just an old tale, and the reality of content-farmed news websites falls out of some implications of the ADA. It's a strange world out there.

--------------------------------------------------------------------------------

>>> 2023-02-17 something up there pt II

As we discussed previously, the search for UAP is often contextualized in terms of the events of 2017: the public revelation of the AATIP and alien-hunting efforts by Robert Bigelow and Tom DeLonge. While widely publicized, these programs seem to have lead to very little. I believe the termination of the AATIP (which lead to the creation of TTSA) to be a result of the AATIP's failure to address the DoD's actual concern: that UAP represented a threat to airspace sovereignty.

I just used a lot of four- and five-letter acronyms without explaining them. These topics were all discussed in the previous post and if you are not familiar with them I would encourage you to read it. Still, I will try to knock it off. Besides, now there is a new set of four- and five-letter acronyms. The end of the AATIP was not the end of the DoD's efforts to investigate UAP. Instead, military UAP research was reorganized, first into Naval intelligence as the UAP Task Force, and later in the cross-branch military intelligence All-Domain Anomaly Resolution Office, or AARO.

It is unclear exactly what the AARO has accomplished. As a military intelligence organization, the DoD will not comment on it. Most of what we know comes from legislators briefed on the program, like Sen. Gillibrand and Sen. Rubio. In various interviews and statements, they have said that AARO's work is underway but hampered by underfunding---underfunding that is, embarrassingly, a result of some kind of technical error in defense appropriation.

Administratively confused as they may be, the DoD's UAP efforts have lead to creation of a series of reports. Issued by the Director of National Intelligence (DNI) at the behest of congress, the June 2021 unclassified report appeared to be mostly a review of the same data analyzed by AATIP. The report was short---9 pages---but contained enough information to produce a lot of reporting. One of the most important takeaways is that, up to around 2020, the military had no standardized way of collecting reports of UAP. Later reporting would show that even after 2020 efforts to collect UAP reports were uneven and often ineffective.

Much of the reason for this is essentially stigma: advocates of UAP research have often complained that through the late 20th century the military developed a widespread attitude of suppressing UAP incidents to avoid embarrassment. As a result, it's likely that there are many more UAP encounters than known. This is particularly important since analysis (including that in the 2021 report) repeatedly finds that the majority of UAP reports are probably explainable, while a few are more likely to result from some type of unknown object such as an adversarial aircraft. In other words, the signal to noise ratio in UAP reports is low. Taken one way this might discourage reporting and analysis, since any individual report is unlikely to amount to anything. The opposite is true as well, though: if most UAP encounters are not reported and analyzed, it's likely that the genuinely troubling incidents will never be discovered. The 2021 report broadly suggests that this is exactly what was happening for many years: so few UAP incidents were seriously considered that no one noticed that some of them posed real danger.

The 2021 report briefly mentions that some UAP incidents were particularly compelling. For example, in 18 incidents the UAP demonstrated maneuvering. This doesn't mean "shot into the sky as if by antigravity," but rather that the objects appeared to be navigating towards targets, turning with intention, or stationkeeping against the wind. In other words, they are incidents in which the UAP appears to have been a powered craft under some type of control. Even more importantly, the report notes that in a few cases there were indications of RF activity. The military will never go into much detail on this topic because it quickly becomes classified, but many military aircraft are equipped with "electronic warfare" systems that use SDR and other radio technology to detect and classify RF signals. Historically the main purpose of these systems was to detect and locate anti-aircraft radar systems, but they have also been extended to general ELINT use.

ELINT is an intelligence community term for "electronic intelligence." Readers are more likely to be familiar with the term SIGINT, for signals intelligence, and the difference between the two can be initially confusing. The key is that the "electronic" in ELINT is the same as in "electronic warfare." SIGINT is about receiving signals in order to analyze their payloads, for example by cryptologic means. ELINT is about receiving signals for the sake of the signals themselves. For example, to recognize the chirp patterns used by specific adversarial radar systems, or to identify digital transmission modes used by different types of communications systems, thus indicating the presence of that communications system and its user. A simple and classic example of ELINT would be to determine that an adversarial force uses a certain type of encrypted digital radio system, and then monitor for transmissions matching that system to locate adversarial forces in the field. The contents don't matter and for an encrypted system may not be feasible to recover anyway. The mere presence of the signal provides useful intelligence.

The concept of ELINT becomes important in several different ways when discussing UAP. First, the 2021 DNI report's mention that several UAP were associated with RF emissions almost certainly refers to ELINT information collected by intelligence or electronic warfare equipment. These RF emissions likely indicate some combination of remote control and real-time data reporting, although a less likely possibility (in my opinion) is that it reflects electronic warfare equipment on the UAP engaged in some type of active countermeasure.

It's meaningful to contrast this view of the matter with the one widespread in the media in 2017. A UAP that maneuvers and communicates by radio is not exactly X-Files material, and almost by definition can be assumed to be an sUAS---small unmanned aerial system, commonly referred to as a drone. Far from the outlandish claims made by characters like Tom DeLonge, such a craft is hardly paranormal in that we know such devices exist and are in use. What is a startling discovery is that sUAS are being spotted operating near defense installations and military maneuvers and cannot be identified. This poses a very serious threat not only to airspace sovereignty as a general principle but also to the operational security of the military.

Perhaps the component of the report that generated the most media interest is its analysis of the nature of the reported UAP. In the vast majority of cases, in fact all but one, the DNI report states that it was not possible to definitively determine the nature of the UAP. This was almost always because of the limited information available, often just one or two eyewitness accounts and perhaps a poor photo and radar tracks. Most of these incidents presumably do have explanations within the realm of the known that simply could not be determined without additional evidence. On the other hand, the report does state that there are some cases which "may require additional scientific knowledge" to identify.

It is not entirely clear how dramatically this statement should be taken. It's possible, even likely, that the phrase mostly refers to the possibility that new methods of evidence collection will need to be developed, such as the new generation of radar systems currently emerging to collect more accurate information on sUAS with very low radar cross section due to their small size. It's also possible that the phrase reflects the fact that some reported UAP incidents involve the UAP behaving in ways that no known aerial system is capable of, such as high speeds and maneuvers requiring extreme performance. Once again, there is a temptation to take this possibility and run in the direction of extraterrestrial technology. Occam's razor at the very least suggests that it's more likely that some adversarial nation has made appreciable advancements in aviation technology and kept them secret. While perhaps unlikely this is not, in my mind, beyond reason. We know, for example, that both Russia and China have now made more progress towards fielding a practical hypersonic weapons system than the United States. This reinforces the possibility that their extensive research efforts have yielded some interesting results.

Following the 2021 UAP report, Congress ordered the DNI to produce annual updates on the state of UAP research. The first such update, the 2022 report, was released a few months ago. The unclassified version is quite short, but it is accompanied by a significantly longer and more detailed classified version which has been presented to some members of Congress. The unclassified document states that the number of known UAP incidents has increased appreciably, largely due to the substantial effort the military has made to encourage reporting. To provide a sense of the scale, 247 new reports were received in the roughly 1.5 years between the preliminary and 2022 reports. A number of additional incidents occurring prior to the 2021 report also came to the attention of military intelligence during the same period, and these were analyzed as well.

Perhaps the most important part of the 2022 report is its statement that, of the newly analyzed incidents, more than half were determined to be "unremarkable." In most cases, it was judged that the incident was probably caused by a balloon. While these are still of possible interest, they are less interesting than the remainder which are more difficult to explain. Intriguingly, the report states that some UAP "demonstrated unusual flight characteristics or performance capabilities." This supports the more dramatic interpretation of the 2021 report, that it is possible that some incidents cannot be explained without the assumption that some adversary possesses a previously unknown advanced technology.

While it already attracted a great deal of media attention, this entire matter of DNI reports was only the opening act to the spy balloon. The airspace sovereignty aspect of the UAP reports is not something that attracted much discussion in the media, but it has become much more front of mind as a UAP of the first kind drifted across the United States. This UAP was not unidentified for long, with the military publicly attributing it to China---an attribution that China has both formally and informally acknowledged.

Balloons are not new in warfare. Indeed, as the oldest form of aviation, the balloon is also the oldest form of military aviation. The first practical flying machine was the hot air balloon. While the technology originated in France, the first regular or large-scale example of military aviation is usually placed at the US Civil War. Hot air balloons were routinely used for reconnaissance during the Civil War, and the slow movement and long dwell times of balloons still make them attractive as reconnaissance platforms.

Military ballooning in the United States is not limited to the far past. During World War II, the Japanese launched nearly 10,000 balloons equipped with incendiaries. The hope was that these balloons would drift into the United States and start fires---which some of them did, although a concerted press censorship program largely prevented not only the Japanese but also Americans learning of the campaign. Ultimately the impact of the balloon bombs was very limited, but they are still often considered the first intercontinental weapon system. They might also be viewed as the first profound challenge to US air sovereignty, as the balloons required no nearby support (as aircraft of the era did) and the technology of the time provided no effective means of protection. Indeed, this was the calculus behind the press censorship: since there was no good way to stop the balloon bombs, the hope was that if the US carefully avoided any word of them being published, the Japanese might assume they were all being lost at sea and stop sending them.

While the Cold War presented Soviet bombers and then missiles as top concerns, it could be said that balloons have always been one of the greatest practical threats to airspace sovereignty. Despite their slow travel and poor maneuverability, balloons are hard to stop.

Balloons remain surprisingly relevant today. First, modern balloons can operate at extremely high altitudes, similar to those achieved by the U-2 spy plans. This provides an advantage both in terms of observation range and secrecy. Second, balloons are notoriously difficult to detect. While the envelope is large, the material is largely transparent to RF, resulting in a very low radar cross section. Careful design of the suspended payload can give it a very low radar cross section as well... often easier than it sounds, since the payload is kept very lightweight. The sum result of these two factors is that even large balloons are difficult to detect. They are most obvious visually, but the United States and Canada have never had that substantial of a ground observer program and the idea has not been on the public mind for many decades. Many people might see a balloon before any word reached air defense.

On January 28th, a large balloon operated by China entered US airspace over Alaska. During the following week, it drifted across the country until leaving the east coast near South Carolina, where it was shot down with a Sidewinder missile. Circumstances suggest that both the Chinese and US administrations may have intended to downplay the situation to avoid ratcheting tensions, as the US government did not announce the balloon to the public until about a day after it had initially been detected entering US airspace. Publicly, China claimed it to be a weather balloon which had unintentionally drifted off course. The New York Times reports that, privately, Chinese officials told US counterparts that they had not intended for the balloon to become such a public incident and would remove it from US airspace as quickly as possible.

Modern balloons of this type are capable of a limited but surprisingly flexible form of navigation by adjusting their buoyancy, and thus altitude, to drift in different winds. Perhaps the balloon spent a week crossing the US by intention, perhaps an unfortunate coincidence of weather created a situation where they were not able to navigate it out more quickly, or perhaps some equipment failure had rendered the balloon unable to change its altitude. I tend to suspect one of the latter two since it is hard to think of China's motivation to leave the balloon so publicly over the United States. In any case, that's what happened.

We now know more about the balloon, not so much because of analysis of the wreckage (although that is occurring) but more because the military and administration have begun to share more information collected by means including a U-2 spy plane (one of few aircraft capable of meeting the balloon's altitude) and other military reconnaissance equipment. The balloon had large solar arrays to power its equipment, it reportedly had small propellers (almost certainly to control orientation of the payload frame rather than for navigation), and it bristled with antennas.

This is an important point. One of the popular reactions to the balloon was mystery at why China would employ balloons when they have a substantial satellite capability. At least for anyone with a background in remote sensing the reason is quite obvious: balloons are just a lot closer to the ground than satellites, and that means that just about every form of sensing can be performed with much lower gain and thus better sensitivity. This is true of optical systems where balloons are capable of much better spatial resolution than satellites, but also true of RF where atmospheric attenuation and distortion both become very difficult problems when observing from orbit. Further, balloons are faster and cheaper to build and launch than satellites, allowing for much more frequent reconfigurations and earlier fielding of new observation equipment. The cost and timeline on satellites is such that newly developed intelligence technology takes years to make it from the lab to the sky... Chinese intelligence balloons, on the other hand, can likely be fabricated pretty quickly.

It's useful here to return to the topic of ELINT. First, it's very likely that ELINT was a major mission of this balloon. Sensing RF emissions from military equipment at close range is invaluable in creating ELINT signatures for equipment like radar and encrypted communications systems, which directly translates into a better capability to mount an offensive from the air. SIGINT was likely also a mission. One of the advantages of ELINT collection is that the data acquired for ELINT purposes can typically be processed to glean SIGINT information, and even provides valuable material for cryptologists attempting to break codes.

ELINT is also relevant in the detection of the balloon. While the spy balloon in the recent incident was detected by conventional means, the DoD has reported that they are now able to assert that this is at least the fifth such balloon to enter US airspace. For those not familiar with ELINT methods this might be surprising, but it makes a great deal of sense. The fact that this balloon was tracked by the military for days provided ample opportunities to collect good quality ELINT signatures of the communications equipment used by the balloon. The military possesses a number of aircraft dedicated to the purpose of ELINT and SIGINT collection, such as the RC-135---a modified C-135 Stratolifter equipped with specialized antennas and hundreds of pounds of electronic equipment. These type of aircraft could orbit the balloon for hours and collect extensive recordings of raw RF emissions.

ELINT information is also collected by ground-based and orbital (satellite) assets, including a family of satellites that deploy large parabolic reflectors to collect RF signals with extremely high gain. The data collected by these platforms is likely retained in raw form, allowing for retrospective analysis. Information collected by similar means has been publicly used in the past. And this is most likely how the first four balloons were discovered: by searching historic data collected by various platforms for matching ELINT signatures. The presence of the same digital data modem as in the recent spy balloon, in US airspace, almost certainly indicates a similar Chinese asset operating in the past.

It's important to understand that the RF environment is extremely busy, with a great deal of noise originating from the many radio devices we use every day. It's simply not feasible for someone in some military facility to carefully review waterfall displays of the RF data collected by numerous ELINT assets. What is much more feasible is to develop signatures and then use automation to search for instances of similar traffic. It's the practical reality of intelligence at scale.

The discovery of the recent spy balloon has had an incredible effect on air defense. I am of the general opinion, and have occasionally argued in the past, that the US government has significantly under-invested in air defense since the end of the Cold War. While we do need to move on from the hysteria of the 1970s, the lack of investment in air surveillance and defense over the last fifty years or so has lead to an embarrassing situation: our ability to detect intrusion on our airspace is fairly poor, and when we do it can take well over an hour to get a fighter in the air to investigate it. The balloon brought this problem to the attention of not only the government but the public, and so some action had to be taken.

Primary radar [1] is quite complex. Even decades into radar technology it remains a fairly difficult problem to pick objects of interest, such as aircraft, out of "clutter"---the many objects, ranging from the ground to wind-blown dust, that can produce primary radar returns. One of the simplest approaches is to ignore objects that are not large and moving fast. This type of filtering is usually adequate for detection of aircraft, but fails entirely for some objects like balloons and sUAS that may be small and slow moving.

Further, the US and Canada are very large. Integrating data from the many radar surveillance sites and presenting it in a way that allows an air defense controller to identify suspicious objects in the sea of normal air traffic is a difficult problem, and a problem that the US has not seriously invested in for decades. The information systems used by both the FAA and NORAD for processing of radar data are almost notoriously poor. In the wake of the spy balloon, officials have admitted to the press that the military is struggling to process the data from radar systems and identify notable objects.

Air defense is one of the oldest problems in computing as an industry. One of the first (perhaps the first, depending on who you ask) networked computer systems was SAGE: an air defense radar processing system. These problems are still difficult today, but we are no longer mounting cutting-edge research and development projects to face them. Instead, we are trapped in a morass of defensed contractors and acquisition projects that take decades to deliver nothing.

In response to the discovery of the spy balloon, NORAD has changed the parameters used to process radar data to exclude fewer objects. They have also made a policy change to take action on more unknown objects than they had before. This lead directly to NORAD action to intercept several balloons over the past two weeks. There are now indications that at least some of these balloons may have been ordinary amateur radio balloons, not presenting a threat to air sovereignty at all. Some will view this as an embarrassment or indictment of NORAD's now more aggressive approach, but it's an untenable problem. If China or some other adversary is sending small balloons into our airspace, we need to make an effort to identify such balloons. But currently, no organized system or method exists to identify balloons and other miscellaneous aerial equipment.

One could argue (indeed, here I am) that up to about two weeks ago NORAD was still looking for Soviet bombers, with a minor side project of light aircraft smuggling drugs. Air defense largely ignored anything that wasn't large and actively crossing a border (or more to the point an ADIZ). And that's how about four large intelligence platforms apparently wandered in unnoticed... with UAP reports suggesting that there may be much more.

My suspicion is that the coming year will involve many changes and challenges in the way that we surveil our airspace. I think that we will likely become more restrictive in airspace management, requiring more aircraft than before to have filed flight plans. Otherwise it is very difficult to differentiate a normal but untracked object from an adversarial intelligence asset.

And indications are that adversarial intelligence assets are a very real problem. China's spy balloon program is apparently both long-running and widespread, with similar balloons observed for years in other countries as well. This shouldn't be surprising---after all, reconnaissance balloons are the oldest form of military aviation. The US and allies made enormous use of reconnaissance balloons during the Cold War, sending many thousands into the USSR. It's likely the case that we only really slowed down because our modern reconnaissance balloon projects have all become notorious defense contracting failures. We're still trying, but projects like TARS have run far overbudget and still perform poorly in operational contexts.

It might feel like this situation is new, and in terms of press reporting it is. But we should have seen it coming. In an interview following a classified briefing, Senator John Kennedy said that "These objects have been flying over us for years, many years. We've known about those objects for many years."

Robert Bigelow got into UAP research because he was searching for aliens. Maybe aliens are out there, maybe they aren't, but there is one thing we know for sure: our adversaries are out there, and they possess aviation technology at least as advanced as ours. For decades we ignored UFOs as folly, and for decades we ignored potential aviation advancements by our adversaries along with them. Now those advancements are floating across the northern United States and perhaps worse---the DNI is hoping they'll find out, if they can just get people to report what they see.

[1] Radar that operates by detecting reflections or attenuation of an RF field by an object. This is as opposed to secondary radar, more common in air traffic control, that works by "interrogating" a cooperative transponder installed on the aircraft.

--------------------------------------------------------------------------------
<- newer                                                                older ->