_____                   _                  _____            _____       _ 
  |     |___ _____ ___ _ _| |_ ___ ___ ___   |  _  |___ ___   | __  |___ _| |
  |   --| . |     | . | | |  _| -_|  _|_ -|  |     |  _| -_|  | __ -| .'| . |
  |_____|___|_|_|_|  _|___|_| |___|_| |___|  |__|__|_| |___|  |_____|__,|___|
  a newsletter by |_| j. b. crawford               home archive subscribe rss

>>> 2022-01-16 peer to peer but mostly the main peer

I have admitted on HN to sometimes writing computer.rip posts which are extensions of my HN comments, and I will make that admission here as well. A discussion recently came up that relates to a topic I am extremely interested in: the fundamental loss of peer-to-peer capability on the internet and various efforts to implement peer-to-peer distributed systems on top of the internet.

Of course, as is usual, someone questioned my contention that there really is no such thing as a distributed system on the internet with the example of Bitcoin. Having once made the mistake of a graduate thesis on Bitcoin implementation details it is one of the things I feel most confident in complaining about, and I do so frequently. Rest assured that Bitcoin's technical implementation is 1) a heaping pile of trash which ought to immediately dispel stories about Nakamoto being some kind of engineering savant, and 2) Bitcoin is no exception to the fundamental truth that the internet does not permit distributed systems.

But before we get there we need to start with history... fortunate since history is the part I really care to write about.

Some time ago I mentioned that I had a half-written blog post that I would eventually finish. I still have it, although it's now more like 3/4 written. The topic of that post tangentially involves early computer networks such as ARPANET, BITNET, ALOHAnet, etc. that grew up in academic institutions and incubated the basic concepts around which computer networks are built today. One of my claims there is that ARPANET is overrated and had a smaller impact on the internet of today than everyone thinks (PLATO and BITNET were probably both more significant), but it is no exception to a general quality most of these early networks had that has been, well, part of the internet story: they were fundamentally peer to peer.

This differentiation isn't a minor one. Many early commercial computer networks were extensions of timeshare multiple access systems. That is, they had a strictly client-server (or we could actually call it terminal-computer) architecture in which service points and service clients were completely separated. Two clients were never expected to communicate with each other directly, and the architecture of the network often did not facilitate this kind of use.

Another major type of network which predated computer networks and had influence on them were telegraph networks. Telegraph systems had long employed some type of "routing," and you can make a strong argument that the very concept of packet switching originated in telegraph systems. We tend to think of telegraph systems as being manual, morse-code based networks where routing decisions and the actual message transfer were conducted by men wearing green visors. By the 1960s, though, telegraph networks were gaining full automation. Messages were sent in baudot (a 5-bit alphabetical encoding) with standardized headers and trailers that allowed electromechanical equipment and, later, computers to route them from link to link automatically. The resemblance to packet switched computer networks is very strong, and by most reasonable definitions you could say that the first wide-scale packet network in the US was that of Western Union [1].

Still, these telegraph networks continued to have a major structural difference from what we now consider computer networks. Architecturally they were "inverted" from how we think of hardware distribution on computer networks: they relied on simple, low-cost, low-capability hardware at the customer site, and large, complex routing equipment within the network. In other words, the "brains of the operation" were located in the routing points, not in the users equipment. This was useful for operators like Western Union in that it reduced the cost of onboarding users and let them perform most of the maintenance, configuration, upgrades etc. at their central locations. It was not so great for computer systems, where it was desirable for the computers to have a high degree of control over network behavior and to minimize the cost of interconnecting computers given that the computers were typically already there.

So there are two significant ways in which early computer networks proper were differentiated from time-sharing and telegraph networks, and I put both of them under the label of "network of equals," a business term that is loosely equivalent to "peer to peer" but more to our point. First, a computer network allows any node to communicate with any other node. There is no strict definition of a "client" or "server" and the operation of the network does not make any such assumptions as to the role of a given node. Second, a computer network places complexity at the edge. Each node is expected to have the ability to make its own decisions about routing, prioritization, etc. In exchange, the "interior" equipment of the network is relatively simple and does not restrict or dictate the behavior of nodes.

A major manifestation of this latter idea is distributed routing. Most earlier networks had their routes managed centrally. In the phone and telegraph networks, the maintenance of routing tables was considered part of "traffic engineering," an active process performed by humans in a network operations center. In order to increase flexibility, computer networks often found it more desirable to generate routing tables automatically based on exchange of information between peers. This helped to solidify the "network of equals" concept by eliminating the need for a central NOC with significant control of network operations, instead deferring routing issues to the prudence of the individual system operators.

Both of these qualities make a great deal of sense in the context of computer networking having been pioneered by the military during the Cold War. Throughout the early days of both AUTODIN (the military automated telegraph network) and then ARPANET which was in some ways directly based on it, there was a general atmosphere that survivability was an important characteristic of networks. This could be presented specifically as survivability in nuclear war (which we know was a key goal of basically all military communications projects at the time), but it has had enduring value outside of the Cold War context as we now view a distributed, self-healing architecture as being one of the great innovations of packet-switched computer networks. The fact that this is also precisely a military goal for survival of nuclear C2 may have more or less directly influenced ARPANET depending on who you ask, but I think it's clear that it was at least some of the background that informed many ARPANET design decisions.

It might help illustrate these ideas to briefly consider the technical architecture of ARPANET, which while a somewhat direct precursor to the modern internet is both very similar and very different. Computers did not connect "directly" to ARPANET because at the time it was unclear what such a "direct" connection would even look like. Instead, ARPANET participant computers were connected via serial line to a dedicated computer called an interface message processor, or IMP. IMPs are somewhat tricky to map directly to modern concepts, but you could say that they were network interface controllers, modems, and routers all in one. Each IMP performed the line coding to actually transmit messages over leased telephone lines, but also conducted a simple distributed routing algorithm to determine which leased telephone line a message should be sent over. The routing functionality is hard to describe because it evolved rapidly over the lifetime of ARPANET, but it was, by intention, both simple and somewhat hidden. Any computer could send a message to any other computer, using its numeric address, by transferring that message to an IMP. The IMPs performed some internal work to route the message but this was of little interest to the computers. The IMPs only performed enough work to route the message and ensure reliable delivery, they did not do anything further and certainly nothing related to application-level logic.

Later, ARPANET equipment contractor BBN, along with the greater ARPANET project, would begin to openly specify "internal" protocols such as for routing in order to allow the use of non-BBN IMPs. Consequentially, much of the functionality of the IMP would be moved into the host computers themselves, while the routing functionality would be moved into dedicated routing appliances. This remains more or less the general idea of the internet today.

Let's take these observations about ARPANET (and many, but not all, other early computer networks) and apply them to the internet today.

Peer-to-Peer Connectivity

The basic assumption that any computer could connect to any other computer at will proved problematic by, let's say, 1971, when an employee of BBN created something we might now call a computer worm as a proof of concept. The problem was far greater as minicomputers and reduced connectivity costs significantly increased the number of hosts on the internet, and by the end of the '90s network-based computer worms had become routine. Most software and the protocols it used were simply not designed to operate in an adversarial climate. But, with the internet now so readily accessible, that's what it became. Computers frequently exposed functionality to the network that was hazardous when made readily available to anonymous others, with Windows SMB being a frequent concern [2].

The scourge of internet-replicating computer worms was stopped mostly not by enhancements in host security but instead as an incidental effect of the architecture of home internet service. Residential ISPs had operated on the assumption that they provided connectivity to a single device, typically using PPP over some type of telephone connection. As it became more common for home networks to incorporate multiple devices (especially after the introduction of WiFi) residential ISPs did not keep pace. While it was easier to get a subnet allocation from a residential ISP back then than it is now, it was very rare, and so pretty much all home networks employed NAT as a method to make the entire home network look, to the internet, like a single device. This remains the norm today.

NAT, as a side effect of its operation, prohibits all inbound connections not associated with an existing outbound one.

NAT was not the first introduction of this concept. Already by the time residential internet was converging on the "WiFi router" concept, firewalls had become common in institutional networks. These early firewalls were generally stateless and so relatively limited, and a common configuration paradigm (to this day) was to block all traffic that did not match a restricted set of patterns for expected use.

Somewhere along this series of incremental steps, a major change emerged... not by intention so much as by the simple accretion of additional network controls.

The internet was no longer peer to peer.

Today, the assumption that a given internet host can connect to another internet host is one where exceptions are more common than not. The majority of "end user" hosts are behind NAT and thus cannot accept inbound connections. Even most servers are behind some form of network policy that prevents them accepting connections that do not match an externally defined list. All of this has benefits, there is an upside, but there is also a very real downside, which is that the internet has effectively degraded to a traditional client-server architecture.

One of the issues that clearly illustrated this to many of my generation was multiplayer video games. Most network multiplayer games of the '90s to the early '00s were built on the assumption that one player would "host" a game and other players would connect to them. Of course as WiFi routers and broadband internet became common, this stopped working without extra manual configuration of the NAT appliance. Similar problems were encountered by youths in peer-to-peer systems like BitTorrent and, perhaps this will date me more than anything else, eD2k.

But the issue was not limited to such frivolous applications. Many early internet protocols were designed with the peer-to-peer architecture of the internet baked into the design. FTP is a major example we encounter today, which was originally designed under the assumption that the server could open a connection to the client. RTMP and its entire family of protocols, including SIP which remains quite relevant today, suffer from the same basic problem.

For any of these use-cases to work today, we have had to clumsily re-invent the ability for arbitrary hosts to connect to each other. WebRTC, for example, is based on RTMP and addresses these problems for the web by relying on STUN and TURN, two separate but related approaches to "NAT negotiation." Essentially every two-way real-time media application must take a similar approach, which is particularly unfortunate since TURN introduces appreciable overhead as well as privacy and security implications.

Complexity at the Edge

This is a far looser argument, but I think one that is nonetheless easy to make confidently: the concept of complexity at the edge has largely been abandoned. This happens more at the application layer than at lower layers, since the lower layers ossified decades ago. But almost all software has converged on the web platform, and the web platform is inherently client-server. Trends such as SPAs have somewhat reduced the magnitude of the situation as even in web browsers some behavior can happen on the client-side, but a look at the larger ecosystem of commercial software will show you that there is approximately zero interest in doing anything substantial on an end-user device. The modern approach to software architecture is to place all state and business logic "in the cloud."

Like the shift away from P2P, this has benefits but also has decided disadvantages. Moreover, it has been normalized to the extent that traditional desktop development methods that were amenable to significant complexity at the client appear to be atrophying on major platforms.

As the web platform evolves we may regain some of the performance, flexibility, and robustness associated with complexity at the edge, but I'm far from optimistic.

Peer-to-Peer and Distributed Applications Today

My contention that the internet is not P2P might be surprising to many as there is certainly a bevy of P2P applications and protocols. Indeed, one of the wonders of software is that with sufficient effort it is possible to build a real-time media application on top of a best-effort packet-switched network... the collective ingenuity of the software industry is great at overcoming the limitations of the underlying system.

And yet, the fundamentally client-server nature of the modern internet cannot be fully overcome. P2P systems rely on finding some mechanism to create and maintain open connections between participants.

Early P2P systems such as BitTorrent (and most other file sharing systems) relied mostly on partial centralization and user effort. In the case of BitTorrent, for example, trackers are centralized services which maintain a database of available peers. BitTorrent should thus be viewed as a partially centralized system, or perhaps better as a distributed system with centralized metadata (this is an extremely common design in practice, and in fact the entire internet could be described this way if you felt like it). Further BitTorrent assumes that the inbound connection problem will be somehow solved by the user, e.g. by configuring appropriate port forwarding or using a local network that supports automated mechanisms such as UPnP.

Many P2P systems in use today have some sort of centralized directory or metadata service that is used for peer discovery, as well as configuration requirements for the user. But more recent advances in distributed methods have somewhat alleviated the needs for this type of centralization. Methods such as distributed hole punching (both parties initiating connections to each other at the same time to result in appropriate conntrack entries at the NAT devices) allow two arbitrary hosts to connect, but require that there be some type of existing communications channel to facilitate that connection.

Distributed hash tables such as the Kademlia DHT are a well understood method for a distributed system to share peer information, and indeed BitTorrent has had it bolted on as an enhancement while many newer P2P systems rely on a DHT as their primary mechanism of peer discovery. But for all of these there is a bootstrapping problem: once connected to other DHT peers you can use the DHT to obtain additional peers. But, this assumes that you are aware of at least a single DHT peer to begin with. How do we get to that point?

You could conceivably search the entire internet space, but given the size of the internet that's infeasible. The next approach you might reach for is some kind of broadcast or multicast, but for long-standing abuse, security, and scalability reasons broadcast and multicast cannot be routed across the internet. Anycast offers similar potential and is feasible on the internet, but it requires the cooperation of an AS owner which would be both centralized and require a larger up-front investment than most P2P projects are interested in.

Instead, real P2P systems address this issue by relying on a centralized service for initial peer discovery. There are various terms for this that are inconsistent between P2P protocols and include introducer, seed, bootstrap node, peer helper, etc. For consistency, I will use the term introducer, because I think it effectively describes the concept: an introducer is a centralized (or at least semi-centralized) service that introduces new peers to enough other peers that they can proceed from that point via distributed methods.

As a useful case study, let's examine Bitcoin... and we have come full circle back to the introduction, finally. Bitcoin prefers the term "seeding" to refer to this initial introduction process. Dating back to the initial Nakamoto Bitcoin codebase, the Bitcoin introduction mechanism was via IRC. New Bitcoin nodes connected to a hardcoded IRC server and joined a harcoded channel, where they announced their presence and listened for other announcements. Because IRC is a very simple and widely implemented message bus, this use of IRC had been very common. Ironically it was its popularity that lead to its downfall: IRC-based C2 was extremely common in early botnets, at such a level that it has become common for corporate networks to block or at least alert on IRC connections, as the majority of IRC connections out of many corporate networks are actually malware checking for instructions.

As a result, and for better scalability, the Bitcoin project changed the introduction mechanism to DNS, which is more common for modern P2P protocols. The Bitcoin Core codebase includes a hardcoded list of around a dozen DNS names. When a node starts up, it queries a few of these names and receives a list of A records that represent known-good, known-accessible Bitcoin nodes. The method by which these lists are curated is up to the operators of the DNS seeds, and it seems that some are automated while some are hand-curated. The details don't really matter that much, as long as it's a list that contains a few contactable peers so that peer discovery can continue from there using the actual Bitcoin protocol.

Other projects use similar methods. One of the more interesting and sophisticated distributed protocols right now is Hypercore, which is the basis of the Beaker Browser, in my opinion the most promising distributed web project around... at least in that it presents a vision of a distributed web that is so far not conjoined at the hip with Ethereum-driven hypercapitalism. Let's take a look at how Hypercore and its underlying P2P communications protocol Hyperswarm address the problem.

Well, it's basically the exact same way as Bitcoin with one less step of indirection. When new Hypercore nodes start up, they connect to bootstrap1.hyperdht.org through bootstrap3.hyperdht.org, each of which represents one well-established Hypercore node that can be used to get a toehold into the broader DHT.

This pattern is quite general. The majority of modern P2P systems bootstrap by using DNS to look up either a centrally maintained node, or a centrally maintained list of nodes. Depending on the project these introducer DNS entries may be fully centralized and run by the project, or may be "lightly decentralized" in that there is a list of several operated by independent people (as in the case of Bitcoin). While this is slightly less centralized it is only slightly so, and does not constitute any kind of real distributed system.

Part of the motivation for Bitcoin to have multiple independently operated DNS seeds is that they are somewhat integrity sensitive. Normally the Bitcoin network cannot enter a "split-brain" state (e.g. two independent and equally valid blockchains) because there are a large number of nodes which are strongly interconnected, preventing any substantial number of Bitcoin nodes being unaware of blocks that other nodes are aware of. In actuality Bitcoin enters a "split-brain" state on a regular basis (it's guaranteed to happen by the stochastic proof of work mechanism), but as long as nodes are aware of all "valid" blockchain heads they have an agreed upon convention to select a single head as valid. This method can sometimes take multiple rounds to converge, which is why Bitcoin transactions (and broadly speaking other blockchain entries) are not considered valid until multiple "confirmations"---this simply provides an artificial delay to minimize the probability of a transaction being taken as valid when the Bitcoin blockchain selection algorithm has not yet converged across the network.

But this is only true of nodes which are already participating. When a new Bitcoin node starts for the first time, it has no way to discover any other nodes besides the DNS seeds. In theory, if the DNS seeds were malicious, they could provide a list of nodes which were complicit in an attack by intentionally not forwarding any information about some blocks or advertisements of nodes which are aware of those blocks. In other words, in practice the cost of a sybil attack is actually reduced to the number of nodes directly advertised by the DNS seeds, but only for new users and only if the DNS seeds are complicit. In practice the former is a massive limitation. The Bitcoin project allows only trusted individuals, but also multiple such individuals, to operate DNS seeds in order to mitigate the latter. In practice the risk is quite low, mostly due to the limited impact of the attack rather than its difficulty level (very few people are confirming Bitcoin transactions using a node which was just recently started for the first time).


One of the painful points here is that multicast and IGMP make this problem relatively easy on local networks, and indeed mDNS/Avahi/Bonjour/etc solve this problem on a daily basis, in a reasonably elegant and reliable way, to enable things like automatic discovery of printers. Unfortunately we cannot use these techniques across the internet because, among other reasons, IGMP does not manageably scale to internet levels.

P2P systems can use them across local networks, though, and there are P2P systems (and even non-P2P systems) which use multicast methods to opportunistically discover peers on the same local network. When this works, it can potentially eliminate the need for any centralized introducer. It's, well, not that likely to work... that would require at least one, preferably more than one, fully established peer on the same local network. Still, it's worth a shot, and Hypercore for example does implement opportunistic peer discovery via mDNS zeroconf.

Multicast presents an interesting possibility: much like the PGP key parties of yore, a P2P system can be bootstrapped without dependence on any central service if its users join the same local network at some point. For sufficiently widely-used P2P systems, going to a coffee shop with a stable, working node once in order to collect initial peer information will likely be sufficient to remain a member of the system into the future (as long as there are enough long-running peers with stable addresses that you can still find some and use them to discover new peers weeks and months into the future).

Of course by that point we could just as well say that an alternative method to bootstrapping is to call your friends on the phone and ask them for lists of good IP addresses. Still, I like my idea for its cypherpunk aesthetics, and when I inevitably leave my career to open a dive bar I'll be sure to integrate it.

Hope for the future

We have seen that all P2P systems that operate over the internet have, somewhere deep down inside, a little bit of centralized original sin. It's not a consequence of the architecture of the internet so much as it's a consequence of the fifty years of halting change that has brought the internet to its contemporary shape... changes that were focused around the client-server use cases that drive commercial computing for various reasons, and so had the network shaped in their image to the extent of exclusion of true P2P approaches.

Being who I am it is extremely tempting to blame the whole affair on capitalism, but of course that's not quite fair. There are other reasons as well, namely that the security, abuse, stability, scalability, etc. properties of truly distributed systems are universally more complex than centralized ones. Supporting fully distributed internet use cases is more difficult, so it's received lower priority. The plurality of relatively new P2P/distributed systems around today shows that there is some motivation to change this state of affairs, but that's mostly been by working around internet limitations rather than fixing them.

Fixing those limitations is difficult and expensive, and despite the number of P2P systems the paucity of people who actually use them in a P2P fashion would seem to suggest that the level of effort and cost is not justifiable to the internet industry. The story of P2P networking ends where so many stories about computing end: we got here mostly by accident, but it's hard to change now so we're going to stick with it.

I've got a list of good IP addresses for you though if you need it.

[1] WU's digital automatic routing network was developed in partnership with RCA and made significant use of microwave links as it expanded. Many have discussed the way that the US landscape is littered in AT&T microwave relay stations, fewer know that many of the '60s-'80s era microwave relays not built by AT&T actually belonged to Western Union for what we would now call data networking. The WU network was nowhere near as extensive as AT&T's but was particularly interesting due to the wide variety of use-cases it served, which ranged from competitive long distance phone circuits to a a very modern looking digital computer interconnect service.

[2] We should not get the impression that any of these problems are in any way specific to Windows. Many of the earliest computer worms targeted UNIX systems which were, at the time, more common. UNIX systems were in some ways more vulnerable due to their relatively larger inventory of network services available, basically all of which were designed with no thought towards security. Malware developers tended to follow the market.


>>> 2022-01-01 secret military telephone buttons

It's the first of the new year, which means we ought to do something momentous to mark the occasion, like a short piece about telephones. Why so much on telephones lately? I think I'm just a little burned out on software at the moment and I need a vacation before I'm excited to write about failed Microsoft ventures again, but the time will surely come. Actually I just thought of a good one I haven't mentioned before, so maybe that'll be next time.

Anyway, let's talk a little bit about phones, but not quite about long distance carriers this time. Something you may or may not have noticed about the carriers we've discussed, perhaps depending on how interesting you find data communications, is that we have covered only the physical layer. So far, there has been no consideration of how switches communicated in order to set up and tear down connections across multiple switches (i.e. long distance calls). Don't worry, we will definitely get to this topic eventually and there's plenty to be said about it. For the moment, though, I want to take a look at just one little corner of the topic, and that's multifrequency tone systems.

Most of us are at least peripherally familiar with the term "dual-tone multifrequency" or "DTMF." AT&T intended to promote Touch-Tone as the consumer friendly name for this technology, but for various reasons (mainly AT&T's trademark) most independent manufacturers and service providers have stuck to the term DTMF. DTMF is the most easily recognizable signaling method in the telephone system: it is used to communicate digital data over phone lines, but generally only for "meta" purposes such as connection setup (i.e. dialed digits). An interesting thing about DTMF that makes it rather recognizable is that it is in-band, meaning that the signals are sent over the same audio link as the phone call itself... and if your telephone does not mute during DTMF (some do but most do not), you can just hear those tones.

Or, really, I should say: if your phone just makes the beep boop noises for fun pretend purposes, like cellphones, which often emit DTMF tones during dialing even though the cellular network uses entirely on-hook dialing and DTMF is not actually used as part of call setup. But that's a topic for another day.

DTMF is not the first multi-frequency signaling scheme. It is directly based on an earlier system called, confusingly, multifrequency or MF. While DTMF and MF have very similar names, they are not compatible, and were designed for separate purposes.

MF signaling was designed for call setup between switches, mostly for long-distance calling. Whenever a call requires a tandem switch, so say you call another city, your telephone switch needs to connect you to a trunk on a tandem switch but also inform the tandem switch of where you intend to call. Historically this was achieved by operators just talking to each other over the trunk before connecting it to your local loop, but in the era of direct dialing an automated method was needed. Several different techniques were developed, but MF was the most common for long-distance calling in the early direct dial era.

An interesting thing about MF, though, is that it was put into place in a time period in which some cities had direct long distance dialing but others did not. As a result, someone might be talking to an operator in order to set up a call to a city with direct dial. This problem actually wasn't a new one, the very earliest direct dialing implementations routinely ran into this issue, and so it became common for operators switchboards to include a telephone dial mounted at each operator position. The telephone dial allowed the operator to dial for a customer, and was especially important when connecting someone into a direct dial service area.

MF took the same approach, and so one could say that there were two distinct modes for MF: in machine-to-machine operation, a telephone switch automatically sent a series of MF tones after opening a trunk, mainly to forward the dialed number to the next switch in the series. At the same time, many operators had MF keypads at their positions that allowed them to "dial" to a remote switch by hand. The circuitry that implemented these keypads turned more or less directly into the DTMF keypads we see on phones today.

Like DTMF, MF worked by sending a pair of two frequencies [1]. The frequencies were selected from the pool of 700, 900, 1100, 1300, 1500, and 1700Hz. That's six frequencies, and it is required that two frequencies always be used, so the number of possible symbol is 6c2 or 15. Of course we have the ten digits, 0-9, but what about the other five? The additional five possibilities were used for control symbols. For reasons that are obscure to me, the names selected for the control symbols were Key Pulse or KP and Start or ST. Confusingly, KP and ST each had multiple versions and were labeled differently by different equipment. The closest thing to a universal rule would be to say that MF could express the symbols 0-9, KP1-KP2, and ST1-ST3.

Part of the reason that the labeling of the symbols was inconsistent is that their usage was somewhat inconsistent from switch to switch. Generally speaking, an operator would connect to a trunk and then press KP1, the number to be called, and then ST1. KP1 indicated to the far-side switch that it should set up for an incoming connection (e.g. by assigning a sending unit or other actions depending on the type of switch), while ST1 indicated that dialing was complete. Most of the time telephone switches used other means (digit-matching based on "dial plans") to determine when dialing was complete, but since tandem switches handled international calls MF was designed to gracefully handle arbitrary length phone numbers (due to both variance between countries and the bizarre choice of some countries to use variable-length phone numbers).

The additional KP and ST symbols had different applications but were most often used to send "additional information" to the far side switch, in which case the use of one of the control symbols differentiated the extra digits (e.g. an account number) from the phone number.

MF keypads were conventionally three columns, two columns of digits (vertically arranged) and one of control symbols on the right.

This is a good time to interject a quick note: the history of MF signaling turns out to be surprisingly obscure. I had been generally aware of it for years, I'm not sure why, but when I went to read the details I was surprised by... how few details there are. Different sources online conflict about basic facts (for example, Wikipedia lists 6 frequencies which is consistent with the keypad I have seen and the set of symbols, but a 1960 BSTJ overview article says there were only five...). So far as I can tell, MF was never formally described in BSTJ or any other technical journal, and I can't find any BSPs describing the components. I suspect that MF was an unusually loose standard for the telephone system, and that the MF implementation on different switches sometimes varied significantly. This is not entirely surprising since the use of MF spanned from manual exchanges to modern digital exchanges (it is said to still be in use in some areas today, although I am not aware of any examples), covering around 80 years of telephone history.

I didn't really intend to go into so much detail on MF here, but it's useful to understand my main topic: DTMF. MF signaling went into use by the late 1940s (date unclear for the reasons I just discussed), and by 1960 was considered a main contender for AT&T's goal of introducing digital signaling not just between switches but also from the subscriber to the switch [2]. A few years later, AT&T introduced Touch-Tone or DTMF dialing. Unsurprisingly, DTMF is really just MF with some problems solved.

MF posed a few challenges for use with subscriber equipment. The biggest was the simple placement of the frequencies. The consistent 200 Hz separation meant that certain tones were subject to harmonics and other intermodulation products from other tones, requiring high signal quality for reliable decoding. That wasn't much of a problem on toll circuits which were already maintained to a high standard, but local loops were routinely expected to work despite very poor quality, and there was a huge variety of different equipment in use on local loops, some of which was very old and easily picked up spurious noise.

Worse, the MF frequencies were placed in a range that was fairly prominent in human speech. This resulted in a high risk that a person talking would be recognized by an MF decoder as a symbol, which could create all kind of headache. This wasn't really a problem for MF because MF keypads were designed to disconnect the subscriber when digits were pressed. DTMF, though, was intended to be simpler to implement and convenient to use while in a call, which made it challenging to figure out how to disconnect or "mute" both parties during DTMF signaling.

To address these issues, a whole new frequency plan was devised for DTMF. The numbers and combinations all seem a bit odd, but were chosen to avoid any kind of potential intermodulation artifacts that would be within the sensitivity range of the decoder. DTMF consisted of eight frequencies, which were organized differently, into a four by four grid. A grid layout was used, in which there is one set of "low" frequencies and one set of "high" frequencies and "low" was never mixed with "low" and vice versa, because it allowed much tighter planning of the harmonics that would result from mixing the frequencies.

So, we can describe DTMF this way: there are four rows and four columns. The four rows are assigned 697, 770, 852, and 941 Hz, while the four columns are 1209, 1336, 1477, and 1633 Hz. Each digit consists of one row frequency and one column frequency, and they're laid out the same way as the keypad.

Wait a minute... four rows, four columns?

DTMF obviously needed to include the digits 0-9. Some effort was put into selecting the other available symbols, and for various reasons * and # were chosen as complements to the digits (likely owing to their common use in typewritten accounting and other business documents at the time). That makes up 12 symbols, the first three columns. The fourth column, intended mostly for machine-to-machine applications [3], was labeled A, B, C, and D.

Ever since, DTMF has featured the mysterious symbols A-D, and they have seen virtually no use. It is fairly clear to me that they were only included originally because DTMF was based directly on MF and so tried to preserve the larger set of control symbols and in general a similar symbol count. The engineers likely envisioned DTMF taking over as a direct replacement for MF signaling in switch-to-switch signaling, which did happen occasionally but was not widespread as newer signaling methods were starting to dominate by the time DTMF was the norm. Instead, they're essentially vestigial.

One group of people which would be generally aware of the existence of A-D are amateur radio operators, as the DTMF encoders in radios almost always provide a full 4x4 keypad and it is somewhat common for A-D to be used for controlling telephone patches---once the telephone patch is connected, 0-9, *, and # will be relayed directly to the phone network, A-D provide an opportunity for four symbols that are reserved for the patch itself to respond to.

Another group of people to which this would be familiar is those in the military from roughly the '70s to the '90s, during the period of widespread use of AUTOVON. While AUTOVON was mostly the same as the normal telephone network but reserved for military use, it introduced one major feature that the public telephone system lacked: a precedence, or priority system.

Normally dialed AUTOVON calls were placed at "routine" priority, but "priority," "immediate," "flash," and "flash override" were successively higher precedence levels reserved for successively more important levels of military command and control. While it is not exactly true, it is almost true, and certainly very fun to say, that AUTOVON telephones feature a button that only the President of the United States is allowed to press. The Flash Override or FO button was mostly reserved for use by the national command authority in order to invoke a nuclear attack, and as you would imagine would result in AUTOVON switches abruptly terminating any other call as necessary to make trunks available.

AUTOVON needed some way for telephones to indicate to the switch what the priority of the call was, and so it was obvious to relabel the A, B, C, and D DTMF buttons as FO, F, I, and P respectively. AUTOVON phones thus feature a full 4x4 keypad, with the rightmost column typically in red and used to prefix dialed calls with a precedence level. Every once in a while I have thought about buying one of these phones to use with my home PABX but they tend to be remarkably expensive... I think maybe restorers of military equipment are holding up prices.

And that's what I wanted to tell you: the military has four extra telephone buttons that they don't tell us about. Kinda makes you take the X-files a little more seriously, huh?

In all seriousness, though, they both do and don't today. Newer military telephone systems such as DSN and the various secure VoIP systems usually preserve a precedence feature but offer it using less interesting methods. Sometimes it's by prefixing dialing with a numeric code, sometimes via feature line keys, but not by secret DTMF symbols.

[1] This was technically referred to as a "spurt" of MF, a term which I am refusing to use because of my delicate sensibilities.

[2] One could argue that pulse dialing was "digital," but because it relied on the telephone interrupting the loop current it was not really "in-band" in the modern sense and so could not readily be relayed across trunks. Much of the desire for a digital subscriber signaling system was for automated phone systems, which could never receive pulses since they were "confined" to the local loop. Nonetheless DTMF was also useful for the telephone system itself and enabled much more flexible network architectures, especially related to remote switches and line concentrators, since DTMF dialing could be decoded by equipment "upstream" from wherever the phone line terminated without any extra signaling equipment needed to "forward" the pulses.

[3] This might be a little odd from the modern perspective but by the '60s machine-to-machine telephony using very simple encodings was becoming very popular... at least in the eyes of the telephone company, although not always the public. AT&T was very supportive of the concept of telephones which read punched cards and emitted the card contents as DTMF. In practice this ended up being mostly used as a whimsical speed-dial, but it was widely advertised for uses like semi-automated delivery of mail orders (keypunch them in the field, say going door to door, and then call an electromechanical order taking system and feed them all through your phone) and did see those types of use for some time.


>>> 2021-12-26 diy mail

I have another post about half-written that I will finish up and publish soon, but in the mean time I have been thinking today about something that perennially comes up in certain orange-tinted online communities: running your own mail server.

I have some experience that might allow me to offer a somewhat nuanced opinion on the matter. Some years ago I was a primary administrator of a mailserver for a small university (~3k users), and today I operate two small mailservers, one for automated use and one that has a small number of active human users. On the other hand, while I operated a mailserver for my own personal email for years I have now been using Fastmail since around 2015 and have had no regrets.

My requirements are perhaps a little unusual, and that no doubt colors my opinion on options for email: I virtually never use webmail, instead relying on SMTP/IMAP clients (mostly Thunderbird on desktop and Bluemail on Android, although I would be hesitant to endorse either very strongly due to long-running stability and usability problems). A strong filtering capability is important to me, but I am relatively lax about spam filtering as I normally review my junk folder manually anyway. I am very intolerant of any deliverability problems as they have caused things like missed job opportunities in the past.

The software stack that I normally use for email is a rather pedestrian one that forms the core of many institutional and commercial email services: postfix and dovecot, working off of a mix of Unix users and virtual ones. Typically I have automated management through tools that output postfix map text files rather than by moving postfix maps to e.g. an SQL server, although I have worked with one or two mailservers configured that way. I have been from Squirrelmail to Roundcube to Rainloop for webmail, although as mentioned I do not really use webmail personally.

My preference is to use Dovecot via LMTP and move as much functionality as possible (auth, filtering, etc) into Dovecot rather than Postfix. I have always used SpamAssassin and fail2ban to control log noise and brute force attempts.

All of this said, one of the great frustrations to email servers, especially to the novice, is that even for popular combinations like Postfix/Dovecot there are multiple ways to architect the mail delivery, storage, and management process. For example, there are at least 4-5 distinct architectural options for configuring Postfix to make mail available to Dovecot. Different distributions may package these services pre-configured for one approach or the other, or with nothing pre-configured at all. In fact, mail servers are an area where your choice of distribution can matter a great deal: under some Linux distributions, like RHEL, simply installing a few packages will result in a mostly working mailserver configured for the most common architecture. Under other distributions the packages will leave you with an empty or nonexistent configuration and you will have a lot of reading to do before you get to the most basic working configuration.

The need to support some of the newer anti-spam/anti-abuse technologies introduces some further complication, as you'll need to figure out a DKIM signing service and get it inserted into the mail flow at the right point. Because of the historic underlying architecture of most MTAs/MDAs, this can actually be surprisingly confusing as it's often difficult to "hook" into mail processing at a point where you can clearly differentiate email that is logically "inbound" and "outbound" [1].

Finally, as a general rule mail-related software tends to be "over-architected" (e.g. follows "the Unix philosophy" in all the worst ways and few of the good ones) and fairly old [2]. This makes a basic working body of configuration surprisingly large and complex, and the learning curve can be formidable. It took me years to feel generally conversant in the actual care and feeding of Postfix, for example, which like many of these older network services has a lot of fun details like its own service management and worker pooling system.

All of this goes to explain that configuring a mailserver has one of the steeper learning curves of common network services. Fortunately, a number of projects have appeared which aim to "auto-configure" mailservers for typical use-cases. Mail-in-a-box, iRedMail, etc. promise ten minutes to a fully working mailserver.

These projects, along with a general desire by many in the tech industry to reduce their reliance on ad-supported services of major tech companies, have resulted in a lot of ongoing discussion about the merits of running your own mail. Almost inevitably these threads turn into surprisingly argumentative back-and-forths about the merits of self-hosted mail, the maintenance load, deliverability, and so on.

Years ago, before I had any sort of coherent place to put my writing, I wrote an essay about email motivated by Ray Tomlinson's death: Obituary, for Ray Tomlinson and Email. I will not merely repeat the essay here, in large part because it's mostly philosophical in nature and I intend to stay very practical in this particular message. The gist of it, though, is that email as we now know it was one of the first real federated systems and is also, in my opinion, one of the last. The tremendous success of the SMTP ecosystem has also revealed the fundamental shortcomings of federated/distributed systems, in that the loosely federated nature of email leads to daily real-world frustrations that show close to zero sign of ever improving.

There are practical implications to these more theoretical problems, and they're the same ones that repeatedly tank efforts at decentralized communications and social media. In the case of email, they are particularly severe, as the problems emerged after email became widely used. Instead of killing the concept or causing a redesign to eliminate the defects of the federated design, in the case of email we just have workarounds and mitigations. These are where most of the real complexity of email lies.


The most obvious, and usually largest, problem that any decentralized communications system will run into is spam. There are various theoretical approaches to mitigating this issue, such as proof of work, but in practice real-world communications products pretty consistently mitigate spam by requiring a proof of identity that is difficult to produce en masse.

The most common by far is a telephone number. Complaints about Telegram and Signal requiring that accounts be associated with a phone number are widespread (I am one of the people complaining!), but they often miss that this simple (if irritating to some) step is incredibly effective in reducing spam. This tends to turn into a real "I found the exception and therefore you are wrong" kind of conversation, so let me acknowledge that there are plenty of ways to come up phone numbers that will pass the typical checks used by these service. But that doesn't in any way invalidate the concept: all these methods of obtaining phone numbers are relatively expensive and volume limited, so they don't undermine the basic goal of using SMS validation of a phone number to require a high effort level to register multiple accounts. The very low volume of outright spam on both Telegram and Signal as an indication of the success of this basic strategy.

Of course requiring a validated telephone number as part of identity is a substantial compromise on privacy and effectively eliminates identity compartmentalization (the mind boggles at the popularity of Telegram with furries in consideration of this issue, as compared to common furry use patterns on services like Twitter that do facilitate compartmentalization). But there's a more significant problem: it is predicated on centralization. Sure, it's theoretically possible to implement this in a distributed fashion, but there's a few reasons that no one is going to. For properly federated services it's a non-starter, as unless you significantly compromise on the basic idea of federation you are reliant on all members of the federation to perform their own validation of users against a scarce proof of identity... but the federation members themselves are frequently crooked.

In other words, in some ways this approach has been applied to email as popular free email hosts like Google and Microsoft are increasingly pushing telephone validation as a requirement on accounts. But that only protects their own abuse handling resources. Email being federated means that you need to accept mail from other servers, and you don't know what their validation policy is.

This is a fundamental problem. Federated systems impose significant limits on any kind of identity or intent validation for users, and so spam mitigation needs to be done at the level of nodes or instances instead. This tends to require an ad-hoc "re-centralization" in the form of community consensus policy, blocklists of instances, etc. Modern federated systems still handle this issue fairly poorly, but email, due to its age, lacks even the beginning of a coordinated solution.

Instead, more reactive measures have had to be taken to protect the usability of email, and those are the elephant in the room in all discussions of self-hosted email. They have significant implications for self-hosted email operators.

Most spam filtering solutions rely on some degree of machine learning or dynamic tuning. Smaller email operators have an inherently harder time performing effective spam blocking because of the smaller set of email available for ongoing training. In practice this problem doesn't seem to be that big and SpamAssassin mostly performs okay without significant additional training, but the issue does exist, it's just not too severe.

Because mail servers come and go, and malicious/spam email often comes from new mail servers, major email operators depend heavily on IP reputation and tend to automatically distrust any new mail server. This leads to a few problems. First, cheap or easy-to-access hosting services (from AWS to Uncle Ed's Discount VPS) almost always have ongoing problems with fly-by-night customers using their resources to send spam, which means that they almost always have chunks of their IP space on various blocklists. This is true from the sketchiest to the most reputable, although the problem tends to be less severe as you get towards the Oracle(R) Enterprise Cloud(TM) version of the spectrum.

These issues can make DIY email a non-starter, as if you rely on a commodity provider there's a fair chance you'll just get a "bad IP" and have a bit of an ongoing struggle to get other providers to trust anything you send. That said, it's also very possible to get recycled IPs that have no issues at all... it tends to be a gamble. Less commodity, more bespoke service providers can usually offer some better options here. In the best case, you may be able to obtain IP space that hasn't been used in a long time and so is very unlikely to be on any blocklists or reputation lists. This is ideal but doesn't happen so often in this era of IPv4 exhaustion. As the next best thing, many providers that have a higher-touch sales process (i.e. not a "cloud" provider) maintain a sense of "good quality" IPs that have a history of use only by trusted clients. If you spend enough money with them you can probably get some of these.

On the other hand, most cheap VPS and cloud providers are getting their IP space at auction, which has a substantial risk of resulting in IP space with a history of use by a large-scale organized email spam operation. If you spend much time looking at websites like LowEndBox you'll see this happening a lot.

Even if you get an IP with no reputational problems, you will still run into the worst part of this IP reputation issue: IPs with no history are themselves suspicious. Most providers have logic in place that is substantially more likely to reject or flag as spam any email coming an IP address without a history of originating reliable email. Large-scale email operations contend with this by "warming up" IPs, using them to send progressively more traffic over time in order to build up positive reputation. As an individual with a single IP you are not going to be able to do this in such a pre-planned way, but it does mean that things will get better over time.

A frustrating element of email deliverability is the inconsistency in the way that email providers handle it. It used to be that it was often possible to get feedback from email providers on your deliverability, but that information was of course extremely useful to spammers, so major providers have mostly stopped giving it out. Instead, email providers typically reject some portion of mail they don't like entirely, giving an SMTP error that almost universally gives a link to a support page or knowledgebase article that is not helpful. While these SMTP rejections are frustrating, the good news is that you actually know that delivery failed... although in some cases it will succeed on retry. The mail servers I run have been around long enough that outright SMTP rejections are unusual, but I still consistently get a seemingly random sample of emails hard rejected by Apple Mail.

What's a little more concerning is, of course, a provider's decision of whether or not to put a message into the junk folder. In a way this is worse than an outright rejection, because the recipient will probably never see the message but you don't know that. Unfortunately there aren't a lot of ways to get metrics on this.

If you self-host email, you will run into an elevated number of delivery problems. That is a guarantee. Fully implementing trust and authentication measures will help, but it will not eliminate the problem because providers weight their IP reputation information more than your ability to configure DKIM correctly. Whether or not it becomes a noticeable problem for you depends on a few factors, and it's hard to say in advance without just trying it.

Federated systems like email tend to rely on a high degree of informal social infrastructure. Unfortunately, as email has become centralized into a small number of major providers, that infrastructure has mostly decayed. It was not that long ago that you could often resolve a deliverability problem with a politely worded note to postmaster @ the problematic destination server. Today, many email providers have some kind of method of contacting them, but I have never once received a response or even evidence of action due to one of these messages... both for complaints of abuse on their end and deliverability problems [3].

Ongoing maintenance

While it is fully possible to set up a mailserver and leave it for years without much of any intervention beyond automatic updates, I wouldn't recommend it. Whether you have one user or a thousand, mail service tends to benefit appreciably from active attention. Manual tuning of spam detection parameters in response to current spam trends can have a huge positive impact on your spam filtering quality. I also manually maintain allow and blocklists of domains on mailservers I run, which can also greatly improve spam results.

More importantly, though, because of the extremely high level of ambient email abuse, mailservers are uniquely subject to attack. Any mailserver will receive a big, ongoing flow of probes that range from simple open relay checks to apparently Metasploit-driven checks for recently published vulnerabilities. A mailserver which is vulnerable to compromise will start sending out solicitations related to discount pharmaceuticals almost instantly. While I am hesitant to discourage anyone trying to grow their own vegetables, I also feel that it's a bit irresponsible to be too cavalier about mailservers. Any mailserver should have at least a basic level of active maintenance to ensure vulnerabilities are patched and to monitor for successful exploitation. I would not recommend that a person operate a mailserver without at least a basic competence in Linux administration and security.

Look at it this way: the security and abuse landscape of email is such that the line between being one of the good guys, and being part of the problem, is sometimes very thin. It's easy to cross by accident if you do not learn best practices in mail administration and keep up with them, because they do change.

Privacy and Security

The major motivation for self-hosting email usually has to do with privacy. There's a clear benefit here, as most ad-driven email providers are known to do various types of analysis on email content and that feels very sketchy. There may also be a benefit to privacy and security in that you have a greater ability to control and audit the way that your email is handled and protected.

There can be some more subtle advantages to running your own mailserver, as well, since you can customize configuration to meet your usage. For example, you can simply not implement interfaces and protocols that you do not use (e.g. POP) to reduce attack surface. You can set up more complex authentication with fewer bypass options. And you can restrict and monitor access much more narrowly in general. For example, if you run your own mailserver it may be practical to put strict firewall restrictions on the submission interface.

One benefit that I rather like is mandatory TLS. SMTP for inter-MTA transfer provides only opportunistic TLS, meaning that it is susceptible to an SSL stripping attack if there is an on-path attacker. A countermeasure already often implemented by email providers is a mandatory TLS list, which is basically just a list of email domains that are known to support TLS. The MTA is configured to refuse delivery to any of these domains if they do not accept an upgrade to TLS and provide a valid certificate. This is basically the email equivalent of HTTPS Everywhere, and if you run your own mailserver you can configure it much more aggressively than a major provider can. This is a substantial improvement in security since it nearly ensures in-transit encryption, which generally cannot be assumed with email [4].

We must remember, though, that in general the privacy and security situation with email is an unmitigated disaster. Even when running your own mailserver you should never make assumptions. A very large portion of the email you send and receive will pass through the same major provider networks on the other end anyway. There is always a real risk that you have actually compromised the security of your email contents, as commercial providers generally have a substantial professional security program and you do not. The chances are much higher that your own mailserver will be compromised for an extended period of time without your knowledge.

It is also more likely that your own mail server will compromise your privacy by oversharing. Distribution packages fortunately often include a default configuration with reasonable precautions, but mail services can include a lot of privacy and security footguns and it can be hard to tell if you have disarmed them all. For example, when using SMTP submission many MTAs will put the IP address of your personal computer, and the identity and version of your MUA, in the outbound email headers. This is a breach of your privacy that can be particularly problematic when your email address is related to an endeavor with a DoS problem like competitive gaming. Commercially operated services virtually always remove or randomize this information, and you can too, but you have to know what issues to check for, and that's hard to do without putting appreciable time into it.


Do I think running your own mail server is a good idea? I do not have an especially strong opinion on the matter, as you might have guessed from the fact that I both run mailservers and pay Fastmail to do it for me. I think it depends a great deal on your needs and desires, and your level of skill and resources.

I will offer a few things that I think are good rules:

And my advice for running a mailserver:

Surely I have said at least one thing here which is controversial or just outright wrong, so you can let me know by sending an email to the MTA that I pay some Australians to run.

[1] Basically, MTAs especially are designed around a fundamentally distributed architecture and try to view mail handling as a completely general-purpose routing exercise. That means that there's no strong differentiation between "inbound" and "authenticated outbound" messages at many points in the system. Obviously there are canonical ways to handle this with DKIM, but nonetheless it's surprisingly easy to do something extremely silly like set up an open relay where, as a bonus, OpenDKIM signs all of the relayed messages.

[2] Some of this over-architecture is simply a result of the natural worst tendencies of academic software development (to make the system more general until it is so general that the simplest use-cases are very complex), but some of it is a result of a legitimate need for mail software to be unusually flexible. I think we can attribute this to three basic underlying issues: first, historically, there has been a strong temptation to "merge" mail with other similar protocols. This has resulted in a lot of things like MDAs that also handle NNTP, leading to more complex configuration. Second, there is a surprising amount of variation in the actual network architecture of email in institutional networks. You've experienced this if you've ever had to carefully reread what a "smart host" vs. a "peer relay" is. Third, MTAs and MDAs have traditionally been viewed as very separate concerns while other related functions like LDA and filtering have been combined into MTAs and MDAs at various points. So, most mail software tries to support many different combinations of different mail services and inter-service protocols.

[3] I have a story about once submitting more than one abuse complaint to gmail per week for over a year, during which they never stopped the clearly malicious behavior by one of their users. The punchline is that the gmail user was sending almost daily emails with so many addresses across the To: and Cc: (over 700 in total) that the extreme length of the headers broke the validation on the gmail abuse complaint form, requiring me to truncate them. I also detailed this problem in my abuse reports, it was also never fixed.

[4] This is increasingly getting written into compliance standards for various industries, so more and more paid commercial email services also allow you to configure your own mandatory TLS list.

[5] This issue will mostly not apply to "personal" mail servers, but at an institutional scale it's one of the biggest problems you'll deal with. I went through a period where I was having to flush the postfix queue of "3 inches longer" emails more than once a week due to graduate students falling for phishing. Yes, I'm kind of being judgmental, but it was somehow always graduate students. The faculty created their own kinds of problems. Obviously 2FA will help a lot with this, and it might also give you a bit of sympathy for your employer's annoying phishing training program. They're annoyed too, by all the phishing people are falling for.


>>> 2021-12-13 coaxial goes to war

Last time we perambulated on telephones, we discussed open-wire long-distance telephone carriers and touched on carriers intended for cables. Recall that, in the telephone industry, cable refers to an assembly of multiple copper twisted pairs (often hundreds) in a single jacket. There is a surprising amount of complexity to the detailed design of cables, but that's the general idea. Cables were not especially popular for long-distance service because the close proximity of the pairs always lead to crosstalk problems. Open-wire was much better in that regard but it was costly to install and maintain, and the large physical size of open-wire arrays limited the channel capacity of long-distance leads.

The system of carriers, multiplexing multiple lines onto a single set of wires, allowed for a significant improvement in the capacity of both cables and open-wire. However, even the highest quality open-wire circuits could offer only a very limited bandwidth for multiplexing. In practice, anything near 100kHz became hopelessly noisy as the balanced transmission system used on these cables was simply ineffective at high frequencies. Because phone conversations required around 15kHz of bandwidth (assuming no companding, which was not yet done at the time) this imposed a big limit... which helps to explain why open-wire carriers basically topped out at 17 total channels.

Fortunately, in the early 1930s AT&T engineers [1] began to experiment with a then obscure type of cable assembly called a coaxial line (I will stick to this terminology to avoid confusion with the more specific industry meaning of "cable"). Coaxial lines were first proposed in the 19th century and are in widespread use today for all manner of RF applications, although people tend to associate them most with cable television. At the time, though, there were no commercial applications for long, high-frequency coaxial lines, and so AT&T's efforts covered considerable new ground.

The basic concept of a coaxial line is this: a center conductor is surrounded first by a dielectric (typically air in earlier lines with separation achieved by means of wrapping a non-conductive fiber cord around the center conductor) and then by a cylindrical metal outer conductor. Unlike open-wire lines, the outside conductor is connected to ground. This system has two particularly useful properties: first, because the signal is carried through the internal space between the conductors which is effectively a capacitor, the system acts much like the old loaded telephone lines (but more effective) and can carry very high frequencies. Second, the skin effect causes the outer conductor to serve as a very effective shield: undesired RF energy follows the outside of the outer conductor to ground, and is thus kept well isolated from the RF energy following the inside of the outer conductor. Coaxial lines can support a high bandwidth with low noise, and for this reason they are still the norm today for most RF applications.

The high-bandwidth property of coaxial lines has an interesting implication that the 1934 BSTJ article introducing the concept must lay out explicitly, since the technology was not yet familiar. Because a coaxial line can carry a wide range of frequencies simultaneously, it can be used by a radio system much like the air. We have previously discussed "wired radio," but coaxial lines provide a much more literal "wired radio." Modern CATV systems, for example, are basically an entire broadcast RF spectrum, up to about 1GHz, but contained inside of a coaxial line instead of traveling through free air. The implication is that coaxial lines can carry a lot of payload, using conventional radio encoding methods, but are isolated such that two adjacent lines can use the same frequencies for different purposes with no (or in practice minimal) interference with each other.

We can presumably imagine how telephone over coaxial line works: much like the open-wire carriers, telephone calls are modulated to higher frequencies, and thus many telephone calls at different frequencies can be multiplexed onto one line. The principle is simple, although in the 1930s the electronics necessary to perform the modulation, amplification, and demodulation were rather complex.

Adding further complexity, the early coaxial lines which could be manufactured at the time had rather high attenuation compared to modern cables, requiring frequent amplification of the signal (you will be surprised, when we get to it, by just how frequent). Further, the RF properties of the cables (in terms of frequency response and attenuation) turned out to be significantly related to the temperature of the cable, likely mostly because of expansion and contraction of the outer conductor which was physically relatively large (1/2" common in early telephony experiments) and secured loosely compared to modern designs.

Another important trend occurring at the same time was the creation of national television networks. Radio networks already often used leased telephone lines to distribute their audio programming to various member stations, and this had by the 1930s already become a profitable service line for AT&T. Television networks were now looking to do the same, but the far higher bandwidth required for a television signal posed a challenge to AT&T which had few options technically capable of carrying them. This was a huge incentive to develop significantly higher bandwidth carriers.

AT&T first created a 2600' test cable at a facility in Phoenixville, Pennsylvania. Tests conducted on this length of copper pipe in 1929 validated the concept and lead to the 1930s project to fully realize the new carrier scheme. In 1936, AT&T committed, building a real coaxial long-distance lead between New York and Philadelphia that supported 240 voice channels or a single television channel. The design bandwidth of this line was what we now call 1MHz, but in AT&T documents it is widely referred to as the "million-cycle system" or, less picturesque, the 1,000 kc system. Because of the rather high attenuation in the line, repeaters were required every 10.5 miles, and the design of suitably wide-band repeaters was one of the greater challenges in development of this experimental toll lead.

Powering these repeaters proved a challenge. Previous carrier systems had usually had a local utility install three-phase power to each repeater station; it was undesirable to run power along with the signal wires because AC hum in telephone calls had been an ongoing frustration with telephone lines run along with power. With repeaters as frequently as 10 miles, though, the cost of adding so many new power lines would have been excessive. Instead, the decision was made to bundle the coaxial signal cable along with wires used for high-voltage DC. "Primary power supply stations," later called "main stations," had grid connections (and typically backup batteries and generators) along with rectification equipment to inject HVDC onto the cable. Repeaters between main stations ran off of this DC power. Much the same technique is used today for transoceanic cables.

Following experiments performed on this early coaxial route, the frequency division carrier on coaxial cable was productionized as the L-carrier, or specifically L1. The first proper L1 route was installed from Stevens Point, Wisconsin to Minneapolis in 1941. L1 combined voice channels into groups of 12, called banks. Five banks were then modulated to different carriers to form a group. Finally, eight groups were modulated onto carriers to form a "supergroup" of 480 channels, which was transmitted on the cable [2]. The end result spanned 68 kHz to 2044 kHz on the line, and some additional carriers at higher frequencies were used as "pilots" for monitoring cable attenuation to adjust amplifiers.

As L1 equipment and installation methods improved, additional supergroups were added to reach 600 total channels, and it became the norm to combine multiple coaxial lines into a single assembly (along with power wires and often several basic twisted pairs, which served as order wires). Late L1 installations used four coaxial lines, for a total of 2400 channels.

AT&T briefly experimented with an L2 carrier, a variation that was intended to be simpler and lower cost and thus suitable for shorter toll leads (e.g. within metro areas). The effort quickly proved to be uncompetitive with conventional cables and was canceled, which is simply to explain why most accountings of L-carrier history totally skip L2.

In 1952, a major advancement to the technology came in the form of the L-3 carrier, initially installed between New York and Philadelphia for testing. L-3 carried three "mastergroups" spanning approximately 200kHz to 8.3 MHz. Each mastergroup contained two submastergroups, which each contained six supergroups, which matched L1 in containing eight groups of five banks of twelve channels. This all combined onto 8 coaxial lines in a typical assembly yielded a total of 14,880 voice channels per route, although both more and fewer coaxial lines were used for some routes. As an additional feature, L-3 could optionally replace two mastergroups with a TV channel, allowing one TV channel and 600 channels on a cable.

One of the larger improvements to L-3, besides its increased capacity, was a significant expansion of the supporting infrastructure considered part of the carrier installation. This included repeaters at 4 mile intervals, some of which were fitted with additional signal conditioning equipment (namely for equalization of the balanced pairs). Main stations were required to inject power at roughly 100-mile intervals.

For an additional quality improvement, L-3 used a technique called "frogging" in which the supergroups were periodically "rearranged" or swapped between frequency slots. This prevented excessive accumulation of intermodulation products in any one supergroup frequency range, and was done at some main stations, typically about every 800 miles.

A more interesting feature, though, was L-3's substantial automatic protection capabilities. Equipment at each main station (where power was injected) monitored a set of pilot signals on each coaxial line, and each route was provided with a spare coaxial line. A failure of any coaxial line triggered an automatic switching of its modulation and demodulation equipment onto the spare pair, restoring service in about 15ms. L-3 also contained several other automatic recovery features, and an extensive alarm system that detected and reported faults in the cable or at repeaters.

Here, at L-3, we will spend more time discussing what is perhaps the most interesting part of the L-carrier story: the physical infrastructure. L-3 was the state of the art in long-distance toll leads during an especially portentous period in US telecom history: the 1960s or, more to the point, the apex of the cold war.

Beginning in 1963, the military began construction of the Automatic Voice Network, or AUTOVON. AUTOVON, sometimes called "the DoD's phone company," was a switched telephone network very much like the regular civilian one but installed and operated for the military. But, of course, the military did not really build telephone infrastructure... they contracted it to AT&T. So, in practice, AUTOVON was a "shadow" or "mirror" telephone system that was substantially similar to the PSTN, and operated by AT&T on largely the same equipment, but only terminated at various military and government installations.

A specific goal for AUTOVON was to serve as a hardened, redundant system that would survive an initial nuclear attack to enable coordination of a reprisal. In other words, AUTOVON was a critical survivable command and control system. To that end, it needed a survivable long-distance carrier, and L-carrier was the state of the art.

Previous L-carrier routes, including the initial L-3 routes, had enclosed their repeaters and power feeds in existing telephone exchange buildings and brick huts alongside highways. For AUTOVON, though, AT&T refined the design to L-3I, or L-3 Improved. The "Improved," in this case, was a euphemism for nuclear hardening, and L-3I routes consisted entirely of infrastructure designed to survive nuclear war (within certain parameters, such as a maximum 2 PSI blast overpressure for many facilities). Built in the context of the cold war, nearly all subsequent L-carrier installations were nuclear hardened.

The first major L-3I project was the creation of a hardened transcontinental route for AUTOVON. Like the 1915 open-wire route before it, the first L-3I connected the east coast to the west coast via New York and Mojave---or rather, connected the military installations of the east coast to the military installations of the west coast.

L-3I routes consisted of repeater stations every 4 miles, which consisted of a buried concrete vault containing the repeater electronics and a sheet-metal hut on the surface, directly over the vault manhole, containing test equipment and accessories to aid maintenance technicians. Because repeaters were powered by the line itself, no utility power was required, although many repeater huts had it added at a later time to allow use of the lights and ventilation blower without the need to run a generator.

Every 100 miles, a main station injected power onto the cable and contained automatic protection equipment. Some main stations contained additional equipment, up to a 4ESS tandem switch. Main stations also served as interconnect points, and often had microwave antennas, cables, and sometimes "stub" coaxial routes to connect the L-3I to nearby military and civilian telephone exchanges (L-3I routes installed for AUTOVON were also used for civilian traffic on an as-available basis). A few particularly large main stations had even more equipment, as they were capable of serving as emergency network control facilities for the AUTOVON system.

A typical main station consisted of a five to twenty thousand square foot structure buried underground, with all sensitive equipment mounted on springs to provide protection from shocks transmitted through the ground. Vent shafts from the underground facility terminated at ground-level vents with blast deflectors. A gamma radiation detector on the surface (and, in later installations, a "bhangmeter" type optical detector) triggered automatic closure of blast valves on all vents when a nearby nuclear detonation was detected. Several diesel generators, either piston or turbine depending on the facility, were backed by buried diesel tanks to provide a two-week isolated runtime. Water wells (with head pit underground), water tanks, and supplies of sealed food supported the station's staff for the same two week duration. This was critical, as main stations required a 24/7 staff for monitoring and maintenance of the equipment.

At those facilities with interconnections to microwave routes, even the microwave antennas were often a variant of the KS-15676 modified for hardening against blast overpressures by the addition of metal armor. L-3I main stations, being hardened connection points to AUTOVON with a maintenance staff on-hand, were often used as ground stations for the ECHO FOX and NORTH STAR contingency communications networks that supported Air Force One and E-4B and E-6 "Doomsday planes."

This first transcontinental L-3I ran through central New Mexico and had a main station at Socorro, where I used to live. In fact, the Socorro main station [3] housed a 4ESS tandem switch, a master synchronization time source for the later digital upgrade of the system, and served as the contingency network control center for the western half of AUTOVON, making it one of the larger main stations. You would have no idea from the surface, as the surface structures are limited to a short microwave tower (for interconnection to the Rio Grande corridor microwave route) [4], a small surface entry building, and a garage for vehicle storage. The only indication of the facility's nature as cold war nuclear C2 infrastructure are the signs on the perimeter fence which bear a distinctive warning about the criminality of tampering with communications infrastructure used for military purposes. And the gamma radiation detector, if you know what they look like.

As an aside, a poorly maintained page that includes photos of some of these locations can be found on my personal website: https://jbcrawford.us/history/bellsystem/socorro

Hopefully you can see why I have always found this fascinating. Rumors about secret underground military facilities abound, and yet few really exist... but somewhat secret underground telephone facilities are actually remarkably common, as not only L-3I but following L-4 and L-5 main stations were all hardened, buried facilities that were, at least initially, discreet. The fact that such an important part of the network infrastructure was located in such a rural area might be a surprise to you, for example (at least if you are familiar with New Mexico geography), but this was an explicit goal: L-3I main stations were required to be located at least 20 miles from any major population centers, since they were designed based on a nuclear detonation at 5 miles distance. So not only are these sorts of underground facilities found throughout the nation, they're almost always found in odd places... off the side of rural US highways between modest towns.

Given the lengths I have already reached, I will spend less time on L-4 and L-5. This isn't much of a loss to you, because L-4 and L-5 were mostly straightforward incremental improvements on L-3I. L-4 reached up to 72,000 channels per route while L-5E ("L5 Enhanced," which if you read the relevant BSTJ articles appears to be merely the original L5 scheme with a limitation in the multiplexing equipment resolved) reached up to 108,000 channels, using 66 MHz of bandwidth on each coaxial line.

Somewhat counter-intuitively, AT&T achieved these increases in capacity at the cost of increased attenuation, so the repeater frequency actually increased as the L-carrier technology evolved. L-4 required a repeater every 2 miles and a power feed every 150, while L-5 required a repeater every 1 mile and a power feed every 75. Some L-3I routes, such as the segment between Socorro and Mojave, were upgraded to L-4, resulting in an L-4 repeater added between each pair of original L-3I repeaters. Most L-4 routes were upgraded to L-5, resulting in "alternating" main stations as smaller L-5 power-feed-only main stations were added between the older L-4 main stations with more extensive protection equipment.

Both L-4 and L-5 made use of entirely underground repeaters (e.g. no hut at the surface), although L-4 repeaters sometimes had huts above them... usually at mid-span equalizing repeaters (every 50 miles) and occasionally randomly at others. The L-4 huts are said to have been almost entirely empty, serving only to give technicians a workspace out of the wind and rain.

These L-carrier systems were entirely analogue as you have likely gathered, and started out as tube equipment that transitioned to transistorized in the L-3I era. But the analogue limitation was not undefeatable, and Philips designed a set of systems for digital data on L-carrier referred to as P-140 (140 Mbps on L-4) and P-400 (400 Mbps on L-5). Most L-carrier routes still in service in the '80s were upgraded to digital.

What came of all of this infrastructure? Not long after the development of L-5E in 1975, fiber optics began to reach maturity. By the 1980s fiber optics had become the obvious future direction in long-distance telecoms, and L-carrier began to look obsolete. L-carrier routes generally went out of service in the '90s, although some survived into the '00s. Many, but not all, L-carrier rights of way were reused to trench fiber optic cable, and some L-carrier repeater vaults were reused as land for fiber add/drop and regeneration huts (typically every 24 miles on telco fiber lines).

More interestingly, what about the main stations? Large underground facilities have long proven difficult to repurpose. At least twice a month someone declares that it would be a good idea to purchase an underground bunker and operate it as a "high security data center," and sometimes they even follow through on it, despite the fact that these ventures have essentially never been successful (and the exceptions seem to be the type that prove the rule, since they are barely surviving and/or criminal enterprises). The nation is studded with AT&T main stations and Atlas and Titan missile silos that suffer from extremely haphazard remodeling started, but not finished, by a "data center" operator before going bankrupt. There are two examples just in the sparsely populated state of New Mexico (both surrounding Roswell in the former Walker AFB missile silo complex).

In practice, the cost of restoring and upgrading a Cold War underground facility for modern use usually exceeds the cost of building a new underground facility. The rub of it is that no one actually wants to put their equipment in an underground data center anyway. These Cold War facilities cannot practically be upgraded to modern standards for redundant HVAC, power, and connectivity, and are never operated by ventures with enough money to hire security, add vehicle protections, and obtain certifications. Ironically, they are less secure and reliable than your normal above-ground type. Most of them are highly prone to flooding [5].

Many main stations, L-4 and L-5 in particular, have been sold into private ownership. Some owners have tried to make use of the underground facility, but most have abandoned it and only use the surface land (for example because it is adjacent to their farm). A few are being restored but these restoration efforts quickly become very expensive and usually fail due to lack of funds, meaning these often come up on the market with odd quirks like new kitchen appliances but a foot of water on the lower level.

On the other hand, because L-carrier main stations sat on high-capacity long-distance lines and had a staff and space for equipment, they naturally became junction points for other types of long-distance technology. Many L-carrier main stations are still in use today as switching centers for fiber routes, but in most cases the underground has been placed in mothballs and new surface buildings contain the actual equipment (the cost of modernizing the electrical infrastructure and adding new cable penetrations to the underground areas is very high). Mojave is a major example, as the old Mojave L-3I main station remains one of Southern California's key long-distance telephone switching centers.

Still others exist somewhere in-between. I have heard from a former employee that Socorro, for example, is no longer in use for any long-distance application and is largely empty. But CenturyLink, the present owner, still performs basic caretaking on the structure at least in part because they know that details of the lease agreement (most western L-carrier facilities are on land leased from the BLM or Forest Service) will require them to perform site remediation that is expected to be very costly. As happens all too often with old infrastructure, it's cheaper to keep the lights on and little else than to actually close the facility.

I am not aware of any former main stations that are not fairly well secured. Repeaters are a different story. L-4 and L-5 seldom lead to interesting repeater sites since they were underground vaults that were often filled and in any case are very dangerous to enter (hydrogen sulfide and etc). L-3I, on the other hand... nearly all visible signs of the L-3I transcontinental in California were removed due to the environmental sensitivity of the Mojave Desert region. Many other L-3I routes, though, in their more rural sections, feature completely abandoned repeater huts with the doors left to flap in the wind.

Even in the absence of visible structures, L-carrier has a very visible legacy. In the desert southwest, where the land is slow to heal, the routes are clearly visible in aerial images to this day. A 100' wide swath cleared of brush with a furrow in the center is perhaps more likely to be a petroleum pipeline, but some of them are old L-carrier routes. You can identify AT&T's routes in some cases by their remarkable dedication to working in completely straight lines, more so than the petroleum pipelines... perhaps an accommodation to the limited surveying methods of the 1960s that created a strong desire for the routes to be easy to pace out on the ground.

Since the genesis of the L-carrier system AT&T has maintained a practice of marking underground lines to discourage backhoe interference. There are many types of these markers, but a connoisseur learns certain tricks. Much like the eccentric wording "Buried Light Guide" indicates one of the earliest fiber optic routes, signs reading "Buried Transcontinental Telephone Line" usually indicate L-carrier. Moreover, L-carrier routes all the way to L-5 used AT&T's older style of ROW markers. These were round wooden posts, about 4" in diameter and 4-8' tall, with one to three metal bands painted orange wrapped around the top. Lower on the post, a rectangular sign gave a "Buried telephone line, call before digging" warning and a small metal plate affixed to the post near the sign gave a surveyor's description of the route each way from the post (in terms of headings and distances). Maintenance crews would locate the trench if needed by sighting off of two posts to find the vector between them, so they are usually tall enough and close enough together that you can see more than one at a time.

It's hard to find or put together maps of the entire system, as routes came and went over time and AT&T often held maps close to their chest. Some are available though, and a 1972 map [6] depicts L-3I and major microwave routes. L-4 and L-5 routes were more common but fewer maps depict them; on all maps of long-distance routes in the 1980s time period there is a high chance that any given non-microwave route is L-4 or L-5.

At the peak of the L-carrier network, there were over 100 hardened underground main stations, 1000 underground repeater vaults, and at least 5,000 miles of right of way. For a period of around two decades, L-carrier was the dominant far long-distance technology in the United States, and the whole thing was designed for war.

There is so much more to say about both L-carrier and AT&T's role in Cold War defense planning, and I have already said a lot here. My next post will probably be on a different topic, but I will return to long-distance systems in order to discuss microwave. Microwave relay systems were extensively built and covered many more route-miles than L-carrier. The lower cost of installation made microwave better for lower-capacity routes, and also spurred competitive long distance carriers like MCI and Verizon to use almost entirely microwave. Later, we will get to fiber, although I have previously written about SONET.

I will also return to AT&T and nuclear war, one of my greatest interests. The practicalities of the Cold War---that it consisted primarily of an enormous planning exercise in preparation for nuclear attack---meant that AT&T functioned effectively as another branch of the military. Nearly every nuclear scenario involved AT&T's infrastructure, and AT&T and its subsidiaries, partners, and successors were knee deep in secret planning for the end of the world. They still are today.

P.S.: I have a fairly extensive collection of information on L-carrier routes, and particularly those built for AUTOVON. There is a surprisingly large community of people interested in this topic, which means that many resources are available. Nonetheless it has always been my intent to put together one of the most comprehensive sources of information on the topic. For various reasons I had put this project down for years, but I am picking it back up now and hope to produce something more in the format of a book over the next year. I will perhaps share updates on this from time to time.

[1] You have hopefully, by now, realized that I am using "AT&T" to refer to the entirety of the Bell System apparatus, which at various time periods consisted of different corporate entities with varying relationships. Much of the work I attribute to AT&T was actually performed by AT&T Long Lines, Bell Laboratories, and the Western Electric Company, but some of it was also performed by various Bell Operating Companies (BOCs, although they became the somewhat more specifically defined RBOCs post-breakup). All of these entities have been through multiple rounds of restructuring and, often, M&A and divestitures, with the result that you sort of need to settle on using one name for all of them to avoid spending a lot of time explaining the relationships. The same organizations usually exist today in the forms of AT&T, Alcatel-Lucent, Nokia, Avaya, CenturyLink, etc., but often not recognizably.

[2] This hierarchy of multiple levels of multiplexing was used both to make the multiplexing electronics more practical and to allow L-carrier's 12-channel banks to "appear" the same as the J and K-carrier banks. The concept had a lot of staying power, and virtually all later telephone-industry multiplexing scheme used similar hierarchies, e.g. DS0, DS1, etc.

[3] If you are following along at home it is technically the Luis Lopez or "Socorro #2" main station, located just south of Socorro, as the Socorro name was already used within AT&T Long Lines for an en-route microwave relay located somewhat north of Socorro.

[4] If you're one of the few who has seen my few YouTube videos, you might find it interesting that documentation refers to a direct microwave connection between the Socorro #2 main station and the Manzano Base nuclear weapons repository (now disused and part of Kirtland AFB). It's unclear if this was a dedicated system or merely reserved capacity on the Rio Grande route, although the latter seems more likely since multiple relays would be required and there's no evidence of any.

[5] Like most underground facilities, these main stations are often below the water table and have always required sump pumps for water extraction. As they get older the rate of water ingress tends to increase, and so if the pumps are out of operation for any period of time they can quickly turn into very-in-ground swimming pools.

[6] https://jbcrawford.us/_media/history/bellsystem/1972_l-carrier_map.jpg


>>> 2021-12-05 singing wires

I have probably described before the concept of the telephone network forming a single, continuous pair of wires from your telephone to the telephone of the person you are calling. This is the origin of "circuit switching" and the source of the term: the notion that a circuit-switched system literally forms an electrical circuit between two endpoints.

Of course, we presumably understand that modern systems don't actually do this. For one, most long-distance data transmission today is by means of optics. And more importantly, most modern systems that we call circuit-switching are really, in implementation, packet switching systems that use fixed allocation schemes to provide deterministic behavior that is "as good as a circuit."

Consider the case of MPLS, or Multi-Protocol Label Switching, a network protocol which was formerly extremely popular in telecom and ISP backhaul networks and is still common today, although improvements in IP switching have reduced its popularity [1]. MPLS is a "circuit-switched" system in that it establishes "virtual circuits," i.e. connection setup is a separate operation from using the connection. But, in implementation, MPLS is a packet switching system and inherits the standard limitations thereof (resulting in a need for QoS and traffic engineering mechanisms to provide deterministic performance). We can say that MPLS implements circuit switching on top of packet switching. One of the fun things about networking is that we can do things like this.

Why, though? Circuit switching is, conceptually, very simple. So why do we bother with things like MPLS that make it very much more complicated, even as simple as MPLS is?

There are two major reasons, one fundamental and one practical. First, the conventional naive explanation of circuit switching implies that, when I call someone in India, the telephone network allocates a set of copper wires all the way from me to them. This is a distance of many thousands of miles, which includes oceans, and it does not seem especially likely that the telecom industry has sunk the thousands of tons of copper into the ocean that would be required to accommodate the telephone traffic between the US and the Asian region. It is obvious on any consideration that, somehow, my telephone call is being combined with other telephone calls onto a shared medium.

Second, there is the issue of range. The microwatt signal produced by your telephone will not endure thousands of miles of wire, even if the gauge was made unreasonably large. For this simple practical reason, signals being moved over long distances need to somehow be encoded differently in a way that can cover very long distances.

Both of these things are quite unsurprising to us today, because we are fortunate enough to live in a world in which these problems were solved long ago. Today, I'm going to talk about how these problems were solved, in the first case where they were encountered at large scale: the telephone network.

Your landline telephone is connected by means of a single pair, two wires. This pair forms a loop, called the local loop, from the exchange to your phone and back. Signals are conveyed by varying the voltage (and, by the same token, current as the resistance is fixed) on the circuit, or in other words by amplitude modulation. The fact that this works in full duplex on a single loop is surprisingly clever from the modern perspective of digital protocols which almost universally are either half-duplex or need separate paths for each direction, but the electrical trick that enables this was invented at about the same time as the telephone. It's reasonably intuitive, although not quite technically accurate, to say that each end of the telephone line knows what signal it is sending and can thus subtract it from the line potential.

The possible length of the local loop is limited. It varies by the gauge of wire used, which telephone companies selected based on loop length to minimize their costs. In general, beyond about ten miles the practicality starts to drop as more and more things need to be done to the line to adjust for resistance and inductance attenuating the signal.

The end result is that the local loop, the part of the telephone system we are used to seeing, is actually sort of the odd one out. Virtually every other part of the telephone system uses significantly different signaling methods to convey calls, and that's not just a result of digitization: it's pretty much always been that way, since the advent of long-distance telephony.

Before we get too much further into this, though, a brief recap of the logical architecture of the telephone system. Let's say you make a long distance call.

In simplified form (in the modern world there are often more steps for optimization reasons), your phone is directly connected by the local loop to a class 5 or local switch in your exchange office. The local switch consults routing information and determines that the call cannot be completed locally, so it connects your call to a trunk. A trunk is a phone line that does not connect a switch to a phone... instead, it connects a switch to another switch. Trunk lines thus make up the backbone of the telephone network.

In this case, the trunk line will go to a tandem, class 4, or toll switch. These are all mostly interchangeable terms used at different periods. Tandem switches, like trunk lines, are not connected to any subscriber phones. Their purpose is to route calls from switch to switch, primarily to enable long distance calling---between two local switches. In our example, the tandem switch may either select a trunk to the local switch of the called party, or if there is no one-hop route available it will select a trunk to another tandem switch which is closer to [2] the called party. Eventually, the last tandem switch in the chain will select a trunk line to the called party's local switch, which will select the local loop to their phone.

What we are most interested in, here, are the trunks.

Trunk lines may be very long, reaching thousands of miles for trans-continental calls. They are also expected to serve a high capacity. Almost regardless of the technology [3], laying new trunk lines is a considerable expense to this day, so it's desirable to concentrate a very large amount of traffic onto a small number of major lines. As a result, common routing in the telephone network tends to resemble a hub and spoke architecture, with calls between increasingly larger regions being concentrated onto just a few main trunks between those regions. The modern more mesh-like architecture of the internet, the more flexible routing technology it required, and the convergence of telephony on IP is lessening this effect, but it's still fairly prominent and was completely true of the early long-distance network.

Consider, for example, calls from New York City to Los Angeles. These two major cities are separated by a vast distance, yet many calls are placed between them. For cost reasons, just a small number of transcontinental lines, each of them a feat of engineering, must take the traffic. Onto those same lines is aggregated basically the entire call volume between the east coast and the west coast, easily surpassing one hundred thousand simultaneous connections.

Now, imagine you tried to do this by stringing one telephone line for each call.

Well, in the earliest days of long-distance telephony, that's exactly what was done. A very long two-wire telephone circuit was strung between cities just like between exchange offices and homes. To manage the length, inductance coils were added at frequent intervals to adjust frequency response, and at less frequent intervals the line was converted to four-wire (one pair each direction) so that an amplifier could be inserted in each pair to "boost" the signal against line loss. These lines were expensive and the quality of the connection was poor, irritating callers with low volume levels and excessive noise.

Very quickly, long-distance trunks were converted to a method we now refer to as open wire. On these open wire trunks, sets of four wires (one pair for each direction) were strung alongside each other across poles with multiple cross-arms. Because a set of four wires was required for every simultaneous phone call the trunk could support (called a channel), it was common to have an absolute maze of wires as, say, four cross-arms on each pole each supported two four-wire pairs. This large, costly assembly supported only eight channels.

Four-wire circuits were used instead of two-wire circuits for several reasons, including lower loss and greater noise immunity. But moreover, continuous use of a four-wire circuit made it easier to install amplifiers without having to convert back and forth (which somewhat degraded quality every time). Loading coils to adjust inductance were still installed at regular intervals (every mile was typical).

The size and cost of these trunks was huge. Nonetheless, in 1914 AT&T completed the first transcontinental telephone trunk, connecting for the first time the eastern network (through Denver) to the previously isolated west coast network. The trunk used three amplifiers and uncountable loading coils. Amusingly, for basically marketing reasons, it would not go into regular service until 1915.

The high cost of this and subsequent long-distance connections was a major contributor to the extraordinary cost of long-distance calls, but demand was high and so long-distance open-wire trunks were extensively built, especially in the more densely populated northeast where they formed the primary connections between smaller cities for decades to come.

Years later, the development of durable, low-cost plastics considerably reduced the cost of these types of trunks by enabling cheap "sheathed" cables. These cables combined a great number of wire pairs into a single, thick cable that was far cheaper and faster to install over long distances. Nonetheless, the fundamental problem of needing two pairs for each channel and extensive line conditioning remained much the same. The only real difference in call quality was that sheathed cables avoided the problem of partially shorting due to rain or snow, which used to make open-wire routes very poor during storms.

It was clear to Bell System engineers that they needed some form of what we now call multiplexing: the ability to place multiple phone calls onto a single set of wires. The first, limited method of doing so was basically a way of intentionally harnessing crosstalk, the tendency of signals on one pair to "leak" onto the pair run next to it. By use of a clever transformer arrangement, two pairs could each carry one direction of one call... and the two pairs together, each used as one wire of a so-called phantom circuit, could carry one direction of a third call. This represented a 50% increase in capacity, and the method was widely used on inter-city trunks. Unfortunately, combining phantom circuits into additional super-phantom circuits proved impractical, and so the modest 1.5x improvement remained and the technique was far from addressing the problem.

Vacuum tube technology, originally employed for amplifiers on open-wire circuits, soon offered an interesting new potential: carriers. Prior to carrier methods, all telephone calls were carried on trunks in the audible frequency range, just like on local loops. Carrier systems entailed using the audio frequency signal to modulate a higher frequency carrier, much like radio. At the other end, the carrier frequency could be isolated and the original audio frequency demodulated. By mixing multiple carriers together, multiple channels could be placed on the same open-wire pair with what we now call frequency-division muxing.

The first such multiplexed trunk went into service in 1918 using what AT&T labeled the "A" carrier. A-carrier was capable of carrying four channels on a pair, using a single-sideband signal with suppressed carrier frequency, much like the radio systems of the time. These carrier systems operated above audible frequency (voice frequency or VF) and so were not considered to include the VF signal, with the result that an open-wire line with A carrier could convey five channels: four A-carrier channels and one VF channel.

Subsequent carriers were designed to use FDM on both open-wire and sheathed cables, using improved electronics to fit more channels. Further, carriers could be used to isolate the two directions instead of separate pairs, once again allowing full-duplex operation on a single wire pair while still keeping amplifiers practical.

This line of development culminated in the J-carrier, which placed 12 channels on a single open-wire trunk. J-carrier operated above the frequencies used by older carriers such as C-carrier and VF, and so these carriers could be "stacked" to a degree enabling a total of 17 bidirectional channels on a four-wire trunk [4], using frequencies up to 140 KHz. This 17x improvement came at the cost of relatively complex electronics and more frequent amplifiers, but still yielded a substantial cost reduction on a per-channel basis. J-carrier was widely installed in the 1920s as an upgrade to existing open-wire trunks.

Sheathed cables yielded somewhat different requirements, as crosstalk was a greater issue. A few methods of mitigating the problem lead to the development of the K-carrier, which multiplexed 12 channels onto each pair in a sheathed cable. Typically, one sheathed cable was used for signals each direction to reduce crosstalk. Sheathed cables could contain a large number of pairs (hundreds was typical), making the capacity highly scalable. Further, K-carrier was explicitly designed to operate without loading coils, further lessening cost of the cable itself. In fact, loading coils improved frequency response only to a point and worsened it much beyond VF, so later technologies like K-carrier and even DSL required that any loading coils on the line be removed.

As a downside, K-carrier required frequent repeaters: every 17 miles. Each repeater consisted of two amplifiers, one each direction, per pair in use. Clever techniques which I will not describe in depth were used to automatically adjust amplifiers to maintain consistent signal levels throughout the line. Because these repeaters were fairly large and power intensive, they were installed in fairly substantial brick buildings that resembled small houses but for their unusual locations. Three-phase power had to be delivered to each building, usually adding additional poles and wires.

The size of the buildings is really quite surprising, but we must remember that this was still prior to the invention of the transistor and so the work was being done by relatively large, low-efficiency tubes, with sensitive environmental requirements. The latter was a particularly tricky aspect of analog carriers. Repeater buildings for most open-wire and cable carriers used extremely thick brick walls, which was not yet for blast hardening but instead a method of passive temperature stabilization as the thermal mass of the brick greatly smoothed the diurnal temperature cycle. A notable K-carrier trunk ran between Denver and El Paso, and the red brick repeater buildings can still be seen in some more rural places from I-25.

This post has already reached a much greater length than I expected, and I have yet to reach the topics that I intended to spend most of it on (coaxial and microwave carriers). So, let's call this Part I, and look forward to Part II in which the telephone network will fight the Cold War.

[1] MPLS used to have massive dominance because it was practical to implement MPLS switching in hardware, and IP switching required software. Of course, the hardware improved and IP switching can now be done in silicon, which reduces the performance advantage of MPLS. That said, MPLS continues to have benefits and new MPLS systems are still being installed.

[2] Closer in the sense of network topology, not physical locality. The topology of the telephone network often reflects history and convenience, and so can have unexpected results. For much of the late 20th century, virtually all calls in and out of the state of New Mexico passed through Phoenix, even if they were to or from Texas. This was simply because the largest capacity trunk out of the state was a fiber line from the Albuquerque Main tandem to the Phoenix Main tandem, along the side of I-40. Phoenix, being by then a more populous city, was better connected to other major cities.

[3] Basically the only exception is satellite, for which the lack of wires means that cost tends to scale more with capacity than with distance. But geosynchronous satellites introduce around a half second of latency, which telephone callers absolutely despised. AT&T's experiments with using satellites to connect domestic calls were quickly abandoned due to customer complaints. Satellites are avoided even for international calls, with undersea cables much preferred in terms of customer experience. Overall the involvement of satellites in the telephone network has always been surprisingly minimal, with their role always basically limited to connecting small or remote countries to which there was not yet sufficient cable capacity.

[4] Because the VF or non-carrier channel could be used by any plain old telephone connected to the line (via a hybrid transformer to two-wire), it was used as an order wire. The order wire was essentially a "bonus" channel that was primarily used by linemen to communicate with exchange office staff during field work, i.e. to obtain their orders. While radio technology somewhat obsoleted this use of the order wire, it remained useful for testing and for connecting automated maintenance alarms. Telephone carriers to this day usually have some kind of dedicated order wire feature.

<- newer                                                                older ->