_____                   _                  _____            _____       _ 
  |     |___ _____ ___ _ _| |_ ___ ___ ___   |  _  |___ ___   | __  |___ _| |
  |   --| . |     | . | | |  _| -_|  _|_ -|  |     |  _| -_|  | __ -| .'| . |
  |_____|___|_|_|_|  _|___|_| |___|_| |___|  |__|__|_| |___|  |_____|__,|___|
  a newsletter by |_| j. b. crawford               home archive subscribe rss

>>> 2022-03-12 an 800 scam in short

A non-diagetic aside

This is an experiment in format for me: I would like to have something like twitter for thoughts that are interesting but don't necessarily make a whole post. The problem is that I'm loathe to use Twitter and I somehow find most of the federated solutions to be worse, although I'm feeling sort of good about Pixelfed. But of course it's not amenable to text.

I would just make these blog posts, but blog posts get emailed out to a decent number of subscribers now. I know that I, personally, react with swift anger to any newsletter that dare darken my inbox more than once a week. I don't want to burden you with another subject line to scroll past unless it's really worth it, you know? So here's my compromise: I will post short items on the blog, but not email them out. When I write the next proper post, I'll include any short items from the meantime in the email with that post. Seem like a fair compromise? I hope so. Beats my other plan, at least, which was to start a syndicated newspaper column.

Also, having now written Computers Are Bad for nearly two years, I went back and read some of my old posts. I feel like my tone has gotten more formal over time, something I didn't intend.

I would hate for anyone to accuse me of being "professional." In an effort to change this trend, the tone of these will be decidedly informal and my typing might be even worse than usual.

The good part

So tonight I was lying on my couch watching Arrested Development yet again while not entirely sober, and I experienced something of a horror film scenario: I noticed that the "message" light on my desk phone was flashing. I remembered I'd missed several calls today, so I retrieved my voice mail. That is, I pulled out my smartphone and scrolled through my inbox to find the notification emails with a PCM file attached. Even I don't actually make a phone call for that.

The voicemail, from a seemingly random phone number in California, was 1 minute and 8 seconds long. I realized that this was a trend: over the last few days I had received multiple 1 minute, 8 second voice messages but the first one I had listened to seemed to be silent. I had since been ignoring them, assuming it was a telephone spammer that hung up a little bit too late (an amusing defect of answering machines and voicemail is that it has always been surprisingly hard for a machine to determine whether a person answered or voicemail, although there are a few heuristics). Just for the heck of it, though, realizing that I had eight such messages, I listened to one again.

The contents: a sort of digital noise. It sounded like perhaps very far away music, or more analytically it seemed like mostly white noise with a little bit of detail that a very low-bitrate speech codec had struggled to handle. It was quiet, and I could never quite make anything out, although it always seemed like I was just on the edge of distinguishing a human voice.

Here's the best part: after finding about fifteen seconds of this to be extremely creepy, I went back to my email. The sound kept on playing. I checked the notifications. Still going. It wouldn't stop. I went to the task switcher and dismissed the audio player. Still going. Increasingly agitated, and on the latest version of Android which is somehow yet harder to use, I held power and remembered it doesn't do that any more. I held power and volume down, vaguely remembering they had made it something like that. No, screenshot. Holding power and volume up finally got me to the six-item power menu, which somehow includes an easy-access "911" button even though you have to remember some physical button escape sequence to get it. Rebooting the phone finally stopped the noise.

Thoroughly spooked, I considered how I came to this point.

Because I am a dweeb and because IP voice termination is very cheap if you look in the right places, I hold multiple toll-free phone numbers, several of which go through directly to the extension of my desk phone. This had been the case for some time, a couple of years at least, and while I don't put it to a lot of productive use I like to think I'm kind of running my own little cottage PrimeTel. Of course basically the only calls these numbers ever get are spam calls, including a surprising number of car warranty expiration reminders considering the toll-free number.

But now I remember that there is another type of nuisance call that afflicts some toll free numbers. You see, toll free numbers exhibit a behavior called "reverse charging" or "reverse tolling" where the callee pays for the call instead of the caller. Whether you get your TFN on a fixed contract basis or pay a per-minute rate, your telephone company generally pays just a little bit of money each minute to the upstream telephone providers to compensate them for carrying the call that their customer wasn't going to pay for.

This means that, if you have a somewhat loose ethical model of the phone system, you can make a bit of profit by making toll-free calls. If you either are a telco or get a telco to give you a cut of the toll they receive, every toll-free call you make now nets you a per-minute rate. There is obviously a great temptation to exploit this. Find a slightly crooked telco, make thousands of calls to toll-free numbers, get some of them to stay on the phone for a while, and you are now participating in capitalism.

The problem, of course, is that most telcos (even those that offer a kickback for toll-free calls, which is not entirely unusual) will find out about the thousands of calls you are making. They'll promptly, usually VERY promptly due to automated precautions, give you the boot. Still, there are ways, especially overseas or by fraud, to make a profit this way.

And so there is a fun type of nuisance call specific to the recipients of toll-free calls: random phone calls that are designed to keep you on the phone as long as possible. This is usually done by playing some sort of audio that is just odd enough that you will probably stay on the phone to listen for a bit even after you realize it's just some kind of abuse. Something that sounds almost, but not quite, like someone talking is a classic example.

Presumably one of the many operations making these calls is happy to talk to voicemail for a bit (voicemail systems typically "supe," meaning that the call is charged as if it connected). why one minute and eight seconds I'm not sure, that's not the limit on my voicemail system. Perhaps if you include the greeting recording it's 2 minutes after the call connects or something.

I've known about this for some time, it's a relatively common form of toll fraud. I likely first heard of it via an episode of "Reply All" back when that was a going concern. Until now, I'd never actually experienced it. I don't know why that's just changed, presumably some operation's crawler just now noticed one of my TFNs on some website. Or they might have even wardialed it the old fashioned way and now know that it answers.

Oh, and the thing where it kept on playing after I tried to stop it, as if it were the distorted voice of some supernatural entity? No idea, as I said, I use Android. God only knows what part of the weird app I use and the operating system support for media players went wrong. Given the complexity and generally poor reliability of the overall computing ecosystem, I can easily dismiss basically any spooky behavior emanating from a smartphone. I'm not going to worry about evil portents until it keeps going after a .45 to the chipset... Maybe a silver one, just in the interest of caution.


>>> 2022-03-05 high definition radio

One of the great joys of the '00s was the tendency of marketers to apply the acronym "HD" to anything they possibly could. The funniest examples of this phenomenon are those where HD doesn't even stand for "High Definition," but instead for something a bit contrived like "Hybrid Digital." This is the case with HD Radio.

For those readers outside of these United States and Canada (actually Mexico as well), HD Radio might be a bit unfamiliar. In Europe, for example, a standard called DAB for Digital Audio Broadcasting is dominant and, relative to HD radio, highly successful. Another relatively widely used digital broadcast standard is Digital Radio Mondiale, confusingly abbreviated DRM, which is more widely used in the short and medium wave bands than in VHF where we find most commercial broadcasting today... but that's not a limitation, DRM can be used in the AM and FM broadcast bands.

HD radio differs from these standards in two important ways: first, it is intended to completely coexist with analog broadcasting due to the lack of North American appetite to eliminate analog. Second, no one uses it.

HD Radio broadcasts have been on the air in the US since the mid '00s. HD broadcasts are reasonably common now, with 9 HD radio carriers carrying 16 stations here in Albuquerque. Less common are HD radio receivers. Many, but not all, modern car stereos have HD Radio support. HD receivers outside of the car center console are vanishingly rare. Stereo receivers virtually never have HD decoding, and due to the small size of the market standalone receivers run surprisingly expensive. I am fairly comfortable calling HD Radio a failed technology in terms of its low adoption, but since it falls into the broader market of broadcast radio standards are low. We can expect HD Radio stations to remain available well into the future and continue to offer some odd programming.

Santa Fe's 104.1 KTEG ("The Edge"), for example, a run of the mill iHeartMedia alt rock station, features as its HD2 "subcarrier" a station called Dance Nation '90s. The clearly automated programming includes Haddaway's "What Is Love" seemingly every 30 minutes and no advertising whatsoever, because it clearly doesn't have enough listeners for any advertisers to be willing to pay for it. And yet it keeps on broadcasting, presumably an effort by iHeartMedia to meet programming diversity requirements while still holding multiple top-40 licenses in the Albuquerque-Santa Fe market region [1].

So what is all this HD radio stuff? What is a subcarrier? And just what makes HD radio "Hybrid Digital?" HD Radio has gotten some press lately because of the curious failure mode of some Mazda head units, and that's more attention than it's gotten for years, so let's look a bit at the details.

First, HD Radio is primarily, in the US, used in a format called In-Band On-Channel, or IBOC. The basic idea is that a conventional analog radio station continues to broadcast while an HD Radio station is superimposed on the same frequency. The HD Radio signal is found "outside" of the analog signal, as two prominent sideband signals outside of the bandwidth of analog FM stereo.

While the IBOC arrangement strongly resembles a single signal with both analog and digital components, in practice it's very common for the HD signal to be broadcast by a separate transmitter and antenna placed near the analog transmitter (in order to minimize destructive interference issues). This isn't quite considered the "correct" implementation but is often cheaper since it avoids the need to make significant changes to the existing FM broadcast equipment... which is often surprisingly old.

It's completely possible for a radio station to transmit only an HD signal, but because of the rarity of HD receivers this has not become popular. The FCC does not normally permit it, and has declined to extend the few experimental licenses that were issued for digital-only operation. As a result, we see HD Radio basically purely in the form of IBOC. More commonly, HD Radio supports both a full hybrid mode with conventional FM audio clarity and also an "extended" mode in which the digital sidebands intrude on the conventional FM bandwidth. This results in mono-only, reduced-quality FM audio, but allows for a greater digital data rate.

HD Radio was developed and continues to be maintained by a company called iBiquity, which was acquired by DTS, which was acquired by Xperia. iBiquity maintains a patent pool and performs (minimal) continuing development on the standard. iBiquity makes their revenue from a substantial up-front license fee for radio stations to use HD Radio, and from royalties on revenue from subcarriers. To encourage adoption, no royalties are charged on each radio station's primary audio feed. Further encouraging adoption (although not particularly successfully), no royalty or license fees are required to manufacture HD Radio receivers.

The adoption of HD Radio in North America stems from an evaluation process conducted by the FCC in which several commercial options were considered. The other major competitor was FMeXtra, a generally similar design that was not selected by the FCC and so languished. Because US band planning for broadcast radio is significantly different from the European approach, DAB was not a serious contender (it has significant limitations due to the very narrow RF bandwidth available in Europe, a non-issue in the US where each FM radio station was effectively allocated 200kHz).

Layer 1

The actual HD Radio protocol is known more properly as NRSC-5, for its standards number issued by the National Radio Systems Council. The actual NRSC-5 protocol differs somewhat depending on whether the station is AM or FM (the widely different bandwidth characteristics of the two bands require different digital encoding approaches). In the more common case of FM, NRSC-5 consists of a set of separate OFDM data carriers, each conveying part of several logical channels which we will discuss later. A total of 18 OFDM subcarriers are typically present, plus several "reference" subcarriers which are used by receivers to detect and cancel certain types of interference.

If you are not familiar with OFDM or Orthogonal Frequency Division Multiplexing, it is an increasingly common encoding technique that essentially uses multiple parallel digital signals (as we see with the 18 subcarriers in the case of HD Radio) to allow each individual signal to operate at a lower symbol rate. This has a number of advantages, but perhaps the most important is that it is typically used to enable the addition of a "guard interval" between each symbol. This intentional quiet period avoids subsequent symbols "blurring together" in the form of inter-symbol interference, a common problem with broadcast radio systems where multipath effects result in the same signal arriving multiple times at slight time offsets.

A variety of methods are used to encode the logical channels onto the OFDM subcarriers, things like scrambling and convolutional coding that improve the ability of receivers to recover the signal due to mathematics that I am far from an expert on. The end result is that an NRSC-5 standard IBOC signal in the FM band can convey somewhere from 50kbps to 150kbps depending on the operator's desired tradeoffs of bitrate to power and range.

The logical channels are the interface from layer 1 of NRSC-5 to layer 2. The number and type of logical channels depends on the band (FM or AM), the waveform (hybrid analog and digital, analog with reduced bandwidth and digital, or digital only), and finally the service mode, which is basically a configuration option that allows operators to select how digital capacity is allocated.

In the case of FM, five logical channels are supported... but not all at once. A typical full hybrid station broadcasts only primary channel P1 and a the PIDS channel, a low-bitrate channel for station identification. P1 operates at approximately 98kbps. For stations using an "extended" waveform with mono FM, the operator can select from configurations that provide 2-3 logical channels with a total bitrate of 110kbps to 148kbps. Finally, all-digital stations can operate in any extended service mode or at lower bitrates with different primary channels present. Perhaps most importantly, all-digital stations can include various combinations of secondary logical channels which can carry yet more data.

The curious system of primary channels is one that was designed basically to ease hardware implementation and is not very intuitive... we must remember that when NRSC-5 was designed, embedded computing was significantly more limited. Demodulation and decoding would have to be implemented in ASICs, and so many aspects of the protocol were designed to ease that process. At this point it is only important to understand that HD Radio's layer 1 can carry some combination of 4 primary channels along with the PIDS channel, which is very low bitrate but considered part of the primary channel feature set.

Layer 1, in summary, takes some combination of primary channels, the low-bitrate PIDS channel, and possibly several secondary channels (only in the case of all-digital stations) and encodes them across a set of OFDM subcarriers arranged just outside of the FM audio bandwidth. The design of the OFDM encoding and other features of layer 1 aid receivers in detecting and decoding this data.

Layer 2

Layer 2 operates on protocol data units or PDUs, effectively the packet of NRSC-5. Specifically, it receives PDUs from services and then distributes them to the layer 1 logical channels.

The services supported by NRSC are the Main Program Service or MPS which can carry both audio (MPSA) and data (MPSD), the similar Supplemental Program Service which also conveys audio and data, the Advanced Application Service (AAS), and the Station Information Service (SIS).

MPS and SPS are where most of HD Radio happens. Each carries a program audio stream along with program data that is related to the audio stream---things like metadata of the currently playing track. These streams can go onto any logical channel at layer 1, depending on the bitrate required and available. An MPS stream is mandatory for an HD radio station, while an SPS is optional.

AAS is an optional feature that can be used for a variety of different purposes, mostly various types of datacasting, which we'll examine later. And finally, the SIS is the simplest of these services, as it has a dedicated channel at layer 1 (the PIDS previously mentioned). As a result, layer 2 just takes SIS PDUs and puts them directly on the layer 1 channel dedicated to them.

The most interesting part of layer 2 is the way that it muxes content together. Rather than sending PDUs for each stream, NRSC-5 will combine multiple streams within PDUs. This means that a PDU may contain only MPS or SPS audio, or it might contain some combination of MPS or SPS with other types of data. While this seems complicated, it has some convenient simplifying properties: PDUs can be emitted for each program stream at a fixed rate based on the audio codec rate. Any unused space in each PDU can then be used to send other types of data, such as for AAS, on an as-available basis. The situation is somewhat simplified for the receiver since it knows exactly when to expect PDUs containing program audio, and that program audio is always the start of a PDU.

The MPS and one or more SPS streams, if present, are not combined together but instead remain separate PDUs and are allocated to the logical channels in one of several fixed schemes depending on the number of SPS present and the broadcast configuration used by the station. In the most common configuration, that of one logical channel on a full hybrid radio station, the MPS and up to two SPS are multiplexed onto the single logical channel. In more complex scenarios such as all-digital stations, the MPS and three SPS may be multiplexed across three logical channels. Conceptually, up to seven distinct SPS identified by a header field can be supported, although I'm not aware of anyone actually implementing this.

It is worth discussing here some of the practical considerations around the MPS and SPS. NRSC-5 requires that an MPS always be present, and the MPS must convey a "program 0" which cannot be stopped and started. This is the main audio channel on an HD radio station. The SPS, though, are used to convey "subcarrier" [2] stations. This is the capability behind the "HD2" second audio channel present on some HD radio stations, and it's possible, although not at all common, to have an HD3 or even HD4.

Interestingly, the PDU "header" is not placed at the beginning of the PDU. Instead, its 24-bit sequence (chosen off a list based on what data types are present in the PDU) are interleaved throughout the body of the PDU. This is intended to improve robustness by allowing the receiver to correctly determine the PDU type even when only part of the PDU is received. PDUs always contain mixed data in a fixed order (program data, opportunistic data, fixed data), with a "data delimiter" sequence after the program audio and a fixed data length value placed at the end. This assists receivers in interpreting any partial PDUs, since they can "backtrack" from the length suffix to identify the full fixed data section and then search further back for the "data delimiter" to identify the full opportunistic data section.

And that's layer 2: audio, opportunistic data, and fixed data are collected for the MPS and any SPS and/or AAS, gathered into PDUs, and then sent to layer 1 for transmission. SID is forwarded directly to layer 1 unmodified.


NRSC-5's application layer runs on top of layer 2. Applications consist most obviously of the MPS and SPS streams, which are used mainly to convey audio... you know, the thing that a radio station does. This can be called the Audio Transport application and it runs the same way whether producing MPS (remember, this is the main audio feed) or SPS (secondary audio feeds or subcarriers).

Audio transport starts with an audio encoder, which is a proprietary design called HDC or High-Definition Coding. HDC is a DCT-based lossy compression algorithm which is similar to AAC but adjusted to have some useful properties for radio. Among them, HDC receives audio data (as PCM) at a fixed rate and then emits encoded blocks at a fixed rate---but variable size. This variable size but fixed rate is convenient to receivers but also makes "opportunistic data," as discussed earlier, possible, because many PDUs will have spare room at the end.

Another useful feature of HDC is its multi-stream output. HDC can be configured to produce two different bit streams, a "core" bit stream which is lower bitrate but sufficient to reproduce the audio at reduced quality, and an "enhanced" data stream that allows the reproduction of higher fidelity audio. The core bit stream can be placed on a different layer 1 channel than the enhanced data stream, allowing receivers to decode only one channel and still produce useful audio when the second channel is not available due to poor reception quality. This is not typically used by hybrid stations, instead it's a feature intended for extended and digital-only stations.

The variable size of the audio data and variable size of PDUs creates some complexity for receivers, so the audio transport includes some extra data about sample rate and size to assist receivers in selecting an appropriate amount of buffering to ensure that the program audio does not underrun despite bursts of large audio samples and fixed data. This results in a fixed latency from encoding to decoding, which is fairly short but still a bit behind analog radio. This latency is sometimes apparent on receivers that attempt to automatically select between analog and digital signals, even though stations should delay their analog audio to match the NRSC-5 encoder.

Finally, the audio transport section of each PDU (that is, the MPS or SPS part at the beginning) contains regular CRC checksums that are used by the receiver to ensure that any bad audio data is discarded rather than decoded.

MPS and SPS audio is supplemented by Program Service Data (PSD), which can be either associated with the MPS (MPSD) or an SPS (SPSD). The PSD protocol generates PDUs which are provided to the audio transport to be incorporated into audio PDUs at the very beginning of the MPS or SPS data. The PSD is rather low bitrate as it receives only a small number of bytes in each PDU. This is quite sufficient, as the PSD only serves to move small, textual metadata about the audio. Most commonly this is the title, artist, and album, although a few other fields are included as well such as structured metadata for advertisements, including a field for price of the advertised deal. This feature is rarely, if ever, used.

The PSD data is transmitted continuously in a loop, so that a receiver that has just tuned to a station can quickly decode the PSD and display information about whatever is being broadcast. The looping PSD data changes whenever required, typically based on an outside system (such as a radio automation system) sending new PSD data to the NRSC-5 encoder over a network connection. PSD data is limited to 1024 bytes total and, as a minimum, the NRSC-5 specification requires that the title and artist fields be populated. Oddly, it makes a half-exception for cases where no information on the audio program is available: the artist field can be left empty, but the title field must be populated with some fixed string. Some radio stations have added an NRSC-5 broadcast but not upgraded their radio automation to provide PSD data to the encoder; in this case it's common to transmit the station call sign or name as the track title, much as is the case with FM Radio Data Service.

Interestingly, the PSD data is viewed as a set of ID3 tags and, even though very few ID3 fields are supported, it is expected that those fields be in the correct ID3 format including version prefixes.

Perhaps the most sophisticated feature of NRSC-5 is the Advanced Application Service transport or AAS. AAS is a flexible system intended to send just about any data alongside the audio programs. Along with PDUs, the audio transport generates a metadata stream indicating the length of the PDUs which is used. The AAS can use that value to determine how many bytes are free, and then fill them with opportunistic data of whatever type it likes. As a result, the AAS basically takes advantage of any "slack" in the radio broadcast's capacity, as well as reserving a portion for fixed data if desired by the station operator.

AAS data is encoded into AAS packets, an organizational unit independent of PDUs (and included within PDUs generated by the audio transport) and loosely based on computer networking conventions. Interestingly, AAS packets may be fragmented or combined to fit into available space in PDUs. To account for this variable structure, AAS specifies a transport layer below AAS packets which is based on HDLC (ISO high-level data link control) or PPP (point-to-point protocol, which is closely related to HDLC and very similar). So, in a way, AAS consists of a loosely computer-network-like protocol over a protocol roughly based on PPP over audio transport PDUs over OFDM.

Each AAS packet header specifies a sequence number for reconstruction of large payloads and a port number, which indicates to the receiver how it should handle the packet (or perhaps instead ignore the packet). A few ranges of port numbers are defined, but the vast majority are left to user applications.

Port numbers are two bytes, and so there's a large number of applications possible. Very few are defined by specification, limited basically to port numbers for supplemental PSD. This might be a bit confusing since PSD has its own reserved spot at the beginning of the audio transport. The PSD protocol itself is limited to only small amounts of text, and so when desired AAS can be used to send larger PSD-type payloads. The most common application of this "extra PSD" is album art, which can be sent as a JPG or PNG file in the AAS stream. In fact, multiple ports are reserved for each of MPSD (main PDS) and SPSD, allowing different types of extra data to be sent via AAS.

Ultimately, the AAS specification is rather thin... because AAS is a highly flexible feature that can be used in a number of ways. For example, AAS forms the basis of the Artist Experience service which allows for delivery of more complete metadata on musical tracks including album art. AAS can be used as the basis of almost any datacasting application, and is applied to everything from live traffic data to distribution of educational material to rural areas.

Finally, in our tour of applications, we should consider the station information service or SIS. SIS is a very basic feature of NRSC-5 that allows a station to broadcast its identification (call sign and name) along with some basic services like a textual message related to the station and emergency alert system messages. SIS has come up somewhat repeatedly here because it receives special treatment; SIS is a very simple transport at a low bitrate and has its own dedicated logical channel for easy decoding. As a result, SIS PDUs are typically the first thing a receiver attempts to decode, and are very short and simple in structure.

To sum up the structure of HD radio, it is perhaps useful to look at it as a flow process: SID data is generated by the encoder and sent to layer 2 which passes it directly to layer 1, where it is transmitted on its own logical channel. PSD data is provided to the audio transport which embeds it at the beginning of audio PDUs. The audio transport informs the AAS encoder of the amount of available free space in a PDU, and the AAS encoder provides an appropriate amount of data to the audio transport to be added at the end of the PDU. This PDU is then passed to layer 2 which encapsulates it in a complete NRSC-5 PDU and arranges it into logical channels which are passed to layer 1. Layer 1 encodes the data into multiple OFDM carriers using a somewhat complex scheme that produces a digital signal that is easy for receivers to recover.

Non-Audio Applications of NRSC-5

While the NRSC-5 specification is clearly built mostly around transporting the main and secondary program audio, the flexibility of its data components like PSD and AAS allows its use for purposes other than audio. As a very simple example, SIS packets include a value called the "absolute local frame number" or ALFN that is effectively a timestamp, useful for receivers to establish the currency of emergency alert messages and for various data applications. Because the current time can be easily calculated from the ALFN, it can be used to set the clocks on HD radio receivers such as car head units. To support this, standard SIS fields include information on local time zone, daylight savings time, and even upcoming leap seconds.

SIS packets include a one-bit flag that indicates whether or not the ALFN is being generated based on a GPS-locked time source, or based on the NRSC-5 encoder's internal clock only. To avoid automatically adjusting radio clocks to an incorrect time (something that had plagued the earlier CEA protocol for automatic setting of VCR clocks via PBS member stations), NRSC-5 dictates that receivers must not set their display time based on a radio station's ALFN unless the flag indicating GPS lock is set. Unfortunately, it seems that it's rather uncommon for radio stations to equip their encoder with a GPS time source, and so in the Albuquerque market at least HD Radio-based automatic time setting does not work.

Other supplemental applications were included in the basic SIS as well, notably emergency alert messages. HD Radio stations can transmit emergency alert messages in text format with start and end times. In practice this seems to be appreciably less successful than the more flexible capability of SiriusXM, and ironically despite its cost to the consumer SiriusXM might have better market penetration than HD Radio.

NRSC-5's data capabilities can be used to deliver an enhanced metadata experience around the audio programming. The most significant implementation of this concept is the "artist experience" service, a non-NRSC-5 standard promulgated by the HD Radio alliance that uses the AAS to distribute more extensive metadata including album art in image format. This is an appreciably more complex process and so is basically expected to be implemented in software on a general-purpose embedded operating system, rather than the hardware-driven decoding of audio programming and basic metadata. Of course this greater complexity lead more or less directly to the recent incident with Mazda HD radio receivers in Seattle, triggered by a station inadvertently transmitting invalid Artist Experience data in a way that seems to have caused the Mazda infotainment system to crash during parsing. Fortunately infotainment-type HD radio receivers typically store HD Radio metadata in nonvolatile memory to improve startup time when tuning to a station, so these Mazda receivers apparently repeatedly crashed every time they were powered on to such a degree that it was not possible to change stations (and avoid parsing the cached invalid file). Neat.

Since Artist Experience just sends JPG or PNG files of album art, we know that AAS can be used to transmit files in general (and looking at the AAS protocol you can probably easily come up with a scheme to do so). This opens the door to "datacasting," or the use of broadcast technology to distribute computer data. I have written on this topic before.

To cover the elements specific to our topic, New Mexico's KANW and some other public radio stations are experimenting with transmitting educational materials from local school districts as part of the AAS data stream on their HD2 subcarrier. Inexpensive dedicated receivers collect these files over time and store them on an SD card. These receiver devices also act as WiFi APs and offer the stored contents via an embedded web server. This allows the substantial population of individuals with phones, tablets, or laptops but no home internet or cellular service to retrieve their distance education materials at home, without having to drive into town for cellular service (the existing practice in many parts of the Navajo Nation, for example) [3].

There is potential to use HD Radio to broadcast traffic information services, weather information, and other types of data useful to car navigation systems. While there's a long history of datacasting this kind of information via radio, it was never especially successful and the need has mostly been obsoleted by ubiquitous LTE connectivity. In any case, the enduring market for this type of service (over-the-road truckers for example) has a very high level of SiriusXM penetration and so already receives this type of data.

Fall of the House of Hybrid Digital

In fact, the satellite angle is too big to ignore in an overall discussion of HD Radio. Satellite radio was introduced to the US at much the same time as HD Radio, although XM proper was on the market slightly earlier. Satellite has the significant downside of a monthly subscription fee. However, time seems to have shown that the meaningful market for enhanced broadcast radio consists mostly of people who are perfectly willing to pay a $20/mo subscription for a meaningful better service. Moreover, it consists heavily of people involved in the transportation industry (Americans listen to the radio basically only in vehicles, so it makes sense that the most dedicated radio listeners are those who spend many hours in motion). Since many of these people regularly travel across state lines, a nationwide service is considerably more useful than one where they have to hunt for a new good station to listen to as they pass through each urban area.

All in all, HD radio is not really competitive for today's serious radio listeners because it fails to address their biggest complaint, that radio is too local. Moreover, SiriusXM's ongoing subscription revenue seems to provide a much stronger incentive to quality than iHeartMedia's declining advertising relationships [4]. The result is that, for the most part, the quality of SiriusXM programming is noticeably better than most commercial radio stations, giving it a further edge over HD Radio.

Perhaps HD Radio is simply a case of poor product-market fit, SiriusXM having solved essentially the same problems but much better. Perhaps the decline of broadcast media never really gave it a chance. The technology is quite interesting, but adoption is essentially limited to car stereos, and not even that many of them. I suppose that's the problem with broadcast radio in general, though.

[1] The details here are complex and deserve their own post, but as a general idea the FCC attempts to maintain a diversity of radio programming in each market by refusing licenses to stations proposing a format that is already used by other stations. Unfortunately there are relatively few radio formats that are profitable to operate, so the broadcasting conglomerates tend to end up playing games with operating stations in minor formats, at little profit, in order to argue to the FCC that enough programming diversity is available to justify another top 40 or "urban" station.

[2] The term "subcarrier" is used this way basically for historical reasons and doesn't really make any technical sense. It's better to think of "HD2" as being a subchannel or secondary channel, but because of the long history of radio stations using actual subcarrier methods to convey an alternate audio stream the subcarrier term is stuck.

[3] It seems inevitable that, as has frequently happened in the history of datacasting, improving internet access technology will eventually obsolete this concept. I would strongly caution you against thinking this has already happened, though: even ignoring the issue of the long and somewhat undefined wait, Starlink is considerably more expensive than the typical rates for rural internet service in New Mexico. It is to some extent a false dichotomy to say that Starlink is cost uncompetitive with DSL considering that it can service a much greater area. However, I think a lot of "city folk" are used to the over-$100-per-month pricing typical of urban gigabit service and so view Starlink as inexpensive. They do not realize that, for all the downsides of rural DSL, it is very cheap. This reflects the tight budget of its consumers. For those who have access, CenturyLink DSL in New Mexico ranch country is typically $45/mo no-contract with no install fee and many customers use a $10/mo subsidized rate for low income households. Starlink's $99/mo and $500 initial is simply unaffordable in this market, especially since those outside of the CenturyLink service area have, on average, an even lower disposable income than those clustered near towns and highways.

[4] It is hard for me not to feel like iHeartMedia brought this upon themselves. They gained essentially complete control of the radio industry (with only even sadder Cumulus as a major competitor) and then squeezed it for revenue until US commercial radio programming had become, essentially, a joke. Modern commercial radio stations run on exceptionally tight budgets that have mostly eliminated any type of advantage they might have had due to their locality. This is most painfully apparent when you hear an iHeartMedia station give a rare traffic update (they seem to view this today as a mostly pro forma activity and do it as little as possible) in which the announcer pronounces "Montaño" and, more puzzlingly, "Coors" wrong in the span of a single sentence. I have heard a rumor that all of the iHeartMedia traffic announcements are done centrally from perhaps Salt Lake City but I do not know if this is true.


>>> 2022-02-19 PCM

I started writing a post about media container formats, and then I got severely sidetracked by explaining how MPEG elementary streams aren't in a container but still have most of the features of containers and had a hard time getting back to topic until I made the decision that I ought to start down the media rabbit hole with something more basic. So let's talk about an ostensibly basic audio format, PCM.

PCM stands for Pulse Code Modulation and, fundamentally, it is a basic technique for digitization of analog data. PCM is so obvious that explaining it is almost a bit silly, but here goes: given an analog signal, at regular intervals the amplitude of the signal is measured and quantized to the nearest representable number (in other words, rounded). The resulting "PCM signal" is this sequence of numbers. If you remember your Nyquist and Shannon from college data communications, you might realize that the most important consideration in this process is that the sampling frequency must be twice the highest frequency component in the signal to be digitized.

In the telephone network, for example, PCM encoding is performed at 8kHz. This might seem surprisingly low, but speech frequencies trail off above 3kHz and so the up-to-4kHz represented by 8kHz PCM is perfectly sufficient for intelligible speech. It is not particularly friendly to music, though, which is part of why hold music is the way it is. For this reason, in music and general digital audio a sampling rate of 44.1kHz is conventional due to having been selected for CDs. Audible frequencies are often defined as being "up to 20kHz" although few people can actually hear anything that high (my own hearing trails off at 14kHz, attributable to a combination of age and adolescent exposure to nu metal). This implies a sampling rate of 40kHz; the reason that CDs use 44.1kHz is essentially that they wanted to go higher for comfort and 44.1kHz was the highest they could easily go on the equipment they had at the time. In other words, there's no particular reason, but it's an enduring standard.

Another important consideration in PCM encoding is the number of discrete values that samples can possibly take. This is commonly expressed as the number of bits available to represent each sample and called "bit depth." For example, a bit depth of eight allows each sample to have one of 255 values that we might label -127 through 128. The bit depth is important because it limits the dynamic range of the signal. Dynamic range, put simply, is the greatest possible variation in amplitude, or the greatest possible variation between quiet and loud. Handling large dynamic ranges can be surprisingly difficult in both analog and digital systems, since both electronics and algorithms struggle to handle values that span multiple orders of magnitude.

In PCM encoding, bit depth has a huge impact on the resulting bitrate. 16-bit audio, as used on CDs, is capable of a significantly higher dynamic range than 8-bit audio at the cost of doubling the bitrate. Dynamic range is important in music, but is also surprisingly important in speech, and a bit depth of 8 is actually insufficient to reproduce speech that will be easy to understand.

And yet, due to technical constraints, 8kHz and 8-bit samples were selected for telephone calls. So how is speech acceptably carried over 8-bit PCM?

We need to talk a bit about the topics of compression and companding. There can be some confusion here because "compression" is commonly used in computing to refer to methods that reduce the bitrate of data. In audio engineering, though, compression refers to techniques that reduce the dynamic range of audio, by making quieter sounds louder and louder sounds quieter until they tend to converge at a fixed volume. Like some other writers, I will use "dynamic compression" when referring to the audio technique to avoid confusion. For both practical and aesthetic reasons (not to mention, arguably, stupid reasons), some degree of dynamic compression is applied to most types of audio that we listen to.

Companding, a portmanteau of compressing and expanding, is a method used to pack a wide dynamic range signal into a channel with a smaller dynamic range. As the name suggests, companding basically consists of compressing the signal, transmitting it, and then expanding it. How can the signal be expanded, though, given that dynamic range was lost when it was compressed? The trick is that both sides of a compander are non-linear, compressing loud sounds more than quiet sounds. This works well, because in practice many types of audio show a non-linear distribution of amplitudes. In the case of speech, for example, significantly more detail is found at low volume levels, and yet occasional peaks must be preserved for good intelligibility.

In practice, companding is so commonly used with PCM that the compander is often considered part of the PCM coding. When I have described PCM thus far, I have been describing linear PCM or LPCM. LPCM matches each sample against a set of evenly distributed discrete values. Many actual PCM systems use some form of non-linear PCM in which the possible sample values are distributed logarithmically. This makes companding part of PCM itself, as the encoder effectively compresses and decoder effectively expands. One way to illustrate this is to consider what would happen if you digitized audio using a non-linear PCM encoder and then played it back using a linear PCM decoder: It would sound compressed, with the quieter components moved into a higher-valued, or louder, range.

Companding does result in a loss of fidelity, but it's one that is not very noticeable for speech (or even for music in many cases) and it results in a significant savings in bit depth. Companding is ubiquitous in speech coding.

One of the weird things you'll run into with PCM is the difference between µ-law PCM and A-law PCM. In the world of telephony, a telephone call is usually encoded as uncompressed 8kHz, 8-bit PCM, resulting in the 64kbps bitrate that has become the basic unit of bandwidth in telecom systems. Given the simplicity of uncompressed PCM, it can be surprising that many telephony systems like VoIP software will expect you to choose from two different "versions" of PCM. The secret of telephony PCM is that companding is viewed as part of the PCM codec, and for largely historic reasons there are two common algorithms in use. The actual difference is the function or curve used for companding, or in other words, the exact nature of the non-linearity. In the US and Japan (owing to post-WWII history Japan's phone system is very similar to that of the US), the curve called µ-law is in common use. In Europe and most other parts of the world, a somewhat different curve is used, called A-law. In practice the difference between the two is not particularly significant, and it's difficult to call one better than the other since both just make slightly different trade offs of dynamic range for quantization error (A-law is the option with greater dynamic range and greater possible distortion).

Companding is rarely applied in music and general multimedia applications. One way to look at this is to understand the specializations of different audio codecs: µ-law PCM and A-law PCM are both simple examples of what are called speech codecs, Speex and Opus being more complex examples that use lossy compression techniques for further bitrate reduction (or better fidelity at 64kbps). Speech codecs are specialized for the purpose of speech and so make assumptions that are true of speech including a narrow frequency range and certain temporal characteristics. Music fed through speech codecs tends to become absolutely unlistenable, particularly for lossy speech codecs, which hold music on GSM cellphones painfully illustrates.

In multimedia audio systems, we instead have to use general-purpose audio codecs, most of which were designed around music. Companding is effectively a speech coding technique and is left out of these audio systems. PCM is still widely used, but in general audio PCM is assumed to imply linear PCM.

As previously mentioned, the most common convention for PCM audio is 44.1kHz at 16 bits. This was the format used by CDs, which effectively introduced digital audio to the consumer market. In the professional market, where digital audio has a longer history, 48kHz is also in common use... however, you might be able to tell just by mathematical smell that conversion from 48kHz to 44.1kHz is prone to distortion problems due to the inconveniently large common multiple of the two sample rates. An increasingly commonly used sample rate in consumer audio is 96kHz, and "high resolution audio" usually refers to 96kHz and 24 bit depth.

There is some debate over whether or not 96kHz sampling is actually a good idea. Remembering our Nyquist-Shannon, note that all of the extra fidelity we get from the switch from 44.1kHz to 96kHz sampling is outside of the range detectable by even the best human ears. In practice the bigger advantage of 96kHz is probably that it is an even multiple of the 48kHz often used by professional equipment and thus eliminates effects from sample rate conversion. On the other hand, there is some reason to believe that the practicalities of real audio reproduction systems (namely the physical characteristics of speakers, which are designed for reproduction of audible frequencies) causes the high frequency components preserved by 96kHz sampling to turn into distortion at lower, audible frequencies... with the counterintuitive result that 96kHz sampling may actually reduce subjective audio quality, when reproduced through real amplifiers and speakers. In any case, the change to 24-bit samples is certainly useful as it provides greater dynamic range. Unfortunately, much like "HDR" video (which is the same concept, a greater sample depth for greater dynamic range), most real audio is 16-bit and so playback through a 24-bit audio chain requires scaling that doesn't typically produce distortion but can reveal irritating bugs in software and equipment. Fortunately the issue of subjective gamma, which makes scaling of non-HDR video to HDR display devices surprisingly complex, is far less significant in the case of audio.

PCM audio, at whatever bit rate and bit depth, is not so often seen in the form of files because of its size. That said, the "WAV" file format is a simple linear PCM encoding stored in a somewhat more complicated container. PCM is far more often used as a transport between devices or logical components of a system. For example, if you use a USB audio device, the computer is sending a PCM stream to the device. Unfortunately Bluetooth does not afford sufficient bandwidth for multimedia-quality PCM, so our now ubiquitous Bluetooth audio devices must use some form of compression. A now less common but clearer example of PCM transport is found in the form of S/PDIF, a common consumer digital audio transport that can carry two 44.1 or 48kHz 16-bit PCM channels over a coaxial or fiber-optic cable.

You might wonder how this relates to the most common consumer digital audio transport today, HDMI. HDMI is one of a confusing flurry of new video standards that were developed as a replacement for the analog VGA, but HDMI originated more from the consumer A/V part of the market (the usual Japanese suspects, mostly) and so is more associated with televisions than the (computer industry backed) DisplayPort standard. A full treatment of HDMI's many features and misfeatures would be a post of its own, but it's worth mentioning the forward audio channel.

HDMI carries the forward (main, not return) audio channel by interleaving it with the digital video signal during the "vertical blanking interval," a concept that comes from the mechanical operation of CRT displays but has remained a useful way to take advantage of excess bandwidth in a video channel. The term vertical blanking is now somewhat archaic but the basic idea is that transmitting a frame takes less time than the frame is displayed for, and so the unoccupied time between transmitting each frame can be used to transmit other data. The HDMI spec allows for up to 8 channels of 24-bit PCM, at up to 192kHz sampling rate---although devices are only required to support 2 channels for stereo.

Despite the capability, 8-channel (usually actually "7.1" channel in the A/V parlance) audio is not commonly seen on HDMI connections. Films and television shows more often distribute multi-channel audio in the form of a compressed format designed for use on S/PDIF, most often Dolby Digital and DTS (Xperi). In practice the HDMI audio channel can move basically any format so long as the devices on the ends support it. This can lead to some complexity in practice, for example when playing a blu-ray disc with 7.1 channel DTS audio from a general-purpose operating system that usually outputs PCM stereo. High-end HDMI devices such as stereo receivers have to support automatic detection of a range of audio formats, while media devices have to be able to output various formats and often switch between them during operation.

On HDMI, the practicalities of inserting audio in the vertical blanking interval requires that the audio data be packetized, or split up into chunks so that it can be divided into the VBI and then reassembled into a continuous stream on the receiving device. This concept of packetized audio and/or video data is actually extremely common in the world of media formats, as packetization is an easy way to achieve flexible muxing of multiple independent streams. And that promise, that we are going to talk about packets, seems like a good place to leave off for now. Packets are my favorite things!

Later on computer.rip: MPEG. Not much about the compression, but a lot about the physical representations of MPEG media, such as elementary streams, transport streams, and containers. These are increasingly important topics as streaming media becomes a really common software application... plus it's all pretty interesting and helps to explain the real behavior of terrible Hulu TV apps.

A brief P.S.: If you were wondering, there is no good reason that PCM is called PCM. The explanation seems to just be that it was developed alongside PWM and PPM, so the name PCM provided a pleasing symmetry. It's hard to actually make the term make a lot of sense, though, beyond that "code" was often used in the telephone industry to refer to numeric digital channels.


>>> 2022-02-14 long lines in the Mojave

I have sometimes struggled to justify my love for barren deserts. Why is it that my favorite travel destinations consist of hundreds of miles of sandy expanse? Today, I'm going to show you one reason: rural deserts have a habit of accumulating history. What happens in the desert stays there---in corporeal form. Slow growth of vegetation, little erosion, and extraordinarily low property values turn vast, empty deserts into time capsules... if you spend enough time looking for the artifacts.

Also, we're going to talk about telephones! But first...

While the Mojave Desert was long open to homesteading claims, the extremely arid climate along with the distance to urban centers has always made the area challenging to put to use, and so has kept its population low. Places like Death Valley and Joshua Tree National Park are the best known Mojave destinations for their distinct geography and flora, but a different swath of the Mojave Desert began to attract a less formal sort of attention long ago. The development of the US Highway System, specifically highways 91 and 66, created a pocket of desert that was remote and yet still readily accessible from both Los Angeles and Las Vegas. Post-WWII, power sports (specifically dirt biking) lead to significant use of and impact on open land along highways. Combined, these created a bit of a contradiction: the empty desert was getting a little too crowded.

Through a series of political and administrative moves, the area of the Mojave desert roughly defined by US-91 to the north (now I-15) and US-66 to the south (now I-40 although the alignments vary) became first the East Mojave National Scenic Area of the Bureau of Land Management (the first such National Scenic Area established) and then, in 1994, the Mojave National Preserve of the National Park Service [1]. It is the third largest unit in the national park system, and due to its vast size, history, character, and perhaps most of all, miniscule budget, it remains surprisingly undeveloped and untamed.

Roughly in the center of the Preserve is a tiny town called Kelso. Kelso was established by the Los Angeles and Salt Lake Railroad (later part of Union Pacific) as a railroad depot and base for "helper" locomotives added to trains to help them make it up a steep grade eastwards towards the next tiny settlement, Cima. During its distinguished life as a railroad town, from the 1910s to the 1980s, it also supported a few surrounding mines. Elsewhere in what is now the National Preserve, many small mines, a few large ones, and a few major ranches made up the entire tiny population of the region. The nearest present-day town, Baker, has become little more than a row of gas stations and a surprisingly compelling Greek diner. In an earlier era, the multi-hour trip to Baker on horseback was the only connection to the outside world available to miners and ranchers out in the desert. Back then it was like a different world, and today it largely still is.

Situated roughly halfway between LA and Vegas, in San Bernardino County, California, the Mojave National Preserve's more than one and a half million acres display two wars worth of military collaboration by the Bell System, two generations of long-distance telephone infrastructure, and four distinct types of long-distance telephone lines. There is perhaps nowhere else that one can gain as powerful of an appreciation for the feat that is the long-distance telephone call. A call to Los Angeles requires of you only dialing and waiting. First, though, it required teams of workers digging thousands of huge post holes by hand. The Mojave has been described as "a nowhere between two somewheres" [1]. This is true not only on the ground but also in the wires, as a large portion of telephone calls in and out of one of America's largest cities had to pass through one hundred miles of blowing sand. They still do today.

It's hard to say when, but we can safely how the first telephones arrived in the Mojave: by rail. The railroads made extensive use of telegraphy and, later, telephony. By the 1920s the railroad depot at Kelso, and later some of the homes of railroad employees, were equipped with telephones on the Los Angeles and Salt Lake Railroad's (LA&SR) private system [2]. While railroad telephones operated on separate, wholly railroad-owned infrastructure that was not interconnected with the Bell system, railroad telephone departments enjoyed a close relationship with the Bell System and largely used the same techniques with equipment from the same manufacturers.

The LA&SR would have installed a series of multi-armed utility poles, likely as part of the original construction of the railroad. While these poles would have initially carried only telegraph circuits, they later gained telephone circuits, signal logic circuits, and even "code" circuits which used an early form of digital signaling to communicate with trackside equipment. Many of these circuits would have looked substantially similar to open-wire telephone leads, because they were: railroads employed the same open-wire design that AT&T used.

Railroad telephones went through generally the same technological progression as public telephones. The first equipment installed would have been magneto phones. To make a call, you would turn a crank on the phone which generated a high voltage "ringing" signal. Once an operator noticed and connected to the line, you asked the operator to connect your call. Individual phones were expensive to install. As a result, the Kelso depot started with only a single phone in the dispatcher's office, along with the telegraph. At some point, this telephone was placed in a specially built rotating cabinet that allowed the station agent in the dispatch office to spin it around, presenting it through the other side of the wall for someone in the lobby to take a call [2]. The clever pass-through phone was probably designed by a local worker as a practical solution to the problem that dispatchers often called the phone wanting to speak to visiting train crews, but railroad security policy forbade anyone other than a qualified agent in the dispatch office. The station agent must have quickly tired of relaying conversations sentence-by-sentence through the window.

Later, as the technology progressed and more resources became available, the railroad connected additional phones to other buildings. These extra extensions most commonly appeared in the homes of senior staff such as the station agent and track gang foreman; they would be the only way to reach someone at the depot (or, for that matter, in the entire town of Kelso) during an after-hours emergency. In this era an "extension" was a literal extension of the existing wiring; all of these phones would have rung together. Kelso Depot also featured another clever solution to the difficulties of reaching a remote employee before the widespread availability of radio: after the installation of electrical (CTC) signaling on the rail line, the dispatch office's semaphore display that once electromagnetically dropped flags to alert the agent that a train had passed a signal point approaching the station was rewired to drop a flag whenever the depot phone rang. This way, if the agent missed a call while attending to something outside of the office, they would at least know to call dispatch back when they returned [2].

Elsewhere, in 1915, the first transcontinental telephone line went into service. This line, generally from New York to San Francisco, passed through Northern California far from our area of interest. It ignited a fire that quickly spread, though, and the 1920s saw extensive construction of new long distance telephone lines in the West. In parts of Southern California, Pacific Telephone and Telegraph (PT&T) competed with the Home Telephone Company, until it acquired it. Much of PT&T's advantage over Home was its status as a member of the Bell System: Home had no long distance service. PT&T did.

The history of this early era is difficult to construct, as very few records remain and most of the artifacts have been removed. Around the Mojave National Preserve, though, two PT&T long-distance open-wire toll leads survive. One roughly paralleled US-91 (now I-15) from the area of Los Angeles (probably San Bernardino) towards, ultimately, Salt Lake City. This was one of the most important connections between the West Coast and the rest of the country.

Another, further south, roughly paralleled US-66 (now I-40). This was the fourth transcontinental line, constructed around 1938. Its Western terminus was likely Whitewater, which was less a town and more a particularly important telephone office in the early AT&T system. Whitewater is notable for existing in the territory of General Telephone & Electronics or GTE, a particularly long-lived non-Bell competitive telephone carrier. When GTE was later acquired by a Bell operating company, the two adopted the name Verizon. Back in the pre-WWII era, GTE's dominance in far southern California meant that AT&T had few actual customers but a great need to move long-distance traffic. Whitewater was thus a bit of an oddity: a major telephone office, in the middle of nowhere, with no customers but a surplus of traffic. At the other end, this southern open wire lead probably connected into Bullhead City or Laughlin, in the area of Needles.

Sections of both open wire routes remain and can be seen today. The northern one includes a particularly spectacular crossing of the freeway, employing a seldom seen today technique in which two steel "messenger" ropes suspended between stout poles hold up a series of wooden frames that substitute for poles and crossarms. These "floating poles" supported the actual telephone wires for the long canyon crossing. Today, a single multi-pair cable hangs sadly from the five-arm frames, apparently placed to provide telephone service to a nearby mine after the removal of the open-wire route. On the southern route, remaining poles are clearly visible from US-66 during its departure from the present-day I-40 alignment near Cadiz (itself an oddity, a town owned by a natural resources company with long-failed plans to pump and sell the groundwater to LA).

At the dawn of WWII, these two leads represented some of the only long toll leads on the West Coast. Following the attack on Pearl Harbor, Japanese invasion, or at least opportunistic sabotage, was a major concern for military planners. Concerningly, the handful of telephone connections between major western cities had poor redundancy and were mostly near the coast. They would be easy for a small team of Japanese soldiers, delivered by boat under cover of night, to find and destroy. This easy move would effectively decapitate military leadership on the West Coast, impairing the ability of the United States to mount a defense. Here, years before the nuclear bomb or mutually assured destruction, survivable C2 and the telecom infrastructure to support it became a national priority [3].

In early 1942, and in collaboration with the War Department, PT&T embarked on a project that would rival the first transcontinental: a new wartime long distance route called the Defense Backbone Route, or DBR. Over a span of under a year, with a monumental level of effort ranging from logistical innovations to the sheer manpower of hundreds of laborers working from roving camps, the DBR's open-wire spanned nearly 900 miles from the Mojave desert to Yakima. Most importantly, its entire route was well inland (including a large stretch in Nevada), making it difficult for any military force arriving via the coast to reach. The DBR presaged later developments like the L-3I by providing a survivable toll lead dedicated to military use, particularly for the purposes of defense and reprisal.

The southern terminus of the DBR is sometimes described as Los Angeles, but that seems to have been only by interconnection to the existing open-wire lead along US-66, south of the Preserve. The DBR itself ends at a location optimistically described as Danby, California, although Danby is not much of a town and the end of the DBR is not that near it. The "Danby" station, consisting of one small building which presumably originally contained carrier multiplexing equipment, is still there today, and seems to still be in use with an added microwave antenna. As we will see, telephone infrastructure in rural areas is often reused for cost efficiency. It appears that there are still active customers served by a modern multipair telephone cable installed using the open-wire route's right of way, and the Danby station remains with a microwave link to provide local loop connections to these customers.

From Danby, the DBR continued northwest and then north, almost right through the middle of the present-day Preserve. It wanders through valleys, passing within a few miles of Kelso, before reaching the northern open-wire lead at US-91, which it joins to head towards Las Vegas. The poles and wire of the DBR remain largely intact within the Preserve and can be seen from Kelbaker Road (shortened from Kelso Baker Road), in Kelso and on Kelso Cima Road, and later where it crosses the Mojave Road and numerous dirt mine and ranch roads. It can even be seen from I-15 if you look carefully, although the span near I-40 has been removed.

Not so long after the DBR was constructed, and shortly after it was converted to civilian use, the owners of the Cima mine (a small cinder mine in what is now the eastern portion of the Preserve) contacted PT&T or AT&T to request a telephone. Under a California law, PT&T was required to furnish telephones for use in very rural areas where no other phones existed. At the time, it was a multi-hour trip from the Cima mine to reach Kelso or Baker where, in an emergency, help could be summoned. To improve safety, but moreover to comply with California law, AT&T came to the desert with a "toll station." Toll stations, an artifact of the early phone system no longer seen today, can be thought of as telephones that are a long distance call from anywhere. Toll stations were used to connect very rural areas, and were often located on party lines and required operator assistance to call. The reason for these oddities is simple: toll stations were connected directly to long-distance wires, not via local exchanges, and so they represented an odd exception in the general architecture of the telephone system. On the upside, they were far easier to install in very rural areas than conventional local telephone service.

In 1948, the Cima mine's requested phone was unceremoniously placed dead center in the middle of the desert. Located at the intersection of the DBR and a dirt access road, the new phone was still some way from the Cima mine but much closer than any town. It was a magneto phone connected directly to the DBR (likely using the "voice frequency" or non-carrier "channel" of one of the sets of pairs). Lifting the handset and turning the crank prompted a long-distance operator in San Bernardino to pick up and ask where the user wanted to call. If one wanted to call the phone, they would have to ask their operator to be connected to the San Bernardino long distance operator, and then ask for the phone by name. The long distance operator would ring the phone, and someone would have to be waiting nearby. It is said that some local user left a chair by the phone, as a convenience to those waiting for incoming calls. Telephone users would sometimes adopt a regular schedule of visiting the phone during set hours in case someone wanted to reach them. Locals driving by the phone on the dirt road would roll down their window, just in case it rang, so that they could take a message that would almost certainly be for someone they knew. While far from convenient, it was the only phone service for both mines and ranchers in the area [2,4].

This lonely phone would prove to have remarkable staying power. The owners of the Cima mine seem to have continued to use it as their main telephone into the '90s. Over time the phone was modernized to a more conventional payphone and given a more conventional switching arrangement and phone number. In 1997, after a series of chance discoveries, it came to be widely known in counter-cultural circles as the Mojave Phone Booth. It was difficult to comprehend: an aluminum and glass telephone booth, in a way a symbol of modernity and urbanism, sitting in a lonely desert impossibly far from the civilization it connected to [2,4].

The phone booth's sudden fame, and significant increase in users, lead directly to its demise. Some combination of Park Service concern about environmental impact by visitors to the phone booth and upset by a local rancher who didn't appreciate the raucous visitors lead to its removal in 2000. Today nothing remains of the Mojave Phone Booth except for the DBR itself. Its segment in the northern Preserve, apparently maintained to keep the single connection to the phone booth, is still in good shape today (albeit with only one crossarm remaining) [2]. Unfortunately, while the Mojave Phone Booth is widely described in media ranging from Geocities-esque phone phreaking websites to an episode of 99% Invisible, few people know that the cross-desert phone line its wires once hung off of was itself an oddity, an artifact of WWII which had been hidden from the Japanese in the desert. The Mojave Phone Booth was a contradiction in an even deeper way than it might first seem: a phone placed for convenient access along a phone line placed specifically to avoid convenient access. That is how you get a phone booth in the middle of nowhere.

Elsewhere along the DBR, World War II had ended but the Cold war was just beginning. Early in the Cold War the greatest fear was the delivery of nuclear weapons by bombers. Air defense, before mature radar technology, was a difficult problem. In the style successfully employed by the UK during the battle for Britain, the Air Force established a Ground Observer Corps (actually the second, after one which operated during WWII). The Corps consisted of volunteers who, when activated, would search the sky by eye and ear for enemy aircraft and report them to a "filter center" where such reports were correlated and forwarded to the Air Defense Command.

Being the only thing for a very long ways, the Kelso railroad depot was an obvious choice for a Corps observation post. While not recorded, the volunteers were probably railroad employees who lived in railroad-provided housing in Kelso. There was one problem: the observers would need a telephone to report their sightings to the filter center, and the filter center was not on the railroad telephone network. As a result, the first public network telephone was installed in the lobby of the Kelso railroad depot in 1956 [1,2]. It is unclear today how exactly this phone was connected. I find it likely, although I cannot prove, that it was connected via railroad open-wire leased by AT&T and tied into an AT&T exchange in a larger town. It is also possible that it was a toll station attached to the DBR much like the Mojave Phone Booth, although inspection of the cabling which now exists from the DBR to Kelso suggests that it is a much newer connection than 1956.

In 1974, the Kelso depot phone was apparently still in service although it was likely connected differently (as we will later discuss). A railroad employee, responsible for the operation of the Depot, requested that PT&T move the phone outside under a covered walkway so that it would be accessible 24/7 after they introduced the practice of locking the depot doors at night (this on the advice of a UP police Special Agent who feared a midnight robbery of the depot's small restaurant, formerly 24/7 but by then closed nightly). There was, reportedly, also a phone at Kelso's small general store, across the street from the depot, which likely shared the line [2].

The DBR thus served two local telephones within the Preserve in addition to long-distance traffic. There is some reason to believe that the payphone, at least, was on a party-line circuit shared with phones installed in the homes of some local residents (probably ranch houses relatively near the DBR). The Kelso phone may have been as well, or may have been party-lined with other phones in railroad facilities if it was indeed on leased railroad open-wire. The general pattern of an open-wire toll lead with a single party line used to connect a few toll stations for local users was common in very rural areas at the time. Not just ranch houses and mines but also rural gas stations and businesses relied on this type of service, as toll leads often followed the same highways that rural businesses clustered near.

The Cold War would have a much bigger impact on the Mojave than a phone in the Kelso depot, although the introduction of telephone service to such remote sites as the Mojave Phone Booth and even the Kelso Depot (which did not even have an electrical connection, relying instead on a small on-site power plant operated by the railroad until 1960) was no small feat. The need for long-distance capacity between Los Angeles and the east had grown exponentially. More troubling, the genesis of nuclear weapons and the doctrine of mutually assured destruction created an exponentially greater need for fast, reliable, and survivable telecommunications. The "Flash Override" button on the President's telephone, intended foremost for use when ordering a nuclear strike, would be useless without a telephone network that could connect the President to their military branches... even after nuclear attack, and even through the remote Mojave.

Southern California, outside of its major cities, was rich with military installations (a result in part of the extensive use of the Mojave for WWII training) but poor in infrastructure. This created a particular challenge: in Southern California AT&T suddenly had a bevy of customers for AUTOVON (the Automatic Voice Network, a survivable military telephone system operated by AT&T on contract to the War Department), but very few customers for civilian service and thus very little capability to connect new phones. The Mojave needed strong telephone infrastructure, and it needed it very fast.

I feel fairly confident in stating that the greatest single achievement in the construction of AUTOVON, if not of the entire history of survivable military communications, was the L-3I transcontinental telephone line. This coaxial cable, analog, high capacity toll lead, installed mostly in 1964, could carry thousands of calls from coast to coast. Moreover, it was completely underground and hardened against nearby nuclear detonations. Manned L-3I support facilities, which were found every 100 miles, were underground bunkers staffed 24/7 and equipped with supplies for staff to survive two weeks in isolation. Because it was impractical to harden such facilities against direct nuclear attack, their survivability relied in part on remoteness. The L-3I was intentionally routed through rural areas, well clear of likely targets for nuclear attack. At the same time, it needed reliable connectivity to military installations that were almost certainly targets. This required a hybrid approach of both the hardened L-3I and multiple, redundant connections of other types.

The L-3I transcon adopted a southern route in the western United States and dipped into the western edge of Texas before crossing through the Southwest: New Mexico, Arizona, and then Southern California. There was an underground L-3I "main station" at Kingman, Arizona, after which the cable crossed under the Colorado River at Riviera and proceeded almost directly west into the Preserve. The L-3I cable passed about eight miles north of Kelso, roughly parallel to and not far south of the Mojave Road. After a northern jog it turned back west south of Baker until it met US-91, now I-15, and followed the highway to Beacon Station (which appears to be a former CAA intermediary airfield) where the next L-3I main station is found just north of the freeway.

L-3I cables were buried underground (actually pushed directly into the ground by means of a special plow), but their presence was clearly indicated on the surface by repeater stations and right of way (ROW) markers. Repeater stations were installed in buried concrete vaults every four miles, but a metal hut was installed on top of each vault to house test equipment and provide technicians a clean and dry workspace. ROW markers consisted of frequently placed tall wooden posts with orange metal bands wrapped around the tops, intended both to help repair crews locate the cable in the ground and to warn farmers and construction workers to dig with care.

In the Mojave, though, none of this can be seen today. The L-3I cable saw use through the Cold War, carrying both military and civilian traffic. In the '80s, it was upgraded to carry a digital protocol called P140. The 140Mbps capacity of P140 was limited, though, and the L-3I cable required significantly more maintenance and support than the fiber optic technology increasingly used by 1990. In 1997, AT&T disclosed its intentions to fully abandon the L-3I cable west of Socorro (although portions of the ROW would be reused for fiber). In response, the NPS and BLM performed an environmental analysis on abandonment of the cable in federal land. The analysis revealed several potential long-term environmental impacts from not only the cable and repeaters but also the ROW markers themselves. The marker posts provided an ideal perch for birds of prey, in a desert environment that offered very few other tall objects. The effect was increased predation of small animals, a particular problem for several endangered species in the region.

To mitigate the problem, the NPS required an effort that was rather unusual for L-carrier routes: complete removal. Repeater huts, vaults, ROW marker posts, and the cable itself were all demolished and hauled away throughout most of Arizona and California. Today, all that remains of the L-3I in the Preserve is the still visible scar of the trenching and excavation, marked on many maps as "Utility Road."

The western destination of the L-3I cable was the L-3I Main Station at Mojave, around one hundred miles west of the Preserve, which despite the retirement of the L-3I has grown into a large complex that remains an important long-distance switching center today. The AUTOVON switch at Mojave connected via microwave and buried cable to a long list of cities and military installations in Southern California.

Microwave radio systems can move large numbers of telephone calls over point-to-point radio links. Because they avoid the need to run cable through many miles of land, microwave can be a much more affordable option for long-distance telephone infrastructure. As a downside, the microwave technology of the time (called TD-2) could carry fewer calls and required a line-of-sight of generally under 50 miles between stations. To extend this range, simple stations called relays received a signal on one side and transmitted it again out the other. The microwave antennas used at the time were large, heavy, and required fairly regular maintenance, all of which made very tall towers impractical. Instead, to find a line of sight, microwave facilities were built on peaks and in high mountain passes.

Much of the microwave telephone infrastructure in the area of the Preserve was built at the same time as the L-3I cable, as both were part of the same larger AUTOVON project. The L-3I was a high-capacity, high-durability, but high-cost backbone, while microwave links were a more affordable technology that used redundancy rather than physical strength to ensure reliability. The lower cost and greater flexibility of microwave also made it ideal for shorter connections between the telephone network and defense facilities, encouraging more local microwave stations. This is why the mid-1960s AUTOVON effort lead to the creation of not one, but three independent east-west long-distance routes through the area of the Preserve: the L-3I cable, a northern microwave route, and a southern microwave route.

Although they were relatively inexpensive, microwave stations were not left undefended. AUTOVON microwave facilities were above ground but used hardened building techniques including thick concrete walls, blast shielded vents, and reinforced towers and antennas to survive nuclear strikes at a moderate distance. Most microwave stations were simple relays that operated unattended except for periodic visits by maintenance technicians, but larger stations with switching equipment were staffed 24/7 and supplied for two weeks of isolation, much like L-3I main stations.

The center of the microwave network in the Mojave, if not all of Southern California, was a remote mountaintop site called Turquoise. Located just north of the preserve, about five miles north of I-15 at Halloran Springs, Turquoise was a staffed facility with an AUTOVON switch. Its squat square tower bristled with horn antennas. Every day several shifts of exchange technicians made their way up Halloran Springs Road to the site and supervised the switching of military and civilian calls to destinations throughout southern California, as well as onto the L-3I and other cables for cross-country transit. Turquoise had direct or indirect connections to four different major long-distance microwave routes. As a secondary function, Turquoise included Ground Entry Point antennas for a system called ECHO FOX that provided radiotelephone service to Air Force One... directly to AUTOVON, for use in a nuclear emergency if necessary [6].

One pair of antennas on Turquoise's crowded tower pointed south towards a station called Kelso Peak. Kelso Peak, located in the Preserve a little under ten miles northwest of Kelso, served as a relay station on both a north-south route (north to Turquoise) and an east-west route (west to Hector, a relay not very close to anything but perhaps closest to Ludlow).

To the east, Kelso Peak connected to Cima, another AT&T relay in the Preserve. Cima station sits on a hill five miles due East of the town of Cima, and relays traffic northeast to a station in Nevada, almost to Las Vegas, charmingly called Beer Bottle.

To the south, Kelso Peak connects to the Granite Pass station. Granite Pass is directly next to Kelbaker Road 14 miles south of Kelso. Across Kelbaker Road from the Granite Pass microwave station is a much smaller tower installed by the National Park Service to relay internet and phone service to Kelso today. South from Granite Pass, the next station is Sheep Hole, south of the preserve and northeast of Twentynine Palms.

Putting these stations together, the northern east-west telephone route through the Mojave runs north of the preserve mostly parallel to I-15. The southern one (actually perhaps better called the "middle" route as there is yet another further south and nearer to Joshua Tree National Park) runs through the center of the Preserve, and would be visible on the horizon just north of Kelso if you could see microwaves. The north-south route runs directly through the western side of the Preserve.

Construction of these stations in 1964 was a formidable project. Crews from Southern California Edison installed many miles of new power lines, starting from Kelso and running outwards, to bring power to the Granite Pass and Kelso Peak stations (Kelso itself had only been connected to the grid a few years earlier). The microwave stations, like the L-3I cable, were built by PT&T crews for AT&T Long Lines. Crews from both SoCal Edison and PT&T occupied employee hotel rooms at the Kelso Depot while performing work, often for months long stays that irritated the station agent and stretched the depot's capacity to house and feed [2].

Each of the AT&T stations in the Preserve, like others in the Mojave, included an unusually large gravel apron around the facility. This leveled gravel lot served as a helipad; due to the remoteness of the facilities maintenance technicians were delivered by helicopter. The sandstorms which sometimes occur in the Preserve posed a maintenance and reliability challenge, and maintenance crews were kept busy.

Along with the L-3I and TD-2 came the end of open-wire, but in such a remote area it's hard to really tear out old infrastructure. Instead, when the decision was made to decommission much of the open wire by 1989, alternative arrangements were made for the few local customers once served by the DBR. South of Granite Pass and north of Kelso portions of the DBR were removed, but the Granite Pass microwave station was connected to the DBR open-wire as it passed by. East of Kelso where the DBR crossed the railroad, a section of new wire was used to connect pairs on the DBR to pairs on the railroad open-wire, which was removed except between the DBR and Kelso. The result was a direct local loop from the Kelso phones to the Granite Pass microwave station, a very unconventional setup to avoid the need for a new phone line to Kelso. A 1924 rail depot connected to a 1964 microwave station by reuse of a 1942 open-wire toll lead: the kind of thing you run into in the desert.

The phone booth seems to have found an alternate arrangement. The DBR was removed for a span north of Kelso, disconnecting the phone booth from the span to Granite Pass. Instead, the phone booth seems to have been reconnected along old DBR wire pairs to somewhere further north, likely Baker, where the phone booth had been assigned a regular local number.

Because of these two legacy arrangements, large spans of the DBR have remained intact in the Preserve to this day. On Kelso Cima Road just east of Kelso, the intersection of the DBR and the railroad can be seen, including the somewhat awkward interconnection of the DBR to the railroad open wire. Just north of this point the DBR abruptly ends, the remaining wires tied around the base of the last pole. The DBR is only absent for about two miles though, follow its route north and the poles will start again just as abruptly as they ended. 16 miles further, the ghost of the phone booth sits under the poles of the former DBR. Look carefully and you can see many details of this old infrastructure. I have posted a few photos I took at https://pixelfed.social/jbcrawford, although I intend to get better and more thorough ones on a future trip.

Today, the original TD-2 microwave equipment is long removed, and some of the large KS-15676 horn antennas have been removed as well (although they remain at some sites including the highly visible Granite pass). Even so, radio sites, once built, have a tendency to live on. Most of these microwave sites are still in use, either by a telco or under ownership of a leasing company such as American Tower. The remoteness of the Mojave means that radio remains an important technology and many of these microwave sites still carry telephone calls, using more modern equipment, and either as the primary route to some rural telephone exchanges or as a backup in case of damage to buried fiber optic lines. The late life of these facilities can sometimes be confusing. At Granite Pass, a much newer tower and small utility enclosure on the west side of Kelbaker road, next to the small NPS relay, are used by AT&T for telephone service. The original AT&T tower on the east side of the road is no longer used by AT&T but lives again as a relay site for the Verizon Wireless microwave backhaul network, which provides cell towers their connection to the rest of the phone system. Many microwave sites have also been reused as cellular towers.

Various newer telecommunications infrastructure can be found within the Preserve as well. At the town of Cima, a large radio tower erected by Union Pacific provides radio communications with trains and trackside equipment. Smaller towers found up and down the railroad link trackside equipment together as a replacement for the old open-wire lines. Just south of I-40 near the Essex exit, a modern small microwave relay site provides backhaul communications for several cellular carriers on solar power alone. At Goffs Butte, a conspicuous cinder cone south of Goffs, a busy radio site includes cellular and telephone microwave relays alongside broadcast radio stations. Cellular towers at Baker, Mountain Pass, Goffs, Ludlow, and others now provide coverage to some, but far from all, of the Preserve.

There is a very real sense, though, in which modern telecommunications technology has still failed to tame the desert. Satellite networks such as Globalstar and Iridium can be reached throughout the Preserve, but slowly and at significant cost. Cellphones are unreliable to unusable in many parts of the Preserve, and there are few landline phones to be found. Despite all of this infrastructure, the Mojave is still far from civilization. That's another great thing about the open desert, besides the memories it keeps: it's hard to get to, and even harder for anyone to bother you once you're there.

[1] "From Neglected Space to Protected Place: An Administrative History of the Mojave National Preserve," Eric Charles Nystrom for the National Park Service, 2003.

[2] "An Oasis for Railroaders in the Mojave: The History and Architecture of the Los Angeles and Salt Lake Railroad Depot, Restaurant, and Employees Hotel at Kelso, California, on the Union Pacific System," Gordon Chappel et al. for the National Park Service, 1998.

[3] "DBR: 'Damned Big Rush' or the Building of the Defense Backbone Route," The Electric Orphanage. https://the-electric-orphanage.com/the-damned-big-rush-or-building-of-the-defense-backbone-route/

[4] Correspondence, Telephone Collectors International mailing list.

[5] "Draft Environmental Impact Statement: Mojave National Preserve, P140 Coaxial Cable Removal Project," National Park Service, 1997.

[6] "Turquoise, California AT&T Long Lines Site," Path Preservation. http://www.drgibson.com/towers/turquoise.html


>>> 2022-01-24 the smart modem

I think I've mentioned occasionally that various devices, mostly cellular modems, just use the Hayes or AT command set. Recently I obtained a GPS tracking device (made by Queclink) that is, interestingly, fully configured via the Hayes command set. It's an example of a somewhat newer trend of converging the functionality of IoT devices into the modem baseband. But what is this Hayes command set anyway?

Some of you are no doubt familiar with the "acoustic coupler," a device that has two rubber cups intended to neatly mate with the speaker and microphone of a telephone handset. The acoustic coupler allowed a computer modem to be connected to the telephone system via audio instead of electrically, which was particularly important because, pre-Carterfone, nothing could be connected to the telephone system that was not leased from the telco. Acoustic couplers were also just convenient, as back in the days when all equipment was leased from the telco phones were fairly expensive, most houses did not yet have a panoply of telephone jacks, and so it was just generally handy to be able to easily use a normal phone with computer modem without having to swap around cabling.

Unfortunately, this scheme had a major limitation: the computer interacted with the telephone just like you would, via audio. The computer had no way, though, of taking the phone on or off hook or dialing. That was all up to the user. So, you'd pick up the phone, dial a number, and then set the phone down on the acoustic coupler. When you were done, you would take the phone off of the coupler and hang it back up. Besides being a bit of a hassle and sometimes prone to mistakes, this effectively ruled out any kind of automatic or scheduled modem usage.

Through the '70s, modems capable of automatic dialing and on/off hook were available but were expensive, large machines intended for commercial-scale use. For example, they were somewhat widely used by retail point of sale systems of the era to send regular reports back to corporate headquarters for accounting. For the home computer enthusiast, there were essentially no options, and among other implications this ruled out the BBS ecosystem that would emerge later since there was no way for a computer to automatically pick up the line.

Everything changed in 1981. Actually, the first fully computer-controlled modem came somewhat earlier, but because it was designed specifically for S-100 computers (like the Altair) and later Apple II, its popularity was limited to those platforms. Hayes, the same company that developed this early internal modem, released the Hayes Smartmodem in '81---which truly started the PC modem revolution. The basic change from their earlier internal modems was that the Smartmodem interfaced with the host computer via serial. RS-232-esque-ish serial ports were by this time ubiquitous on microcomputers, so the Smartmodem could be used with a huge variety of hardware.

It might be surprising that a modem that allowed programmatic control of the hook and dialing took so long to come around. It might be more obvious why if we think about the details of the modem interface to the host PC. The task of a modem is, of course, to send and receive data. In order to do so, modems have traditionally acted like transparent serial channels. In other words, modems have behaved as if they were simply very long serial cables between two computers. Whatever data was sent to the modem it transmitted, and whatever data it received it returned on the serial interface.

We could thus refer to the serial connection to the modem as being the data plane. How is the modem commanded, then? Well, originally, it wasn't... the user had to handle all aspects of call control manually. To bring about automatic call control, Hayes had to come up with a command set for the modem and a way to send those commands. Hayes solution is one that vi users will appreciate: they created two modes. A Hayes Smartmodem, in data mode, acted like a normal modem by simply sending and receiving data. A special escape sequence, though, which defaulted to "+++", caused the modem to change to command mode. Once in command mode, the computer could send various commands to the modem and the modem could reply with status information. The modem would switch back to data mode either after an explicit mode switch command or implicitly after certain connection setup commands.

All commands to a Hayes modem began with the letters "AT". There are a few reasons for this. Perhaps most obviously (certainly to any vim users), the use of two distinct modes creates a huge opportunity for "mode errors" in which the modem is somehow not in the mode that the software controlling it thinks it is. Prefixing all command strings with "AT" serves as an additional check that a line of text is intended to be a command is not actually data errantly sent during command mode, which might cause the modem to take all kinds of strange actions. Second, AT was used for automatic baud detection and clock recovery in the modem, since it was a known bit sequence that would be sent to the modem after the modem first powered on and before it was used to make a call.

It's because of this "AT" prefix, which in principle stands for "attention," that the Hayes command set is commonly referred to as the AT commands. If either Hayes or AT rings a bell, it will be because the influence of the Hayes Smartmodem on the computer industry has been incredibly long lasting: essentially all telephone network modems, whether landline or cellular, continue to use the exact same Hayes interface. In most cases, the operating system on your smartphone is, as we speak, using the Hayes command set to interact with the cellular baseband. If you buy an LTE module for something like an IoT application, you will need to send it Hayes commands for setup (under Linux the ModemManager daemon is responsible for this background work). If you use a USRobotics parallel telephone modem, well, you will once again be using the Hayes command set, but then that's less surprising.

Let's take a quick look at the Hayes commands. The format of them is somewhat unconventional and painful by modern standards, but keep in mind that it was intended for easy implementation in '80s hardware, serial interfaces were slow back then, and in general it is designed more for economy than user-friendliness. On top of that, it's been hugely extended over the years since, meaning that the awkward "extension" commands are now very common.

A Hayes command string starts with "AT" and ends with a carriage return (line feed may or may not be used depending on configuration). In between, a command string can contain an arbitrary number of commands concatenated together---there is no whitespace or other separation, rather the command format ensures that it is possible to determine where a command ends. You can imagine this has become a bit awkward with extension commands and so, in practice, it's common to only put one command per line except for rather routing multi-step actions.

The basic commands consist of a single capital letter which is optionally followed by a single digit. Most of the time, the letter indicates an action while the digit indicates some kind of parameter, but there are exceptions. Some commands take an arbitrary-length parameter following the command, and some commands accept a letter instead of the one trailing digit. Actually even the original Hayes command set is so inconsistent that it's hard to succinctly describe the actual syntax, and now it's been added on to so many times that exceptions outweigh the rules. It might be easier to just look at a few examples.

To do perhaps the most obvious thing, instruct the modem to go off hook and dial a telephone number, you send "ATDT" (D=Dial, T=Touch-Tone) followed by a string which specifies the phone number... and can also contain dialing instructions such as pausing and waiting for ringback. For example when dialing into a PABX that uses extensions, you might use "ATDT5055551234@206;". This tells the modem to dial (D), touch-tone (T), 505-555-1234, wait until ringback (@), dial 206, then stay in command mode (;). Without the semicolon, the D command usually causes an implicit switch to data mode.

Answering a call is simpler, since there are fewer parameters. The "A" command is answer. The command string would sort of technically be "ATA0" since A ostensibly conforms to the "one letter one digit" convention, but when the digit is 0 it can be omitted.

But wait... how would the computer know that the modem is "ringing" in order to answer? Well, for that you'll have to jump back to the post on RS-232, and study up on why the Hayes Smartmodem used a 25-pin connector. There's just a dedicated wire to indicate ringing, as well as a dedicated wire to indicate when the modem is ready to move data (i.e. when a data carrier is present). The serial interface in the computer was expected to expose the state of these pins to software as needed.

Some of you may remember that, in the days of dial-up, it was common to hear the modem dial and negotiate the data connection aloud. This too dates back to the Hayes Smartmodem, and it's somewhat related to the reason that fax machines usually provide a handset. If you misdial or there is a problem with the destination phone number or one of a number of other things, you may get an intercept message or someone answering or some other non-modem audio upon the call connecting. The Smartmodem featured a speaker to allow the user to hear any such problems, but of course few users wanted to listen to the whole data session. The Hayes "M" command allowed the host computer to set the behavior of the speaker, and "ATM1" was commonly sent which caused the modem to enable the built-in speaker until a data carrier was established, at which point it was muted.

The Hayes Smartmodem also included a number of registers in which configuration could be stored in order to affect the behavior of later commands. For example, the duration of a standard dialing pause could be adjusted by changing the value in the register. The "S" command allowed for selecting a register (e.g. ATS8 to select register 4), and the "?" and "=" commands could be used to query and set the value of a register. "=" of course took an argument, and so "ATS8=8" could be used to set the pause duration to 8 seconds. This might look like one long command but it's not, we could just as well send "ATS8" followed by "AT=8". The = is a command, not an operator.

As modems became faster and more capable and gained features, the Hayes command set gained many additions and variants. While the core commands remain very consistently supported, the prefixes "&", "%", "\", and "+" are all used to indicate various extended commands. Some of these are defined by open standards, while others will be proprietary for the modem manufacturer. For example, the GSM standard specifies extended Hayes commands useful for interacting with cellular modems. For example, "AT+CSQ" can be used to ask a cellular modem for the current signal strength (RSSI). The "+" prefix is, in general, used for ITU-standardized additional commands, and "+C" used for commands related to cellular modems. You'll see these prefixes very frequently today, as the Hayes command set is more and more seen in the context of cellular modems rather than telephone modems.

Of course, "+CSQ" being a command seems to violate the syntax I explained earlier for Hayes commands, and vendor proprietary commands frequently take this much further by introducing multi-parameter commands with parameter separators and all types of lengthier command names. For example, for a personal project I wrote software around a Telit LTE module that made use of the command string "AT#CSURV" (note non-standard prefix "#"). This command causes the modem to search for nearby cells and return a listing of cells with various parameters, which is useful for performing site surveys for cellular network reliability.

Many modern cellular modems have GPS receivers built-in, and it's possible to use the GPS receiver via Hayes commands. On the Telit module, a command string of "AT$GPSACP" causes the modem to return the current position, while the command string "AT$HTTPGETSTSEED=1,2199" (note two parameters) can be used to command the embedded GNSS module to load AGPS data from an HTTP source (the details of AGPS will perhaps be a future topic on this blog).

Brief tangent: some of you may be aware (perhaps I have mentioned it before?) that dialing emergency calls on GSM and LTE cellphones is, well, a little weird. Much of that is because the GSM specifications have built-in support for emergency calling, independent of phone numbers, that is intended to allow cellular phones to present a consistent emergency calling method regardless of the dialing conventions of a country/area the user might be roaming in. The exact commands are unstandardized, but on the Telit module "AT#EMRGD" initiates an emergency call (note that no phone number is specified) while "AT#EMRGD?" (it is a common convention in extended AT commands for a trailing "?" to change the command to a status check) causes the modem to report which phone numbers the GSM network has indicated should be used for different types of emergency calls---chiefly for display to the user. This is why dialing common international emergency numbers like 999 and 110 on a US cellular phone still results in a connection to 911---in actuality no dialing happens at all, when the dialer app determines that the number entered appears to be an emergency call it instead issues an AT command with no phone number at all. Part of the reason for this is due to enhanced GSM features for position reporting, which relate to what is called "e911" in the US and provide essentially a basic, slow data channel between a cellular phone and a PSAP that can be used by the phone operating system to report a GPS position to the PSAP. There are, of course, a half dozen AT commands on most cellular modems to facilitate this [1].

Now keep in mind that all of these commands happen over a channel that is also intended to send data. So, after dialing a call or by issuing the command string "ATO" (go online) further data sent over the serial connection will instead go "through" the modem to the other end. In practice, though, mode switching introduces a set of practical problems (not least of which is having to make sure the escape sequence "+++" does not appear in data) and so most modern modems actually don't do it any more. Instead, the Hayes protocol serial connection is usually used purely for modem commanding and a separate data channel is used for payload.

This is clearest if we look at the most common modern incantation of Hayes commands, a cellular modem connected to a host running Linux. Traditionally, ModemManager would issue a set of commands to the modem to set up the connection after which it would place the modem into data mode and then trigger pppd to establish a ppp connection with the modem serial device. In practice, most cellular modems today are "composite devices" in some sense (i.e. present multiple independent data channels, whether physically or as a virtual product of their driver) and appear as both a serial device and a network interface. The serial device is for Hayes commands, the network interface is, well, a plain old network interface, which makes network setup rather easier than having to use PPP. There are various ways that this happens mechanically; in the case of USB modems it is usually by presenting a composite USB device that includes some type of network interface profile like CDC Ethernet [2].

In fact, a lot of modems don't just present a serial interface and a network interface... it's not unusual for modems to present several. One will be for Hayes commands, but there's often a second to be used as a dedicated channel for PPP over serial (in case a different method of network connection isn't used) and then a third dedicated to GPS use. Since applications often want regular (unsolicited) updates from the GPS module and it's a bit silly to have to constantly poll via Hayes command or switch modes around, it's common for LTE modems to allow the host to issue a Hayes command that enables unsolicited GPS updates, after which they continuously generate GPS fix messages on a dedicated channel. These are usually in NMEA format, a widespread standard for GNSS information over simple serial channels that was originally developed to allow a single GNSS receiver on a boat to disseminate position information to multiple navigation devices. Yes, specifically a boat---NMEA is the National Marine Electronics Association, but they came up with a solid standard first and everyone else has copied it.

Despite the partial shift away, Hayes commands have a lot of staying power due to their simplicity. Some devices are going to the direction of using more Hayes commands, since it can actually eliminate the need for any "data channel proper" in some cases. Many LTE modems oriented towards IoT or industrial use provide extension Hayes commands that perform high level actions like "make a POST request". The modem implements HTTP internally so that developers of embedded devices don't have to. The Telit modules even support HTTPS, although setting up the TLS trust store is a bit of a pain.

The latest hotness in cellular modules is the ability to load new functionality at runtime. IOT LTE modems made by Digi, for example, include extra non-volatile and volatile storage and a MicroPython runtime so that business logic can run directly in the modem. You can bet that there are Hayes commands involved.

So 40 years later, a huge variety of modern electronics using cutting-edge cellular data networks are still, at least for initial setup, pretending to be a 300-baud Hayes Smartmodem. Maybe you can still find a case out there where coercing another computer to attempt to send "+++" followed by, after a sufficient pause, "ATH" will cause it to drop off the network.

A final tangent on Hayes commands, and what brought them to my mind: through a combination of good luck and force of will I have managed to get a dealership to take my money for a new car (this proved astoundingly difficult). Since Albuquerque is quite competitive in its effort to regain the recently lost title of Car Theft Capitol of the USA I have fit it with a tracking device. This device, made by a small Chinese outfit, runs the entirety of its logic within a modem with extended firmware. This is sometimes called a "microcontroller-less" design, although obviously the modem is essentially functioning as a microcontroller in this case. For configuration, the tracker exposes the modem's Hayes serial interface on an external connector, and the vendor provides a software tool that generates very long Hayes command strings to configure the tracker behavior (endpoint, report frequency, immobilize logic, etc). It's possible to use AT commands on this interface to send and receive SMS, for example, which makes the tracker a more flexible device than it advertises.

Actually, I lied, one more tangent: Wikipedia notes that the Smartmodem used a novel design of an extruded aluminum section that the PCB slid into, and a plastic cap on each end. This was an extremely common case design for '90s computer accessories. Cheaper plastic injection molding seems to have mostly killed it off, but it was super convenient to take these types of cases apart and I rather miss them now.

[1] In fact a new and somewhat upcoming GSM feature called "eCall" enables "data-only" emergency calls, mostly intended for use by in-vehicle assistive technologies that may connect an emergency call and then send a position and status report under the assumption that the occupants may be incapacitated and unable to speak.

[2] Note that newer modems and operating systems are starting to use MBIM more often, a newer USB profile that includes a newer command channel. If you have an LTE modem and do not see the expected Hayes serial device, MBIM may be the reason... but on most modems a Hayes command must be issued to switch the modem to MBIM mode, so the Hayes command set is still in use even if only briefly.

<- newer                                                                older ->