_____                   _                  _____            _____       _ 
  |     |___ _____ ___ _ _| |_ ___ ___ ___   |  _  |___ ___   | __  |___ _| |
  |   --| . |     | . | | |  _| -_|  _|_ -|  |     |  _| -_|  | __ -| .'| . |
  |_____|___|_|_|_|  _|___|_| |___|_| |___|  |__|__|_| |___|  |_____|__,|___|
  a newsletter by |_| j. b. crawford               home archive subscribe rss

>>> 2021-07-26 rip those bits to shreds

Programming note: you may have noticed that computer.rip has been up and down lately. My sincere apologies, one of the downsides of having a neo-luddite aversion to the same cloud services you work with professionally all day is that sometimes your "platform as a physical object" (PaaPO) starts exhibiting hardware problems that are tricky to diagnose, and you are not paid to do this so you are averse to spending a lot of your weekend on it. Some messing around and remote hands tickets later the situation seems to have stabilized, and this irritation has given me the impetus to get started on my plans to move this infrastructure back to Albuquerque.

Let's talk a bit about something practical. Since my academic background is in computer security, it's ego-inflating to act like some kind of expert from time to time. Although I have always focused primarily on networking, I also have a strong interest in the security and forensic concerns surrounding file systems and storage devices. Today, we're going to look at storage devices.

It's well known among computing professionals that hard disk drives pose a substantial risk of accidental data exposure. A common scenario is that a workstation or laptop is used by a person to process sensitive information and then discarded as surplus. Later, someone buys it at auction, intercepts it at a recycler, or similar and searches the drive for social security numbers. This kind of thing happens surprisingly frequently, perhaps mostly because the risk is not actually as common knowledge as you would think. I have a side hustle, hobby, and/or addiction of purchasing and refurbishing IT equipment at auction. I routinely purchase items that turn out to have intact storage, including from government agencies.

So, to give some obvious advice: pay attention to old devices. If your organization does not have a policy around device sanitization, it should. Unfortunately the issue is not always simple, and even organizations which require sanitization of all storage devices routinely screw it up. A prominent example is photocopiers, for years organizations with otherwise good practices were sending photocopiers back to leasing companies or to auction without realizing that most photocopiers these days have nonvolatile storage to which they cache documents. So having a policy isn't really good enough on its own: you need to back it up with someone doing actual research on the devices in question. I have heard of a situation in which a server was "sanitized" and then surplussed with multiple disk drives intact because the person sanitizing it didn't realize that the manufacturer had made the eccentric decision to put additional drive bays on the rear of the chassis!

But that's all sort of besides the point. We all agree that storage devices need to be sanitized before they leave your control... but how?

Opinions on data sanitization tend to fall into two camps. Roughly, those are "an overwrite is good enough" and "the only way to be sure is to nuke it from orbit." Neither of these positions are quite correct, and I will present an unusually academic review here of the current state of storage sanitization, along with my opinionated advice.

The black-marker overwrite

The most obvious way to sanitize a storage device, perhaps after burying it in a hole, is to overwrite the data with something else. It could be ones, it could be zeroes, it could be random data or some kind of systematic pattern. The general concept of overwriting data to destroy it presumably dates back to the genesis of magnetic storage, but for a long time it's been common knowledge that merely overwriting data is not sufficient to prevent recovery.

A useful early illustration of the topic is Venugopal V. Veeravali's 1987 master's thesis, "Detection of Digital Information from Erased Magnetic Disks." It's exactly what it says on the tin. The paper is mostly formulae by mass, but the key takeaway is that Veeravali connected a spectrum analyzer to a magnetic read head. They showed that the data from the spectrum analyzer, once subjected to a great deal of math, could be used to reconstruct the original contents of an erased disk to a certain degree of confidence.

This is pretty much exactly the thing everyone was worried about, and various demonstrations of this potential lead to Peter Gutmann's influential 1996 paper "Secure Deletion of Data from Magnetic and Solid-State Memory." Gutmann looks at a lot of practical issues in the way storage devices work and, based on consideration of specific patterns that could remain considering different physical arrangements for data storage, proposes the perfect method of data erasure. The Gutmann Method, as it's sometimes called, is a 35-pass scheme of overwriting with both random data and fixed patterns.

The reason for the large number of passes is partially Just To Be Sure, but the fixed pattern overwrites are targeted at specific forms of encoding. The process is longer than strictly needed just because Gutmann believes that a general approach to the problem requires use of multiple erasure methods, one of which ought to be appropriate for the specific device in question. This is to say that Gutmann never really thought 35 passes were necessary. Rather, to put it pithily, he figured eight random passes would do and then multiplied all the encoding schemes together to get 27 passes that ought to even out the encoding-related patterns on the drives of the time.

Another way to make my point is this: Gutmann's paper is actually rather specific to the storage technology of the time, and the time was 1996. So there's no reason to work off of his conclusions today. Fortunately few people do, because a Gutmann wipe takes truly forever.

Another influential "standard" for overwriting for erasure is the "DoD wipe," which refers to 5220.22-M, also known as the National Industrial Security Program Operating Manual, also known as the NISPOM. I can say with a good degree of confidence that every single person who has ever invoked this standard has misunderstood it. It is not a standard, it is not applicable to you, and since 2006 it no longer makes any mention of a 3-pass wipe.

Practical data remanance

The concept of multi-pass overwrites for data sanitization is largely an obsolete one. This is true for several different reasons. Most prominently, the nature of storage devices has changed appreciably. The physical density of data recording has increased significantly. Drive heads now operate on magnetic coils and track dynamically rather than under absolute positioning (reducing error in tracking). And there are of course today many solid-state drives, which repeatedly overwrite data as a matter of normal operating procedure (but at the same time may leave a great deal of data available).

You don't need to take my word on this! Also in 2006, for example, the NIST issued new recommendations on sanitization stating that a single overwrite was sufficient. This may have been closely related to the 2006 NISPOM change. Gutmann himself published a note in 2011 that he no longer believes his famous method to be relevant and assumes a single overwrite to be sufficient.

Much of the discussion of recovery of overwritten data from magnetic media has long concentrated around various types of magnetic microscopes. Much like your elementary school friend who's uncle works for Nintendo, the matter is frequently discussed but seldom demonstrated. Without wanting to go too deep into review of literature and argumentative blog posts, I think it is a fairly safe assertion that recovery of data by means of electron microscopy, force microscopy, magnetic probe microscopy, etc is infeasible for any meaningful quantity of data without enormous resources.

The academic work that has demonstrated recovery of once-overwritten data by these techniques has generally consisted of extensive effort to recover a single bit at a low level of confidence. The error rate makes recovery of even a byte impractical. A useful discussion of this is in the ICISS 2008 conference paper "Overwriting Hard Drive Data: The Great Wiping Controversy," amusingly written in part by a man who would go on to claim (almost certainly falsely) to have invented Bitcoin. It's a strange world out there.

As far as summing up the issue, I enjoy the conclusion of a document written by litigation consultant Fred Cohen:

To date I have found no example of any instance in which digital data recorded on a hard disk drive and subsequently overwritten was recovered from such a drive since 1985... Indeed, there appears to be nobody in the [forensics and security litigation] community that disputes this result with any actual basis and no example of recovery of data from overwritten areas of modern disk drives. The only claims that there might be such a capability are based on notions surrounding possible capabilities in classified environments to which the individuals asserting such claims do not assert they have actual access and about which they claim no actual knowledge.

Recovery of overwritten data by microscopy is, in practice, a scary story to tell in the dark.

The takeaway here is that, for practical purposes, a single overwrite of data on a magnetic platter seems to be quite sufficient to prevent recovery.

It's not all platters

Here's the problem: in practice, remanance on magnetic media is no longer the thing to worry about.

The obvious reason is the extensive use of SSDs and other forms of flash memory in modern workstations and portable devices. The forensic qualities of SSDs are, to put it briefly, tremendously more complicated and more poorly understood than those of HDDs. To even skim the surface of this topic would require its own post (perhaps it will get it one day), but the important thing to know is that SSDs throw out all of the concerns around HDDs and introduce a whole set of new concerns.

The second reason, though, and perhaps a more pervasive one, is that the forensic properties of the magnetic platters themselves are well understood, but those of the rest of the HDD are not.

The fundamental problem in the case of both HDDs and SSDs is that modern storage devices are increasingly complex and rely on significant onboard software in order to manage the physical storage of data. The behavior of that onboard software is not disclosed by the manufacturer and is not well understood by the forensics community. In short, when you send data to an HDD or SSD, we know that it puts that data somewhere but in most cases we really don't know where it puts it. Even in HDDs there can be significant flash caching involved (especially on "fancier" drives). Extensive internal remapping in both HDDs and SSDs means that not all portions of the drive surface (or flash matrix, etc) are even exposed to the host system. In the case of SSDs, especially, large portions of the storage are not.

So that's where we end up in the modern world: storage devices have become so complex that the recovery methods of the 1980s no longer apply. By the same token, storage devices have become so complex that we can no longer confidently make any assertions about their actual behavior with regards to erasure or overwriting. A one-pass overwrite is both good enough at the platter level and clearly not good enough at the device level, because caches, remapping, wear leveling, etc all mean that there is no guarantee that a full overwrite actually overwrites anything important.

Recommended sanitization methods

Various authorities for technical security recommendations exist in the US, but the major two are the NIST and the NSA.

NIST 800-88, summarized briefly, recommends that sanitization be performed by degaussing, overwriting, physical destruction of the device, or encryption (we will return to this point later). The NIST organizes these methods into three levels, which are to be selected based on risk analysis, and physical destruction is the recommended method for high risk material or material where no method of reliable overwriting or degaussing is known.

NSA PM 9-12 requires sanitization by degaussing, disintegration, or incineration for "hard drives." Hard drives, in this context, are limited to devices with no non-volatile solid state memory. For any device with non-volatile solid state memory, disintegration or incineration is required. Disintegration is performed to a 2mm particle size, and incineration at 670 Celsius or better.

Degaussing, in practice, is surprisingly difficult. Effective degaussing of hard drives tends to require disassembly in order to individually degauss the platters, and so is difficult to perform at scale. Further, degaussing methods tend to be pretty sensitive to the exact way the degaussing is performed, making them hard to verify. The issue is big enough that the NSA requires that degaussing be followed by physical destruction of the drive, but to a lower standard than for disintegration (simple crushing is acceptable). For that reason, disintegration and incineration tend to be more common in government contexts.

It's sort of funny that I tell you all about how multiple overwrite passes are unnecessary but then tell you that accepted standards require that you blend the drive until it resembles a coarse glitter. "Data sanitization is easy," I say, chucking drives into a specialized machine with a 5-figure price tag.

The core of the issue is that the focus on magnetic remanance is missing the point. While research indicates that magnetic remanance is nowhere near the problem it is widely thought to be, in practice remanance is not the way that data is sneaking out. The problem is not the physics of the platters, it's the complexity of the devices and the lack of reliable host access to the entire storage capacity.

ATA secure erase and self-encryption and who knows what else

The ATA command set, or rather some version of it, provides a low-level secure erase command that, in theory, causes the drive's own firmware to initiate an overwrite of the entire storage surface. This is far preferable to overwriting from the host system, because the drive firmware is aware of the actual physical storage topology and can overwrite parts of the storage that are not normally accessible to the host.

The problem is that drive manufacturers have been found to implement ATA secure erase sloppily, or not at all. There is basically no external means of auditing that a secure erase was performed effectively. For that simple reason, ATA secure erase should not be relied upon.

Another approach is the self-encrypting drive or SED, which transparently encrypts data as it is written. These devices are convenient since simply commanding the drive to throw away the key is sufficient. SED features tend to be better implemented than ATA secure erase because of the fact that they are only implemented at all on high-end drives that are priced for the extra feature. That said, the external auditing problem still very much exists.

Another option is to encrypt at the host level, and then throw away the key at the host level. This is basically the same as the SED method but since the encryption is performed externally to the drive, the whole thing can be audited externally for assurance. In all reality this is a fine approach to data sanitization and should be implemented whenever possible. If you have ever been on the fence about whether or not to encrypt storage, consider this: if you are effective about encrypting your storage, you won't need to sanitize it later! The mere absence of the key is effective sanitization, as recognized by the NIST.

The problem is that disk encryption features in real devices are inconsistent. Drive encryption may not be available at all, or it may only be partial. This makes encryption difficult to rely on in most practical scenarios.

The bottom line

When you dispose of old electronics, you should perform due diligence to identify all non-volatile storage devices. These storage devices should be physically destroyed prior to disposal.

DIY methods like drilling through platters and hitting things with hammers are not ideal, but should be perfectly sufficient for real scenarios. Recovering data from partially damaged hard drives and SSDs is possible but not easy, and the number of facilities that perform that type of recovery is small. There are lots of ways to achieve this type of significant damage, from low-cost hand-cranked crushing devices to the New Mexican tradition of taking things out to the desert and shooting at them. Await my academic work on the merits of FMJ vs hollow-point for data sanitization. My assumption is that FMJ will be more effective due to penetration in multi-platter drives, but I might be overestimating the hardness of the media, or underestimating the number of rounds I will feel like putting into it.

Ideally, storage devices should be disintegrated, shredded, or incinerated. Unless you are looking forward to making a large quantity of thermite, these methods are difficult without expensive specialized equipment. However, there are plenty of vendors that offer certified storage destruction as a service. Ask your local shredding truck company about their rates for storage devices.

Most conveniently, do what I do: chuck all your old storage devices in a drawer, tell yourself you'll use them for something later, and forget about them. We'll call it long-term retention, or the geologic repository junk drawer.

--------------------------------------------------------------------------------

>>> 2021-07-21 the desqtop

I believe I have mentioned before that the history of early GUI environments for PCs is sufficiently complex and obscure that it's very common to run into incorrect information. This is markedly true of the Wikipedia article on DESQview, which "incorrects" a misconception by stating another incorrect fact. Since it's Wikipedia, the free encyclopedia that anyone can edit, I assume that if I correct it the change will be reverted by bot within seconds.

False claims about TopView aside, the Wikipedia article on DESQview makes most of the salient points about its history. That said, I would like to talk about it a bit because DESQview is a neat example of an argument I've made, and it happens to dovetail into another corner of GUI history that I'll bring up here and there.

DESQview was a multitasking GUI built by a company called Quarterdeck. It was released for DOS in 1985, so several years after Visi On, and right in the thicket of most of the DOS GUIs. DESQview is a GUI, though, only in the sense of the logical paradigm of user interactions. It actually runs in textmode, using the DOS extended ASCII box drawing figures to create windows and menus, and using letters and symbols as buttons. It's similar in this regard to the relatively modern Twin (Terminal Windows), and could be viewed as a souped up terminal multiplexer like tmux.

Despite running in textmode, DESQview has basically all of the WIMP (Windows, Icons, Menus, Pointer) behavior that we consider typical of a GUI. To be fair, by virtue of running in textmode it fundamentally lacks icons, but so did a number of other early GUIs that ran in graphics mode. Any one of us could sit down in front of a machine running DESQview and figure out the basic interactions without much trouble, something that can't be said of most terminal multiplexers. Here is an example of the philosophical divide between TUI and GUI, or more specifically between unguided and guided: terminal multiplexers like screen and tmux are unguided interfaces that expect the user to read the manual. More typical of the GUI, DESQview attempts to make most functionality fairly discoverable to the user.

So in that light, consider this sentence from the Wikipedia article: "DESQview is not a GUI (Graphical User Interface) operating system. Rather, it is a non-graphical, windowed shell that runs in real mode on top of DOS, although it can run on any Intel 8086- or Intel 80286-based PC."

It's not a GUI, it's a non-graphical windowed shell. It runs in real mode on top of DOS, which is true of basically all '80s GUIs including Windows. It has windows, and it's a shell, but it's not a GUI because it's non-graphical. To me, at least, this whole thing is a bit farcical. The desire here to exclude DESQview from the category of GUIs only serves to reinforce that the interaction concept that we refer to as the "GUI" is actually quite divorced from the difference between text and raster displays. You can always employ ASCII art to pretend you have a graphical display, after all.

Another interesting component of DESQview to discuss is its support for DOS applications. We saw with Visi On that there is sort of a basic conflict involved in developing a DOS GUI: if it runs on top of DOS, users will want to be able to run their existing DOS software. But DOS software assumes full control of the machine and does not play well with multitasking. Visi On went the route of throwing DOS out the window and requiring that software be written specifically for Visi On [1]. DESQview went the opposite, more consumer-friendly route, of bending over backwards to work with the existing DOS stable.

DESQview had a significant leg up on this venture because its developer, Quarterdeck, had previously sold a DOS task-switcher called Desq. Task switchers are not really a familiar part of the modern computing landscape because of the ubiquity of multitasking operating systems. Back in the '80s, though, most microcomputer operating systems were single-task and so the ability to run multiple programs at the same time could only be simulated. A task switcher created something like multitasking by doing exactly what it sounds like: switching out the tasks.

Specifically, Desq acted as a DOS TSR, or Terminate and Stay Resident. When launched, Desq installed an interrupt handler and then terminated. The interrupt handler fired when keyboard keys were pressed (remember at this point the keyboard on PCs was connected via the 8042 keyboard controller, which generated interrupts on each keypress). The interrupt handler could basically inspect each keyboard event and decide whether to act on it. In effect, a TSR could implement a "global hotkey."

In the case of Desq, the hotkey resulted in Desq seizing control of the machine and stashing the contents of memory. It then presented a utility that allowed the user to select another task, which would be copied into memory and then jumped to. The effect was somewhat like switching windows, but you could only have one program visible at a time.

You might be wondering where that memory was stashed to. This gets into the peculiarities of x86 memory. By the time these task switcher utilities hit the scene, "extended memory" beyond the 1 MB real mode limit was fairly common on PCs. But, real-mode applications were unable to access this extended memory without putting in extra effort [2]. In practice, most DOS applications only ever used the real-mode-addressable memory, so task switchers could somewhat safely swap the first megabyte "basic memory" into the extended memory without the next application messing with it. Of course there was no guarantee, some applications did implement extended memory support and this generally made a program "incompatible" with task switching.

For Quarterdeck, DESQview was basically an extension of Desq, so it was natural to continue to support switching between conventional DOS applications. DESQview did much the same thing, loading and unloading DOS applications, but also using driver tricks to cause applications to "draw" text to their own windows. Like Desq, DESQview could "multitask" only the sense that it could react to interrupts, so the user was effectively "locked in" to the active window until the user triggered DESQview to seize control by use of a keyboard shortcut.

DESQview is an important example of a GUI system that is very much transitional between text and raster, and between TUI and GUI. Other similar examples include TopView, DOS Shell, and Norton Commander, the latter two of which were ostensibly file managers but grew to include a number of GUI features. Interestingly, though, DESQview appeared on the scene after the first text mode competitors. While raster mode has obvious advantages today for GUI software, there were huge additional challenges involved in using raster mode at this point in time. For one, it made compatibility with existing software extremely difficult.

Perhaps more importantly, though, the entire business computing world was on text-based machines, and text was mostly viewed as being perfectly sufficient. There just wasn't a lot of pressure to provide raster operating systems, because people hadn't really seen raster mode put to good use yet.

There are a couple of places to go from here, and you know that I will go to both of them: first, we will eventually need to get to the topic of Windows. I will probably discuss early Windows and TopView somewhat in parallel, because the comparison is interesting and because the competition of Windows and TopView represents yet another twist in the tumultuous partnership between Microsoft and IBM. In more of a fork, though, I will also start into a topic closely related to GUI history: network delivery of GUIs.

I said that DESQview dovetailed into another interesting topic, and it's network GUIs. DESQview was followed by DESQview/X... an X server. While this partially enabled the porting of X applications to DOS, it more importantly contributed to the first wave of thin client GUI systems.

[1] This isn't quite true, it actually is possible to run DOS applications under Visi On but with significant limitations that mostly prevented actually using the feature.

[2] If this sounds a bit amusing, keep in mind that we had basically the exact same problem years later with the 3-ish GB 32-bit limit. Memory beyond the first 3-ish gigabytes on a 32-bit machine could be used only if the application put in extra effort to support it (in that case by implementing PAE rather than XMS, the DOS extended memory API).

--------------------------------------------------------------------------------

>>> 2021-07-07 dial 1 800 flowers dot com

A note: Apologies for the long time without content, I have been on a road trip across the southwest and have suffered from some combination of no internet and no motivation. Rest assured, I am back now.

A second note: apologies that computer.rip was down for a half day or so, there was a power interruption at the datacenter and as a Professional DevOps Engineer I am naturally incredibly haphazard about how I run my personal projects. There was a problem with the fstab and the webserver didn't nfs mount the Enterprise Content Management System directory of text files when it rebooted.

You've gathered by now that I'm interested in telephone numbers, and we have to date discussed the basic structure of phone numbers in the NANP, premium rate ("900 numbers"), and some special purpose exchange and NPAs (555, 700, etc). As promised, it's time to come back around to talk about the best known special-purpose NPAs: toll-free numbers.

Toll-free are commonly referred to as 1-800 numbers, although this is a bit anachronistic as toll-free telephone numbers in NANP now span 800, 888, 877, 866, 855, 844, 833, and they'll get to 822 before you know it. Originally, though, they were all in the 800 NPA, and it's said that there is still a degree of prestige conferred upon actual 800 numbers. There's not a lot of actual reason for this, as while 800 numbers are in relatively short supply there are still many fly-by-night operations that hold them. In the end, though, toll-free numbers today serve almost purely as prestige devices because the majority of consumers are using cellular phones with unlimited long-distance calling, and so the number called barely even matters.

Let's teleport ourselves back in time, though, to the wild past of the early '60s. Direct dialing was becoming the norm, even for long distance calls. The majority of telephone owners, though, paid for calls in two basic tiers: calls to the local calling area are effectively free (included in the normal monthly rate for the line), while calls to outside of the local calling area were charged per minute at a stupidly high rate.

This whole issue of "local calling area" is a surprisingly complex one, and perhaps the simplest answer to "what is a local calling area" is "whatever your phone company tells you when you ask, and maybe specified in the front of the phone book." The local calling area in cities sometimes coincided with the NPA (e.g. all calls within the same area code were local), but this was not at all guaranteed and there were many, many exceptions.

The local calling area is better defined in terms of rate centers. A rate center is a geographical area that serves as the smallest organizational unit for telephone tolling purposes. A call to another person within the same rate center will be a local call. A call to another person in a different rate center could be either local or long-distance (toll), depending on the carrier's definition of the local calling area for your rate center. This typically depended on the geography. Further complicating things, the local calling area was not necessarily the same across telephone users within any given person's local calling area.

Let's work an example: You live in Hillsboro, Oregon, so you are in the Beaverton, OR rate center (RC). Beaverton RC has local calling to the Portland, OR rate center. I live in Oregon City, OR, which is in the Clackamas, OR rate center. Clackamas RC has local calling to Portland. We can both call our friend in Portland and it will be a local call. Our friend in Portland can similarly call both of us, as the Portland RC has both Beaverton and Clackamas in its local calling area.

However... Beaverton does not have Clackamas in its local calling area, and neither does Clackamas have Beaverton. To call each other directly would be a long-distance call [1]. This makes some intuitive sense as the distance between the suburbs and the city is smaller than the distance between two suburbs on different sides, and of course residents of the suburbs call residents of the city frequently. However, it has some odd results.

A phone number in the Portland RC is a better phone number than one in Beaverton or Clackamas, because it has a better local calling area: all of the suburbs, rather than just the city and the suburbs to one side.

This is a common situation. Rate centers which are major cities or in general more populous areas are more desirable, because they are local calls for more prospective customers. The problem is that back in the '60s you didn't really get to shop around for a rate center, it was just determined based on wherever your point of service was. This placed businesses based in suburbs at an inherent disadvantage: for people on the other side of town, they would be a long distance call.

The first major method of improving this situation was simply moving one's point of service into the city. One common method was the use of an answering bureau. A business in Beaverton could hire an answering bureau in Portland and list it as their contact number. It would be a local call for all prospective customers, and the business could return calls to customers at their expense. This came at the obvious downside that customers would always have to leave a message when they called, which was irritating---although answering bureaus were very common at the time, especially since prior to mobile phones many small businesses that worked "in the field" (tradespeople for example) would not have anyone in the office to answer calls most of the time.

A later and more complex solution was the use of a foreign exchange service, also called FXS. Under the FXS arrangement, a business in Beaverton would pay the telephone company to essentially run a miles-long jumper from their local loop in a Beaverton exchange to an exchange in Portland. This effectively "moved" their phone service to the Portland office and the Portland rate center. Early FXS were literally this simple, with the telco using a spare pair on a long distance line to splice the customer's line to a line at the other exchange. This service was expensive and has fallen out of use, although the terminology FXS and FXO (which originated as a description for the two ends of an FXS line) have remained stubbornly common in the world of VoIP-analog bridges despite being archaic and confusing [2].

You can see that both of these approaches are unsatisfactory, and there seems to be an obvious solution: businesses should be able to pay more to just expand the local calling area of their phone, without needing awkward hacks like an FXS.

In fact, there had basically been a solution just like this earlier. So-called "Zenith" numbers were special telephone numbers that did not correspond to a normal physical exchange [3]. Instead, when an operator was asked for a Zenith number they understood it to be a special instruction to look up the actual number and connect the call, but if the call was long-distance they would bill it to the callee instead of the caller. This was toll-free dialing just like we have today, but it required manual effort by the operator who, at the time, would fill out billing tickets for calls by hand. The trouble was that this didn't work at all with direct dialing, the only way to call a Zenith number was to dial zero for the operator and read the number. Customers found this annoying and the telephone companies found it expensive, so there was mutual motivation to find an automated solution.

Although surprisingly janky, a sort of solution was developed quickly for outbound calls: WATS, or Wide Area Telephone Service. WATS was introduced in the early '60s as a simple scheme where a business could pay a flat monthly rate to add additional rate centers to their local calling area, for the purpose of outbound calling only. This could save a lot of money for businesses with clients or offices in other towns. It seemed obvious that the problem of calling areas and Zenith numbers could best be approached by taking WATS and setting it to suck instead of blow. And that's exactly what they did.

In 1967, AT&T introduced inward WATS or InWATS. Much like outbound WATS, InWATS allowed a customer to pay a (large) monthly fee to have their number constitute a local call for customers in other rate centers, even nationwide. It was important that consumers understood that these calls would not incur a toll, and for technical reasons it was desirable to be able to route them differently. For this reason, InWATS numbers were assigned to a new NPA: 800.

While InWATS was similar to our modern toll-free system, it had substantial limitations. First, the rates for InWATS numbers were still based on geographical distance to callers, and InWATS customers could choose (in terms of "bands" or "zones", much like in some transit systems) what distance to pay for. This amusingly maintained the situation where it was worthwhile to strategically place telephone numbers, as an InWATS number in the middle of the country could receive calls from nearly the entire country at a lower rate than an InWATS number located on one of the coasts.

More significantly, though, the technical reality of the phone switching system meant that InWATS was implemented by effectively overlaying the geographical NANP routing system on top of the 800 NPA. For most telephone calls, NPAs identify the physical region of the country to which the call should be routed. For calls to the 800 NPA, the NXX (exchange code) identified the physical area of the country, standing in for the NPA since the NPA was already used to indicate InWATS.

The idea that 800 numbers are "non-geographical" is largely a modern one (and they are not technically "non-geographical" numbers in the sense of 700 and 500). With InWATS, toll-free telephone numbers were still just as geographical as before, just using a second-level "sub-numbering" scheme.

Even more maddeningly, much like WATS before it InWATS handled intrastate and interstate calls completely differently (this was quite simply easier from a perspective of toll regulation). So InWATS numbers subscribed for interstate use actually did not work from within the same state as the subscriber, creating an incentive to put InWATS services in states with small populations in order to minimize the number of people who needed to use a special local number [4]. Although I do not have direct evidence, I will speculate that the confluence of these factors is a major reason that several major national enterprises have located their customer service centers in Albuquerque.

InWATS was replaced in the '80s by a new AT&T service which took advantage of digital switching to eliminate many of the oddities of InWATS service. The major innovation of "Advanced 800," rolled out in 1982, was the use of a "mapping database" that allowed 800 numbers to effectively be "redirected" to any local number. Because tolling was handled digitally using much more flexible configuration, calls to these 800 numbers could be toll-free for all callers but still redirect to any local number. This completely divorced 800 numbers from geography, but for the most part is surprisingly uninteresting because it was really only a technical evolution on the previous state.

A more fundamental change in the 800 number situation happened later in the '80s, as the breakup of the bell system and related events substantially eroded AT&T's monopoly on telephone service. Competitive long distance carriers like MCI had to be allowed to enter the toll-free service market, which meant that a system had to be developed to allocate toll-free numbers between carriers and allow mapping of toll-free numbers to corresponding local (or actual routing) numbers across carrier boundaries.

Two things happened at once: the simple technical reality of needing to manage toll-free numbers across carriers required a more sophisticated approach, and competitive pressures encouraged AT&T to invest in more features for their toll-free service offering. These changes added up to flexible routing of toll-free calls based on various criteria. Further, while 800 numbers were initially distributed between inter-exchange carriers (IXCs, like AT&T, MCI, Sprint, etc) based on number allocation ranges, the inherent "stickiness" of toll-free numbers posed a challenge. Toll-free numbers are often widely published and used by repeat customers, so businesses do not want to change them. This prevents a competitive carrier trying to win their business away, and created a desire for number portability much like had been achieved for local numbers.

This issue broke for toll-free numbers basically the same way it did for local numbers. The FCC issued an order in 1993 stating that it must be possible to "port" toll-free numbers between inter-exchange carriers. Unlike local numbers, though, there was no inherent or obvious method of allocating toll-free numbers (the former geographical and carrier mappings were not widely known to users). This encouraged a completely "open" approach to toll-free number allocation, with all users pulling out of a shared pool.

If this sounds a touch like the situation with DNS, you will be unsurprised by what happened next. A new class of entity was created which would be responsible for allocating toll-free numbers to customers out of the shared namespace, much like DNS registrars. These were were called Responsible Organizations, which is widely shortened to RespOrgs.

The post-1993 system works basically like this: a business or other entity wanting a toll-free number first requests one from a RespOrg. The RespOrg charges them a fee and "assigns" the telephone number to them by means of reserving it in a shared database called SMS/800 (the SMS here is Service Management System, unrelated to the other SMS) [5]. The RespOrg updates SMS/800 to indicate which inter-exchange carrier the toll-free number should be connected to. Whenever a customer calls the toll-free number, their carrier consults SMS/800 to determine where to connect the call. The inter-exchange carrier is responsible for routing it from that point on.

In practice, this looks much simpler for many users as it's common (particularly for smaller customers) for the RespOrg to be the same company as the inter-exchange carrier. Alternately, it might be the same company or a partner of a VoIP or other telephone service provider. Many people might just use a cheap online service to buy a toll-free number that points at their local (mobile or office perhaps) number. They don't need to know that behind the scenes this involves a RespOrg, an inter-exchange carrier, and routing within the inter-exchange carrier and service provider to terminate the call.

The situation of DNS registrars has been subject to some degree of abuse or at least suspicious behavior, and the same is true of RespOrgs. It is relatively easy to become a RespOrg, and so there's a pretty long list of them. Many RespOrgs are providers of various types of phone services (carriers, VoIP, virtual PBX, etc.) who have opted to become a RespOrg to optimize their ability to assign toll-free numbers for their customers. Others, though, are a bit harder to explain.

Perhaps the most infamous RespOrg is a small company called PrimeTel. War-dialers and other telephone enthusiasts have long noted that, if one dials a selection of random toll-free numbers, you are likely to run into a surprising number of identical recordings. Often these are phone sex line solicitations, but sometimes they're other types of content that is uninteresting except for the fact that it appears over and over again on large lists of telephone numbers. These phone numbers all belong to PrimeTel.

Many words have been devoted to the topic of PrimeTel and most notably an episode of the podcast Reply All. I feel much of the mystique of the issue is undeserved, though, as I believe that one fact makes PrimeTel's behavior completely intuitive and understandable: 47 CFR ยง 52.107 forbids the hoarding of toll-free numbers.

That is, toll-free numbers are a semi-limited resource with inherent value due to scarcity, particularly those in the 800 NPA as it is viewed as the most prestigious (unsurprisingly, PrimeTel numbers are more common in 800 than in other NPAs). This strongly suggests that it should be possible to make money by speculatively registering toll-free numbers in order to resell them, as is common for domain names. However, the FCC explicitly prohibits this behavior, largely by stating that toll-free numbers cannot be held by a RespOrg if there is not an actual customer for which the number is held.

So PrimeTel does something that is pretty obvious: in order to speculatively hold toll-free numbers, it acts as customer for all of those numbers.

Since it's hard to come up with a "use" for millions of phone numbers, PrimeTel settles for simple applications like sex lines and other conversation lines. It helps that PrimeTel's owners seem to have a historic relationship to these kinds of operations, so it is a known business to them. Oddly, many of the PrimeTel "services" don't seem to actually work, but that's unsurprising in light of the fact that PrimeTel is only interested in the numbers themselves, not in making any profit from the services they connect to. From this perspective, it's often better if the services don't work, because it reduces PrimeTel's expenses in terms of duration that callers stay on the line.

The case of PrimeTel is often discussed as an egregious example of speculating on (often called warehousing) toll-free numbers, although they are not the only RespOrg accused of doing so. The surprising thing is that the FCC has never taken action against PrimeTel, but, well, the FCC has a reputation for never taking action on things.

Ultimately the impact is probably not that large. It's easy to obtain toll-free numbers in the "less popular" toll-free NPAs such as 844. I have observed that some telecom vendors have zero availability in 800, but that seems to come down to a limitation of the RespOrg relationships they have as the VoIP trunk vendor I use (which is itself a RespOrg) consistently shows tens of 800 numbers available. I tend to like 888s, though. 800 wouldn't get you anything on a slot machine.

In a future post, I will dig a little more into the issue of number portability as it's a major driver of some of the complexity in the phone system. Another topic adjacent to this that bears further discussion is the competitive inter-exchange carriers, which are a major part of the broader story of telephone and technology history.

[1] I had originally tried to construct this example in New Mexico, but this state is so sparsely populated that there are actually very few situations of this type. The Albuquerque RC spans nearly the entire central region of the state, and essentially all calls between RCs are long-distance calls in NM. NM still illustrates oddities of the distance tolling scheme, though, as there are rate centers that clearly reflect history rather than the present. Los Alamos and White Rock are different rate centers despite White Rock being effectively an annexed neighborhood of Los Alamos. They each have each other in their local calling areas.

[2] A related concept to an FXS line was the DISA, or Direct Inward System Access. A DISA was a system, typically a feature of a key system or PBX, that allowed someone calling into a phone system to be connected to an outside line on that same phone system. This made it so that an employee of a company in Portland, at home in Beaverton, could call the Portland office and then access an outside line to make a call... from the Portland rate center. A number of businesses installed these because they could save money on calling between offices (by "bouncing" calls through a city office to avoid long-distance tolls), but as you can imagine they were highly subject to abuse. I used to run a DISA on a telephone number in the Socorro rate center so that I could use "courtesy" local-only phones on the college campus to make long distance calls (at my expense still, but that expense was miniscule and it was useful when my phone was dead).

[3] Why Zenith? The answer is fairly simple. The letter Z was sufficiently rare as the start of a word that it was not included on most telephone dial labels. So, in the time when direct-dialing of calls was done by using the first letters of the exchange name, a customer seeing a "ZEnith" number would quickly realize that "ZE" was not something they could dial, which would direct them to call the operator. By the same token, of course, there are not many words to use as exchange names that satisfy this requirement, so Zenith became pretty standard.

[4] This situation somewhat persists today in an odd way. Toll free numbers cannot be the recipients of collect calls, but there is no international toll free scheme. Take a look at the back of your credit card, most major banks will list a toll-free number for use within the US, but a local number for international use, because they will accept collect calls on that number. International toll-free calling remains an unsolved problem except that the internet is increasingly eliminating the need.

[5] SMS/800 is actually operated by a company called Somos, under contract for the FCC. Somos is also currently the NANP Administrator (NANPA), meaning it is responsible for managing the allocation of NPAs and other elements of administering NANP. There's a whole little world of the "telephone-industrial complex." For example, the role of NANPA formerly belonged to a company called Neustar, formerly a division of Lockheed Martin, which still manages cross-carrier systems such as the STIR/SHAKEN certification authority. Neustar has hired executives away from SAIC/Leidos which has had critical roles in both telephone and internet administration at various points. The whole world of grift on the DoD is tightly interconnected and extends well to grift on other federal agencies.

--------------------------------------------------------------------------------

>>> 2021-06-19 The Visi On Vision

First, after lengthy research and development I have finally followed through on my original vision of making Computers Are Bad available via Gopher. Check it out at gopher://waffle.tech/computer.rip.

Let's talk a bit more about GUIs. I would like to begin by noting that I am intentionally keeping a somewhat narrow focus for this series of posts. While there were many interesting GUI projects across a range of early microcomputer platforms, I am focusing almost exclusively on those GUIs offered for CP/M and DOS. I am keeping this focus for two reasons: First, these are the microcomputer platforms I am personally most interested in. Second, I think the landscape of early CP/M and DOS GUIs are an important part of the history of Windows, because these are the GUIs with which Windows directly competed. A real portion of the failure of Windows 1 and 2 can be attributed to Microsoft's lackluster effort compared to independent software vendors---something quite surprising from the modern perspective of very close coupling between the OS and the GUI [1].

Let's talk, then, about my personal favorite GUI system, and one of the most significant examples of stretching the boundary between operating system and application by implementing basic system features on top of an OS that lacks them... but first, we need to take a step back to perhaps the vintage software I mention most often.

VisiCalc is, for most intents and purposes, the first spreadsheet. There were "spreadsheet-like" applications available well before VisiCalc, but they were generally non-interactive, using something like a compiled language for formulas and then updating data files offline. VisiCalc was the first on the market to display tabular data and allow the definition of formulas within cells, which were then automatically evaluated as the data they depended on changed. It was the first time that you could change one number in a spreadsheet and then watch all the others change in response.

This is, of course, generally regarded as the most powerful feature of a computer spreadsheet... because it allows for the use of a spreadsheet not just as a means of recording and calculation but as a means of simulation. You can punch in different numbers just to see what happens. For the most part, VisiCalc was the first time that computers allowed a user to "play with numbers" in a quick and easy way, and nearly overnight it became a standard practice in many fields of business and engineering.

Released in 1979, VisiCalc was one of the greatest innovations in the history of the computer. VisiCalc is widely discussed as being the "killer app" for PCs, responsible for the introduction of microcomputers to the business world which had formerly eschewed them. I would go one further, by saying that VisiCalc was a killer app for the GUI as a concept. VisiCalc was one of the first programs to truly display the power of direct manipulation and object-oriented interface design, and it wasn't even graphical. It ran in text mode.

We have already, then, identified VisiCalc's creator Dan Bricklin and his company VisiCorp [2] as a pioneer of the GUI. It is no surprise, then, that this investment in the GUI goes beyond just the spreadsheet... and yet it would surprise many to hear that VisiCorp was also the creator of one of the first complete GUIs for DOS, one that was in many ways superior to GUIs developed well after.

By 1983, VisiCorp had expanded from spreadsheets to the broader world of what we would now refer to as productivity software. Alongside VisiCalc were VisiTrend/VisiPlot for regression and plotting [3], word processor VisiWord, spell checker VisiSpell, and proto-desktop database VisiFile. The problem was this: each of these software packages were fully independent, any interoperation (such as spell checking a document or plotting data) requiring saving, launching a new program, and opening.

Of course this was a hassle on a non-multitasking operating system, although multitasking within the scope of a user was sufficiently uncommon at the time that it was not necessarily an extreme limitation. Nonetheless, the tides were turning in the direction of integrated software suites that allowed simultaneous interoperation of programs. In order to do this effectively, a new paradigm for computer interface would be required.

In fact this idea of interoperation of productivity software is an important through-line in GUI software, with most productivity suite developers struggling with the same problem. It tended to lead to highly object-oriented, document-based, componentized software. Major examples of these efforts are the Apple Lisa (and the descendent OpenDoc framework) and Microsoft's OLE, as employed in Office. On the whole, none of these have been very successful, and this remains an unsolved problem in modern software. There is still a great deal of saving the output of one program to open in another. I will probably have a whole message on just this topic in the future.

In any case, VisiCorp realized that seamless interoperation of Visi applications would require the ability to run multiple Visi applications easily, preferably simultaneously. This required a GUI, and fortunately for VisiCorp, the GUI market was just beginning to truly take off.

In order to build a WIMP GUI there are certain fundamental complexities you must address. First, GUI environments are more or less synonymous with multitasking, and so there must be some type of process scheduling arrangement, which had been quite absent from DOS. Second, both multitasking and interprocess communication (which is nearly a requirement for a multitasking GUI) all but require virtual memory. Multitasking and virtual memory management are today considered core features of operating systems, but at this point in time they were unavailable on many operating systems and so anyone aiming for a windowed environment was responsible for implementing these themselves.

Released late 1983, VisiCorp's Visi On GUI environment featured both of these. Multitasking was not at all new and as far as I can tell Visi On multitasking was cooperative (it is very possible I am wrong on this point, it is hard to find a straight answer to this question), so the multitasking capability was not especially cutting edge. What was quite impressive is Visi On's implementation of virtual memory complete with page swapping, which made it practical to have multiple applications running even if they were heavy applications like VisiCorp productivity tools.

Beyond its implementation of multitasking and virtual memory, Visi On was a graphics mode application (i.e. raster display) and supported a mouse. The mouse was used to operate a fundamentally WIMP UI with windows in frames, drop-down menus at the top of windows, and a cursor... fundamentally similar to both pioneering GUIs such as the Alto and the environments that we use today. Visi On allowed multiple windows to overlap, which sounds simple but was not to be taken for granted at the time.

Perhaps the most intriguing feature of Visi On is that it was intended to make software portable. Visi On applications, written in a language called Visi C, targeted a virtual machine called the Visi Machine. The Visi Machine could in theory be ported to other architectures and operating systems, making Visi On development a safer bet for software vendors and adoption of Visi On software a safer bet for users. This feature was itself quite innovative, reminiscent of what Java aimed for much later.

For the many things that Visi On was, there were several things that it was not. For one, Visi On did not embrace the raster display as much as even other contemporary GUIs. There was virtually no use of icons in Visi On. Although it ran in graphics mode it was, visually, very similar to VisiCorp's legacy of text-mode software with GUI-like features.

One of the most significant limitations of Visi On is reflective of the basic problem with GUI environments running on existing operating systems. Visi On was not capable of running DOS software.

This sounds sort of bizarre considering that Visi On itself was a DOS application. Technically, it makes sense, though. DOS was a non-multitasking operating system with direct memory addressing and no hardware abstraction. As a result, all DOS programs were essentially free to assume that they had complete control of the system. DOS applications would freely write to memory anywhere they pleased, and never yielded control back to the system [4]. In short, they were terrible neighbors.

While some GUI systems found ways to coexist with at least some DOS applications (notably, Windows), Visi On did not even make the attempt. Visi On was only capable of running applications specifically built for it, and all other applications required that the user exit Visi On back to plain old DOS. If you wonder why you have never heard of such a revolutionary software package as Visi On, this is one major reason: Visi On's incompatibility with the existing stable of DOS applications made it unappealing to most users, who did not want to live a life of only VisiCorp products.

The other big problem with Visi On was the price. Visi On was expensive to begin with, retailing at $495. It had particularly high system requirements in addition. Notably, the use of virtual memory and swapping required something to swap to... Visi On required a hard drive, which was not yet common on PCs. All in all, a system capable of running Visi On would be a huge expense compared to typical PCs and even other GUI systems that emerged not long after.

Visi On had a number of other intriguing limitations to boot. Because it was released for DOS 2 which used FAT12, it could only be run on a FAT12 system even as DOS 3 made the jump to FAT16... among the many things Visi On had to implement to enable multitasking was direct interaction with the storage. VisiCorp required a Mouse Systems mouse, which was standard as of release but was soon after obsoleted (for most purposes) by the Microsoft mouse standard, so even obtaining a mouse that worked with Visi On could be a hassle.

In the end, Visi On's problems were at least as great as its innovations... cost of a working system most of all. Visi On was the first proper GUI environment to market for the IBM PC, but many others followed very quickly after, including Microsoft's own Windows (which was, debatably, directly inspired by Visi On). More significantly at the time, the Macintosh was released shortly after Visi On. The Macintosh was a lemon in many ways, but did gain appreciable market share by fixing the price issues with the Lisa (admittedly partially through reduced functionality and a less ambitious interface).

The combination of Visi On's high price, limitations, and new competition were too much for VisiCorp to bear. Perhaps VisiCorp could have built on its early release to remain a technical leader in the space, but there were substantial internal issues within VisiCorp that prevented Visi On receiving care and attention after its release. It became obsolete very quickly, and this coincided with VisiCalc encountering the same trouble: ironically, Lotus 1-2-3 was far more successful in taking advantage of the raster display (by being available for common hardware configurations unlike Visi On), which lead to VisiCalc itself becoming obsolete.

Shortly after release, in 1984, VisiCorp sold Visi On to CDC. CDC didn't really have much interest in the software, and neither enhanced it nor marketed it. Visi On died an ignominious death, not even a year after its release... and that was the end of the first GUI for the IBM PC. Of course, there would be many more.

[1] Of course you may be aware that non-NT Windows releases (up to Millennium Edition) similarly consisted basically of Windows running as an application on DOS, although the coupling became tighter and tighter with each release. This is widely viewed as one of the real downfalls of these operating systems because they necessarily inherited parts of DOS's non-multitasking nature, including an if-in-doubt-bail-out approach to error handling in the "kernel." Imagine how much worse that was in these very early GUIs!

[2] The Corporate Entity Behind VisiCalc went through various names through its history, including some acquisitions and partnerships. I am always referring to the whole organization behind VisiCalc as VisiCorp for simplicity and because it's the best name out of all of them.

[3] This view of regression and plotting as coupled features separate from the actual spreadsheet is still seen today in spreadsheets such as Excel, where regression and projection are mostly clearly exposed through the plotting tool. This could be said to be the main differentiation between spreadsheets and statistical tooling such as Minitab: spreadsheets do not view operations on vectors as a core feature. Nonetheless, Excel's inability to produce a simple histogram without a plugin for decades was rather surprising.

[4] There were DOS applications that produced a vestige of multitasking, called TSRs for Terminate and Stay Resident. These were not multitasking in any meaningful way, though, as the TSR had to set an interrupt handler and hope the running application did not change it. The TSR could only gain control via an interrupt. When the interrupt occurred, the TSR became the sole running task. Of course, these limitations made the "multitasking-like" TSRs that existed all the more interesting.

--------------------------------------------------------------------------------

>>> 2021-06-12 ieee 1394 for grilling

To begin with, a reader emailed me an objection to my claim that Smalltalk has never been used for anything. They worked at an investment bank you have heard of where Smalltalk was used for trading and back office systems, apparently at some scale. This stirred a memory in me---in general the financial industry was (and to some extent is) surprisingly interested in "cutting edge" computer science, and I think a lot of the technologies that came out of first-wave artificial intelligence work really did find use in trading especially. I'd be curious to hear more about this from anyone who worked in those environments, as I know little about finance industry technology (despite my interest in their weird phones). Also, I am avoiding naming this reader out of respect for their privacy and because I neglected to ask them if it's okay to do so before going to publish this. So if you email me interesting facts, maybe do me a favor and mention whether or not you mind if I publish them. I'm bad at asking.

And now for something completely different.

Years ago, at a now-shuttered Smith's grocery store in my old home of Socorro, New Mexico, I did a dramatic double-take at a clearance rack full of Firewire. This Firewire was basically a steel cable used like a skewer but, well, floppy. The name got a chuckle out of me and this incident somehow still pops into my mind every time I think about one of my "favorite" interconnects: IEEE 1394.

IEEE 1394 was developed as a fast serial bus suitable for use with both storage devices and multimedia devices. It was heavily promoted by Apple (its original creator) and present on most Apple products from around 2000 to the switch to Thunderbolt, although its popularity had decidedly waned by the time Thunderbolt repeated its mistakes. FireWire was never as successful as USB for general-purposes usage. There are various reasons for this, but perhaps the biggest is that FireWire was just plain weird.

What's it called?

IEEE 1394 was developed by several groups in collaboration, but it was conceived and championed by Apple. Apple refers to it by the name FireWire, and so do most humans, but Apple held a trademark on that name. While Apple made arrangements to license the trademark to a trade association for use on other implementations in 2002, long after that most PC manufacturers continued to use the term IEEE 1394 instead. I am not clear on whether or not this was simple aversion to using a name which was strongly associated with a competitor or if these implementations were somehow not blessed by the 1394 Trade Association.

In any case, you will probably find the terms FireWire and IEEE 1394 used with roughly equal frequency. For further confusion, Sony uses the term i.LINK to refer to IEEE 1394 on their older products including cameras and laptops. Wikipedia says that TI also refers to it as Lynx, but I haven't seen that name personally and cursory internet research doesn't turn up a whole lot either.

The lack of a single, consistent brand identity for FireWire might be seen as its first major mistake. My recollection from FireWire's heyday is that there were indeed people who did not realize that FireWire devices could be used with non-Apple computers, even though "IEEE 1394" interfaces were ubiquitous on PCs at the time. I think this must have negatively impacted sales of FireWire peripherals, because by the time I was dealing with this stuff the only storage peripherals being sold with FireWire were being marketed exclusively to Apple users by historically Apple-associated brands like Macally and LaCie.

What does it look like?

Further contributing to compatibility anxiety was the variety of physical connectors in use. The major FireWire connectors in use were (most commonly) called Alpha, Sony, and Beta. The difference between Alpha and Beta was one of speed, as Alpha was designed for FireWire 400 (400Mbps) and Beta for FireWire 800 (800Mbps). Even this change, though, required the use of so-called "Bilingual" cables with Alpha on one end and Beta on the other.

The Sony standard, which worked only with FireWire 400, was smaller and so popular on mobile or otherwise low-profile devices. A number of laptops also used this smaller connector for reasons I'm not completely clear on (the Alpha connector is not significantly larger than USB).

The result was that practical use of FireWire frequently required adapters or asymmetric cables, even more so than USB (where the device connector was inconsistent) since both ends had a degree of inconsistency involved. The hassle was minor but surely didn't help.

Just to make things more fun, FireWire could be transported over twisted pair (UTP) and efforts were made towards FireWire over single mode fiber. I'm not aware of any significant use of these, but the idea of running FireWire over UTP will become significant later on.

Is it cooler than USB?

Unlike USB and other contemporary peripheral interconnects, FireWire had complex support for management and configuration of the bus. Unlike USB which was 1:1 computer to device, FireWire supported arbitrary groups of up to 63 devices in a tree. While there is a "root node" with some centralized responsibility in the operation of the bus, any device can send data directly to any other device without a copy operation at the root node.

This meant that FireWire was almost more a network protocol than a mere peripheral interconnect. In fact, it was possible to transport Ethernet frames over FireWire and thus use it as an IP network technology, although this wasn't especially common. Further supporting network usage, FireWire supported basic traffic engineering in the form of dedicated bandwidth for certain data streams. This was referred to as isochronous mode, and its ability to guarantee a portion of the bus to real-time applications is reflective of one of FireWire's major strengths (suitability for multimedia) and reminds me of just how uncommon this is in common computer systems, which makes me sad.

Despite the common perception in the computing industry that opportunistic traffic management is better^wmore fun^w^weasier to implement, FireWire's allocated bandwidth capability turned out to be one of its most important features, as it fit a particular but important niche: camcorders.

The handheld camcorders of the early 2000s mostly used DV (digital video), which recorded a digital stream onto a magnetic tape (inexpensive random-access storage was not sufficiently durable or compact at the time). In order to transfer a video to a computer, the tape was played back and the contents of the tape sent directly back to the computer, which recorded it. USB proved incapable of meeting the task.

It's not quite as simple as USB being too slow; USB2.0 could meet the data rate requirements. The problem is that USB (until USB 3.0) was polling-based, and so reliable transfer of digital video from a tape relied on the computer polling sufficiently frequently. If it didn't---say because the user was running another program during the transfer---the video would be corrupted. It turns out that, for moving digital media at original quality, allocated bandwidth matters.

Note that FireWire is effectively acting as a packetized video transport in this scenario, just with some extra support for a control channel. This is very similar to later video technologies such as HDMI.

Did the interesting features become a security problem?

The more complicated something is, the more likely it is that someone will use it to steal your credit card information. FireWire is no exception. Part of FireWire's performance advantage was its support for DMA, in which a FireWire device can read or write information directly from a computer's memory. This was a useful performance optimization, especially for high-speed data transfer, because it avoided the need for extra copies out of a buffer.

The problem is that memory is full of all kinds of things that probably shouldn't be shared with every peripheral. FireWire was introduced before DMA was widely seen as a security concern, and well before memory management units that provided security protections on DMA. On many real FireWire devices, access to physical memory was completely unrestricted. Every FireWire device was (potentially) a memory collection device.

What happened to FireWire?

Consumer adoption was always poor outside of certain niche areas such as the DV video transfer use case. I suspect that a good portion of the issue was the higher cost of FireWire controllers (due to their higher complexity), which discouraged FireWire in low-cost peripherals and cemented USB as a more, eh, universal solution. Consumer perceptions of FireWire as being more complex than USB and somewhat Apple specific were likely an additional factor.

That said, the final nail in FireWire's coffin was probably a dispute between Apple and other vendors related to licensing costs. FireWire is protected by a substantial patent portfolio, and in 2002 Apple announced a substantial $1-per-port licensing fee for use of the technology. Although the fee was later reduced, it was a fiasco that took much of the wind out of FireWire's sails, particularly since some major partners on FireWire technology (including Intel) saw it as a betrayal of previous agreements and ended their active promotion of FireWire.

In summation, FireWire seems to have fallen victim to excessive complexity, costly implementation, and licensing issues. Sound familiar? That's right, there's more commonality between FireWire and ThunderBolt than just the name.

While Apple stopped supporting FireWire some years ago, it continues to see a few applications. IEEE 1394 was extended into embedded and industrial buses and is used in the aerospace industry. It also continues to have some use in industrial automation and robotics, where it's used as a combined transport for video and control with machine vision cameras. That said, development of the FireWire technology has basically stopped, and it's likely these uses will fade away in the coming years.

Last of all, I have to mention that US cable boxes used to be required to provide FireWire ports. The reason relates to the conflict of cable providers and cable regulators in the United States, which will be its own post one day.

--------------------------------------------------------------------------------
<- newer                                                                older ->