_____                   _                  _____            _____       _ 
  |     |___ _____ ___ _ _| |_ ___ ___ ___   |  _  |___ ___   | __  |___ _| |
  |   --| . |     | . | | |  _| -_|  _|_ -|  |     |  _| -_|  | __ -| .'| . |
  |_____|___|_|_|_|  _|___|_| |___|_| |___|  |__|__|_| |___|  |_____|__,|___|
  a newsletter by |_| j. b. crawford                       home subscribe rss

>>> 2021-04-03 use computers to store data

The New York Times once described a software package as "among the first of an emerging generation of software making extensive use of artificial intelligence techniques." What is this deep learning, data science, artificial intelligence they speak of? Ansa Paradox, of 1985 [1].

Obviously the contours of "artificial intelligence" have shifted a great deal over the years. At the same time, though, basic expectations of computers have shifted---often backwards.

One of the most obvious applications of computer technology is the storage of data.

Well, that's both obvious and general, but what I mean specifically here is data which had previously or conventionally been stored on hardcopy. Business records, basically: customer accounts, project management reports, accounts payable, etc etc. The examples are numerous, practically infinite.

I intend to show that, counter-intuitively, computers have in many ways gotten worse at these functions over time. The reasons are partially technical, but for the most part economic. In short, capitalism ruins computing once again.

To get there, though, we need to start a ways back, with the genesis of business computing.

Early computers were generally not applied to "data storage" tasks. A simple explanation is that storage technology developed somewhat behind computing technologies; early computers, over a reasonable period of time, could process more data than they could store. This is where much of the concept of a "computer operator" comes from: the need to more or less continuously feed new data to the computer, retrieved from paper files or prepared (e.g. on punched cards) on demand.

As the state of storage changed, devices included simple, low-capacity types of solid state memory such as core memory, and higher capacity media such as paper or magnetic tape. Core memory was random access, but very expensive. Tape was relatively inexpensive on a capacity basis, but it was extremely inefficient to access it in a nonlinear (e.g. random) fashion. This is essentially the origin of mainframe computers being heavily based around batch processing: for efficiency purposes, data needed to be processed in large volumes, in fixed order, simply to facilitate the loading of tapes.

The ability to efficiently use a "database" as we think of them today effectively required a random-access storage device of fairly high capacity, say, a multi-MB hard drive (or more eccentrically, and more briefly, something like a film or tape magazine).

Reasonably large capacity hard disk drives were available by the '60s, but were enormously expensive and just, well, enormous. Still, these storage devices basically created the modern concept of a "database:" a set of data which could be retrieved not just in linear order but also arbitrarily based on various selection criteria.

As a direct result of these devices, IBM researcher E. F. Codd published a paper in 1970 describing a formalized approach to the storage and retrieval of complex-structured data on large "shared data banks." Codd called his system "relational," and described the primary features seen in most modern databases. Although it was somewhat poorly received at the time (likely primarily due to the difficulty of implementing it on existing hardware), by the '90s the concept of a relational database had become so popular that it was essentially assumed that any "database" was relational in nature, and could be queried by SQL or something similar to it.

A major factor in the rising popularity of databases was the decreasing cost of storage, which encouraged uses of computers that required this kind of flexible, structured data storage. By the end of the 1980s, hard disk drives became a common option on PCs, introducing the ingredients for a database system to the consumer market.

This represented, to a degree which I do not wish to understate, a democratization of the database. Nearly as soon as the computers and storage became available, it was a widespread assumption that computer users of all types would have a use for databases, from the home to the large enterprise. Because most computer users did not have the desire to learn a programming language and environment in depth, this created a market for a software genre almost forgotten today: the desktop database.

I hesitate to make any claims of "first," but an early and very prominent desktop database solution was dBase II (they called the first version II, a particularly strong form of the XBox 360 move) from Ashton-Tate. dBase was released in around 1980, and within a period of a few years the field proliferated. FoxPro (actually a variant of dBase) and Paradox were other major entrants from the same time period that may be familiar to older readers.

dBase was initially offered on CP/M, which was a popular platform at the time (and one that was influential on the design of DOS), but was ported to DOS (of both Microsoft and IBM variants) and Apple II, the other significant platforms of the era.

Let's consider the features of dBase, which was typical of these early desktop database products. dBase was textmode software, and while it provided tabular (or loosely "graphical") views, it was primarily what we would now call a REPL for the dBase programming language. The dBase language was fairly flexible but also intended to be simple enough for end-users to learn, so that they could write and modify their own dBase programs---this was the entire point of the software, to make custom databases accessible to non-engineers.

The dBase language was similar to SQL but added additional interactive prompting and operations for ease of use. Ted Leath provides a reasonably complex example dBase program on his website.

It wasn't necessarily expected, though, that the dBase language and shell would be used on an ongoing basis. Instead, dBase shipped with tools called ASSIST and APPGEN. The purpose of these tools was to offer a more user-friendly interface to a dBase database. ASSIST was a sort of general-purpose client to the database for querying and data management, while APPGEN allowed for the creation of forms, queries, and reports linked by a menu system---basically the creation of a CRUD app.

In a way, the combination of dBase and APPGEN is thus a way to create common CRUD applications without the need for "programming" in its form at the time. This capability is referred to as Rapid Application Development (RAD), and RAD and desktop databases are two peas in a pod. The line between the two has become significantly blurred, and all desktop databases offer at least basic RAD capabilities. More sophisticated options were capable of generating client applications for multiple applications which could operate over the network.

As I mentioned, there are many of these. A brief listing that I assembled, based mostly on Wikipedia with some other sources, includes: DataEase, Paradox, dBase, FoxPro, Kexi, Approach, Access, R:Base, OMNIS, StarOffice/OpenOffice/LibreOffice/NeoOffice Base (delightfully also called Starbase), PowerBuilder, FileMaker, and I'm sure at least a dozen more. These include some entrants from major brands recognizable today, such as Access developed by Microsoft, FileMaker acquired by Apple, and Approach acquired by IBM.

These products were highly successful in their time. dBase propelled Ashton-Tate to the top of the software industry, alongside IBM, in the 1980s. FileMaker has been hugely influential in Apple business circles. Access was the core of many small businesses for over a decade. It's easy to see why: desktop databases, and their companion of RAD, truly made the (record-keeping) power of computers available to the masses by empowering users to develop their own applications.

You didn't buy an inventory, invoicing, customer management, or other solution and then conform your business practices to it. Instead, you developed your own custom application that fit your needs exactly. The development of these database applications required some skill, but it was easier to acquire than general-purpose programming, especially in the '90s and '00s as desktop databases made the transition to GUI programs with extensive user assistance. The expectation, and in many cases reality, is that a business clerk could implement a desktop database solution to their record-keeping use case with only a fairly brief study of the desktop database's manual... no coding bootcamp required.

Nonetheless, a professional industry flowered around these products with many third-party consultants, integrators, user groups, and conferences. Many of these products became so deeply integrated into their use-cases that they survive today, now the critical core of a legacy system. Paradox, for example, has become part of the WordPerfect suite and remains in heavy use in WordPerfect holdout industries such as law and legislation.

And yet... desktop databases are all but gone today. Many of these products are still maintained, particularly the more recent entrants such as Kexi, and there is a small set of modern RAD solutions such as Zoho Creator. All in all, though, the desktop database industry has entirely collapsed since the early '00s. Desktop databases are typically viewed today as legacy artifacts, a sign of poor engineering and extensive technical debt. Far from democratizing, they are seen as constraining.

What changed?

I posit that the decline of desktop databases reflects a larger shift in the software industry: broadly speaking, an increase in profit motive, and a decrease in ambition.

In the early days of computing, and extending well into the '90s in the correct niches, there was a view that computers would solve problems in the most general case. From Rear Admiral Hopper's era of "automatic programming" to "no-code" solutions in the '00s, there was a strong ambition that the field of software engineering existed only as a stopgap measure until "artificial intelligence" was developed to such a degree that users were fully empowered to create their own solutions to their own problems. Computers were infinitely flexible, and with a level of skill decreasing every day they could be made to perform any function.

Today, computers are not general-purpose problem-solving machines ready for the whims of any user. They are merely a platform to deliver "apps," "SAAS," and in general special-purpose solutions delivered on a subscription model.

The first shift is economic: the reality of desktop databases is that they were difficult to monetize to modern standards. After a one-time purchase of the software, users could develop an unlimited number of solutions without any added cost. In a way, the marketers of desktop databases sealed their own fate by selling, for a fixed fee, the ability to not be dependent on the software industry going forward. While not achieved, this was at least the ideal of their fate.

The second shift is cultural: the mid-century to the '90s was a heady time in computer science when the goal was flexibility and generality. To be somewhat cynical (not that that is new), the goal of the '10s and '20s is monetization and engagement. Successful software today must be prescriptive, rather than general, in order to direct users to the behaviors which are most readily converted into a commercial advantage for the developer.

Perhaps more deeply though, software engineers have given up.

The reality is that generality is hard. I am, hopefully obviously, presenting a very rosy view of the desktop database. In practice, while these solutions were powerful and flexible, they were perhaps too flexible and often lead to messy applications which were unreliable and difficult to maintain. Part of this was due to limitations in the applications, part of it was due to the inherent challenge of untrained users who were effectively developing software without a practical or academic knowledge of computer applications (although one could argue that this sentence describes many software engineers today...).

One might think that this is one of the most important challenges that a computer scientist, software engineer, coder, etc. could take on. What needs to be done, what needs to be changed to make computers truly the tools of their owners? Truly a flexible, general device able to take on any challenge, as IBM marketing promised in the '50s?

But, alas, these problems are hard, and they are hard in a way that is not especially profitable. We are, after all, talking about engineering software vendors entirely out of the problem.

The result is that the few RAD solutions that are under active development today are subscription-based and usage-priced, effectively cloud platforms. Even despite this, they are generally unsuccessful. Yet, the desire for a generalized desktop database remains an especially strong one among business computer users. Virtually everyone who has worked in IT or software in an established business environment has seen the "Excel monstrosity," a tabular data file prepared in spreadsheet software which is trying so very hard to be a generalized RDBMS in a tool not originally intended for it.

As professionals, we often mock these fallen creations of a sadistic mind as evidence of users run amok, of the evils of an untrained person empowered by a keyboard. We've all done it, certainly I have; making fun of a person who has created a twenty-sheet, instruction-laden Excel workbook to solve a problem that clearly should have been solved with software, developed by someone with a computer science degree or at least a certificate from a four-week fly-by-night NodeJS bootcamp.

And yet, when we do this, we are mocking users for employing computers as they were once intended: general-purpose.

I hesitate to sound like RMS, particularly considering what I wrote a few messages ago. But, as I said, he is worthy of respect in some regards. Despite his inconsistency, perhaps we can learn something from his view of software as user-empowering versus user-subjugating. Desktop databases empowered users. Do applications today empower users?

The software industry, I contend, has fallen from grace. It is hard to place when this change occurred, because it happened slowly and by degrees, but it seems to me like sometime during the late '90s to early '00s the software industry fundamentally gave up. Interest in solving problems was abandoned and replaced by a drive to engage users, a vague term that is nearly always interpreted in a way that raises fundamental ethical concerns. Computing is no longer a lofty field engaged in the salvation of mankind; it is a field of mechanical labor engaged in the conversion of people into money.

In short, capitalism ruins computing once again.


If I have a manifesto at the moment, this is it. I don't mean to entirely degrade the modern software industry, I mean, I work in it. Certainly there are many people today working on software that solves generalized problems for any user. But if you really think about it, on the whole, do you feel that the modern software industry is oriented towards the enablement of all computer users, or towards the exploitation of those users?

There are many ways in which this change has occurred, and here I have focused on just one minute corner of the shift in the software industry. But we can see the same trend in many other places: from a distributed to centralized internet, from open to closed platforms, from up-front to subscription, from general-purpose to "app store." And yet, after it all, there is still "dBase 2019... for optimized productivity!"

[1] I found this amazing quote courtesy of some Wikipedia editor, but just searching a newspaper archive for "artificial intelligence" in the 1970-1990 timeframe is a ton of fun and will probably lead to a post one day.


>>> 2021-03-27 the actual osi model

I have said before that I believe that teaching modern students the OSI model as an approach to networking is a fundamental mistake that makes the concepts less clear rather than more. The major reason for this is simple: the OSI model was prescriptive of a specific network stack designed alongside it, and that network stack is not the one we use today. In fact, the TCP/IP stack we use today was intentionally designed differently from the OSI model for practical reasons.

Teaching students about TCP/IP using the OSI model is like teaching students about small engine repair using a chart of the Wankel cycle. It's nonsensical to the point of farce. The OSI model is not some "ideal" model of networking, it is not a "gold standard" or even a "useful reference." It's the architecture of a specific network stack that failed to gain significant real-world adoption.

Well, "failed to gain real-world adoption" is one of my favorite things, so today we're going to talk about the OSI model and the OSI network stack.

The story of the OSI model basically starts in the late '70s with a project between various standards committees (prominently ISO) to create a standardized network stack which could be used to interconnect various systems. An Open Systems Interconnection model, if you will.

This time period was the infancy of computer networking, and most computer networks operated on vendor-specific protocols that were basically overgrown versions of protocols designed to connect terminals to mainframes. The IBM Systems Network Architecture was perhaps the most prominent of these, but there were more of them than you could easily list.

Standardized network protocols that could be implemented across different computer architectures were relatively immature. X.25 was the most popular, and continues to be used as a teaching example today because it is simple and easy to understand. However, X.25 had significant limitations, and was married to the telephone network in uncomfortable ways (both in that it relied on leased lines and in that X.25 was in many ways designed as a direct analog to the telephone network). X.25 was not good enough, and just as soon as it gained market share people realized they needed something that was more powerful, but also not tied to a vendor.

The OSI network stack was designed in a very theory-first way. That is, the OSI conceptual model of seven layers was mostly designed before the actual protocols that implemented those layers. This puts the OSI model in an unusual position of having always, from the very start, been divorced from actual working computer networks. And while this is a matter of opinion, I believe the OSI model to have been severely over-engineered from that beginning.

Unlike most practical computer networks which aim to provide a simple channel with few bells and whistles, the OSI model attempted to encode just about every aspect of what we now consider the "application" into the actual protocols. This results in the OSI model's top four layers, which today are all essentially "Application" spelled in various strange ways. Through a critical eye, this could be viewed as a somewhat severe example of design over function. History had, even by this time, shown that what was needed from computer networks was usually ease of implementation and ease of use, not power.

Unfortunately, the OSI model, as designed, was powerful to a fault.

From the modern perspective, this might not be entirely obvious, but only because most CS students have been trained to simply ignore a large portion of the model. Remember, the OSI model is:

  1. Please (Physical)
  2. Do (Data Link)
  3. Not (Network)
  4. Throw (Transport)
  5. Sausage (Session)
  6. Pizza (Presentation)
  7. Away (Application)

Before we get too much into the details of these layers, let's remember what a layer is. The fundamental concept that the OSI model is often used to introduce is the concept that I call network abstraction: each layer interacts only with the layer below it, and by doing so provides a service to the layer above it.

Each layer has a constrained area of concern, and the protocol definitions create a contract which defines the behavior of each layer. Through this sort of rigid, enforced abstraction, we gain flexibility: the layers become "mix and match." As long as layers implement the correct interface for above and expect the correct interface from below, we can use any implementation of a given layer that we want.

This matters in practice. Consider the situation of TCP and UDP: TCP and UDP can both be dropped on top of IP because they both expect the same capabilities from the layer under them. Moreover, to a surprising extent TCP and UDP are interchangeable. While they provide different guarantees, the interface for the two is largely the same, and so switching which of the two software uses is trivial (in the simple case where we do not require the guarantees which TCP provides) [1].

So, having hopefully grasped this central concept of networking, let's apply it to the OSI model, with which it was likely originally taught to us. The presentation layer depends on the session layer, and provides services to the application layer. That's, uhh, cool. Wikipedia suggests that serializing data structures is an example of something which might occur at this layer. But this sort of presupposes that the session layer does not require any high-level data structures, since it functions without the use of the presentation layer. It also seems to suggest that presentation is somehow dependent on session, which makes little sense in the context of serialization.

In fact, it's hard to see how this "fundamental concept" of the presentation layer applies to computing systems because it does not. Session and presentation are both "vestigial layers" which were not implemented in the IP stack, and so they have no real modern equivalent. Most teaching of the session and presentation layers consists of instructors grasping for examples---I have heard of things like CHAP as the session layer---which undermine the point they are making by violating the actual fundamental concept of layered networking.

Now that we all agree that the OSI model is garbage which does not represent the real world, let's look at the world it does represent: the OSI protocols, which were in fact designed explicitly as an implementation of the OSI model.

Layer 1

No one really defines layer 1, the physical layer, because it is generally a constraint on the design of the protocols rather than something that anyone gets to design intentionally. The physical layer, in the context of the OSI stack, could generally be assumed to be a simple serial channel like a leased telephone line, using some type of line coding and other details which are not really of interest to the network programmer.

Layer 2

Layer 2, the data link layer, provides the most fundamental networking features. Today we often talk of layer 2 as being further subdivided into the MAC (media access control) and LLC (logical link control) sublayers, but to a large extent this is simply a result of trying to retcon the OSI model onto modern network stacks, and the differentiation between MAC and LLC is not something which was contemplated by the actual designers of the OSI model.

The data link layer is implemented primarily in the form of X.212. In a major change from what you might expect if you were taught the IP stack via the OSI model, the OSI data link link layer and thus X.212 provides reliability features including checksumming and resending. Optionally, it provides guaranteed order of delivery. X.212 further provides a quality of service capability.

Specifically related to order of delivery, X.212 provides a connection-oriented mode and a connectionless mode. This is very similar (but not quite the same) to the difference between TCP and UDP, but we are still only talking about layer 2! Keep in mind here that layer 2 is essentially defined within the context of a specific network link, and so these features are in place to content with unreliable links or links that are themselves implemented on other high-level protocols (e.g. tunnels), and not to handle routed networks.

X.212 addressing is basically unspecified, because the expectation is that addresses used at layer 2 will be ephemeral and specific to the media in use. Because layer 2 traffic cannot directly span network segments, there is no need for any sort of standardized addressing.

As with most layers, there are alternative implementations available for the data link layer, including implementations that transport it over other protocols.

Layer 3

OSI layer 3, the network layer, provides a more sophisticated service which is capable of moving bytes between hosts with basically the same semantics we expect in the IP world. Layer 3 is available in connection oriented and connectionless modes, much like layer 2, but now provides these services across a routed network.

The two typical layer 3 protocols are Connectionless Network Protocol and Connection Oriented Network Protocol, which are basically exactly what they sound like.

OSI addressing at these layers is based on Network Service Point Addresses or NSAPs. Or, well, it's better to say that NSAPs are the current standard for addressing. In fact, the protocols are somewhat flexible and historically other schemes were used but have been largely replaced by NSAP. NSAP addresses are 20 bytes in length and have no particular structure, although there are various norms for allocation of NSAPs that include embedding of IP addresses. NSAPs do not include routing information as is the case with IP addresses, and so the process of routing traffic to a given NSAP includes the "translation" of NSAPs into more detailed addressing types which may be dependent on the layer 2 in use. All in all, OSI addressing is confusing and in modern use depends very much on the details of the specific application.

Layer 4

Layer 4, the transport layer, adds additional features over layer 3 including multiplexing of multiple streams, error recovery, flow control, and connection management (e.g. retries and reconnects). There are a variety of defined layer 4 protocol classes called TP0 thru TP4, which vary in the features that they offer in ways that do not entirely make sense from the modern perspective.

Because layer 4 offers general messaging features, it is perhaps the closest equivalent to the TCP and UDP protocols in the IP stack, but of course this is a confusing claim since there are many elements of UDP and TCP found at lower levels as well.

The selection of one of the five transport layer "levels" depends basically on application requirements and can range from very high reliability (TP4) to low latency given unreliable network conditions, with relaxed guarantees (TP0 or TP1).

Layer 5

The session layer adds management of associations between two hosts and the status of the connection between them. This is a bit confusing because the IP model does not have an equivalent, but it might help to know that, in the OSI model, connections are "closed" at the session layer (which causes actions which cascade down to the lower layers). The OSI session layer, defined by X.215, this serves some of the roles we associate with link setup.

More interestingly, though, the session layer is responsible for very high-level handling of significant network errors by gracefully restarting a network dialog. This is not a capability that the IP stack offers unless it is explicitly included in an application.

The session layer manages conversations through a token mechanism, which is somewhat similar to that of token-ring networking or the general "talking stick" concept. Multiple tokens may be in use, allowing for half-duplex or duplex interactions between hosts.

Like basically every layer below it, Layer 5 comes in connection-oriented and connectionless flavors. The connectionless flavor is particularly important since it provides powerful features for session management without the requirement for an underlying prepared circuit---something which is likewise often implemented at the application layer over UDP.

Layer 6

Layer 6, the presentation layer, is another which does not exist in the IP stack. The session layer is a bit hard to understand from the view of the IP stack, but the presentation layer is even stranger.

The basic concept is this: applications should interact using abstract representations rather than actual wire-encoded values. These abstract values can then be translated to actual wire values based on the capabilities of the underlying network.

Why is this even something we want? Well, it's important to remember that this network stack was developed in a time period when text encoding was even more poorly standardized than now, and when numeric representation was not especially well standardized either (with various types of BCD in common use).

So, for two systems to be able to reliably communicate, they must establish an acceptable way to represent data values... and it is likely that a degree of translation will be required. The OSI presentation layer, defined by X.216, nominally adjusts for these issues by the use of an abstract representation transformed to and from a network representation. There are actually a number of modern technologies that are similar in concept, but they are seldom viewed as network layers [2].

Layer 7

Finally, the application layer is actually where, you know, things are done. While the application layer is necessarily flexible and not strongly defined, the OSI stack nonetheless comes with a generous number of defined application layer protocols. While it's not particularly interesting to dig into these all, it is useful to note a couple that remain important today.

X.500, the directory service application protocol, can be considered the grandparent of LDAP. If you think, like all sane people, that LDAP is frustratingly complicated, boy you will love X.500. It was basically too complex to live, but too important to die, and so it was pared down to the "lightweight" LDAP.

Although X.500 failed to gain widespread adoption, one component of X.500 lives on today, nearly intact: X.509, which describes the cryptographic certificate feature of the X.500 ecosystem. The X.509 certificate format and concepts are directly used today by TLS and other cryptographic implementations, including its string representations (length-prefixed) which were a decent choice at the time but now quite strange considering the complete victory of null-terminated representations.

X.400, the messaging service protocol, is basically the OSI version of email. As you would expect, it is significantly more powerful and complicated than email as we know it today. For a long time, Microsoft Exchange was better described as an X.400 implementation than an email implementation, which is part of why it is a frightening monstrosity. The other part is everything about modern email.

And that is a tour of the OSI network protocols. I could go into quite a bit more depth, but I have both a limited budget to buy ISO standards and a limited attention span to read the ones I could get copies of. If you are interested, though, the OSI stack protocols are all well defined by ITU standards available in the US from ISO or from our Estonian friends for much cheaper. For a fun academic project, implement them: you will be perhaps the only human alive who truly understands the OSI model ramble your data communications professor indulged in.

[1] Contrast SCTP, which provides an interface which is significantly different from the UDP and TCP bytestream, due to features such as multiple streams. Not unrelatedly, SCTP has never been successful on the internet.

[2] I think that this is actually a clue to the significant limitations of the OSI model for teaching. The OSI model tends to create a perception that there is one "fixed" set of layers with specified functions, when in actual modern practice it is very common to have multiple effective layers of what we would call application protocols.


>>> 2021-03-24 RMS

And now, a brief diversion on current events, which will feature a great deal of opinion. I will limit my remarks somewhat because I do not want to be too harsh on RMS and because I do not want to put myself out there too much. However, I have had a personal experience that was very much formative of my opinion on the issue and I felt was worth sharing.

I once spent three days with Richard M. Stallman. I was the person his assistant would frequently call and ask to speak with him, since he objects to carrying a phone. I advertised the opportunity to meet him throughout the state. I talked him up as a philosophical leader in intellectual property and authorship to people outside of the computing field. I had FSF stickers all over my laptops. And a thermostat.

I have complex feelings about the man. On the one hand, I do not feel him to be an acceptable leader of the FSF or the movement. On the other hand, I respect him for having gotten us to where we are today. I believe that the human mind is large enough to hold both of these ideas at once, and that the good of the world we live in and the people we live with require us to do so.

I wholeheartedly supported RMS's original removal from the Free Software Foundation, and I am concerned about and opposed to the FSF's decision to once again give him a seat on the board. My reasons for this do indeed relate to RMS's track record of alarming opinions on sexual conduct and alleged history of sexual harassment, but are not limited to these.

Just as much, I am concerned about his competence. It is my opinion, and the opinion of a great many other people who I have discussed the matter with, that RMS has been a net negative for the Free Software Foundation and the larger Free Software movement for some time. He has repeatedly made questionable decisions in the leadership of the Gnu project and the FSF. In personal remarks to myself and others, he has shown a startling lack of awareness of or interest in contemporary issues in technology and privacy. He has persistently created the appearance, if not the reality, that he holds problematic and serious ethical views on issues of women and children. He is, not to put too fine of a point on it, constantly an asshole to everyone.

When hosting RMS, I occasionally saw "moments of lucidity," where RMS would make a surprisingly persuasive argument, ask an insightful question, or just tell a good story about his past accomplishments. But these moments were lost in the frustration, difficulty, and alarm of having become responsible for the leader of the free software movement who had, by most appearances, absolutely no ability to steward that movement in any positive direction. I grimaced as he accused an erstwhile supporter, in front of an audience, of being a traitor and undermining the cause by having contributed to FreeBSD. I did my best to steer him away when he started down a surprisingly racist path in the presence of many members of the race involved.

I have a great deal of respect for RMS. His past accomplishments, technically and philosophically, cannot be overstated and will have a lasting influence on the landscape of technology. However, we cannot allow his history to excuse the present. While RMS deserves our appreciation, I do not believe that he deserves a leadership role. He has demonstrated over a decade that he is more of a liability than an asset to the Free Software Foundation.

The Free Software Foundation itself has slid into complete irrelevancy, and in many circles RMS has become not a thought leader but a joke. Much of this is the result of conscious decisions made by RMS to exclude rather than include, to lash out in anger rather than make common ground, and to ignore the last twenty years of development not only in technology but also in the problems that technology creates. Free software, as a movement, is more relevant now than ever, but RMS has chosen to remain an artifact of a past era.

RMS did great things, and we should remember that. But we do not owe him a seat on the board or the title of Chief Gnuisance. RMS is not a god; he is not a prophet. He is a leader, and must be held to the same standards as every other. Indeed, the lofty ideology of the Free Software movement would seem to require that he be held to even higher standards than most. His position of leadership in the FSF, the Gnu project, and elsewhere, is contingent on his ongoing ability to give those projects the direction and inspiration they require to succeed. Stallman has remained in power over these projects for many years now only due to his obstinacy and status as an icon. Neither of these are justifications for those positions.

In the end, the old sometimes needs to make way for the new. RMS has done a great deal, but now, if not ten or more years ago, is the time for him to step aside. I hope that he continues to answer all of our emails and that his opinion continues to merit serious consideration, but he should not be viewed as the leader of any modern project or movement. He has already made the decision not to adapt to that role.

While I am not necessarily at the point of signing any open letters, I further agree that the decision of the FSF to accept RMS back into a position of leadership creates serious questions about the competency of the FSF board. Regardless of your opinions on the man, RMS's acceptance back into the fold was obviously going to create tremendous controversy and create the appearance that the FSF does not care about those who have made allegations against him. This is on top of the FSF's already poor track record of much noise and few results, which already suggested that a change in leadership may be needed.

There has been, and is, an ongoing problem with free and open-source software projects tolerating and even lauding leaders who are abusive, offensive, and a frequent source of allegations of misconduct. If any deep conspiracy is undermining the legitimacy of the Free Software movement, it is its ongoing association with problematic, aggressive leadership at multiple levels. If the Free Software Foundation and the larger movement are to establish themselves as beacons of good, they must show the ability to learn and improve. This includes correcting intolerant, offensive, and mean behavior, and evolving to meet the challenges of the modern world. The FSF has done neither. Instead, it has buried its collective head in the sand and then run another round of fundraising.

Finally, I fully appreciate the concern over neurodiversity. While RMS diagnosis as autistic is, as far as I can tell, completely an assumption by his supporters and thus already somewhat problematic, I do not believe that any such diagnosis would be a full excuse for his behavior. Assuming that RMS struggles to relate to others due to an underlying condition, the FSF and the communities he stewards need to step up to support him and correct his behavior. They have shown little to no interest in doing so, but rather either ignored or excused the problems he creates. Put another way, perhaps by no fault of RMS himself, the fact that the FSF continues to consider him a leader demonstrates that he should not be in that position. The FSF is not able to adequately support him in that role, instead sending him out to the world to offend a few more people and turn at least a dozen more against him. This is not kind to RMS or anyone else. It is a disservice to RMS, as a person, to continue to support him and enable him in digging his own hole.

All of this is, of course, merely my opinion. You are free to disagree, but if you do, please be polite. I'm ostensibly on vacation. Later this week, I plan to write about something more interesting and less personally uncomfortable.


>>> 2021-03-16 can I get your number domain

For a little while I've been off my core topic of computers and the problems with them, mostly because I started talking about telephones and once that happens I am unstoppable. But I will fight through the withdrawal symptoms to talk about something other than telephone numbers, which is DNS.

And also telephones.

Starting in the late 1990s and continuing into the 2000s, there was a lot of hype around "unified communications." This unification was the idea that telephones and computers would merge into devices which could serve the same applications. ISDN could be viewed as an early part of this vision, but much of "unified communications" focused on being able to send and receive instant messages and telephone calls between any two devices---computers, phones, etc.

This required bringing telephony technology into the realm of computer technology, thus spanning the substantial divide in technical architectures. Ironically this divide is getting narrower and narrower today due to all-IP and converged network trends, but not by any effort of the unified communications community which, in practice, has been largely limited to large enterprise deployments that work poorly. Consider Exhibit A, Microsoft Lynq, excuse me, Skype for Bidness.

Let's take ourselves back in time, though, to the halcyon days of the early 2000s when telephones and computers seemed ripe for unification.

One of the big challenges that unification of telephones and telephone-like computer applications (voice chat, video chat, etc) have long faced and still faced today is addressing. There is no unified addressing scheme between phones and computer software. Now, today, we've all resolved to just avoid this problem by accepting that it is ludicrous that you could use a service to contact someone who does not use the same service. Sure Facebook Messenger and the Instagram chat are operated by the same vendor and serve apparently identical purposes but, duh, of course you can't send messages between them, they are different apps.

Well, before this particular sort of defeatism completely set in, there was a desire to be able to create a single standard addressing system for telephones and computer applications.

The story starts more or less with E.164. E.164 is an ITU standardization of global telephone numbers. The standard itself is largely uninteresting, but for practical purposes it is pretty much sufficient to say that when you write a phone number that is prefixed by a +, followed by a country code, followed by a national phone number (called the nationally significant number or NSN), and the whole thing does not exceed 15 digits, you are writing the phone number in E.164 format. Although you should also leave out whatever extra formatting characters are conventional in your nation. So if you wanted to call Comcast, for example, the E.164 number would be +18002662278 [1], accepting also that the + has a tenuous connection to the E.164 standard but is still very useful, especially in countries without a fixed number length.

So from a technical perspective, we could say that telephones use E.164 addressing.

What kind of addressing do computers use, for messaging purposes? Well, there are warring standards for computer communications addressing ranging from the ICQ number to the email address to the tripcode. But none of that is really "unified," is it? What we, or at least the engineers of the early 200s, view as a unified addressing scheme is the same one we use at the lower level of networking: DNS.

Perhaps you can see where this is going. The internet may not be a truck that you can just dump things in, but DNS is, so we'll just take E.164 addresses (phone numbers) and shove them into DNS!

This is one of the various and sundry uses of the internet's most mysterious gTLD, .arpa. The .arpa TLD is officially called the Address and Routing Parameter Area, but more usefully I call it the DNS Junk Drawer. .arpa is most widely used for "reverse DNS," where IP addresses are shoved back into the name end of DNS, and so we can do phone numbers basically the same way.

Remember Comcast? +18002662278? We could also write that as, if we were sadists. Every digit is its own level of the hierarchy because global phone numbers are not constrained to any reasonable hierarchy. The digits are reversed for the same reason they are reversed for reverse-DNS PTRs: DNS obviously orders the hierarchy the wrong way (we could say it is little-endian), when everything else does it the right way, so most forms of hierarchical addressing have to be reversed to fit into DNS [2].

These telephone DNS records are typically expected to contain a record with the type NAPTR. NAPTR, generally speaking, is intended to map addresses (more properly URNs) to other types of addresses. For example, E.164 to SIP. So the NAPTR record, typically, would provide an address type of E2U+sip which indicates ENUM (E.164 number mapping) to the SIP protocol.

Fascinatingly, the actual payload of an NAPTR record is... a regular expression. The regular expression specifies a transformation (with capture groups and the whole nine yards) from the original address (the E.164 number) to the new address type. In theory, this allows optimization of NAPTR records at higher levels of the hierarchy if components of the original address are also used in the new address. This is one of many areas of DNS that are trying perhaps too hard to be clever.

The freshly obtained SIP address has a domain component which can then be fed to the NAPTR system once again to discover where to find a SRV record which indicates where to actually connect.

As you might suspect, this system was never meaningfully used. A few countries reportedly actually operate something in their part of the E.164 ENUM architecture, but not on any meaningful basis.

Since that proposal for unifying telephone addressing with DNS was such a success, the industry is naturally doing the same thing again. An enterprising UK company called Telnames but doing business as Telnic or possibly the other way around, that point is oddly confusing, operates the "award winning" gTLD .tel [3].

The .tel TLD is so minor and poorly marketed that it's sort of hard to tell exactly What Their Deal Is, but "tel is the only top level domain (TLD) that offers a free and optional Telhosting service that allows individuals and businesses to create and manage their very own digital profile."

While .tel is described as some sort of vague telecom unification thing of some unspecified nature, the actual implementation is fairly uninteresting. .tel domains basically come with free hosting of a very minimal profile site which lists various contact information. So basically Linktree but less successful. This already seems to be a weakened form of the original, award-winning .tel vision, and Telnic has abandoned the concept almost entirely by announcing a policy change such that .tel domains can be used for any purpose, with .tel essentially becoming a plain-old land grab gTLD.

The DNS system as accumulated quite a bit of cruft like this. Because it is fairly flexible and yet widely supported, it's very tempting to use DNS as an ad hoc database for all kinds of purposes. Further, computer science being the discipline concerned primarily with assigning numbers to things, there is an unending series of proposal for various name-translation schemes that end up in DNS and then are never used because they're too complicated, don't solve a real problem, or both, as was the case with ENUM.

That is, the lack of a unified addressing scheme was never the thing standing in the way of unified communications. The thing standing in the way of unified communications is, put succinctly, capitalism, as in the absence of a large single purchase driving the requirement (e.g. enterprise sales) there is very little financial incentive to make any two systems interoperable with each other, and there is often a clear financial incentive not to. Shoving regular expressions into DNS can do a lot of things, but fundamentally restructuring the Silicon Valley app mill is not one of them.

Of course, contrary to all reason, phone numbers have made the jump to computer addressing in one way that absolutely delights me: telephone number domain names. Some mail order businesses branded themselves entirely around an excellent 1-800 number, and their awkward transition to the internet leads to things like 1800contacts.com. Another example is 1800wxbrief.com, which is the result of a potent combination of a vanity phone number and government contracting.

See? DNS does handle phone numbers!

[1] In practice it's more complicated because E.164 tries to account for dialing prefixes and etc, but none of that is really important, now or ever.

[2] I rarely say that Java and/or Sun Microsystems have gotten anything right, but Java provides a prominent example of how DNS names should go the other way around.

[3] The award, as far as I can tell, is the Infocommerce Group Models of Excellence Award presented at Data Content '09. Infocommerce Group is some "boutique" consulting firm which is presumably in the business of handing out awards, Data Content is presumably some minor conference but I can't find anything about it online. So basically the Nobel Prize for DNS.


>>> 2021-03-13 CONELRAD

Here in the year 2021, an extended, concerted effort across multiple levels of the government and the telecommunications industry has made it possible for the government to send short text messages to cell phones. Most of the time, it even works. This sophisticated, expensive capability is widely used to send out mistyped descriptions of vehicles potentially containing abducted children, and nearly nothing else.

Before we lived in the modern era of complicated technology that barely works, though, the Civil Defense administration developed an emergency notification solution that was simple and barely worked: CONELRAD. CONELRAD is, of course, short for Control of Electromagnetic Radiation, which in a way is what all radio transmitters do. In the case of CONELRAD, though, the control aspect has special significance.

CONELRAD, introduced in the early stages of the cold war in the '50s, was intended to provide timely information to the public about an incoming Soviet bombing mission. Because bombers, presumably delivering nuclear weapons, were relatively slow and could be detected relatively quickly, early public warning of attack could save many lives. This was especially true due to the generally lower-yield nuclear weapons in consideration at the time. The problem, though, was finding a way to get warning and instructions out in a matter of minutes.

This is actually a bit of a misrepresentation of the history of CONELRAD, but it's a very common one since the emergency alerting feature of CONELRAD was the most widely advertised and the most successfully implemented. In actuality, though, CONELRAD was designed as an active defense system in addition to an emergency alerting system. This part of CONELRAD is not so well known.

Understanding this requires a trip back to World War II, and specifically the air campaigns occurring over Britain (by the Germans) and Germany (by the Allies). During WWII, air navigation was in its infancy. Navigation for fast-evolving situations like bombing runs was based on sighting landmarks and dead reckoning ("pilotage"), which is already difficult at night and especially difficult when the targets are using active denial techniques (blackouts) to make landmarks difficult to see. Bombing runs, though, were far more effective at night when air defense personnel suffered the same challenges---of it being difficult to see things when it's dark.

The result was a huge drive for radio-navigation technology, which would work just as well at night as during the day. Although radio-navigation would later involve all kinds of interesting encoding techniques [1], the simplest and earliest radio technology for air navigation was a simple directional receiver. A small loop antenna outside the fuselage could be rotated around to identify the angle at which a signal is nulled, giving the direction to the signal. This allowed airplanes to fly towards radio transmitters whether or not they could see anything, which as you can imagine was tremendously useful to bombers.

The commercial radio stations in major cities quickly went from a valuable communication asset during a blackout to an unintended navigation aid for the enemy. In Britain and Germany, where this technique was seeing active use in targeting cities, countermeasures had to be developed.

One option is obvious: when incoming bombers are detected, just shut off radio stations. This deprives the bombers of guidance but also deprives the community of information, which would be especially critical following a nuclear attack.

CONELRAD offered a smarter solution: keep (some) radio stations on the air to deliver information, but have them operate in such a way that they would be confusing and useless to aircraft.

The history of air defense in the US is a somewhat strange one in large part because the US has never actually had a need for them. That is, there has never been a significant bombing campaign on the mainland US. As a result, much of the US air defense infrastructure has always been hypothetical in many ways. The CONELRAD proposal fell into a perfect time window when a Soviet nuclear attack had become a public concern but was still expected to be delivered by aircraft. Air defense, briefly, was a focus of the Cold War.

So, form this perspective of CONELRAD functioning primarily as an active denial system for air defense, let's take a look at how it worked.

When an Air Defense Control Center (ADCC) detected an incoming bombing mission, a set of leased telephone lines between the ADCC and major radio stations would be used to activate CONELRAD. The activated radio stations would first alternate their transmitters off and on, five seconds each, twice. Then, a 1kHz tone would be transmitted for 15 seconds.

AM radio stations designed as emergency stations would then switch their transmitters (or more likely switch to an appropriately configured standby transmitter) to either 640 or 1240 kHz. These radio stations would then broadcast emergency information, but with a twist: following a pre-arranged schedule, the stations on 640 and 1240 would alternately shut down and start up their transmitters, every few minutes, on a cycle of several stations This would frustrate any aircraft trying to navigate by these stations as their "target" would keep changing positions.

In the mean time, all other radio stations would simply shut down, making the cycling AM stations the only option.

There are a few things to unpack about this. First, the five-second off/on cycling of activated radio stations and 1kHz tone are both measures to allow automated receivers to take action when an alert is issued. This feature of emergency radio broadcasts persists today in the form of a dual-frequency attention tone and SAME preamble repeated thrice. Various automated receivers were offered for CONELRAD and some radio broadcasters used automatic receivers that disabled their transmitter in response to either the monitored station's carrier dropping or the 1kHz tone.

Second, some radio stations would be expected to change their transmit frequency and power. It is not clear that there was a specific need for CONELRAD transmitting stations to reduce power, I suspect it may be an artifact of the frequency change process. Some stations might achieve the frequency change by switching to an already-prepared standby transmitter, which was often of lower power due to infrequent use, and other stations actually changed the carrier frequency of their primary transmitter... but did not have time to adjust the other stages of the transmitter (and antenna), resulting in poor tuning and low efficiency.

As a result, following the CONELRAD activation sequence there would often be a long and uncomfortable silence as CONELRAD transmitting stations reconfigured. In practice, they didn't always come back at all, because the frequency change-out process was complex and presented a substantial risk of things going wrong. Just the five-second off-on cycle came with great risk; large transmitters used large tubes that operated at high temperature and often did not respond well to rapid changes in power.

As a result, CONELRAD implementation was costly and complex for participating radio stations. This introduced a lot of friction to CONELRAD adoption, and while it's extremely difficult to find detailed information, it seems that CONELRAD's deployment was always severely limited. I have read before that very, very few CONELRAD transmitting stations every fully implemented the ability to cycle their transmitters on and off in an ongoing cycle, it was viewed as too difficult and risky. There were relatively few tests of full CONELRAD capabilities, which left a lot of questions around the actual performance of the system.

Perhaps because of the technical complexity of the active denial component, later public discussion of CONELRAD generally identifies it only as an emergency communications system. The air defense mission was largely forgotten.

CONELRAD faced challenges beyond its own complexity. The development of ICBMs made the air defense component obsolete, as ICBMs used inertial (dead-reckoning) navigation that could not be confused by radio station trickery. In 1963, CONELRAD was replaced by the Emergency Broadcasting System. The EBS entirely eliminated the air defense component, allowing stations to continue to operate on their normal frequencies for the purpose of delivering messages.

EBS would shortly after be replaced by the Emergency Alert System, EAS, which for the most part is still what we use today---but it has been augmented by a baffling number of accessory systems which handle various types of alerts over various media. This includes Wireless Emergency Alert (WEA), the system which has a modest success rate in delivering text to smartphones.

Despite CONELRAD's relatively short lifespan and limited success, its design was highly influential on emergency alerting systems since. The basic pattern of a set of key radio stations broadcasting an attention sequence which causes other radio stations to switch to an emergency mode remains in used today. Modern radio stations use a device called an ENDEC which, depending on the radio station, typically monitors some other radio station further up the tree for the transmission of an attention sequence. In this way emergency alerts propagate downwards from key (entry point) radio stations to all other radio stations.

The modern structure of EAS is just so much more complicated than you would ever reasonably expect, which means it's my kind of thing. Maybe I'll write about it some time.

[1] In fact, some more advanced radio-navigation technologies were in use prior to 1950 including for WWII bombing operations. This is an interesting topic that I hope to write about in the future.

<- newer                                                                older ->