_____                   _                  _____            _____       _ 
  |     |___ _____ ___ _ _| |_ ___ ___ ___   |  _  |___ ___   | __  |___ _| |
  |   --| . |     | . | | |  _| -_|  _|_ -|  |     |  _| -_|  | __ -| .'| . |
  |_____|___|_|_|_|  _|___|_| |___|_| |___|  |__|__|_| |___|  |_____|__,|___|
  a newsletter by |_| j. b. crawford               home archive subscribe rss

>>> 2022-03-24 VoWiFi (PDF)

I haven't written for a bit, in part because I am currently on vacation in Mexico. Well, here's a short piece about some interesting behavior I've noticed here.

I use a cellular carrier with very good international roaming support, so for the most part I just drive into Mexico and my phone continues to work as if nothing has changed. I do get a notification shortly after crossing the border warning that data might not work for a few minutes; I believe (but am not certain) that this is because Google Fi uses eUICC.

eUICC, or Embedded Universal Integrated Circuit Card, essentially refers to a special SIM card that can be field reprogrammed for different carrier configurations. eUICC is attractive for embedded applications since it allows for devices to be "personalized" to different cellular carriers without physical changes, but it's also useful for typical smartphone applications where it allows the SIM to be "swapped out" as a purely software process.

Note well, although the "embedded" seems to suggest it eUICC is not the same as an "embedded SIM" (e.g. one soldered to the board). eUICC is instead a set of capabilities of the SIM card and can be implemented either in an embedded SIM or in a traditional SIM card. Several vendors, particularly in the IoT area, offer eUICC capable SIMs in the traditional full/mini/micro SIM form factors to allow an IoT operator to move devices between cellular networks and configurations post-sale.

Anyway, my suspicion is that Google Fi cuts down on their international service costs by actually re-provisioning devices to connect to a local carrier in the country where they are operating. I can't find any information supporting this theory though, other than clarification that Fi does use embedded (eSIM) eUICC capability in Pixel devices. Of course the eUICC capabilities can be delivered in traditional SIM form factor as well, so carrier switching by this mechanism would not be limited to devices with eSIM. The history of Google Fi as requiring a custom kernel supports the theory that they rely on eUICC capabilities, since until relatively recently eUICC was poorly standardized and Android would likely not normally ship with device drivers capable of re-provisioning eUICC.

In any case, that wasn't even what I meant to talk about. I was going to say a bit about cellular voice-over-IP capabilities including VoWiFi and VoLTE, and the slightly odd way that they can behave in the situation where you are using a phone in a country other than the one in which it's provisioned. To get there, we should first cover a bit about how VoIP or "over-the-top telephony" interacts with modern cellular devices.

Historically, high-speed data modes did not always combine gracefully with cellular voice connections. Many older cellular air interface standards only supported being "in a call" or a "data bearer channel," with the result that a device could not participate in a voice call and a data connection at the same time. This makes sense when you consider that the data standards were developed with a goal of simple backwards-compatibility with existing cellular infrastructure. The result was that basic cellular capabilities like voice calls and management traffic (SMS, etc) were achieved by the cellular baseband essentially regressing to an earlier version of the protocol, disabling high-speed data protocols such as the high-speed-in-name-only HSPDA. Most early LTE devices carried on this basic architecture, and so when you dialed a call on many circa 2010s smartphones the baseband basically went back in time to the 3G days and behaved as a basic GSM device. No LTE data could be exchanged in the mean time, and some users noticed that they could not, for example, load a web page while on a phone call.

This is a good time to insert a disclaimer: I am not an expert on cellular technologies. I have done a fair amount of reading about them, but the full architecture of modern cellular networks, then combined with all of the legacy technologies still in use, is bafflingly complicated. I can virtually guarantee that I will get at least one thing embarrassingly wrong in the length of this post, especially since some of this is basically speculative. If you know better I would appreciate if you emailed me, and I will make an edit to avoid spreading rumors. There are a surprising number of untrue rumors about these systems!

This issue of not being able to use data while in a phone call became increasingly irritating as more people started using Bluetooth headsets of speakerphone and expected to be able to do things like make a restaurant reservation while on a call with a friend. It clearly needed some kind of resolution. Further, the many layers of legacy in the cellular network made things a lot more complicated for carriers than they seemed like they ought to be. Along with other trends like thinner base stations, carriers saw an obvious way out... one shared with basically the entirety of the telecom industry: over-the-top.

If you are not familiar, over-the-top or OTT delivery is an architecture mostly discussed in fixed telecoms (e.g. cable and wireline telephone) but also more generally useful as a way of understanding telecom technologies. The basic idea of OTT is IP convergence at the last mile. If you make every feature of your telecom product run on top of IP, you simplify your whole outside plant to broadband IP transport. The technology for IP is very mature, and there's a wide spectrum of vendors and protocols available. In general, IP is less expensive and more flexible than most other telecom transports. An ISP is a good thing to be, and if cellular carriers can get phones to operate on IP alone, they are essentially just ISPs with some supported applications.

Modern LTE networks are steering towards exactly this: an all-IP air segment with a variety of services, including the traditional core of voice calls, delivered over IP. The system for achieving this is broadly called the IP Multimedia Subsystem or IMS. It is one of an alarming number of blocks in a typical high-level diagram of the LTE architecture, and it does a lot of work. Fundamentally, IMS is a layer of the LTE network that allows LTE devices to connect to media services (mostly voice although video, for example, is also possible) using traditional internet methods.

Under the hood this is not very interesting, because IMS tries to use standard internet protocols to the greatest extent possible. Voice calls, for example, are set up using SIP, just as in most VoIP environments. Some infrastructure is required to get SIP to interact nicely with the traditional phone system, and this is facilitated using SIP proxies, DNS records, etc so that both IMS terminals (phones) and cellular phone switches can locate the "edges" of the IMS segment... or in other words the endpoints that they need to connect to in order to establish a call. While there are a lot of details, the most important part of this bookkeeping is the Home Subscriber Server or HSS.

The HSS is responsible for tracking the association between end subscribers and IMS endpoints. This works like a SIP version of the broader cellular network: your phone establishes a SIP registration with a SIP proxy, which communicates with the HSS to register your phone (state that it is able to set up a voice connection to your phone) and obtain a copy of your subscriber information for use in call processing decisions.

This all makes quite a bit of sense and is probably the arrangement that you would come up with if asked to design an over-the-top cellular voice system. Where things get a bit odd is, well, the same place things always get odd: the edge cases. One of these is when phones travel internationally.

An interesting situation I discovered: when returning to our rented apartment, I sometimes need to call my husband to let me in the front gate. If my phone has connected to the apartment WiFi network by this point, the call goes through normally, but with an odd ringing pattern: the typical "warble" ringback plays only briefly, before being replaced by a fixed sine tone. If, on the other hand, my phone has not connected to the WiFi (or the WiFi is not working, the internet here is rather unreliable), the call fails with an error message that I have misdialed ("El nĂºmero marcado no es correcto," an unusually curt intercept recording from Telcel).

Instead, calls via LTE must be dialed as if international: that is, dialed 00-1-NXX-XXX-XXXX. This works fine, and with normal ringback to boot.

So what's going on here?

This answer is partially speculative, but I think the general contours are correct. First, Google Fi appears to use Telcel as their Mexican carrier partner. I would suspect this works similarly to Fi's network switching to Sprint and US Cellular, with a "ghost number" being temporarily assigned (at least historically, all Google Fi numbers are "homed" with T-Mobile). When not connected to WiFi, the phone is either using "traditional" GSM voice or is connecting to Telcel IMS services located using LTE management facilities. As a result, my phone is, for all intents and purposes, a Mexican cellphone. Calls to US numbers must be dialed as international because they are international.

However, when connected to WiFi, the phone likely connects to a Google-operated IMS segment which handles the phone normally, as if it were in the US. Calls to US numbers are domestic again.

It's sort of surprising that the user experience here is so awkward. This is pretty confusing behavior, especially to those unfamiliar with WiFi calling. It's not so surprising though when you consider the generally poor quality of Android's handling of international travel. Currently many text messages and calls I receive are failing to match up with contacts, apparently because the calling number is coming across with an '00' international dialing prefix and so not matching the saved phone number. Of course, if the call arrives via WiFi or the message by RCS, it works correctly. One would think that Android core applications would correctly handle the scenario of having to remove the international dialing prefix, but admittedly it would probably be difficult to come up with an algorithmic rule for this that would work globally.

Another interesting observation, also with some preamble: I believe I have mentioned before that Mexico has a complex relationship with NANP, the unified numbering scheme for North American countries that makes up the "+1" country code. While Mexico originally intended to participate in NANP, a series of events related to the generally complex history of the Mexican telecom industry prevented that materializing and Mexico was instead assigned country code +52. The result is that Mexico is "NANP-ish" but uses a distinct numbering scheme, and the NANP area codes originally assigned to Mexico have since mostly been recycled as overlays in the US.

A full history of telephone number planning in Mexico could occupy an entire post (perhaps I'll write it next time I'm here). It includes some distinct oddities. Most notably, area codes can be either 2 or 3 digits, with 2 digit area codes being used for major cities. While Mexico had formerly used type of service prefixes (specific dialing prefixes for mobile phones), these were retired fairly recently and are no longer required or even permitted.

In principal, telephone numbers for 2-digit area codes can be written XX-XXXX-XXXX, while three-digit area codes can be written XXX-XXX-XXXX. Note the lack of Ns to specify digits constrained to 2-9 as in NANP. This is not entirely intentional, I just don't know if this restriction exists in Mexico today. Putting together the current Mexican dialing plan from original sources is a bit tricky as IFT has published changes rather than compiled versions of the numbering plan. My Spanish is pretty bad so reading all of these is going to take a while, and it's getting to be pretty late... I'll take this on later, so you can look forward to a future post where I answer the big questions.

An extremely common convention in Mexico is to write phone numbers as XX-XX-XX-XX-XX. I'm not really sure where this came from as I don't see e.g. IFT using it in their documents, but I see it everywhere from handwritten signs to the customer service number on a Coca-Cola can. Further complicating things, I have seen the less obvious XXX-XXXX-XXX in use, particularly for toll free numbers. This seems like perhaps the result of a misunderstanding of the digit grouping convention for 2 digit area codes.

It seems to be a general trend that countries with variable-length area codes lack well agreed upon phone number formatting conventions. In the UK, for example, there is also variability (albeit much less of it). This speaks to one of the disadvantages of variable-length area codes: they make digit grouping more difficult, as there's a logical desire to group around the "area code" but it's not obvious what part of the number that is.

Anyway, there's some more telephone oddities for you. Something useful to think about when you're trying to figure out why your calls won't connect.

Update: reader Gabriel writes in with some additional info on Mexican telephone number conventions. Apparently in the era of manual exchanges, it was conventional to write 4-digit telephone numbers as XX-XX. The "many groups of two" format is sort of a habitual extension of this. They also note that in common parlance Mexico City has a 1-digit area code '5' as all '5X' codes are allocated to it.