Monday, February 04, 2008

Using Google 'My Location' to Troubleshoot AT&T's Broken Network

The AT&T Wireless coverage in the San Francisco area is poor. It's not that the "more bars" commercials or the beautiful coverage map aren't technically accurate -- pushing enough buttons on a handset sooner or later results in bars on the display. Having calls go through and not get dropped, or having the data service behave, is another story entirely.

But this post isn't a rant about AT&T. None of the wireless carriers are perfect, and making changes in an asset-and-operations-intensive business is hard to do. (Think about all the legacy airlines biting the dust -- there are a lot of parallels in terms of the economics.)

This post is about looking at Google mobile maps' "my location" feature, which shows the location of your current cell tower, to take a few guesses at what might be going on.

The first thing I noticed when playing with "my location" is that, on freeways, there seems to be  consistent and stable handoff between towers. Not every spot on the freeway can see a tower, which might explain the fact that within 5 miles of SF there are major freeways with dark spots, where calls predictably drop. But in the covered areas, all is orderly and the service (both voice and HSDPA) is good.

Then the trouble begins: in some suburban and downtown (SF financial district) areas, it seemed as though the phone flits between towers willy-nilly, even if I'm standing still. The results are predictable: service is mediocre, and calls can drop even in downtown. The data service becomes intermittent deep the "3G coverage area" as the phone becomes confused whether 3G is available, whether to fallback to EDGE, or even GPRS. By the time the phone has done the appropriate renegotiation, it gets new information and changes its mind.

Cell phone networks are famous for sporting sophisticated hysteresis mechanisms to manage tower handoffs. Too much handoff, and the switching overhead and side effects dominate... too little handoff and, well, you drop calls. When a handset isn't moving, there is little reason (aside from site congestion) to do a handoff. But how do you know the handset isn't moving? It's a Catch-22, because without external input, the network can only guess by measuring changes in reception from nearby towers. If reception fluctuates, the client and network need to figure out if the phone is really moving, which way, and how fast.

The result is so bad in some cases, that it's almost worth having a button marked "I'm in my office" or "I'm at home" so that the user can explicitly tell the network what's happening. At my house, I'm hopping between three towers (one on the other side of a small mountain, with enough S/N to grab my phone, but not enough to even initiate a call) with the result that the phone is more or less a brick. There's no consistent data or voice service (20 minutes from downtown SF!), so the only possible change is "better."

Here's the interesting part: a lot of the AT&T network problems don't seem to occur on Sprint or Verizon (I've been on all the networks, and was an "early adopter" of digital in the U.S. back when Sprint didn't have any coverage at all in large places like, ah, Chicago.) I do not believe the difference is because Sprint or Verizon have more towers, or superior engineers. I have a feeling this is due to (1) differences between GSM and CDMA (CDMA seems to be technically superior, but business factors make GSM dominant) and (2) the fact that AT&T is a historical rollup of lots of other networks (e.g., a former TDMA or AMPS tower setup may or may not be the ideal place for a GSM tower and I doubt they were all relocated) and supports too many disparate protocols (GPRS, EDGE, HSDPA, UMTS).

Every time the phone changes its mind between HSDPA, EDGE and GPRS, it seems to have to renegotiate its presence on the network. Perhaps there's a software fault in there too, because once the phone starts changing its mind, your network connectivity is shot. Sometimes until you reboot the phone. Whereas with Sprint and Verizon, it was EV-DO all the way. Some mechanism ensured the phone never even tried to fallback, and the result was an overall smoother ride.

I actually use the Google maps data sometimes to figure out whether I'll have coverage, or whether I need to reboot my phone. Probably not what was intended, but a lot more useful than the little bars.

No comments: