About a week ago, I wrote an entry asking how reliable VoIP should be and mentioned some research that was conducted by Professor Henning Schulzrinne and one of his students (Wenyu Jiang) at Columbia University.  The actual research on VoIP service availability in the Internet was conducted a while ago (the paper was presented at the PAM (Passive and Active Measurement) Workshop in La Jolla, CA back in April 2003.  That particular study reported that when the Internet is used as the transport network, the VoIP service availability is approximately 98 percent.  The key factors in arriving at this figure were the call success probability (on average 0.47 percent) and the outage-induced call abortion probability (i.e., the likelihood of the caller hanging up after a service interruption, found to be on average equal to 1.53 percent). Combined, the results yielded a net service availability of roughly 98 percent.

Then, I mentioned the Ofcom figure of service availability for wireless networks in the UK (running at a rate of 97 to 98 percent service availability, according to a 2002 survey).  My intent was just to talk about the social aspects of VoIP, and to speculate about how some of the technology's early adopters were willing to tolerate something in between the availability of wireless service and the "assumed" 99.999% POTS figure (I say "assumed" because of stories such as Greg Galitzine, whose Verizon service outage of more than two days took his POTS availability to less than 99.5% for the year).

Well, Om Malik jumped all over my post, believing that I was really prescribing VoIP as a POTS replacement technology at a smaller availability than the status-quo.  Of course, that was not the case at all.  I was just putting myself on the shoes of a new entrant to the marketplace, and trying to figure out how much is needed as a minimum initial threshold, until more mindshare and market share are captured, before making a higher CAPEX commitment to raise the bar of the service. 

The fact of the matter is that, on a controlled network, one can greatly improve the availability, by adding redundancy (e.g. using RAID hard drives and "hot standby" servers or by implementing a highly reliable LAN/WAN design with multiple switches and routers at the Layer 2 and Layer 3 network levels for redundant connections and call paths).  These more highly available systems require more equipment and supported communications links, thereby increasing the overall system cost.  I was just trying to ask how much each 9 is worth, and what are the tradeoffs in terms of what the investment requirements are and what the customers are willing to accept.

Of course five 9s is highly desirable, because of the possible consequences of a service outage (even as temporary as it might be).  Here, the notion of expected value takes hold - i.e. even starting with a low probability event, when factoring in the bad consequences associated with it, the expected value could be high (e.g. someone who is relying on VoIP service as a primary line and all of a sudden cannot call an ambulance due to a service interruption).  That is why I advocate adherence to this high availability target.  But Rome was not built in one day - and neither were fully redundant, highly available and controlled VoIP networks, so this will be a gradual process.

One final comment - five nines as a goal should be pursued only when VoIP is to be relied on as the only means of communication, but with the advent of data apps (IM, web collaboration, etc) and the proliferation of mobile networks (3G, WiFi, and eventually WiMax), other means of communication are becoming widely popular, which means that POTS is no longer as exclusive as it once was.  It is important to keep in mind that when the POTS network was built, the telephone was the only game in town, and that was a key consideration that had to be factored in the design of that network.