Access Original Article Here
When most people think about VoIP protocols, they typically only think of the Session Initiation Protocol (SIP). SIP is one of the preeminent standards in the world of VoIP, but it is by no means the whole story. This post will cover the wide (and interesting) world of VoIP protocols.
By the end, you’ll be a protocol pro.
Proto Who?
The Internet is made up of a wide array of protocols that are responsible for moving information around between users, the most familiar of which is the HyperText Transfer Protocol (HTTP) – those funny letters at the left most side of the web address in your browser. The job of this protocol is to break up information that is sent across a network (like the Internet) as “packets” (small chunks of information), reassemble it at it’s destination and display it to a user (typically in a web browser).
The job of VoIP protocols is similar – they are used to break up and “packetize” media streams (typically voice and DTMF), send it across a network and reassemble it at its destination (a VoIP client or telephone).
Like HTTP, VoIP protocols are referred to as application layer protocols – these protocols work on top of the more foundational protocols of the networking world, like UDP and IP (hence the name Voice over IP). Also like HTTP, some VoIP protocols (like SIP) work in a request/response manner with one client making a request (usually to start a phone call) and the other client responding.
Type of VoIP Protocols
SIP (Session Initiation Protocol) is one of the most widely known VoIP protocol. Despite its wide adoption and its status as the defacto standard of VoIP, its only used for part of a phone call. SIP is used to signal the inception of a VoIP call — SIP clients use the protocol to determine a variety of factors that need to be aligned before the actual media stream representing voice or DTMF audio is exchanged. In a way, SIP can be thought of as a tool to work out the details of a phone call before the actual call begins. One detail that is usually covered in SIP message exchanges prior to the begining of a phone call is the port number to use for streaming audio using another important VoIP protocol – RTP.
RTP (Real-time Transport Protocol) is the protocol used to stream the audio that makes up a phone conversation. RTP is not limited to audio traffic, however, and can also be used as the basis for streaming video. RTP is also used in conjunction with some other VoIP signaling protocols discussed below. Its important to note that SIP and RTP work on different ports – SIP typically on UDP port 5060, and RTP typically on UDP ports in the range 10000 – 20000. Because of this, and because the RTP ports used can vary, using SIP/RTP behind a firewall may present unique challenges and require the use of NAT to work properly.
H.323 is an older VoIP protocol that has largely been displaced in favor of SIP. It does, however, have a wide implementation base so knowing what it is can be important for protocol pros.
IAX (Inter-Asterisk eXchange protocol) is a VoIP protocol developed by Digium, the company behind the open source telephony platform Asterisk. IAX was designed to be more “firewall friendly” – both signaling and media transport take place on the same port (UDP port 4569). The use of IAX “trunking” can also generate some bandwidth consumption efficinecies when used to connect calls between multiple servers running Asterisk.
Jingle is an extension to the Extensible Messaging and Presence protocol (XMPP) – typically used for instant messaging – that was developed by Google. It is the foundation for the Google Talk service (not the be confused with the Google Voice service, which uses SIP). Jingle is a signaling protocol like SIP, that also uses RTP for media transport.
The Skype protocol is a proprietary VoIP protocol that is used to support the popular Skype calling service. With the exception of this protocol, all others discussed here are open in the sense that anyone can find the specification for the protocol and build a client to use it. Having said that, it should be noted that Skype is working with companies like Digium to provide mechanisms to work with SIP-based systems. A number of other “unofficial” projects have also been launched to bridge SIP-based networks with Skype clients.
One final note of interest is that the last two protocols mentioned (Jingle and Skype) operate in a peer-to-peer (P2P) fashion. Unlike SIP-based networks (which require registration with a centralized server or proxy) P2P networks use a distributed architecture in which participants in the make a portion of their resources (such as processing power, disk storage or network bandwidth) directly available to other network participants. One characteristic of P2P networks is that they can scale very easily, and do not have the same build out costs associated with centralized networks. To give a sense of this effect, the Skype service reported at the end of 2008 (when it was still owned by eBay) that it had an astounding 400+ million users!
Now that you have a broader understanding of the world of VoIP protocols, you can proudly call yourself a protocol pro.




