From SIP to RTP (Part 1) – Overview

This is the first in a series setting out several major parts of the SIP protocol. The following are some pratical notes on the protocol and how works the SDP and RTP protocol delegated to voice or video transport.

Attention: they are simple practical notes: I invite you to see the documentations in linkografia for a in-deep study about this topic.

SIP Overview
SIP (Session Initiation Protocol) is a signaling protocol that is used to control multimedia communication sessions, such as voice and video calls, over Internet Protocol (IP).

SIP protocol is analogous to HTTP for voice and is essentially the glue that ties communications systems together, much like HTTP ties clients (browser) and web servers together for worldwide communication.

If a Phone A want to place a call to Phone B, SIP protocol is used to exchange information about call-establish (callerid, callee number, etc), and how the two stream of voice (or video) Phone A -> Phone B (the caller’s voice in Phone A heard by callee in Phone B), and Phone B -> Phone A will be transmitted. After that Phone A and Phone B agreed about this details (using the SDP protocol, enclosed in SIP protocol), the two real data-stream will be transmitted using RTP protocol: in others words the sip protocol is delegated to the signalling about the call.

In the next of this post the focus will be the call establish: how the voice stream will be transmitted (the streaming audio) will be exposed in the next posts.

SIP Components
In short a call establishment in SIP protocol (Alice wants to call Bob) can be described as the following.

Sip: Alice wants to call Bob

When Alice wants to initiate a call with Bob, she will send a SIP INVITE message (a call request) to Bob directly (or using an intermediate server): Bob’s phone will response with a trying messages, and others several ring messages to indicate that the Bob’s phone is ringing. When Bob answer the call, then her device will send back an OK message: to confirm the call establishment Alice then sends an ACK message to Bob.

Oss.: The communication flow will be directly or using intermediate server. As we shall see in real words things can be a bit more complex: this is only to exemplify the sip messages flow.

Entities interacting in a SIP protocol environment are known generally as User Agents (UA), and there are two types of UAs: clients (UAC) and servers (UAS).

User Agent Client (UAC)
The UAC generates “methods” and sends them to servers (e.g., it sends an INVITE request call and initiates a call).

User Agent Server (UAS)
The UAS receives the methods, processes them, and generates “responses” (e.g., it sends a 200 Ok response to indicate a successful session). UAS is the generic name of the device that receives the methods.

Att: In other words a SIP UA can perform the role of a User Agent Client (UAC), which sends SIP requests, and the User Agent Server (UAS), which receives the requests and returns a SIP response: these roles of UAC and UAS only last for the duration of a SIP transaction. Infact most of the time a SIP device (eg a IpPhone) implements both a UAC to a UAS (they are simply different pieces of software running on the device): the phone behaves like a UAC if initiates a call, and the other party that receive the call will be an UAS. The UAS can be another phone or an intermediate server that re-trasmit the request to another server or the destination IpPhone (that is the final UAS). The next time if the same phone will receive a call it will be a UAS.

The UAC is often associated with the end user, since applications running on systems are used by people. The UAC can be any end-user device, such as a IpPhone, softphone (= a software that emulates an IpPhone) and others.

Each resource of a SIP network, such as a User Agent, is identified by a Uniform Resource Identifier (URI), based on the general standard syntax also used in email. A typical SIP URI is of the form: sip: username@domain.

SIP defines several server network elements UAS (over the telephone that has been called): although two SIP UA can communicate directly without any intervening SIP infrastructure, which is why the protocol is described as peer-to-peer, this approach is often impractical for a public service.

In a real word the requests generated by the UAC usually are sent to a server (typically a proxy server) and not directly to the other Sip IpPhone. There are several types of servers that helps UAC & UAS to connect each other.

>> Proxy Server
The SIP protocol can be, and usually is, routed through one or more SIP proxy servers before reaching its destination: it is very similar to how email is transmitted, in that multiple email server are usually involved in the delivery process, each forwarding the message in its original form.

Proxy servers help track down ip addresses of recipients whose exact addresses are not known in advance. If the proxy server cannot find the address of the recipient, it will send the request to other proxy servers. In others words when Alice want to call Bob, she knows only the Bob’s URI (bob@domain): SIP Proxy convert from SIP URI to Ip Address. SIP proxy servers use presence services (Registrar Server) to track users, which means users can be located regardless of their physical location (current Ip Address). Proxy servers are the most common server in the SIP environment.

 >>Registrar Server
A SIP registration server is responsible for registering devices (tipically IpPhone). It does this by authenticating the device with a user name and password and keeping a table of IP addresses and extensions/phone numbers. Registrations of devices play an important role in the process since SIP devices that do not register itself cannot be called and SIP devices that do not successfully authenticate cannot make outbound calls.

Note: Commonly proxy and registrar server are on the same device: the pbx ! They are simply different pieces of software on the pbx-device.

A more completely description about a call-establish using the Sip protocol it is in the next.

Both Alice and Bob register to a registrar server for location identification purposes: registrar server knows the ip addresses of both Bob and Alice’s IpPhone.

If Alice wants to initiate a call with Bob, she will send an INVITE message (a call request to bob@domain) to her proxy server. This proxy server will act on Alice’s behalf and search for Bob’s proxy server. It will then send the INVITE message to Bob’s proxy. Bob’s proxy server will then look up Bob’s current device (using registrar server) and send an INVITE message to Bob.

When Bob accepts the INVITE (answer the call), then he will send back an OK message, which will propagate back to ALICE through the proxies. Alice then sends an ACK directly to Bob and a direct media session (to transport the voice) takes place after that. To disconnect the session, Alice or Bob will send a BYE message and the other will reply with an OK message.

NEXT POST: From Sip to RTP (Part 2) – This is straight talking !