From Sip to RTP (Part 4) – Invite & Register friendship

INVITE
Session Call Establishment

Sip: Alice wants to call Bob

Call Flow

  1. The UAC (Alice) sends an INVITE message to Bob (UAS).
  2. The UAS receives the request and responds using 100 Trying.
  3. The UAS sends message 180 Ringing response to UAC when the phone begins ringing.
  4. Once the call is picked up, the UAS send a 200 Ok message to the UAC.
  5. The UAC sends an ACK request to confirm the 200 Ok response was received.

Note: The ACK method completes what is known as the three-way handshake-confirmation that a session has been successfully established. In SIP the INVITE is the only method where this occurs, and this is due to the large gap of time that often occurs between the INVITE itself and the 200 OK response (when a user can’t find the phone, is running to the phone, etc.). So the ACK it is important: it tells the callee party that the caller hasn’t hung up and has accepted the call.

Real Example
Alice -> Bob

INVITE sip:41@pbx.company-alice-and-bob.com;user=phone SIP/2.0
Via: SIP/2.0/TLS 10.10.10.32:2061;branch=z9hG4bK-9gg3wzak;rport
From: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=g5ua0i7fz6
To: <sip:41@pbx.company-alice-and-bob.com;user=phone>
Call-ID: 3c2812339279-zvojwzvof6we
CSeq: 1 INVITE
Max-Forwards: 70
Contact: <sip:40@10.10.10.32:2061;transport=tls;line=i339wesg>;reg-id=1
P-Key-Flags: resolution=”31x13”, keys=”4”
User-Agent: snom360/7.3.14
Allow: INVITE, ACK, CANCEL, BYE, REFER, OPTIONS, NOTIFY, SUBSCRIBE, PRACK, MESSAGE, INFO
Content-Length: 452

Bob -> Alice

SIP/2.0 100 Trying
Via: SIP/2.0/TLS 10.10.10.32:2061;branch=z9hG4bK-9gohvng3wzak;rport=2061
From: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=g5ua0i7fz6
To: <sip:41@pbx.company-alice-and-bob.com;user=phone>;tag=eed7a3b4e0
Call-ID: 3c2812339279-zvojwzvof6we
CSeq: 1 INVITE
Content-Length: 0

Bob -> Alice

SIP/2.0 180 Ringing
Via: SIP/2.0/TLS 10.10.10.34:5060;branch=z9hG4bKe299f160c512cb066a3a536253aa4d44;rport=5061
From: "Bob" <sip:41@pbx.company-alice-and-bob.com>;tag=ozac09qwnh
To: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=1521860827
Call-ID: f6f17567@pbx
CSeq: 28592 INVITE
Contact: <sip:41@10.10.10.31:1037;transport=tls;line=zs4m8lei>;reg-id=1
Require: 100rel
RSeq: 1
Allow: INVITE, ACK, CANCEL, BYE, REFER, OPTIONS, NOTIFY, SUBSCRIBE,PRACK, MESSAGE, INFO
Allow-Events: talk, hold, refer, call-info
Content-Length: 0

Alice -> Bob

SIP/2.0 200 Ok
Via: SIP/2.0/TLS 10.10.10.34:5060;branch=z9hG4bK-08d0f56ed1e8d82deb32779c5a2cc55b;rport=5061
From: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=1521860827
To: "Bob" <sip:41@pbx.company-alice-and-bob.com>;tag=ozac09qwnh
Call-ID: f6f17567@pbx
CSeq: 28593 PRACK
Contact: <sip:41@10.10.10.31:1037;transport=tls;line=zs4m8lei>;reg-id=1
Content-Length: 0

Bob -> Alice

ACK sip:41@10.10.10.31:1037 ;transport=tls;line=zs4m8lei SIP/2.0
Via: SIP/2.0/TLS 10.10.10.34:5060;branch=z9hG4bK-6dcf1018159b8e96b7b6d62a758d77fd;rport
From: "Bob" <sip:41@pbx.company-alice-and-bob.com>;tag=ozac09qwnh
To: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=1521860827
Call-ID: f6f17567@pbx
CSeq: 28592 ACK
Max-Forwards: 70
Contact: <sip:41@10.10.10.34:5060;transport=tls>
Content-Length: 0

Att.: Note: The “user=phone” parameter indicates that the user portion of the URI (the part to the left of the @ sign) should be treated as a tel URI: so 40 is the number assigned to the Alice’s phone, and 41 to Bob’s phone. Generally if dial a telephone number on a keypad, this is converted into a SIP URI of the form sip:nnnnn@domain;user=phone (in this case Alice dialed 41 using the keypad of the her phone to call Bob).

REGISTER
While going through a typical SIP session the proxy servers do the job of finding out the exact location of the recipient, and must be knows the ip address of the recipient UA. What actually happens is that every user registers its current location to a REGISTRAR server. The application sends a message callee REGISTER informing the server of its present location. The Registrar stores this binding (between the user and its present address) in a location server which is used by other proxies to locate the user.

The register process is very important because permit to bind UA ↔ Ip Address where the UA itself answer to INVITE etc. In the registration process it is possible to use an authentication (realm, username, password), but it is not mandatory.

Real Example

Alice -> Registrar Server

REGISTER sip:10.10.10.110:5060 SIP/2.0
Via: SIP/2.0/UDP 10.10.10.81:5060;branch=z9hG4bK97f7825d2
Max-Forwards: 70
Content-Length: 0
To: 40 <sip:40@10.10.10.110:5060>
From: 40 <sip:40@10.10.10.110:5060>;tag=60b2a3eb3e07b6e
Call-ID: b72007f19f018538b3ac254fa026dbb8@10.10.10.81
CSeq: 1507731226 REGISTER
Contact: 40 <sip:40@10.10.10.81:5060;transport=udp>;expires=120
Allow-Events: talk,hold,conference
Allow:NOTIFY,REFER,OPTIONS,INVITE,ACK,CANCEL,BYE,INFO
Authorization:Digest response="368b75e6f3d7244fbbf01da27b271feb",username="40",realm="asterisk",nonce="3b2d808b",algorithm=MD5,uri="sip:10.10.10.110:5060"
User-Agent: Aastra 9112i/1.4.3.1001 Brcm Callctrl/1.5.1.0 MxSF/v3.2.8.45

Registrar Server -> Alice

SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP 10.10.10.81:5060;branch=z9hG4bK97f7825d2;received=10.10.10.81
From: 40 <sip:40@10.10.10.110:5060>;tag=60b2a3eb3e07b6e
To: 40 <sip:40@10.10.10.110:5060>;tag=as2efd2925
Call-ID: b72007f19f018538b3ac254fa026dbb8@10.10.10.81
CSeq: 1507731226 REGISTER
User-Agent: FPBX-2.8.1(1.4.40)
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO
Supported: replaces
WWW-Authenticate: Digest algorithm=MD5, realm="asterisk", nonce="7e318af4"
Content-Length: 0

Alice -> Registrar Server

REGISTER sip:10.10.10.110:5060 SIP/2.0
Via: SIP/2.0/UDP 10.10.10.81:5060;branch=z9hG4bK824b13bb5
Max-Forwards: 70
Content-Length: 0
To: 40 <sip:40@10.10.10.110:5060>
From: 40 <sip:40@10.10.10.110:5060>;tag=60b2a3eb3e07b6e
Call-ID: b72007f19f018538b3ac254fa026dbb8@10.10.10.81
CSeq: 1507731227 REGISTER
Contact: 40 <sip:40@10.10.10.81:5060;transport=udp>;expires=120
Allow-Events: talk,hold,conference
Allow:NOTIFY,REFER,OPTIONS,INVITE,ACK,CANCEL,BYE,INFO
Authorization:Digest response="50d379545b2fd91e1a132eb42b120cf0",username="40",realm="asterisk",nonce="7e318af4",algorithm=MD5,uri="sip:10.10.10.110:5060"
User-Agent: Aastra 9112i/1.4.3.1001 Brcm Callctrl/1.5.1.0 MxSF/v3.2.8.45

Registrar Server -> Alice

SIP/2.0 200 OK
Via: SIP/2.0/UDP 10.10.10.81:5060;branch=z9hG4bK824b13bb5;received=10.10.10.81
From: 40 <sip:40@10.10.10.110:5060>;tag=60b2a3eb3e07b6e
To: 40 <sip:40@10.10.10.110:5060>;tag=as2efd2925
Call-ID: b72007f19f018538b3ac254fa026dbb8@10.10.10.81
CSeq: 1507731227 REGISTER
User-Agent: FPBX-2.8.1(1.4.40)
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO
Supported: replaces
Expires: 120
Contact: <sip:40@10.10.10.81:5060;transport=udp>;expires=120
Date: Tue, 29 Nov 2011 00:29:34 GMT
Content-Length: 0

The authentication process checks that both communicate parties know a shared password. In the response 401 Unauthorized the server rejects the client registration and sends it back a challenge digest composed of an algorithm type, a realm and a nonce. When the client sends a new registration request but this time with a digest response composed of the:
”username”, “realm”, “nonce”, “uri”, “response” and the algorithm: the response is computed using the algorithm type, the nonce, the realm and the password. Now the server server will check the response using the password (that is the shared secret) 
for that user, and if all is correct it will send a OK message.

While going through a typical SIP session you have already seen that the caller doesn’t know the address of the callee initially. The proxy servers do the job of finding out the exact location of the recipient (ip address). What actually happens is that every user registers its current location to a REGISTRAR server. The application sends a message callee REGISTER informing the server of its present location. The Registrar stores this binding (between the user and its present ip address) in a location server which is used by other proxies to locate the user.

Att.: The Contact field in INVITE and others SIP-messages is related to the same field used in REGISTER method.

Att.: The ‘Expire’ field reflects the duration for which this registration (bind UA<->Ip Address) will be valid. So the UA has to refresh its registration from time to time.

OPTIONS
The SIP method OPTIONS allows a UA to query another UA or a proxy server as to its capabilities. This allows a client to discover information about the supported SIP methods, codecs, etc. without call the other party.

For example, before a client inserts a field into an INVITE listing an option that it is not certain the destination UAS supports, the client can query the destination UAS with an OPTIONS to see if this option is returned in a Supported header field. All UAs MUST support the OPTIONS method !

Other use is to check the availability of an UA. I.e. it is possible bind statically an UA with a ip address: in this case the UA will not register himself. Using the OPTIONS message it is possible to verify that the UAS is on-line.

Att.: The UAs that register himself will receive the OPTIONS message too !

Real Example

The PBX send a OPTION message to an UA

OPTIONS sip:202@10.10.10.81:5060;transport=udp SIP/2.0
Via: SIP/2.0/UDP 10.10.10.110:5060;branch=z9hG4bK026408ff;rport
From: "Unknown" <sip:Unknown@10.10.10.110>;tag=as264d051f
To: <sip:202@10.10.10.81:5060;transport=udp>
Contact: <sip:Unknown@10.10.10.110>
Call-ID: 787c4d983b006a5a3011bd356140ed15@10.10.10.110
CSeq: 102 OPTIONS
User-Agent: FPBX-2.8.1(1.4.40)
Max-Forwards: 70
Date: Tue, 29 Nov 2011 00:29:34 GMT
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO
Supported: replaces
Content-Length: 0

Response from UA

SIP/2.0 200 OK
Call-ID: 787c4d983b006a5a3011bd356140ed15@10.10.10.110
CSeq: 102 OPTIONS
From: "Unknown" <sip:Unknown@10.10.10.110>;tag=as264d051f
To: <sip:202@10.10.10.81:5060>;tag=4fe041f88ba9bed
Via: SIP/2.0/UDP 10.10.10.110:5060;branch=z9hG4bK026408ff;rport
Content-Length: 0
Allow:NOTIFY,REFER,OPTIONS,INVITE,ACK,CANCEL,BYE,INFO
Contact: <sip:202@10.10.10.81:5060;transport=udp>
Supported: replaces
User-Agent: Aastra 9112i/1.4.3.1001 Brcm Callctrl/1.5.1.0 MxSF/v3.2.8.45

In this case the UA inform that it exist and support the below SIP message.

Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO

Att.: Generally OPTION is sent regularly to all UA from the PBX, whether they are registered or not (Ip Address binded statically to UA). In asterisk it is possible modify this behavior with the qualify parameter. If you turn on qualify in the configuration of a SIP device in sip.conf, Asterisk will send a SIP OPTIONS command regularly to check that the device is still online. If the device does not answer within the configured period (in ms) Asterisk considers the device off-line for future calls. This status can be checked by in Asterisk with the command sip show peers: this will provide status information for peers which have qualify=yes (in status information there is a column that show the delay in response to OPTION message, that is a measure in connection latency between device and Pbx).

Att.: It is possible to use OPTION to try to solve NAT problem in order to keep open the connection from Asterisk to the peer behind NAT. I will write about SIP Pbx protected by Firewall/NAT in future posts.

PREVIOUS POST: From Sip tot RTP (Part 3) – B2BUA… What ?!
NEXT POST:  From SIP to RTP (Part 5) – Trunks and surroundings

From Sip to RTP (Part 3) – B2BUA… What ?!

Def. of Back-to-Back User Agent (B2BUA).
A B2BUA essentially bolts two user agents together in a back-to-back fashion, similar to two people standing back to back. A B2BAU establishes a two-legged call, keeping the SIP server in the middle of the call to orchestrate the details.

One side of the session acts as the SIP UA server that receives the calls; the other side acts as the SIP UA client that establishes the other leg of the call. This “middle position” of the SIP server allows the system to execute difficult call scenarios, like recording a call, stepping out of the voicemail system (by pressing 0), barging into a call, and many other call scenarios that are very hard to do without this center position.

Att.: Typically Asterisk (and most of the pbx) acts like a B2BUA, although you can configure it to behave differently.

In short a SIP call using a B2BUA can be described as the following: both Alice and Bob register to B2BUA Pbx.

If Alice wants to initiate a call with Bob, she will send an INVITE message (a call request) to B2BUA Pbx: it will then send the INVITE message to Bob’s ip phone (B2BUA Pbx knows the ip address where is located the Bob’s ip phone: we will see in details this step).

When Bob ip phone accepts the INVITE (answer the call), then he will send back an OK message to B2BUA Pbx, which will propagate back to ALICE ip phone. Alice then sends an ACK to B2BUA Pbx, that propagate to Bob’s ip phone: a media session that transport voice takes place from Alice’s Iphone → B2BUA Pbx → Bob’s Ip phone to transport the Alice voice, and Bob’s Ip Phone → B2BUA Pbx → Alice’s ip phone to transport the Bob’s voice. B2BUA Pbx in every call is in the middle !

Att.: It is important to note that pbx only proxy’s RTP media traffic when it has to, and when configured to do so: proxy’s RTP traffic is CPU/RAM intensive.

SIP Language
SIP shares some common characteristics with HTTP and SMTP. Like the latter two, SIP is an ASCII text-based protocol which makes it easy to read and troubleshoot. The text below is a SIP trace that shows a user inviting another use to a session.


Users are identified by a SIP address, known as a Uniform Resource Identifier (URI). A SIP URI is similar to an email address and is typically built around the user’s phone number or host name (e.g., sip:[your_number]@companyA.3tsistemi.it). This allows users to be redirected to another phone as easily as they would be redirected to another web page.

SIP communication consists of two types of SIP messages: methods and responses. Methods are sent from the client to the server and are used to indicate the purpose of the request. The following methods are the most important and common: there are some others but these are the must-be-known in trouble shutting process in SIP PBX environment.

INVITE
Establishes a session

ACK
Confirms an INVITE request

BYE
Ends a session

CANCEL
Cancels establishing of a session

REGISTER
Registers a Ip Phone with a registrar server (which is normally incorporated in the pbx: need to know the IP address of the phone, that is where to send the SIP messages)

OPTIONS
Communicates information about the capabilities of the other side.

Responses are sent from the server to the client and are used to indicate the status of the transaction. Responses are delivered in integer form (from 100 to 699) and are categorized as shown in the next.
1xx Informational responses
2xx Success responses
3xx Redirection responses
4xx Request failures
5xx Server errors
6xx Global failures

SIP messages consist of the following three parts:

SIP URI
The SIP URI is typically built around the user’s phone number.

This first line also indicates either the purpose of the request or the response given by the callee party

Message body
SIP requests and responses can both contain message bodies.

The content of the message body is usually a session description and contains syntax as shown in the message below.

 

Att: There are SIP message that does not have the Message Body, but only the Headers (i.e. Cancel, ACK).

Headers
SIP header fields provide additional information about the message. Common headers are shown below.

Via
Path taken by the request so far

Call ID
A unique number used to identify the call

CSeq
Used for keeping track of the conversation number in the SIP messages environment.

Contact
Used for identifying the user agent and the version of software used by the user agent.

Message Body
Normally contains the SDP messages.

PREVIOUS POST: From Sip to RTP (Part 2) – This is straight talking !
NEXT POST: From Sip tot RTP (Part 4) – Invite & Register friendship

From Sip to RTP (Part 2) – This is straight talking !

In this post I will discuss the interaction between SIP and SDP/RTP protocols, with a approach bottom up.

In the beginning a first important note: the Session Initiation Protocol is used ONLY to initiate a session between two endpoints. SIP protocol does not carry any voice or video data (stream) itself, it only allows two or more endpoints to set up connection to transfer that traffic (voice or video) between each other via other protocol, the Real-time Transport Protocol (RTP).

Streaming Audio: the Real-Time Protocol (RTP)
The Real-Time Protocol (RTP) is an application-level protocol that delivers real-time data between two end systems. This is done in such a way that the receiving end system is able to reconstruct the original data stream sent by the other end system, even if the packets are delayed or arrive out of order.

If packets are lost on the way, the protocol will be able to detect this but it does not support requests for retransmissions of any data: every RTP packets contains a sequence number to detect lost and out of order packets.

The reason for not supporting retransmission in the protocol is that it would most likely take too long to request that the source resend the lost RTP packet and for this copy to arrive. A better solution, for the case of audio at least, is to extrapolate sound from previous audio samples to make up for the lost ones, or just ignore the lost data and go on as if nothing has happened (the duration of the audio in one packet is relatively short and the loss of sound for that short period of time will not have a major influence of the quality).

The topic of retransmission is a major reason for not using TCP (TCP protocol, which is a reliable connection oriented protocol, uses retransmissions as a way to guarantee the delivery of the data handed to the TCP layer from the application layer).

Therefore RTP normally uses UDP as the default transmission protocol because that does not provide any reliability features. UDP in turn uses IP, with best effort delivery to encapsulate its data.

Att.: Def. of best effort delivery = Describes a network protocol in which the network does not provide any guarantees that data is delivered.

In the next we summarize the processing and encapsulation of the audio for an IP telephony session before it is sent from a host usng a network connection.
1) The sound from the microphone will be sampled at certain times. A number of samples are bundled together by the application to be the data compressed and encapsulated into a RTP packet. Typically the data related to 20 ms of sound is encapsulated into one RTP packet (to summarize this step: transformation of the voice into a stream of bytes).
2) Every RTP packet is encapsulated into a UDP datagram and transmitted to the destination.

Att.: Does exist several methods how to sample the sound from microphone and compress this stream of bytes obtained: every different methods is a different codec.

The Session Description Protocol (SDP)
The Session Description Protocol (SDP) has three main objectives that need to be achieved before an IP telephony session between a caller and a callee can begin.

First, you need to tell the other party what kind of media you want to receive: audio, video, or both. The second thing is how you want the media to be coded by him so that you can understand what is being sent (what codec is in using). The third thing you need is to inform the other party about what is the address and UDP port you want the media to be delivered to.

For this to work the device on the other side will also have to send you a session description with his information to you, or else you will not be able to send any media data to him. A typical session description looks like the one in the next. SDP is entirely textual !

v=0
o=gptucci 955720785595 955720785595 IN IP4 135.138.242.8
s=Basic Session
c=IN IP4 135.138.242.8
t=955720785595 0
m=audio 2328 RTP/AVP 8 0 96 98 99 97
a=rtpmap:96 SC6/6000
a=rtpmap:98 SC6/3000
a=rtpmap:99 RT24/2400
a=rtpmap:97 VR15/1500

In the next we will see in details the SDP session, but now we can figure out the most important field..

The origin field

o=<username> <session id> <version> <network type> <address type> <address>

The parameters of the origin field will together form a unique identifier for the current SDP session.

The connection field

c=<network type> <address type> <connection address>

The purpose of the connection field is to give to the port number given in the media field (see in the next) an address to be associated with.

The media field

m=<media> <port> <transport> <fmt list>

The purpose of the media field is to let the other party in the session know what kind of media (audio or video) the recipient of the SDP should deliver, to what port on the associated connection address (see above) the media should be delivered to, and in what way the media should be coded. The example of SDP session above uses two standard codecs denoted 8 and 0 in the media field (respectvly PCMA and PCMU). In the same media field are four non-standard codecs, denoted 96, 97, 98 and 99, declared. The non-standard codecs are defined in the following attribute fields, one for each codec number.

SIP
The session initiation protocol (SIP) is a signaling protocol for setting up sessions between clients over a network, i.e. the Internet.

Att.: These sessions do not necessarily have to be Internet telephony sessions: SIP could just as well be used for setting up gaming sessions or for distance learning where a lecture is streamed out to the participants.

The SIP sessions are set up by using a three-way handshake procedure (much like TCP).

Sip: Alice wants to call Bob

When client A (Alice) wants to set up an IP telephony call session with client B (Bob), A sends an INVITE request to B. The INVITE message contains a payload (=data inside the INVITE request) with a description of the session he/she wants to set up with B. If A want to setup an IP voice telephony session, then the session description in payload contains information about audio encoding types A “can understand” and it also specifies on which ports A wants the RTP audio data sent to. The protocol to convey session descriptions is Session Description Protocol (SDP). All the SDP message will be transimmetd inside SIP payload message (it’ll become more clear in the next…) !

When B accepts the call his user agent sends a message with a response code of 200. Any 2xx response means that the message was successfully received, understood, and accepted. In the response client B adds his codec capabilities and the port numbers where he wants A to send his RTP data to (using SDP packet). The final part of the three-way handshake occurs when A sends an acknowledgement to B. By sending an ACK the caller confirms that it has received the response from the callee. After the setup procedure is completed the conversation can begin now using RTP.

 SDP in SIP
I have to repeat another time, but it is very important !

SIP protocol is used to initiate a session between two endpoints: it does not carry any voice or video data (stream) itself, it only allows two endpoints to set up connection (using SDP incapsulated in SIP messages) to transfer that traffic (voice or video) between each other via other protocol, the Real-time Transport Protocol (RTP).

Here is a real example of INVITE message where it is possible to see the structure of the more important SIP message (Alice is calling her friend named Bob).

Att.: In Asterisk it is possible to debug all the SIP messages with the following commands from console.

set verbose 0
set debug 0
sip set debug

 

1 = This is the SIP Request header that tells us what kind of SIP message this is. This particular packet is a SIP INVITE request for below extension.

532453@79.14.212.52 (calling request)SIP/2.0

Att.: 79.14.212.52 is the ip address of the SIP proxy, more common the IpAddress of the SIP Pbx: 532453 is the Bob’s number.

2 = The Via header contains a list of all SIP proxy servers that this packet has passed through, including the initiating client.

We have see that the SIP protocol can be, and usually is, routed through one or more SIP proxy servers before reaching its destination: it is very similar to how email is transmitted, in that multiple email server are usually involved in the delivery process, each forwarding the message in its original form. Each email server adds a Received header to the message, to track the route the message has taken. SIP uses a Via header to track the SIP proxies that the message has passed through to get to its destination.

Att.: The Via field indicates the path taken by the request so far. This prevents request looping and ensures replies take the same path as the requests, which assists in firewall traversal and other unusual routing situations.

3 = The “To” header specifies the SIP packet’s destination

4 = The “From” header specified who sent the SIP packet

5 = This particular packet is a SDP packet, meaning it contains a Session Description Protocol message that contains information the remote client needs to open an RTP session for this call.

6 = The IP address of the SIP client that created this packet

7 = The IP address the destination SIP client should contact to open an RTP session.

8 = The key pieces of information in this header are audio, 35302 and RTP/AVP. The audio component obviously signifies that this is an audio call, 35302 specifies the port where want to receive the RTP stream, and the IP address is specified in 6: RTP/AVP specifies that the Real-time Transport Protocol will be used for the session. The numbers at the end of this header represent the different codecs that this client supports: the SIP client at the other end must support one of the matching protocols in order to be able to make a successful connection.

More deeply…. The key pieces of information in this header are how the audio will flow from UAS (that receive the INVITE message, and is the called party) to UAC (that transmit this INVITE message, that is the caller).

In the INVITE message we can see the following.

c=IN IP4 193.227.104.23
t=0 0
m=audio 35302 RTP/AVP 18 3 97 8 0 101

These means that the stream related the voice (transmitted by RTP) must be transmitted to ip 193.227.104.23 port 35302.

This is the response to this INVITE message.

In the OK messages there is the information about the other voice stream, related to the flow caller->called.

c=IN IP4 79.14.212.52
t=0 0
m=audio 19340 RTP/AVP 8 101

These means that the other stream related the voice must be transmitted to ip 79.14.212.52 port 19340.

Att.: Usually the stream is transmitted from the same port where the other stream is received.

Alice’s voice is sent from ip 193.227.104.23 port 35302 to 79.14.212.52 port 19340 (Bob’s loudspeaker), and Bob’s voice is sent from ip 79.14.212.52 port 19340 to 193.227.104.23 port 35302 (Alice’s loudspeaker).

Att.: The voice is “transmitted” using bit and a codec: the other party must use the same codec to receive the stream and re-transform the bit-flow to voice. There are different kind of codecs: the number at the end of the header illustrated above (m=audio 19340 RTP/AVP 8 101), i.e. 8 represent the different codecs that client supports (here there is only one codec, but usually we can find more values), and 101 describe other sub-properties about the specified codecs. The SIP client at the other end must support one of the matching protocols in order to be able to make a successful connection. To simplify:

m=<media> <port>/<number of ports> <proto> <fmt>

where proto=codec, and fmt=media format description. Here 8 = PCMA (alaw) and 101 define a paylod type = telephony. All the specified numbers are defined in the IETF RFC related to SDP protocols.

The stream is transmitted using RTP protocol, but all the message that clarify what IP and port using is SDP.

Att.: Unlike SIP, which listens on port 5060 (usually UDP like in Asterisk enviroment, but can be TCP), RTP uses a dynamic port range (and is only ever UDP): in asterisk the default is between 10000-20000 and can be changed using the file rtp.conf.

PREVIOUS POST: From SIP to RTP (Part 1) – Overview
NEXT POST:  From Sip to RTP (Part 3) – B2BUA… What ?!


From SIP to RTP (Part 1) – Overview

This is the first in a series setting out several major parts of the SIP protocol. The following are some pratical notes on the protocol and how works the SDP and RTP protocol delegated to voice or video transport.

Attention: they are simple practical notes: I invite you to see the documentations in linkografia for a in-deep study about this topic.

SIP Overview
SIP (Session Initiation Protocol) is a signaling protocol that is used to control multimedia communication sessions, such as voice and video calls, over Internet Protocol (IP).

SIP protocol is analogous to HTTP for voice and is essentially the glue that ties communications systems together, much like HTTP ties clients (browser) and web servers together for worldwide communication.

If a Phone A want to place a call to Phone B, SIP protocol is used to exchange information about call-establish (callerid, callee number, etc), and how the two stream of voice (or video) Phone A -> Phone B (the caller’s voice in Phone A heard by callee in Phone B), and Phone B -> Phone A will be transmitted. After that Phone A and Phone B agreed about this details (using the SDP protocol, enclosed in SIP protocol), the two real data-stream will be transmitted using RTP protocol: in others words the sip protocol is delegated to the signalling about the call.

In the next of this post the focus will be the call establish: how the voice stream will be transmitted (the streaming audio) will be exposed in the next posts.

SIP Components
In short a call establishment in SIP protocol (Alice wants to call Bob) can be described as the following.

Sip: Alice wants to call Bob

When Alice wants to initiate a call with Bob, she will send a SIP INVITE message (a call request) to Bob directly (or using an intermediate server): Bob’s phone will response with a trying messages, and others several ring messages to indicate that the Bob’s phone is ringing. When Bob answer the call, then her device will send back an OK message: to confirm the call establishment Alice then sends an ACK message to Bob.

Oss.: The communication flow will be directly or using intermediate server. As we shall see in real words things can be a bit more complex: this is only to exemplify the sip messages flow.

Entities interacting in a SIP protocol environment are known generally as User Agents (UA), and there are two types of UAs: clients (UAC) and servers (UAS).

User Agent Client (UAC)
The UAC generates “methods” and sends them to servers (e.g., it sends an INVITE request call and initiates a call).

User Agent Server (UAS)
The UAS receives the methods, processes them, and generates “responses” (e.g., it sends a 200 Ok response to indicate a successful session). UAS is the generic name of the device that receives the methods.

Att: In other words a SIP UA can perform the role of a User Agent Client (UAC), which sends SIP requests, and the User Agent Server (UAS), which receives the requests and returns a SIP response: these roles of UAC and UAS only last for the duration of a SIP transaction. Infact most of the time a SIP device (eg a IpPhone) implements both a UAC to a UAS (they are simply different pieces of software running on the device): the phone behaves like a UAC if initiates a call, and the other party that receive the call will be an UAS. The UAS can be another phone or an intermediate server that re-trasmit the request to another server or the destination IpPhone (that is the final UAS). The next time if the same phone will receive a call it will be a UAS.

The UAC is often associated with the end user, since applications running on systems are used by people. The UAC can be any end-user device, such as a IpPhone, softphone (= a software that emulates an IpPhone) and others.

Each resource of a SIP network, such as a User Agent, is identified by a Uniform Resource Identifier (URI), based on the general standard syntax also used in email. A typical SIP URI is of the form: sip: username@domain.

UAS
SIP defines several server network elements UAS (over the telephone that has been called): although two SIP UA can communicate directly without any intervening SIP infrastructure, which is why the protocol is described as peer-to-peer, this approach is often impractical for a public service.

In a real word the requests generated by the UAC usually are sent to a server (typically a proxy server) and not directly to the other Sip IpPhone. There are several types of servers that helps UAC & UAS to connect each other.

>> Proxy Server
The SIP protocol can be, and usually is, routed through one or more SIP proxy servers before reaching its destination: it is very similar to how email is transmitted, in that multiple email server are usually involved in the delivery process, each forwarding the message in its original form.

Proxy servers help track down ip addresses of recipients whose exact addresses are not known in advance. If the proxy server cannot find the address of the recipient, it will send the request to other proxy servers. In others words when Alice want to call Bob, she knows only the Bob’s URI (bob@domain): SIP Proxy convert from SIP URI to Ip Address. SIP proxy servers use presence services (Registrar Server) to track users, which means users can be located regardless of their physical location (current Ip Address). Proxy servers are the most common server in the SIP environment.

 >>Registrar Server
A SIP registration server is responsible for registering devices (tipically IpPhone). It does this by authenticating the device with a user name and password and keeping a table of IP addresses and extensions/phone numbers. Registrations of devices play an important role in the process since SIP devices that do not register itself cannot be called and SIP devices that do not successfully authenticate cannot make outbound calls.

Note: Commonly proxy and registrar server are on the same device: the pbx ! They are simply different pieces of software on the pbx-device.

A more completely description about a call-establish using the Sip protocol it is in the next.

Both Alice and Bob register to a registrar server for location identification purposes: registrar server knows the ip addresses of both Bob and Alice’s IpPhone.

If Alice wants to initiate a call with Bob, she will send an INVITE message (a call request to bob@domain) to her proxy server. This proxy server will act on Alice’s behalf and search for Bob’s proxy server. It will then send the INVITE message to Bob’s proxy. Bob’s proxy server will then look up Bob’s current device (using registrar server) and send an INVITE message to Bob.

When Bob accepts the INVITE (answer the call), then he will send back an OK message, which will propagate back to ALICE through the proxies. Alice then sends an ACK directly to Bob and a direct media session (to transport the voice) takes place after that. To disconnect the session, Alice or Bob will send a BYE message and the other will reply with an OK message.

NEXT POST: From Sip to RTP (Part 2) – This is straight talking !