The Session Description Protocol (SDP)
Clearly, SIP is used to establish sessions between users, which requires that the users agree on the type and coding of the information to be shared. For example, the two users must agree on the voice-coding scheme to be used, which requires that they share session descriptions. These session descriptions are coded according to the Session Description Protocol (SDP). SDP is simply a language for describing sessions. It contains information regarding the parties to be involved in the session, the date and time when the session is to take place, the types of media streams to be shared, and addresses and port numbers to be used. It is perfectly possible that a session description could refer to multiple media streams, such as in a video conference where one media stream relates to coded voice and another media stream relates to coded video. Consequently, SDP is structured so that it can describe information related to the session as a whole (e.g. the name of the session), plus information associated with each individual stream (e.g. the media format and the applicable port number). Some of the information included in an SDP session description will also be included in the SIP message that carries the SDP description. This overlap is due to the fact that SDP is designed to be used by a range of other protocols, not just SIP. Perhaps the best way to describe the combined usage of SIP and SDP is by example. Consider Figure 8-10, which is a more detailed version of the call establishment scenario presented in Figure 8-9. In this case, we see a call from User1@work.com, who is logged in at station1.work.com, to User2@work.com, who is logged in at station2.work.com. As with any SIP session establishment, the call begins with an INVITE, which is indicated in the first line of the request. The first line also indicates the address of the entity to which the message is being sent, known as the request uniform resource indicator (URI). In this case, the message is being sent directly to User2. If, however, there happens to be a proxy server between User1 and User2, then the request would first go to the proxy, in which case the request URI would indicate the proxy. The Via header field is inserted by each entity in the chain from the source of a message to the destination. This is to ensure that the response can follow the same path back through the network, as was taken by the original request. The From and To header fields indicate the initiator of the request and the recipient of the request. The Call ID is a globally unique identification.To ensure uniqueness, the Call ID should take the form indicated in the figure. The CSeq field refers to the command sequence. The CSeq contains an integer and an indication of the type of request. The purpose of the CSeq header is to enable the initiator of a request to correlate a response with the request that generated the response. Finally, we have two headers that provide information about the message body. The first, Content-Length, indicates the length of the message body. The second, Content-Type, indicates the type of message body. Strictly speaking, the message body could be any Multipurpose Internet Mail Extension (MIME coded) type, such as text. In our example, the message body contains a session description code according to SDP.
The message body is separated from the SIP headers by a blank line. In SDP, it starts with a version identifier, which is version 0. Next, we find the Origin (o) field, which indicates the user name (User1), a session ID (123456 in our case) that does not have to match the SIP Call ID, a version for the session (001 in our example), the type of network (IN indicates Internet), the type of addressing used (IP4 indicates IP version 4), and an address for the machine that initiated the session (station1.work.com in our example). After the Origin field, we find the optional Subject (s) field and the Connection (c) field. The connection field provides information regarding where the user would like the media to be sent. In our case, it indicates that the type of network is Internet (IN), that the addressing uses IP version 4 (IP4), and the address to which media should be sent. This address could be different from the address of the machine that created the session. After the Connection field, we find the Time (t) field, which indicates the start and stop times for the session. In our example, these are both set to 0, which means that the session does not have any set start or stop time. Next, we find the Media (m) field, which provides information about the media to be used and the port to which the media should be sent. In our example, the type of media is audio, and it should be received at port number 4444 (i.e. the far end should sent the media to port number 4444). The media field also indicates the type of RTP audio/video profile (AVP) to be used, which is 98 in our example. In RTP, certain types of media stream codings are assigned specific values of payload types and are known as static payload types. For example, payload type 0 indicates G.711 coded voice. Thus, if the media field of the SDP description indicated RTP/AVP value 0, then the far end would know that G.711 coded voice is required. RTP also includes the concept of a dynamic payload type, where the payload type value is significant only within one session. Therefore, it is necessary to indicate additional attributes in order for the far end to understand the meaning of the payload type chosen. In our example, the attribute “artpmap AMR/8000” indicates that the payload type is adaptive multirate and sampled at 8000 Hz. Figure 8-10 shows that the first response includes status code 180, indicating that the user is being alerted. It contains the same Via, To, From, Cseq, and Call ID header fields as the original request and they enable the sender of the request to match the response with the request. This response does not contain any session description. When the called user answers, a 200 (OK) response is generated. The SIP header fields are the same as the original INVITE request, with the exception of the content-length and the message body itself. This is because the called device has included a session description of its own. This is quite similar to the session description in the INVITE request, but indicates a different address and port number. This makes sense, as the address and port number indicate where User2 expects to receive the media stream. Once User1 has received the 200 (OK) response, it sends an ACK message. The header fields in this message are identical to those of the original INVITE, with the exception of the CSeq field, which now indicates the ACK request. At this point, media can flow between the two parties and a conversation can take place.
187 times read
|