In internet-based telephony solutions, ‘signaling’ refers to the protocols and methods used for one terminal (a device or app) to request or accept a call with another terminal. The transmission of the ‘media’ (audio and video packets) is handled using a protocol different from that used for signaling.
Signaling and media present different challenges. Media packets must be delivered in near real-time or the human ear will detect audio latency. Signaling packets can tolerate slightly more latency. While one would think that the real-time component of multimedia calling is the most difficult problem, as the number of devices in the network scales up, signaling presents a significant scaling problem. The introduction of mobile “apps” brings additional concerns.
An industry-standard signaling protocol, called SIP, offers reliability and interoperability with the Public Switched Telephone Network. This use of SIP in mobile apps is common today, but introduces scaling problems on the server side. Custom signaling protocols have been developed for mobile apps, but these do not have the benefits of interoperability.
This article describes some of the basic concepts of the SIP protocol and its implementations. It then goes on to describe how some of these very concepts do not work well in the mobile environment. A call for a new type of implementation is made in the conclusion.
Session Initiation Protocol (SIP) is the industry-standard signaling method today. In a SIP network a central SIP server maintains a database of registered terminals. The database maps a name for each device to its IP-address. A device registers to the SIP server with a REGISTER command. This associates the name of the device with its current IP address. The act of registering allows the SIP server to know how to send messages to the device.
A SIP server may be configured to communicate to a terminal via TCP or UDP. If TCP is chosen, then an active connection will remain open from the SIP server to each device. Each open connection utilizes resources on the CPU and memory of the SIP server. With UDP only the address of the device must be retained and fewer resources are used. The choice of TCP/UDP impacts the scalability of the SIP server.
A Typical SIP Call Sequence
In a typical SIP Call, a series of messages are exchanged between the terminals and the SIP server. An INVITE command from the calling party to the called party begins the sequence. The SIP server is involved in relaying each message. The called device sends back messages indicating the state of the call on its end. For instance, a RINGING message indicates that the terminal device is sounding its ringer, and an OK says that a person answered the call.
Since the SIP server is a centralized resource used by many devices for many calls, its performance and scalability are important. If a SIP server becomes overwhelmed, all devices calls in the network can be affected.
Benefits of SIP
The main benefit of using SIP is that it is an industry standard providing interoperability to a large number of service providers. Using SIP, it is possible to route calls to the Public Switched Telephone Network (PSTN) using SIP trunking. Because SIP is a mature standard, it provides capabilities for advanced features like “3-way Call Join.”
Assumptions of SIP
SIP evolved in a time when most devices were continuously connected to the network and were permanently powered on. These assumptions are not necessarily true for a mobile device or mobile app.
- Network Hop
A mobile device can frequently move from one network to another. With each move (“hop”) the device receives a new IP-address.
Mobile apps can be developed that communicate this information directly to the SIP server. As the app notices that its IP-address has changed it could unregister its IP-Address and then REGISTER a new one. However, for a large number of mobile apps, the number of messages transmitted for this function could overwhelm a SIP server.
- Power States
Mobile devices move through many different power states in order to preserve battery life. An app may be in the foreground when it is the direct focus of user interaction. A mobile app may be put in the background as the user moves to a different task. Or an app (and device) may enter a powerdown state when the user is not using the phone. When an app is in the background or the mobile device is powered off, it may be the case that the app cannot receive messages from the SIP server.
A Mobile App may be a direct client of a SIP server, and many such “VOIP” (Voice over IP) apps exist in the iTunes app store today.
Apple anticipates VOIP apps by providing special compilation flags for the developer to use. A VOIP app receives special background handling and can be woken up by a remote command. However, this option is only available via a TCP connection to a server.
If a SIP server is configured to use TCP connections, then the relaying of an INVITE message from the SIP server to a sleeping iPhone can wake a VOIP app. This solution suffers from the fact that TCP connections are expensive. The SIP server is a precious resource, and oftentimes a bottleneck in the system. It is undesirable to configure the SIP server to keep TCP connections open to each device registered with it.
Using a SIP Server for Mobile
In many systems today, mobile apps are deployed that communicate via SIP directly to a SIP server on the internet.
As stated earlier, this arrangement has the following problems with respsect to mobile devices.
- In order to support “wake” functionality, the server must maintain a TCP connection to each connected terminal.
- As devices roam, their IP address changes, and this happens often. The volume of “registration” messages can overwhelm a SIP server and prevent it from doing useful work.
SIP is a well-established protocol that offers many benefits: a solid “telephone-call” ring/answer model, tremendous interoperability, and many server implementations to choose from. However, many of the design decisions affecting these implementations were not made in the context of Apps that run on devices exhibiting mobility. Achieving mobility itself, means that these devices implement many power states and require Apps to smoothly transition between these states. A new model for the implementation of the SIP protocol is required for the mobile environment.
SightCall implements a proprietary signaling solution for mobile devices that incorporates the best features of SIP but also addresses the challenges of mobile devices. SightCall’s SDKs for Android and iOS seamlessly hop networks and react appropriately to power-state changes. If WebRTC on mobile devices is important to you, consider learning more about our unique native mobile SDKs.
Will WebRTC Replace SIP? [NoJitter: Nov 2014] – http://www.nojitter.com/post/240169322/will-webrtc-replace-sip
Notes from SightCall Labs: Ringing Multiple Extensions – http://sightcall.wpengine.com/notes-sightcall-labs-ringing-multiple-extensions/