VoIP (Voice over Internet Protocol), also known as IP telephony, transmits voice communication and multimedia conversations through IP protocol. It is a cost-effective and open-architecture technical solution that can be used in blind dates, interviews, consultations and other scenarios. , providing an excellent communication experience. VoIP Push is a very important part of the VoIP application implementation process. Follow [Rongyun Global Internet Communication Cloud] to learn more

In the process of going overseas for audio and video services, PushKit on the iOS side cooperates with the CallKit framework to provide the same incoming call experience as the mobile phone system, which is displayed directly on the lock screen interface, and will not be interrupted by other applications during the call. (Due to the influence of Apple's review policy, CallKit is not available in the mainland.)

In the Rongyun RTC Advanced Practical Master Class on June 23, Rongyun Audio and Video R&D engineers introduced the application of VoIP Push in overseas projects from the aspects of VoIP concept and the implementation process of VoIP Push on different platforms, and shared key points. Its practice on the iOS side. Reply [VoIP] in the background to get the complete courseware


What is VoIP?

The traditional voice communication uses PSTN based on circuit switching. The characteristic is that the continuous path adopts physical connection. After the circuit is connected, the circuit that appears in front of the data terminal user is like a dedicated line, and the switch control circuit does not check the transmitted data packets. Any content in it provides users with a completely transparent communication path.

(Traditional communication uses circuit switching technology, source: Nextiva)

Circuit switching technology establishes a connection technology for each call of a user, and once a connection is established, it is always occupied by a pair of users, and no matter whether they communicate or not, they cannot be shared by other users. When there is less communication, the actual efficiency of information transmission is reduced, and the communication cost is higher because the circuit is monopolized by the user during communication.

VoIP (Voice over Internet Protocol, also known as IP telephony) is a technology that transmits voice communications and multimedia sessions over Internet network protocols.

VoIP technology uses packet switching technology as a communication platform.

(VoIP uses packet switching technology, source: Nextiva)

Packet switching technology adopts the "store-and-forward" method of packet switching, but instead of switching packets as a unit like a packet switch, the packet is cut into shorter, unified format packets for exchange and communication. transmission. After each packet enters the switch, the switch selects the transmission path of the packet according to the address information in the packet, and transmits it to the next switch or user terminal along the selected path.

Compared with the circuit switch technology, the utilization rate of the communication path of packet switching is very high because there is no dedicated link. It can be said that every line in the IP backbone network is providing transport services for all users.

Operators have a lot of cost investment in hardware facilities when setting up PSTN terminals, such as communication cables, data exchange equipment, each node equipment, construction cost, labor cost, maintenance cost, etc. And VoIP is all based on the transmission on the Internet, which makes full use of the user's broadband resources and is more cost-effective. Coupled with the powerful voice processing function of special terminal access equipment and corresponding platform technology, the call quality of VoIP can reach the call quality of traditional IP phones.

To sum up, the advantages of VoIP over traditional telephony are:
① Can use network resources more effectively ② Cost-effective ③ IP telephone network inherits the intelligent characteristics of computer network, and can flexibly realize the development of various value-added services ④ Open architecture


Implementation of VoIP Push on Different Platforms

VoIP Push is a very important part of the VoIP application implementation process. And its implementation has its own characteristics on different platforms.

PC side

The difference between the current PC side and the mobile terminal is that the PC side App is always online in real time. You can respond and display calls through Rongyun IMLib.

(PC-side call flow)

Different from the PC side, users on the mobile terminal can "kill" the app, which requires a means of reaching the message.

Android side

Here you need to use the FCM service.
FCM (Firebase Cloud Messaging) is a cross-platform service built on Google Play Services that handles the sending, routing, and queuing of messages between server applications and mobile client applications.

As shown in the figure below, FCM acts as an intermediary between the message sender and the client. The client application is FCM enabled and can run on the device.

(FCM service process)

Specific to the Rongyun CallLib call solution, the process is shown in the following figure:

(Rongyun CallLib call process)

The FCM service receives a Push message of an incoming call, and then the App will be activated by the system. After the app is launched, it will link to the IM service. After the IM service is successfully linked, the client will receive a signaling of an incoming call.

Rongyun CallLib will process the incoming call signaling, and then call the system's Telecom framework to create an incoming call. The App needs to register a ConnectionService to receive the callback created by the system.

Then, ConnectionService will actually create a Connection object to handle a new incoming call. In Connection, we can pull up the incoming page provided by the business module for display.

iOS side

The uniqueness of the iOS side is that Apple has introduced PushKit that supports VoIP push, and it must be used with Apple's CallKit framework, otherwise the iOS system will temporarily "kill" the App process after receiving the VoIP push.

CallKit provides a unified voice call UI and an API for interacting with the UI, so that App calls can display a full-screen call and answer interface just like iOS native phone calls ; VoIP Apps have the same call priority as the system Call, and are listed in the address book. The dial history, Siri wakeup, Do Not Disturb mode, etc. can all be well supported.

The following comparison chart can see its experience advantages more clearly. The left side is the incoming call reminder received through the ordinary push . If the user wants to answer the call, he needs to click the notification and unlock the screen to enter the application.

On the right is the incoming call response through CallKit . Users can directly answer or reject incoming calls in the lock screen or non-lock screen state, and get the same experience as the system phone.


Both VoIP Push and normal push are based on APNs.

In the specific process, the user server decides which notifications to send to the user and when.

When a notification needs to be sent, the user server generates a request containing the notification data and the unique identifier of the user device. The request is then forwarded to APNs, which are responsible for delivering the notification to the user's device. Once a notification is received, the operating system on the user's device handles any user interaction and passes the notification to the application.

(VoIP Push process)

In the past, VoIP applications had to maintain a persistent network connection to the server to receive incoming calls and other data.

This means writing complex code to send periodic messages back and forth between the application and the server to keep the connection alive, even when the application is not in use. This technique causes the device to wake up frequently, wasting resources. This also means that if the user logs out of the VoIP application, they will no longer be able to receive calls from the server.

Instead, use the PushKit framework - which allows applications to receive pushes from remote servers. The app is woken up whenever a push is received.

PushKit has the following features:
① The device will wake up only when a VoIP push occurs, saving energy.
② Different from standard push notifications, VoIP pushes go directly into the app for processing.
③ VoIP pushes are considered high-priority notifications and will be sent immediately.
④ VoIP push can contain more data content than standard push for processing business logic.
⑤ If the App is not running when it receives the VoIP push, it will automatically activate the App. It should be noted that we must use PushKit reasonably, otherwise PushKit will be judged as illegal activation by the system, which will make PushKit unusable.
⑥ Even if the app is running in the background, the app has a certain amount of time to process the push.

The CallKit used with it has the following features:
① Allows users to make calls directly on Apple Phone through a link in the VoIP App.
② System function integration, such as do not hit mode, silent or vibrate mode, etc.
③ Call blocking function, which firstly matches the sender's in-app number with the receiver's iPhone user block list. Allows the application to insert a custom list of blocked numbers.
④ Identify the caller and display his name from the list of saved contacts. This feature is especially useful in social networks, where nicknames can be different from the names in the address book.


Implementation of PushKit+CallKit

First, create a VoIP certificate in the Apple Developer Center, and then configure the VoIP function in XCode to take effect.

In the code, we first need to register and implement PushKit related code, including initialization through PushRegistry, registering VoIP Push Token and uploading it to the server and responding to incoming calls.

Architecture diagram of CallKit


As shown in the architecture diagram of CallKit above, VoIP applications and other system services, such as Bluetooth, Siri, telephone, etc., all contain CallKit, which can be activated in response to each other by communicating with the system.

In an application that implements CallKit, two CallKit core classes need to be introduced.

CXProvider class and CXCallController class.

CXProvider: The application will be used to let the system know about any incoming/outgoing calls.

CXCallController: The application will be used to make the system aware of any local user actions.
CXProviderConfiguration provides some things we can customize, including custom names, video feature support, app icons , and ringtones .

Answer the phone process


① When the user answers the call, the system will send a CXAnswerCallAction instance to the CXProvider.
② The application responds to the user's answering behavior by implementing the CXProviderDelegate method.

incoming call process


① To respond to an incoming call, the application sends it to the iOS system by constructing a CXCallUpdate and using CXProvider.
② The iOS system takes this as an incoming call and synchronizes it to all other system services.

General audio and video projects usually need to configure AVAudioSession. When to configure AVAudioSession?

As shown in the figure above, when the user answers the call, the system calls a response answer method provided by CXProviderDelegate. In the implementation of this method, configure an AVAudioSession and call the Fulfill method on the Action object upon completion.

End the call process


① First, the user clicks the end of the call on the UI, and the reference to the call is obtained in Callkit.
② Since the call is about to end, stop processing the audio of the call.
③ Call the End method to change the state, allowing other system services to react to the new state.
④ At this point, calling Fullfill is marked as completed, and the system will end it.


融云RongCloud
82 声望1.2k 粉丝

因为专注,所以专业