头图

The author of this article is the team of Zhangye Dong, the winner of the RTE 2021 Innovative Programming Challenge. In the field of real-time audio and video, video content will need to be protected by copyright, and blind watermarking is one of the protection measures. The participating team developed a blind watermarking plug-in for real-time video on the user side based on the Shengwang SDK. Other developers who use the Shengwang SDK can also use the plug-in in their own applications. Visit "Read the original" , you can view the source code of the project.

Project Introduction

Video blind watermarking technology directly embeds the identification information in the frequency domain of video RGB or YUV, basically does not affect the viewing quality of the original video, and it is not easy to be noticed or noticed. Through the information hidden in the carrier, the creator or user of the content can be confirmed or whether the video has been tampered with. This technology is usually provided by professional copyright protection service providers, used for broadcasting and television copyright protection, and has a strong commercial nature.

In this project, a plug-in for real-time video blind watermarking on the user side is developed based on the SDK of Soundnet. At the same time, a personal PC-based watermark recognition software is provided for watermark verification. It reduces the professional threshold for using blind watermarking services, and provides a convenient solution for the privacy protection of individual users and the anti-piracy of works.

Realization principle

The realization principle of blind watermarking is to complete information superposition in the frequency domain. The transform methods include discrete Fourier transform, wavelet transform, etc., such as Fourier transform, superimpose text and image in the real and imaginary parts, and then display the video through inverse transform frame.

图片

The method of extracting the watermark from the video frame is to take a screenshot of the video frame, perform a Fourier transform on the screenshot again to obtain the frequency domain data, display the frequency domain amplitude, that is, the energy, and get the frequency domain amplitude map, which will display the previous overlay Text.

图片

The complexity of the fast Fourier transform is O(nlog(n)), and in principle, the real-time superposition of blind watermarks can be realized in the video processing process.

Design and implementation

The program design includes two parts: Soundnet SDK docking and blind watermark development. Blind watermark development is divided into two parts: Android side superimposed watermark and Windows watermark extraction. They are gray, yellow, and orange. Because it is a demo, the blind watermark processing is only completed on the local video preview, which can be extended to the video display in the future.

图片

The design of the solution focuses on two aspects: SDK connection and third-party compatibility. The main aspects are less copying of YUV data, serialization of video processing, third-party compatibility, and scene generalization.

Core code

The main process of superimposing watermark:

图片

The calling function of opencv:

图片

Mainly two functions of Fourier transform and superimposed text. Soundnet SDK and OpenCV open source library are compatible with each other well.

Show results

图片

The first picture is the original video, input the watermark text such as wm, the second picture is the video with blind watermark superimposed, it can be seen that the video effect is basically not affected, the last picture is after uploading the second picture to the PC, the user himself Extracting the watermarked image, it can be seen that there are obvious wm text in the image. So far, the verification is completed.

Future outlook

The next plan is to consider aspects such as improving the robustness of the watermark, expanding the application scenarios of the watermark, and enriching the data dimension of the watermark. In terms of watermark robustness, grid segmentation is performed in the planned spatial domain, and frequency domain watermark superimposition is performed for different segmented regions; different transformation methods, such as DWT, are used to achieve the best results; the watermark itself is redundantly coded to improve The watermark recognition degree increases the concealment of the watermark. In terms of extended watermarking applications, watermarking is superimposed on the real-time video display terminal to achieve the purpose of tracing the source of sneak shots. In terms of enriching data dimensions, in audio processing, voiceprint watermarking can be expanded; combined with video content features, feature encoding can be expanded.

图片


RTE开发者社区
647 声望966 粉丝

RTE 开发者社区是聚焦实时互动领域的中立开发者社区。不止于纯粹的技术交流,我们相信开发者具备更加丰盈的个体价值。行业发展变革、开发者职涯发展、技术创业创新资源,我们将陪跑开发者,共享、共建、共成长。