头图


【Handy: what you think in your heart, what you can do with your hands】

The hand, as the most flexible organ of the human body, participates in all aspects of our life; the hand is the organ that can directly perceive the three-dimensional world and objects in addition to the eyes; the hand, as a silent interactive tool, has occupied an extremely important position in the computer field .

In the human-computer interaction interface, hand interaction is very important, so we have begun to have more and more "touch interaction", but with the development of technology, for the liberation of the body, more and more devices have been introduced. "Gesture Interaction" technology.

"Gesture interaction" means that when people use various electronic devices, they can no longer operate by touching the screen, mouse, keyboard, etc., but completely get rid of the operating medium.

Alibaba Cloud Video Cloud's "Gestures in the Air" is a "future" interactive technology.

Do not need a mouse to go to work, stage a "space" operation

https://www.youku.com/video/XNTg3NDQzMjkwNA==
A programmer's "air gesture" video, you can feel the space interaction operation in an office scene. This is the "smart gesture interaction engine" developed by Alibaba Cloud Video Cloud based on gesture recognition technology.

It can be seen that the programmers in the short video, whether browsing the page, logging in to the system, or finishing the video editing in detail, are no longer operating step-by-step through the mouse and keyboard, but use various static and dynamic gestures. Smooth, real-time, and accurate remote control and operation, and this level of fine control, in the technology of remote gesture interaction, breaks through the existing bottleneck.

Compared with "touch interaction", which requires the user and the device to perform contact control, "voice interaction" requires the process of listening, speaking and high-precision recognition, while "gesture interaction" has the natural advantages of human usage habits, becoming "touch interaction" and Another optimal solution in the inconvenient scenario of "voice interaction".

When it comes to gesture interaction, the basis is "gesture recognition" technology.

Start with gesture recognition

In computer science, gesture recognition is an issue of identifying human gestures through mathematical algorithms, that is, users can use gestures to control or interact with devices, allowing computers to understand human behavior.

The key technologies of gesture recognition include gesture segmentation, gesture analysis, and static and dynamic gesture recognition. Whether it is a static or dynamic gesture, the sequence of recognition first needs to detect the hand in the obtained image and segment the gesture; then through the gesture analysis, obtain the shape feature or motion trajectory of the gesture; finally, according to the important features in the gesture analysis, complete Static or dynamic gesture recognition.

The research and development of gesture recognition affects the naturalness and flexibility of human-computer interaction. At present, most researchers in the industry focus on the final recognition of gestures, usually simplifying the background of gestures, and using algorithms to segment and analyze gestures in a single context.

However, in real applications, the human hand is usually in a complex environment, and it is necessary to consider such complex factors as: the light is too bright or too dark, and the distance between the gesture and the collection device is different, so as to achieve accurate gesture recognition.

How does the "smart gesture interaction engine" of Alibaba Cloud Video Cloud make "air gestures" more intelligent and interactive?

High-performance intelligent gesture interaction engine

Gesture keypoint tracking is challenging due to the complex finger-palm structure and high flexibility during movement. The intelligent gesture interaction engine developed by the Alibaba Cloud Video Cloud team supports the recognition of 25 basic static gestures by accurately identifying and tracking 21 key points of the hand.

Based on these 25 basic gestures, combined with palm gesture information and scenes, more than 100 gestures can be extended. For example, the gesture of extending the thumb, we can accurately identify it according to the direction of the thumb and thumb: like (thumb up), negative (thumb down), left (thumb left), right (thumb right) )and many more.

https://www.youku.com/video/XNTg3MzQ0ODU0OA==
In the video, Left_Prob represents the confidence of the left hand, and Gesture_ID represents the recognized gesture ID.

In addition to static gestures, various dynamic gestures such as sliding up, down, left and right, turning pages left and right, zooming out and zooming in, and bye bye can also be accurately recognized and tracked, so as to achieve the effect of video editing in the above video.

It is worth mentioning that the algorithm of Alibaba Cloud Video Cloud's "smart gesture interaction engine" can not only ensure "high precision" and "high stability", but also achieve "super lightweight".

"High precision" refers to the ability to accurately identify various hand gestures and locate the position of key points of the hand, even in challenging scenes such as dark light and backlight;

"High stability" means that through the deep polishing of the algorithm, it can detect and output stable key point positions for the key points of the hand, and achieve ultra-low latency of gesture interaction.

"Super lightweight" is reflected in the single-threaded operation of ordinary devices. The average time per frame is only 6.5 milliseconds, the processing performance can reach more than 150fps, and the model size is only 2.6MB. Compatible with all mainstream platforms, it is very suitable for ordinary mobile terminals. Deployment and application of mobile phones.

Everything is Different With Gesture Interaction

New interaction is becoming a trend, and a more natural interaction that liberates the body is also the direction of interaction evolution. It is conceivable that it can bring new forms and experiences to life, work, and study. Interactive black technology" can be gradually applied to various scenarios.

In the interactive classroom scene, students can keep a distance from the screen for the sake of visual health and rich experience. Through gestures in the air, students can complete various interactive operations such as course selection, question answering, page turning, and raising their hands.

In today's normalized epidemic situation and online classrooms, the intelligent gesture interaction engine helps industry users redefine the interactive mode of teaching content in online classrooms, so that the teaching in front of the screen of teachers and students is no longer a one-way knowledge instillation, but an online Classroom interactivity and perception are highly enriched with educational intelligence.

https://www.youku.com/video/XNTg3NDQzMzEyNA==

In e-commerce and entertainment live broadcast scenarios, it is very inconvenient for the anchor to control the screen of the mobile phone while broadcasting. However, using gesture interaction, the anchor can interact with the live audience in real time through gesture special effects, and can also use gestures to control the live broadcast process and screen; On the user side, various stickers and special effects can be presented in real time in combination with the user's gestures, such as liking, comparing hearts, etc., which greatly improves the interactive experience.

In the scene of the digital exhibition hall, the digital visual display has been constantly innovating. By using gestures in the air, visitors do not need to control the screen. By rotating and moving the exhibits through the gestures in the air, they can understand the exhibits in a 360-degree panorama, which can be reduced especially during the epidemic period. Safety hazards brought about by close contact.

In the field of intelligent driving, gesture recognition is applied to the driving assistance system, and the driver can use gestures to control various functions and parameters in the car, so as to avoid the hidden danger of driving safety caused by sight shift.

In daily life, air gesture recognition can be deeply integrated with intelligent hardware, such as smart home appliances, intelligent robots, etc., and home appliances are controlled by air gestures, which is more convenient and makes human-machine interaction more experiential. Of course, if you apply gesture recognition to offline activities in your life, you can use your imagination to have more interesting interactive experiences.

For the "smart gesture interaction engine", Alibaba Cloud Video Cloud has already accumulated relatively mature technologies and applications in 2D gesture recognition. In the future, it will continue to explore advanced gesture interaction technologies, especially for 3D gesture interaction. With the position information in the three-dimensional space, we can more accurately identify the movements of the hands, so as to realize more complex interactions, such as driving a 3D virtual human, or realizing AR special effects such as hand-held virtual objects. Through the recognition of 3D hand gestures, it can bring a richer, more immersive and more intelligent online interactive experience.

Zhuangzi's "The Way of Heaven" once said: "It is not slow and slow, but it is obtained in the hand and should be in the heart. The mouth cannot speak, and there are a number of things stored in it."
That is, "It is neither slow nor fast, it should be obtained from the heart and obtained by the hand. Although it cannot be said, there are skills and secrets in it."

Gesture interaction technology must be such a mysterious existence, which can make you handy in any field.

"Video Cloud Technology", your most noteworthy public account of audio and video technology, pushes practical technical articles from the frontline of Alibaba Cloud every week, where you can communicate with first-class engineers in the audio and video field. Reply to [Technology] in the background of the official account, you can join the Alibaba Cloud video cloud product technology exchange group, discuss audio and video technology with industry leaders, and obtain more latest industry information.

CloudImagine
222 声望1.5k 粉丝