Recently, the hottest topic is Liu Genghong's boys and girls.
Liu Genghong's fat-burning fitness live broadcast swiped the Internet, setting off a frenzy of home fitness, and the shuttlecock dance with Jay Chou's "Compendium of Materia Medica" as the melody has even triggered a national punch-in craze.
So, how did the programmer circle navigate this wave of craze?
Alibaba Cloud Programmer Incarnates "Liu Genghong Boy"
https://www.youku.com/video/XNTg3MTAyMTQ4NA==
A programmer's "Compendium of Materia Medica" let everyone warm up and experience the "human pose estimation algorithm".
In the short video, the Aliyun video cloud engineer looks like a "stickman", which is to visualize the "human pose estimation algorithm", and human pose recognition is an important task in computer vision, and it is also impossible for computers to understand the actions and behaviors of the human body. or missing part.
Long before the Liu Genghong phenomenon, the video cloud technology team began to work on the human pose estimation algorithm. This time, the algorithm engineers turned into "Liu Genghong Boys" in order to explore the real-world application of "human pose estimation algorithm".
We know that with Liu Genghong's fitness boom, countless boys and girls have been injured. Although Liu Genghong will correct videos and correct actions for some fans on social media, it is still difficult to solve the situation of some fans jumping and getting injured.
Of course, the error of naked eye recognition is very large, and it is impossible to correct the actions of fans only by manpower. Therefore, a smarter and more efficient recognition technology can solve this problem, namely "human pose estimation algorithm".
Human pose estimation algorithm?
Human posture is one of the important biological features of the human body, and posture estimation is an important technical basis for human digitization and intelligence. It has a wide range of application scenarios, including gait analysis, video surveillance, augmented reality, human-computer interaction, sports science, etc. .
What we call human pose estimation (Pose Estimation) includes key technologies such as target detection, human skeleton key point detection, and segmentation. The key points of the skeleton are used to estimate the pose of the human body.
It can be seen that the engineers in the short video look like "stickman", which represents the accurate identification of 18 key points of human skeleton (head, shoulder joints, elbow joints, etc.).
18 key points of human skeleton
The human pose estimation algorithm of Alibaba Cloud Video Cloud can quickly and accurately realize action recognition in static and dynamic scenes, and, more importantly, can realize real-time tracking and recognition of multiple people on the mobile terminal with the movement of the human body.
Mobile terminal real-time human body 18 key point recognition
You must know that it is not easy to achieve real-time tracking and identification of multiple people on the mobile terminal.
The mobile terminal is limited by the hardware computing power, especially some low-end computers with poor performance. It is necessary to design a lightweight model structure and engineering strategy to achieve real-time operation. For individual forecasting, the time consumption will increase proportionally, and it is difficult to complete in real time.
In order to achieve a balance between real-time and accuracy, Alibaba Cloud Video Cloud technically implements the Bottom-Up solution (first detects all the joints in the image, and then determines which person each joint belongs to. The implementation steps are the key to detecting key points). Point matching) has been improved and optimized as a whole, which predicts two feature map branches, one is the prediction of unknown key points such as shoulder joints and elbow joints, and the other is to predict the vector field between two key points. , which is used to determine which person in the graph each key point belongs to, and to assemble a complete person through the "Hungarian algorithm".
In this way, real-time gesture recognition of multiple people on the mobile terminal can be realized, and a lower threshold and wider application space of commercial scenarios can be opened.
The real value of algorithms
The exploration of technology, the destination serves people's life.
One application of human pose estimation techniques is action recognition.
For example, in sports and fitness scenarios, the human posture estimation algorithm of Alibaba Cloud Video Cloud can not only identify various actions and identify and warn of risky actions, but also give information feedback such as action accuracy, and use motion to be more accurate and real-time. , more accurate judgments of people, and more digital sports technology services such as exercise counting for various movements such as skipping, squats, and push-ups.
Action Recognition for Human Pose Estimation
From figurative to scene, technology has important value in elderly care, medicine, sports competition, sports training and other scenes.
In the elderly care industry, algorithms can accurately identify the potentially dangerous actions of the elderly and issue early warnings in real time;
In the medical field, technology can be used to observe bone recovery and monitor patient posture;
In the field of competitive sports, it is possible to supervise the posture of the athletes, create an auxiliary training system, analyze the movements of the athletes at every moment, and assist the athletes to find a better posture;
In daily sports, technology can be more intuitively applied to automatic teaching of various types of fitness, sports, dance, etc.
Of course, in more specific scenarios such as video surveillance, financial services, mobile payment, entertainment and social interaction, and game interaction, the technology has more valuable application space.
More intelligent virtual human driver
Another interesting application of human pose estimation technology is to drive virtual humans by tracking changes in human poses.
Generally speaking, a human-driven virtual digital human is based on a real person, and forms a virtual avatar through 3D modeling, motion capture technology, rendering and other technologies.
At present, the main implementation methods of virtual digital human body movements are optical motion capture, inertial motion capture, and computer vision-based motion capture.
The principle of optical motion capture is to track, identify and name each reflective marker on the target, obtain the basic skeleton of the target, and then continuously track the marker. Inertial motion capture is mainly equipped with inertial sensors used on the main skeletal nodes of the human body, so as to complete data collection and establish a three-dimensional model of motion through processing.
However, these two current methods have many problems: high requirements for the environment, high hardware and software equipment, low precision, and easy accumulation of errors in continuous use.
https://www.youku.com/video/XNTg3MTAyMzA0OA==
Virtual human real-time driving instructions
However, as you can see, the engineer in the video can accurately drive the virtual human even in an ordinary outdoor environment and without a wearable sensor device. This is computer vision-based motion capture, which is based on two-dimensional images captured. , 3D shape features to restore the motion information of each joint point.
It is worth mentioning that the technical team of Alibaba Cloud Video Cloud has realized the facial expression simulation of virtual human through the ultra-light face tracking and AAI inference framework of Video Cloud, and can realize real-time driving + rendering on the CPU of the PC. The full-state simulation of palms and gestures is being added to bring more experience and value to the interactivity and timeliness of virtual humans.
In addition to the above, many applications of Alibaba Cloud Video Cloud's human pose estimation algorithm have been integrated into Alibaba Cloud Queen SDK products. On the basis of human pose recognition, Queen itself has a number of human special effects functions, such as precise body contouring and slimming. , you are also welcome to experience the demo ( experience link ).
The important application value of human pose estimation is not only rich in life scenarios, but also attracts the attention of more and more industrial and academic researchers. In the future, with the continuous upgrading of related technologies of human pose estimation, its application advantages will be more obvious, and the application fields will be more extensive.
At the same time, the digitization and intelligence of the human body is a bigger issue, and it is a more breakthrough technology extension for the virtual world, the big health industry, and the industrial industry. Alibaba Cloud Video Cloud will also continue to explore advanced visual intelligence technologies to promote the digitization of the human body. And the real landing of intelligence in all walks of life.
"Video Cloud Technology", your most noteworthy public account of audio and video technology, pushes practical technical articles from the frontline of Alibaba Cloud every week, where you can communicate with first-class engineers in the audio and video field. Reply to [Technology] in the background of the official account, you can join the Alibaba Cloud video cloud product technology exchange group, discuss audio and video technology with industry leaders, and obtain more latest industry information.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。