It took 2 days, I made a somatosensory game console

Hello everyone, welcome to Crossin's programming classroom~

I haven't seen you for a few days, what game did Crossin go to? This time I'm not making games, but consoles! And it's a somatosensory game machine .

When it comes to somatosensory games, what you may think of most now is the fitness ring adventure on the switch.

But in the past few years, there is actually another very popular somatosensory game device, which is the kinect on the xbox. Unlike the switch, which uses a controller with a sensor to recognize the player's movements, the kinect uses a set of cameras to recognize the player's movements through images.

The demo I made this time is a motion recognition system using a camera. In theory, this identification system only needs ordinary computers and ordinary cameras to run. However, I just recently got a good thing, which can greatly improve my development efficiency and operation efficiency this time.

That's what I'm talking about, the full name is Jetson AGX Orin , which is NVIDIA's AI edge computing device. What is called edge computing, simply put, the processing of data is as close as possible to the application that generates the data. Like robots, like autonomous driving. This scenario has high requirements for real-time computing. It is hard to say that the data is transmitted to the computing center, and the big guys in the computer room process it and then return the result to the device. Therefore, for edge computing devices, one is to have strong computing power, and the other is to be small enough, not only small in size, but also in low energy consumption.

This AGX Orin is the latest in the NVIDIA Jetson series. How new is it, that is, there is no spot to buy on the market at present, and it can only be ordered. So my machine can be said to be a global limited edition. Compared to the previous generation Jetson AGX Xavier, it has an 8x increase in performance, reaching 275 trillion calculations per second. This level is already equivalent to a server with a built-in GPU, and its size is small enough to be held in the palm of the hand.

In addition to powerful hardware, there is also the support of NVIDIA AI-related software. And for most common AI applications, such as face recognition, object recognition, action recognition, natural language processing, speech synthesis, etc., it provides some pre-trained models. This is simply too convenient.

After booting and installing the JetPack development kit, there are many test programs that can be run for you to experience. The official documentation also provides many examples to help developers get started.

Here are the results of my running visual and conversational AI benchmarks:

It can be seen that there is a very significant improvement compared to the previous generation:

In the official Hello AI World, some demos are also provided.

For example, object recognition, it only takes ten milliseconds to recognize a frame, which can be used in real-time video surveillance or even in ongoing games.

There is also such a demo:

Man, that's not half of my work done.

After getting the human pose data, we can use the data corresponding to various actions to train a classifier. Then, the classifier is used to identify the user's gesture captured by the camera in real time, and determine the action. Then according to the recognized actions, send keyboard commands to the system. In this way, a simple interaction system based on human actions is completed.

In the github repository of NVIDIA Smart Internet of Things, I found a similar project that uses gestures to navigate the web.

https://github.com/NVIDIA-AI-IOT/trt\_pose\_hand

It uses SVM support vector machine to train gesture classifier, which uses Python's scikit-learn module. The same approach can be used for our second part, except that we use a full-body model of the human body.

To train a classifier, we need a little sample data.

After that, the keyboard command is sent through the pynput module.

Putting all of the above together gives us the functionality we want:

A system that can use actions to play games

Video Demonstration: It took 2 days, I made a somatosensory game console_bilibili_bilibili

For Orin, it is actually a bit of a cannonball to hit mosquitoes in this project, because the pre-trained models are used for posture judgment and action recognition, and the real-time calculation amount is not large. But its software environment and development community resources have greatly improved my development efficiency this time.

The only downside is that even github, apt, and pip from my home network are so slow that it takes a lot of time to install the environment. It would be better if there is a set of domestic mirrors for related resources.

Finally, there is a little easter egg, have you noticed the game KOF97 that I used to demonstrate. In 2009, the year before the official release of kinect, my master's graduation project was actually: a human-computer interaction system using a single camera

In the action recognition part, the SVM support vector machine is also used. During the defense, the demo game I used was KOF97

In the work outlook at the end of the paper, I wrote:

Unexpectedly, after 13 years, I filled this hole myself. This reminds me of what Steve Jobs once said:

Believe that the dots that we have experienced in our life will be connected in some way someday in the future. Steve Jobs

The code in this article is modified based on the official NVIDIA example:

https://github.com/NVIDIA-AI-IOT/trt \_pose\_hand

Operating environment:

NVIDIA Jetson AGX Orin

JetPack 5.0

Python 3.8.10

The code is open source: http://python666.cn/c/2

 获取更多教程和案例，
欢迎搜索及关注：Crossin的编程教室
每天5分钟，轻松学编程。

It took 2 days, I made a somatosensory game console

Crossin先生

引用和评论

Python写个“点球大战”小游戏

Open WebUI：开源AI交互平台的全面解析

大模型中的Token究竟是什么？从原理到作用深度解析

一文掌握 MCP 上下文协议：从理论到实践

MySQL × 向量数据库：大模型时代的黄金组合实战指南

AdventureX 2025 正式启动：五天四夜，120小时极限创造！一起在杭州点燃青年创新之火！

大模型时代，后端程序员如何避免被AI卷死？