Hand-in-hand Golang realizes face recognition for static images and video streams

Speaking of face recognition, the first implementation method that everyone thinks of should be Python to do related processing, because the relevant machine learning frameworks and libraries have been packaged better. But the implementation we discussed today is replaced by Golang, and Golang is used to do the corresponding processing of face recognition for static images and video streams.

Static image face recognition

First, let's perform static face recognition. Compared with the Python community, Golang is relatively small, but there are still some excellent libraries for us to use. Today we are using the go-face library. The library uses dlib to realize face recognition, a very popular machine learning tool set, which can be said to be one of the most used software packages in face recognition. It has a wide range of applications in the industry and academia, covering robotics, embedded devices, mobile devices, and so on. The document on its official website mentioned that the accuracy of identifying marked faces in the Wild benchmark test reached an astonishing 99.4%, which also explains why it can be widely used.

Before we start the code, we first need to install dlib. The Windows platform is relatively troublesome. There are specific installation plans on the official website. Here I will introduce two platforms.

Ubuntu 18.10+, Debian sid

The latest versions of Ubuntu and Debian both provide suitable dlib packages, so you only need to run them.

# Ubuntu
sudo apt-get install libdlib-dev libblas-dev liblapack-dev libjpeg-turbo8-dev
# Debian
sudo apt-get install libdlib-dev libblas-dev liblapack-dev libjpeg62-turbo-dev

macOS

Make sure that Homebrew .

brew install dlib

Project creation and preparation

Create a project file in the src directory of GOPATH, the command is as follows.

sudo makedir go-face-test
# 创建 main.go
sudo touch main.go

Then enter this directory and generate mod files.

sudo go mod init

go.mod file should have been generated in the go-face-test directory.

The library requires three models shape_predictor_5_face_landmarks.dat , mmod_human_face_detector.dat and dlib_face_recognition_bb9ab9 directory, download the corresponding test data under the directory of 534.dat 1611a534bb9ab6.

git clone https://github.com/Kagami/go-face-testdata testdata

The final project structure should be as shown in the figure.

Code

First, we use code to check whether the environment is normal. Initialize the recognizer and release resources.

package main

import (
    "fmt"

    "github.com/Kagami/go-face"
)

const dataDir = "testdata"

// testdata 目录下两个对应的文件夹目录
const (
    modelDir  = dataDir + "/models"
    imagesDir = dataDir + "/images"
)

func main() {
    fmt.Println("Face Recognition...")

    // 初始化识别器
    rec, err := face.NewRecognizer(modelDir)
    if err != nil {
        fmt.Println("Cannot INItialize recognizer")
    }
    defer rec.Close()

    fmt.Println("Recognizer Initialized")
}

Compile and then run the code.

sudo go run main.go

You should get the following output.

Face Recognition...
Recognizer Initialized

At this point, we have successfully set up everything we need.

Detect the number of faces in the picture

First, prepare a photo of Lin Junjie and put it in any directory. For the convenience of demonstration, I put it in main.go at the same level.

As you can see, there is nothing but one picture now. Next, we have to let the computer count the number of faces in the picture.

package main

import (
    "fmt"
    "log"

    "github.com/Kagami/go-face"
)

const dataDir = "testdata"

// testdata 目录下两个对应的文件夹目录
const (
    modelDir  = dataDir + "/models"
    imagesDir = dataDir + "/images"
)

func main() {
    fmt.Println("Face Recognition...")

    // 初始化识别器
    rec, err := face.NewRecognizer(modelDir)
    if err != nil {
        fmt.Println("Cannot INItialize recognizer")
    }
    defer rec.Close()

    fmt.Println("Recognizer Initialized")

    // 调用该方法，传入路径。返回面部数量和任何错误
    faces, err := rec.RecognizeFile("linjunjie.jpeg")
    if err != nil {
        log.Fatalf("无法识别: %v", err)
    }
    // 打印人脸数量
    fmt.Println("图片人脸数量: ", len(faces))
}

The core code is actually just one line, go-face encapsulates the method of identification, passes in the image file of the corresponding path, and the result after executing the code is as follows.

Face Recognition...
Recognizer Initialized
图片人脸数量:  1

Now stupid computers can already count the number of faces. Then.... If there are many people in a photo, let's try it and prepare a picture of a group of people.

heyin.jpeg

We can replace the 31st line of code with the following.

faces, err := rec.RecognizeFile("heyin.jpeg")

The result of the operation should be printed ( picture face number: 6 ), and then we will officially show our face recognition.

Face recognition

First, we prepare a group photo, here we still use the heyin.jpeg above.

The whole process is roughly divided into the following steps.

1. Map the person in the group photo to a unique ID, and then associate the unique ID with the corresponding person.

var samples []face.Descriptor
    var peoples []int32
    for i, f := range faces {
        samples = append(samples, f.Descriptor)
        // 每张脸唯一 id
        peoples = append(peoples, int32(i))
    }

    // Pass samples to the recognizer.
    rec.SetSamples(samples, peoples)

2. Next, we encapsulate a face recognition method, pass in the recognizer and photo path, and print the corresponding person ID and person name.

func RecognizePeople(rec *face.Recognizer, file string) {
    people, err := rec.RecognizeSingleFile(file)
    if err != nil {
        log.Fatalf("无法识别: %v", err)
    }
    if people == nil {
        log.Fatalf("图片上不是一张脸")
    }
    peopleID := rec.Classify(people.Descriptor)
    if peopleID < 0 {
        log.Fatalf("无法区分")
    }
    fmt.Println(peopleID)
    fmt.Println(labels[peopleID])
}

3. Finally, we pass in the pictures we want to recognize. At present, 3 pictures have been passed in. Interested friends can pass in other pictures to try.

jay.jpeg

linjunjie.jpeg

taozhe.jpeg

4. Call three times.

RecognizePeople(rec, "jay.jpeg")
    RecognizePeople(rec, "linjunjie.jpeg")
    RecognizePeople(rec, "taozhe.jpeg")

code show as below

package main

import (
    "fmt"
    "log"

    "github.com/Kagami/go-face"
)

const dataDir = "testdata"

// testdata 目录下两个对应的文件夹目录
const (
    modelDir  = dataDir + "/models"
    imagesDir = dataDir + "/images"
)

// 图片中的人名
var labels = []string{
    "萧敬腾",
    "周杰伦",
    "unknow",
    "王力宏",
    "陶喆",
    "林俊杰",
}

func main() {
    fmt.Println("Face Recognition...")

    // 初始化识别器
    rec, err := face.NewRecognizer(modelDir)
    if err != nil {
        fmt.Println("Cannot INItialize recognizer")
    }
    defer rec.Close()

    fmt.Println("Recognizer Initialized")

    // 调用该方法，传入路径。返回面部数量和任何错误
    faces, err := rec.RecognizeFile("heyin.jpeg")
    if err != nil {
        log.Fatalf("无法识别: %v", err)
    }
    // 打印人脸数量
    fmt.Println("图片人脸数量: ", len(faces))

    var samples []face.Descriptor
    var peoples []int32
    for i, f := range faces {
        samples = append(samples, f.Descriptor)
        // 每张脸唯一 id
        peoples = append(peoples, int32(i))
    }

    // 传入样例到识别器
    rec.SetSamples(samples, peoples)

    RecognizePeople(rec, "jay.jpeg")
    RecognizePeople(rec, "linjunjie.jpeg")
    RecognizePeople(rec, "taozhe.jpeg")
}

func RecognizePeople(rec *face.Recognizer, file string) {
    people, err := rec.RecognizeSingleFile(file)
    if err != nil {
        log.Fatalf("无法识别: %v", err)
    }
    if people == nil {
        log.Fatalf("图片上不是一张脸")
    }
    peopleID := rec.Classify(people.Descriptor)
    if peopleID < 0 {
        log.Fatalf("无法区分")
    }
    fmt.Println(peopleID)
    fmt.Println(labels[peopleID])
}

operation result

Finally we run the code.

go build main.go
./main

The result is as follows

图片人脸数量:  6
1
周杰伦
5
林俊杰
4
陶喆

Congratulations, you have successfully identified who these three pictures are. At this step, the static image face recognition has been completed.

Static face recognition summary

At this point, we can successfully use Go to achieve static face recognition. It is not impossible to apply it to projects, but it has many limitations. The scene used is relatively single, and it can only be used in scenes such as user uploading face recognition, single face recognition, etc.; the image format is relatively single, and PNG is temporarily not supported. Disadvantages such as format.

Video streaming face recognition

background

Static face recognition application scenarios are relatively limited and cannot be placed in more important environments, such as finance, insurance, security and other fields, and there are possibilities for forgery. And simple static face recognition is of little significance. The dynamic video stream has a broader application space and is fully used in the fields of intelligent security, gesture recognition, and beauty. In the 5G era, many businesses will focus on video. How to decouple the video business from the core business? ’s 1611a534bba287 RTE components are doing well. As the pioneer of RTE-PaaS, Soundnet already has more technologies. Accumulation, there are many benefits in the form of RTE components.

RTE advantages

1. Application independence

It can be shared among different projects to realize reuse and avoid the repetitive work of multiple development

2. Platform independence

Widely used in operating systems, programming languages and various fields

3. Rich three-party modules

Able to provide many modules for developers to use, such as whiteboard teaching, video beautification, pornography, etc.

Code

Here we will implement the relevant face recognition of the video stream. The previous static recognition is to pave the way for the face recognition of the dynamic video stream. Let's talk about the realization of face recognition in video streams. The face recognition of static images has been completed, and the video is a continuous multi-frame. We only need to extract the fragments to capture the key frames, identify the portrait, and output the corresponding correlation after the person. Person's name.

Ready to work

Here we use gocv (the bottom layer uses OpenCV), here we temporarily skip the specific installation process, and install it according to the official document.

1. Set the video capture device, generally the default is 0

// set to use a video capture device 0
    deviceID := 0

    // open webcam
    webcam, err := gocv.OpenVideoCapture(deviceID)
    if err != nil {
        fmt.Println(err)
        return
    }
    defer webcam.Close()

2. Open the display window

// open display window
    window := gocv.NewWindow("Face Detect")
    defer window.Close()

3. Prepare the image matrix and display the configuration of the rectangular frame when the face is detected

// prepare image matrix
    img := gocv.NewMat()
    defer img.Close()

    // color for the rect when faces detected
    blue := color.RGBA{0, 0, 255, 0}

4. Load the face recognition classifier, use an infinite loop, and add our relevant recognition service

for {
        if ok := webcam.Read(&img); !ok {
            fmt.Printf("cannot read device %v\n", deviceID)
            return
        }
        if img.Empty() {
            continue
        }

        // detect faces
        rects := classifier.DetectMultiScale(img)
        fmt.Printf("found %d faces\n", len(rects))

        // draw a rectangle around each face on the original image
        for _, r := range rects {
            gocv.Rectangle(&img, r, blue, 3)
      imgFace := img.Region(r)
            buff, err:=gocv.IMEncode(".jpg",imgFace)
            if err != nil {
                fmt.Println("encoding to jpg err:%v", err)
                break
            }

            RecognizePeopleFromMemory(rec, buff)
        }

        // show the image in the window, and wait 1 millisecond
        window.IMShow(img)
        window.WaitKey(1)
    }

There are several steps that need to be changed. Currently, gocv.IMEncode only supports the conversion of captured images into PNG , JPG , GIF three formats. Put the converted byte stream in memory, and then pass the byte stream into our face recognition function.

// RecognizeSingle returns face if it's the only face on the image or
// nil otherwise. Only JPEG format is currently supported. Thread-safe.
func (rec *Recognizer) RecognizeSingle(imgData []byte) (face *Face, err error) {
    faces, err := rec.recognize(0, imgData, 1)
    if err != nil || len(faces) != 1 {
        return
    }
    face = &faces[0]
    return
}

Notes

Since go-face only supports JPEG format, the frames we capture can only be converted to JPG format

Then simply encapsulate the recognition function of a character stream. It needs to be explained here that the reason why log.Fatal is replaced with log.Println is that there may be no faces in the recognition of the video stream level. At this time, the program should be running normally and cannot be exited.

func RecognizePeopleFromMemory(rec *face.Recognizer, img []byte) {
    people, err := rec.RecognizeSingle(img)
    if err != nil {
        log.Println("无法识别: %v", err)
        return
    }
    if people == nil {
        log.Println("图片上不是一张脸")
        return
    }
    peopleID := rec.Classify(people.Descriptor)
    if peopleID < 0 {
        log.Println("无法区分")
        return
    }
    fmt.Println(peopleID)
    fmt.Println(labels[peopleID])
}

The final complete code is as follows

package main

import (
    "fmt"
    "image/color"
    "log"

    "github.com/Kagami/go-face"
    "gocv.io/x/gocv"
)

const dataDir = "testdata"

// testdata 目录下两个对应的文件夹目录
const (
    modelDir  = dataDir + "/models"
    imagesDir = dataDir + "/images"
)

// 图片中的人名
var labels = []string{
    "萧敬腾",
    "周杰伦",
    "unknow",
    "王力宏",
    "陶喆",
    "林俊杰",
}

func main() {
    // 初始化识别器
    rec, err := face.NewRecognizer(modelDir)
    if err != nil {
        fmt.Println("Cannot INItialize recognizer")
    }
    defer rec.Close()

    fmt.Println("Recognizer Initialized")

    // 调用该方法，传入路径。返回面部数量和任何错误
    faces, err := rec.RecognizeFile("heyin.jpeg")
    if err != nil {
        log.Fatalf("无法识别: %v", err)
    }
    // 打印人脸数量
    fmt.Println("图片人脸数量: ", len(faces))

    var samples []face.Descriptor
    var peoples []int32
    for i, f := range faces {
        samples = append(samples, f.Descriptor)
        // 每张脸唯一 id
        peoples = append(peoples, int32(i))
    }

    // Pass samples to the recognizer.
    rec.SetSamples(samples, peoples)

    RecognizePeople(rec, "jay.jpeg")
    RecognizePeople(rec, "linjunjie.jpeg")
    RecognizePeople(rec, "taozhe.jpeg")

    // set to use a video capture device 0
    deviceID := 0

    // open webcam
    webcam, err := gocv.OpenVideoCapture(deviceID)
    if err != nil {
        fmt.Println(err)
        return
    }
    defer webcam.Close()

    // open display window
    window := gocv.NewWindow("Face Detect")
    defer window.Close()

    // prepare image matrix
    img := gocv.NewMat()
    defer img.Close()

    // color for the rect when faces detected
    blue := color.RGBA{0, 0, 255, 0}

    // load classifier to recognize faces
    classifier := gocv.NewCascadeClassifier()
    defer classifier.Close()

    if !classifier.Load("./haarcascade_frontalface_default.xml") {
        fmt.Println("Error reading cascade file: data/haarcascade_frontalface_default.xml")
        return
    }

    fmt.Printf("start reading camera device: %v\n", deviceID)
    for {
        if ok := webcam.Read(&img); !ok {
            fmt.Printf("cannot read device %v\n", deviceID)
            return
        }
        if img.Empty() {
            continue
        }

        // detect faces
        rects := classifier.DetectMultiScale(img)
        if len(rects) == 0 {
            continue
        }

        fmt.Printf("found %d faces\n", len(rects))

        // draw a rectangle around each face on the original image
        for _, r := range rects {
            gocv.Rectangle(&img, r, blue, 3)

            imgFace := img.Region(r)
            buff, err:=gocv.IMEncode(".jpg",imgFace)
            if err != nil {
                fmt.Println("encoding to jpg err:%v", err)
                break
            }

            RecognizePeopleFromMemory(rec, buff)
        }

        // show the image in the window, and wait 1 millisecond
        window.IMShow(img)
        window.WaitKey(1)
    }
}

func RecognizePeople(rec *face.Recognizer, file string) {
    people, err := rec.RecognizeSingleFile(file)
    if err != nil {
        log.Fatalf("无法识别: %v", err)
    }
    if people == nil {
        log.Fatalf("图片上不是一张脸")
    }
    peopleID := rec.Classify(people.Descriptor)
    if peopleID < 0 {
        log.Fatalf("无法区分")
    }
    fmt.Println(peopleID)
    fmt.Println(labels[peopleID])
}

func RecognizePeopleFromMemory(rec *face.Recognizer, img []byte) {
    people, err := rec.RecognizeSingle(img)
    if err != nil {
        log.Println("无法识别: %v", err)
        return
    }
    if people == nil {
        log.Println("图片上不是一张脸")
        return
    }
    peopleID := rec.Classify(people.Descriptor)
    if peopleID < 0 {
        log.Println("无法区分")
        return
    }
    fmt.Println(peopleID)
    fmt.Println(labels[peopleID])
}

Next, when we run the code, we should be able to pull up the camera. At this time, I hold the photo of JJ Lin for recognition. We can see that the corresponding person's name has been output in the lower left corner.

Video stream face recognition summary

At this point, congratulations, you have been able to complete face recognition in video streaming. However, it should be explained here that for fast implementation, our sample set is relatively small, and the recognition success rate is relatively low. But a simple dynamic face recognition has been set up.

Summarize

Although we have implemented dynamic face recognition, it is difficult to meet the corresponding requirements in more complex application scenarios, and there are restrictions on image formats, and there is a lack of other modules for face processing, beautification, yellow detection and other functions. However, through third-party SDKs, such as Voicenet and other platforms, to meet the corresponding requirements, scenes such as face recognition, video conferencing, and cloud classrooms in the park can be quickly set up, and the corresponding access can be completed in a few lines of code. Relevant development of face recognition around components such as RTE. In order to save a lot of time and cost in development, the development focus can be shifted to a more core business.

Hand-in-hand Golang realizes face recognition for static images and video streams

Static image face recognition

Ubuntu 18.10+, Debian sid

macOS

Project creation and preparation

Code

Detect the number of faces in the picture

Face recognition

operation result

Static face recognition summary

Video streaming face recognition

background

Code

Ready to work

Video stream face recognition summary

Summarize

RTE开发者社区

引用和评论

Rime 最新 TTS 模型 Arcana：能听到呼吸声和轻微口腔音；Bubba AI：专为卡车司机打造的语音交互智能体丨日报

为什么音量设置最大是100，却还有许多音量增强300%的插件？

手把手教你如何使用java开发人脸识别及人脸比对（附源码）

三分钟掌握视频剪辑 | 在 Rust 中优雅地集成 FFmpeg

2025版 RTC、直播、点播技术对比｜腾讯云/即构/声网如何选型

三分钟掌握音视频处理 | 在 Rust 中优雅地集成 FFmpeg

三分钟掌握视频分辨率修改 | 在 Rust 中优雅地使用 FFmpeg