开始使用苹果的 Vision 框架

Introduction: The Vision framework was introduced by Apple in 2017 at WWDC as part of iOS 11. It marked a turning point in machine vision and image analysis, providing native tools. In 2017, it introduced text recognition, face recognition, detection of rectangular shapes, and barcode and QR code recognition. Since then, Apple has continuously enhanced it. By 2024 with iOS 18, it offers improved text recognition accuracy for many languages, face and feature detection, movement analysis, pose recognition, object tracking in video, better integration with CoreML, and deep integration with related frameworks.
VNRequest: Vision has an abstract class VNRequest that defines data request structures. Subclasses implement specific requests. The VNRequest initializer takes a completion handler. VNRequestCompletionHandler is a typealias that returns a VNRequest with results or an error. The VNRecognizeTextRequest class is for text recognition. An example shows how to implement text recognition: create a request, handle results with VNRecognizedTextObservation, and process the image.
VNDetectFaceRectanglesRequest: This class finds faces in an image and returns their coordinates. An example shows how to implement face recognition: create a VNDetectFaceRectanglesRequest, handle results with VNFaceObservation, and process the image. It can be used in KYC onboarding for confirming a real person's face.
VNDetectBarcodesRequest: This class recognizes and reads barcodes and QR codes from an image. An example shows how to implement barcode and QR code recognition: create a VNDetectBarcodesRequest, handle results with VNBarcodeObservation, and process the image. It can be used in a QR scanner.