tesseract安装配置
安装依赖
brew install automake autoconf libtool
brew install pkgconfig
brew install icu4c
brew install leptonica
# Packages required for training tools.
brew install pango
# Optional packages for extra features.
brew install libarchive
# Optional package for builds using g++.
brew install gcc
下载解压tesseract
编译安装
cd tesseract-4.1.1
./autogen.sh
mkdir build
cd build
# Optionally add CXX=g++-8 to the configure command if you really want to use a different compiler.
../configure PKG_CONFIG_PATH=/usr/local/opt/icu4c/lib/pkgconfig:/usr/local/opt/libarchive/lib/pkgconfig:/usr/local/opt/libffi/lib/pkgconfig
make -j
# Optionally install Tesseract.
sudo make install
# Optionally build and install training tools.
make training
sudo make training-install
下载eng.traineddata
eng.traineddata
这里只要下载其中的eng.traineddata就行了,如果需要其他的语言则按需下载,不需要全部都下载了,全部下载的话3g左右,比较大。
测试
$ tesseract 0384.jpg stdout
0 3 8 4
看报错路径,把eng.traineddata文件拷贝到缺失路径下,再次测试
pytesseract使用
依赖包安装
pip install pytesseract
导入使用
import pytesseract as pt
from PIL import Image
image = Image.open('0384.jpeg')
text = pt.image_to_string(image)
print(text)
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。