新手上路，请多包涵

我正在使用 python 3.x 并使用以下代码将图像转换为文本：

 from PIL import Image
from pytesseract import image_to_string

image = Image.open('image.png', mode='r')
print(image_to_string(image))

我收到以下错误：

 Traceback (most recent call last):
  File "C:/Users/hp/Desktop/GII/Image_to_text.py", line 12, in <module>
    print(image_to_string(image))
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string
    config=config)
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 94, in run_tesseract
    stderr=subprocess.PIPE)
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 950, in __init__
    restore_signals, start_new_session)
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 1220, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

请注意，我已将图像放在我的 python 所在的同一目录中。此外，它不会在 image = Image.open('image.png', mode='r') 上引发错误，但会在 print(image_to_string(image)) 线上引发错误。

知道这里可能有什么问题吗？谢谢

原文由 muazfaiz 发布，翻译遵循 CC BY-SA 4.0 许可协议

python python-3.x pytesser

阅读 1k

2 个回答

得票最新

社区维基

发布于
2022-11-16

✓ 已被采纳

您必须在您的路径中安装并访问 tesseract 。

根据来源， pytesseract 只是 subprocess.Popen 的包装器，将 tesseract 二进制文件作为要运行的二进制文件。它本身不执行任何类型的 OCR。

来源的相关部分：

 def run_tesseract(input_filename, output_filename_base, lang=None, boxes=False, config=None):
    '''
    runs the command:
        `tesseract_cmd` `input_filename` `output_filename_base`

    returns the exit status of tesseract, as well as tesseract's stderr output
    '''
    command = [tesseract_cmd, input_filename, output_filename_base]

    if lang is not None:
        command += ['-l', lang]

    if boxes:
        command += ['batch.nochop', 'makebox']

    if config:
        command += shlex.split(config)

    proc = subprocess.Popen(command,
            stderr=subprocess.PIPE)
    return (proc.wait(), proc.stderr.read())

引用源的另一部分：

 # CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY
tesseract_cmd = 'tesseract'

改变 tesseract 路径的快速方法是：

 import pytesseract
pytesseract.tesseract_cmd = "/absolute/path/to/tesseract"  # this should be done only once
pytesseract.image_to_string(img)

原文由 Łukasz Rogalski 发布，翻译遵循 CC BY-SA 3.0 许可协议

社区维基

发布于
2022-11-16

请安装以下软件包以从图像 pnf/jpeg 中提取文本

pip install pytesseract

pip install Pillow

使用python pytesseract OCR（光学字符识别）是从图像中电子提取文本的过程

PIL 可用于从简单的读取和写入图像文件到科学图像处理、地理信息系统、遥感等任何事物。

 from PIL import Image
from pytesseract import image_to_string
print(image_to_string(Image.open('/home/ABCD/Downloads/imageABC.png'),lang='eng'))

原文由 thrinadhn 发布，翻译遵循 CC BY-SA 4.0 许可协议

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

图像到文本python

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

如何实现一个深拷贝函数？

Python 成员变量在多个子类实例间共享，如何避免？

为什么 Qwen2.5-Omni-7B 官方教程都报错 Cannot import available module of Qwen2_5OmniModel in modelscope ？

Spark-TTS-0.5B 的 requirements.txt 在哪里？

Stack Overflow 翻译