python - Tesseract OCR Command in ocrmypdf Fails with 'SubprocessOutputError' on Windows - Stack Overflow-软件玩家

admin管理员组
文章数量:1399830

ExitCodeException                                                                                         _common.py:271
Traceback (most recent call last):
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\_exec\tesseract.py", line 313, in generate_hocr
    p = run(args_tesseract, stdout=PIPE, stderr=STDOUT, timeout=timeout, check=True)
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\subprocess\__init__.py", line 62, in run
    proc = subprocess_run(args, env=env, check=check, **kwargs)
  File "C:\<USER>\apps\python\current\Lib\subprocess.py", line 579, in run
    raise CalledProcessError(retcode, process.args,
                             output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['C:\\<USER>\\shims\\tesseract.EXE', '-l', 'eng',
'C:\\<USER>\\AppData\\Local\\Temp\\ocrmypdf.io.<RANDOM>\\000045_ocr.png',
'C:\\<USER>\\AppData\\Local\\Temp\\ocrmypdf.io.<RANDOM>\\000045_ocr_hocr', 'hocr', 'txt']' returned
non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\_pipelines\_common.py", line 261, in cli_exception_handler
    return fn(options, plugin_manager)
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\_pipelines\ocr.py", line 181, in _run_pipeline
    optimize_messages = exec_concurrent(context, executor)
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\_pipelines\ocr.py", line 117, in exec_concurrent
    executor(
    ~~~~~~~~^
        use_threads=options.use_threads,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
        task_finished=update_page,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^ 
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\_concurrent.py", line 78, in __call__
    self._execute(
    ~~~~~~~~~~~~~^
        use_threads=use_threads,
        ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<5 lines>...
        task_finished=task_finished,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\builtin_plugins\concurrency.py", line 144, in _execute
    result = future.result()
  File "C:\<USER>\apps\python\current\Lib\concurrent\futures\_base.py", line 449, in result
    return self.__get_result()
           ~~~~~~~~~~~~~~~~~^^
  File "C:\<USER>\apps\python\current\Lib\concurrent\futures\_base.py", line 401, in __get_result
    raise self._exception
  File "C:\<USER>\apps\python\current\Lib\concurrent\futures\thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\_pipelines\ocr.py", line 81, in _exec_page_sync
    ocr_out, text_out = _image_to_ocr_text(page_context, ocr_image_out)
                        ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\_pipelines\ocr.py", line 62, in _image_to_ocr_text
    hocr_out, text_out = ocr_engine_hocr(ocr_image_out, page_context)
                         ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\_pipeline.py", line 678, in ocr_engine_hocr
    ocr_engine.generate_hocr(
    ~~~~~~~~~~~~~~~~~~~~~~~~^
        input_file=input_file,
        ^^^^^^^^^^^^^^^^^^^^^^
    ...<9 lines>...
        user_patterns=options.user_patterns,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\builtin_plugins\tesseract_ocr.py", line 268, in generate_hocr
    tesseract.generate_hocr(
    ~~~~~~~~~~~~~~~~~~~~~~~^
        input_file=input_file,
        ^^^^^^^^^^^^^^^^^^^^^^
    ...<9 lines>...
        options=options,
        ^^^^^^^^^^^^^^^^
    )
  File "C:\<USER>\apps\python\current\Lib\site-packages\ocrmypdf\_exec\tesseract.py", line 327, in generate_hocr
    raise SubprocessOutputError() from e
ocrmypdf.exceptions.SubprocessOutputError

This error came as a result of using "ocrmypdf --skip-text '.\input.pdf' output.pdf -v" I get the above error using OCRMYPDF, I installed it with scoop on Windows 11. The PDF was originally a DJVU file, which I converted into a PostScript file and then converted to a PDF.

I used this tutorial to install OCRMYPDF on Windows: .html

This all is a massive headache and haven't found a solution to.

本文标签： pythonTesseract OCR Command in ocrmypdf Fails with 39SubprocessOutputError39 on WindowsStack Overflow

版权声明：本文标题：python - Tesseract OCR Command in ocrmypdf Fails with 'SubprocessOutputError' on Windows - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744136214a2592408.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - Tesseract OCR Command in ocrmypdf Fails with 'SubprocessOutputError' on Windows - Stack Overflow

更多相关文章

python - Tesseract OCR Command in ocrmypdf Fails with 'SubprocessOutputError' on Windows - Stack Overflow

发表评论

推荐文章

javascript - Div display with Backbone.js - Stack Overflow

categories - how to edit wp category widget

javascript - How can I check whether innerHTML is empty? - Stack Overflow

python - Why does my Llama 3.1 model act differently between AutoModelForCausalLM and LlamaForCausalLM? - Stack Overflow

How to Integrate Django Djoser Backend with Next.js Frontend Using Auth.js? - Stack Overflow

热门文章

javascript - D3JS remove classes similar to JQuery removeClass()? - Stack Overflow

heatmap - changing the color of text inside the correlation matrix heat map using metan and plot function in r - Stack Overflow

processing - strokeWeight not applied to PShape? - Stack Overflow

javascript - ERROR: The JSX syntax extension is not currently enabled (in angular project) - Stack Overflow

javascript - Draggable div is sticked to the right side of screen until max-width is reached while dragging - Stack Overflow

javascript - Is there a way to format text in embeds using discord.js? - Stack Overflow

jquery - JavaScript object call one method from another - Stack Overflow

Javascript memoize find array - Stack Overflow

node.js - Error: Cannot find module in node_modules when running npm run dev - Stack Overflow

javascript - remove slashes jquery from json - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

WordPress redirects me to posts page after editing one page

reactjs - Docker, watchtower, and react in production, how to build CICD? - Stack Overflow

How to import only certain comments from post[s]

javascript - Unknown word error in css file even with css-loader - Stack Overflow

fft - How can I calculate the output frequency steps of the rpmfreqmap function in matlab - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

编程频道|软件玩家 - 软件改变生活！

python - Tesseract OCR Command in ocrmypdf Fails with &#39;SubprocessOutputError&#39; on Windows - Stack Overflow

更多相关文章

python - Tesseract OCR Command in ocrmypdf Fails with &#39;SubprocessOutputError&#39; on Windows - Stack Overflow

发表评论

推荐文章

javascript - Div display with Backbone.js - Stack Overflow

categories - how to edit wp category widget

javascript - How can I check whether innerHTML is empty? - Stack Overflow

python - Why does my Llama 3.1 model act differently between AutoModelForCausalLM and LlamaForCausalLM? - Stack Overflow

How to Integrate Django Djoser Backend with Next.js Frontend Using Auth.js? - Stack Overflow

热门文章

javascript - D3JS remove classes similar to JQuery removeClass()? - Stack Overflow

heatmap - changing the color of text inside the correlation matrix heat map using metan and plot function in r - Stack Overflow

processing - strokeWeight not applied to PShape? - Stack Overflow

javascript - ERROR: The JSX syntax extension is not currently enabled (in angular project) - Stack Overflow

javascript - Draggable div is sticked to the right side of screen until max-width is reached while dragging - Stack Overflow

javascript - Is there a way to format text in embeds using discord.js? - Stack Overflow

jquery - JavaScript object call one method from another - Stack Overflow

Javascript memoize find array - Stack Overflow

node.js - Error: Cannot find module in node_modules when running npm run dev - Stack Overflow

javascript - remove slashes jquery from json - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

WordPress redirects me to posts page after editing one page

reactjs - Docker, watchtower, and react in production, how to build CICD? - Stack Overflow

How to import only certain comments from post[s]

javascript - Unknown word error in css file even with css-loader - Stack Overflow

fft - How can I calculate the output frequency steps of the rpmfreqmap function in matlab - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

python - Tesseract OCR Command in ocrmypdf Fails with 'SubprocessOutputError' on Windows - Stack Overflow

python - Tesseract OCR Command in ocrmypdf Fails with 'SubprocessOutputError' on Windows - Stack Overflow