admin管理员组

文章数量:1316006

I get a strange error message when running a OCRmyPDF command

My setup:

  • macOS Sequoia 15.2
  • OCRmyPDF 16.8.0 (installed by Brew)
  • tesseract 5.5.0 (installed by Brew)
  • Command: ocrmypdf -l deu+fra+eng --clean --force-ocr test.pdf test-out.pdf 2>> debugOCR.txt

I have to say that the command is triggered by the software NoodleSoft Hazel, and as far as i understand Hazel executes the shell commands in a dedicated environment. However, my setup worked fine for a few weeks, but within the processing of a batch of PDF files, the following error started to occur. Since then I was not able to bring it back to work.

The debug file debugOCR.txt shows the following error:

1 [tesseract] Error in fopenReadStream: failed to open locally with tail 000001_ocr.png for filename /tmp/ocrmypdf.io.81a_o2mw/000001_ocr.png
1 [tesseract] Leptonica Error in findFileFormat: image file not found: /tmp/ocrmypdf.io.81a_o2mw/000001_ocr.png
1 [tesseract] Error in fopenReadStream: failed to open locally with tail PNG for filename PNG
1 [tesseract] Leptonica Error in pixRead: image file not found: PNG
1 [tesseract] Image file PNG cannot be read!
1 [tesseract] Error during processing.
SubprocessOutputError

In the folder /tmp i can't find any subfolder like /tmp/ocrmypdf.io.81a_o2mw/.

I also have to mention that when executing the following commands directly in Apple Terminal, they work fine:

ocrmypdf -l deu+fra+eng --clean --force-ocr test.pdf test-out.pdf 2>> debugOCR.txt
tesseract test.tiff output --oem 1 -l eng pdf 

Any hints where I have to dig deeper? Is ocrmypdf or tesseract missing some environment variables in the Hazel environment? Other hints?

Thanks a lot

AJ

I get a strange error message when running a OCRmyPDF command

My setup:

  • macOS Sequoia 15.2
  • OCRmyPDF 16.8.0 (installed by Brew)
  • tesseract 5.5.0 (installed by Brew)
  • Command: ocrmypdf -l deu+fra+eng --clean --force-ocr test.pdf test-out.pdf 2>> debugOCR.txt

I have to say that the command is triggered by the software NoodleSoft Hazel, and as far as i understand Hazel executes the shell commands in a dedicated environment. However, my setup worked fine for a few weeks, but within the processing of a batch of PDF files, the following error started to occur. Since then I was not able to bring it back to work.

The debug file debugOCR.txt shows the following error:

1 [tesseract] Error in fopenReadStream: failed to open locally with tail 000001_ocr.png for filename /tmp/ocrmypdf.io.81a_o2mw/000001_ocr.png
1 [tesseract] Leptonica Error in findFileFormat: image file not found: /tmp/ocrmypdf.io.81a_o2mw/000001_ocr.png
1 [tesseract] Error in fopenReadStream: failed to open locally with tail PNG for filename PNG
1 [tesseract] Leptonica Error in pixRead: image file not found: PNG
1 [tesseract] Image file PNG cannot be read!
1 [tesseract] Error during processing.
SubprocessOutputError

In the folder /tmp i can't find any subfolder like /tmp/ocrmypdf.io.81a_o2mw/.

I also have to mention that when executing the following commands directly in Apple Terminal, they work fine:

ocrmypdf -l deu+fra+eng --clean --force-ocr test.pdf test-out.pdf 2>> debugOCR.txt
tesseract test.tiff output --oem 1 -l eng pdf 

Any hints where I have to dig deeper? Is ocrmypdf or tesseract missing some environment variables in the Hazel environment? Other hints?

Thanks a lot

AJ

Share Improve this question asked Jan 29 at 22:16 user9803071user9803071 752 silver badges4 bronze badges 1
  • the two commands are just to show that ocrmypd and tesseract work when used separately directly in the Apple Terminal. The error message I get when I call ocrmypdf from Hazel. In Hazel i don't call tesseract directly, tesseract is being called by ocrmypdf as a sub-process, as far as I understand. – user9803071 Commented Jan 30 at 7:18
Add a comment  | 

1 Answer 1

Reset to default 0

https://github/tesseract-ocr/tesseract/issues/4333

This is likely the issue.

I faced the same while using wcgw mcp which also has a separate terminal evironment.

Setting TMPDIR to //tmp helped me.

本文标签: OCRmyPDFWired error message from tesseractStack Overflow