1.5 KiB
1.5 KiB
Linux
- Run
apt-get install ocrmypdf - Install ghostscript > 9.55 by following these instructions or running
scripts/install/ghostscript_install.sh. - Run
pip install ocrmypdf - Install any tesseract language packages that you want (example
apt-get install tesseract-ocr-eng) - Set the tesseract data folder path
- Find the tesseract data folder
tessdatawithfind / -name tessdata. Make sure to use the one corresponding to the latest tesseract version if you have multiple. - Create a
local.envfile in the rootmarkerfolder withTESSDATA_PREFIX=/path/to/tessdatainside it
- Find the tesseract data folder
Mac
Only needed if using ocrmypdf as the ocr backend.
- Run
brew install ocrmypdf - Run
brew install tesseract-langto add language support - Run
pip install ocrmypdf - Set the tesseract data folder path
- Find the tesseract data folder
tessdatawithbrew list tesseract - Create a
local.envfile in the rootmarkerfolder withTESSDATA_PREFIX=/path/to/tessdatainside it
- Find the tesseract data folder
Windows
- Install
ocrmypdfand ghostscript by following these instructions - Run
pip install ocrmypdf - Install any tesseract language packages you want
- Set the tesseract data folder path
- Find the tesseract data folder
tessdatawithbrew list tesseract - Create a
local.envfile in the rootmarkerfolder withTESSDATA_PREFIX=/path/to/tessdatainside it
- Find the tesseract data folder