This repository contains the source code for Nougat OCR, a tool for Optical Character Recognition (OCR) using the Nougat model. Follow the instructions below to set up the environment and run the OCR.
-
Clone this repository:
git clone https://github.com/cudanexus/nougat.git
-
Download the model files from Hugging Face using Git LFS:
- Make sure you have Git LFS installed (Git LFS Installation )
- Run the following commands:
git lfs install
git clone https://huggingface.co/spaces/tomriddle/nougat
input
Upload nougat.pdf
nougat
output
Upload nougat.pdf
README.md
app.py
requirements.txt
3. Copy the nougat
folder (which contains all model files) to the root of this repository. Your updated structure should look like:
input
nougat
--- config.json
--- pytorch_model.bin
--- special_tokens_map.json
--- tokenizer.json
--- tokenizer_config.json
output
app.py
cog.yaml
output.txt
predict.py
requirements.txt
pip install -r requirements.txt
Ensure that everything is installed correctly by running:
python app.py --pdf_file input/nougat.pdf
If the installation is successful, you should see the OCR output.
For any issues or questions, please refer to the repository or contact the repository owner.