GitHub - bcankara/LitOrganizer: LitOrganizer is a powerful tool designed for researchers, academics, and students to organize their PDF literature collections automatically. It extracts metadata from academic papers, renames files according to citation standards, categorizes them into a logical directory structure, and provides powerful search capabilities.

Organize your academic literature efficiently

LitOrganizer is a powerful tool designed for researchers, academics, and students to organize their PDF literature collections automatically. It extracts metadata from academic papers, renames files according to citation standards, categorizes them into a logical directory structure, and provides powerful search capabilities.

Main Organization Tab	Search Keywords Tab
General Statistics Tab	Publication Statistics Tab

✨ Features

📚 Automatic Organization

Smart Metadata Extraction: Automatically extracts DOIs and retrieves complete metadata from multiple academic APIs
Citation-based Renaming: Renames PDF files using APA7 format (Author_Year) for easy identification
Intelligent Categorization: Organizes PDFs into folders by journal, author, year, or subject
Reference List Generation: Creates a comprehensive bibliography of all processed papers

🔍 Advanced Search Capabilities

Full-text Search: Quickly find information across your entire PDF collection
Context Display: View search results with surrounding text for better understanding
Flexible Search Options: Use exact match, case sensitivity, or regular expressions
Export Results: Save search results to Word and Excel files with highlighted matches

📊 Comprehensive Statistics

Performance Metrics: Visual representation of processing speed and efficiency
Accuracy Analysis: Detailed breakdown of metadata quality and DOI detection rates
Publication Analytics: Distribution of papers by author, journal, year, and subject
Error Diagnostics: Identification of problematic files with detailed error analysis

💻 User-Friendly Interface

Modern Design: Clean, intuitive interface with Windows 11 design principles
Multi-tab Layout: Separate tabs for organization, search, and statistics
Progress Tracking: Real-time progress indicators and detailed logging
Customizable Options: Flexible settings to adapt to your workflow

🚀 Installation

Requirements

Python 3.8 or later
Required Python packages (see requirements.txt)
For OCR functionality: Tesseract OCR

Installation Steps

Clone or download this repository:

git clone https://github.com/bcankara/LitOrganizer.git
cd LitOrganizer

Install required dependencies:
```
pip install -r requirements.txt
```
(Optional) For OCR functionality, install Tesseract OCR:
- Windows: Download and install from Tesseract at UB Mannheim
- macOS: brew install tesseract
- Linux: sudo apt install tesseract-ocr

📖 Usage

GUI Mode

Run the application without arguments to start in GUI mode:

python litorganizer.py

Main Tab

Select a directory containing PDFs using the "Browse" button
Configure categorization options (by journal, author, year, subject)
Click "Start Processing" to begin organizing your files
Monitor progress in the log window

Search Keywords Tab

Select a directory containing PDFs
Enter a keyword to search for
Configure search options:
- Exact Match: Only match complete words
- Case Sensitive: Match exact letter case
- Use Regex: Use regular expressions for pattern matching
Click "Start Search" to begin
View results and save to Word/Excel if desired

Statistics Tabs

General Statistics: Overall performance metrics and accuracy analysis
Publication Statistics: Detailed breakdown by author, journal, year, and subject

Command Line Mode

Basic usage:

python litorganizer.py -d /path/to/pdfs

Additional options:

python litorganizer.py --help

⚙️ Configuration

API settings for DOI metadata retrieval can be configured in the API Settings tab or by editing config/api_config.json.

🔄 Workflow Example

LitOrganizer Workflow

Start with unorganized PDFs
Extract DOIs and metadata
Rename and categorize files
Generate references and statistics

Input: Start with a folder of unorganized PDF files
Processing: LitOrganizer extracts DOIs and retrieves metadata
Organization: Files are renamed and categorized
Output: A well-structured directory with properly named files

🛠️ Technical Details

LitOrganizer is built with:

PyQt5: For the graphical user interface
PyMuPDF & pdfplumber: For PDF text extraction
Requests: For API communication with academic databases
pandas & python-docx: For exporting search results

Python

PyQt5

PDF Processing

Pandas

VS Code

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with PyQt5 for the user interface
Uses pdfplumber and PyMuPDF for PDF text extraction
Integrated with multiple academic APIs for metadata retrieval

📬 Contact

For questions, suggestions, or issues, please open an issue on GitHub or contact the maintainer.

Made with ❤️ for the academic community

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
modules		modules
resources		resources
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
litorganizer.py		litorganizer.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Organize your academic literature efficiently

✨ Features

📚 Automatic Organization

🔍 Advanced Search Capabilities

📊 Comprehensive Statistics

💻 User-Friendly Interface

🚀 Installation

Requirements

Installation Steps

📖 Usage

GUI Mode

Main Tab

Search Keywords Tab

Statistics Tabs

Command Line Mode

⚙️ Configuration

🔄 Workflow Example

🛠️ Technical Details

📝 License

🙏 Acknowledgments

📬 Contact

About

Releases 1

Packages

Languages

License

bcankara/LitOrganizer

Folders and files

Latest commit

History

Repository files navigation

Organize your academic literature efficiently

✨ Features

📚 Automatic Organization

🔍 Advanced Search Capabilities

📊 Comprehensive Statistics

💻 User-Friendly Interface

🚀 Installation

Requirements

Installation Steps

📖 Usage

GUI Mode

Main Tab

Search Keywords Tab

Statistics Tabs

Command Line Mode

⚙️ Configuration

🔄 Workflow Example

🛠️ Technical Details

📝 License

🙏 Acknowledgments

📬 Contact

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages