GitHub - jaypyles/Scraperr: Self-hosted webscraper.

A powerful self-hosted web scraping solution

📋 Overview

Scrape websites without writing a single line of code.

📚 Check out the docs for a comprehensive quickstart guide and detailed information.

✨ Key Features

XPath-Based Extraction: Precisely target page elements
Queue Management: Submit and manage multiple scraping jobs
Domain Spidering: Option to scrape all pages within the same domain
Custom Headers: Add JSON headers to your scraping requests
Media Downloads: Automatically download images, videos, and other media
Results Visualization: View scraped data in a structured table format
Data Export: Export your results in markdown and csv formats
Notifcation Channels: Send completion notifcations, through various channels

🚀 Getting Started

Docker

make up

Helm

Refer to the docs for helm deployment: https://scraperr-docs.pages.dev/guides/helm-deployment

⚖️ Legal and Ethical Guidelines

When using Scraperr, please remember to:

Respect robots.txt: Always check a website's robots.txt file to verify which pages permit scraping
Terms of Service: Adhere to each website's Terms of Service regarding data extraction
Rate Limiting: Implement reasonable delays between requests to avoid overloading servers

Disclaimer: Scraperr is intended for use only on websites that explicitly permit scraping. The creator accepts no responsibility for misuse of this tool.

💬 Join the Community

Get support, report bugs, and chat with other users and contributors.

👉 Join the Scraperr Discord

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

👏 Contributions

Development made easier with the webapp template.

To get started, simply run make build up-dev.

Name		Name	Last commit message	Last commit date
Latest commit History 186 Commits
.github		.github
api/backend		api/backend
cypress		cypress
docker		docker
docs		docs
helm		helm
public		public
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
.prettierignore		.prettierignore
.python-version		.python-version
FUNDING.yml		FUNDING.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cypress.config.ts		cypress.config.ts
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml
next-env.d.ts		next-env.d.ts
next.config.mjs		next.config.mjs
package.json		package.json
pdm.lock		pdm.lock
postcss.config.js		postcss.config.js
pyproject.toml		pyproject.toml
start.sh		start.sh
supervisord.conf		supervisord.conf
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📋 Overview

✨ Key Features

🚀 Getting Started

Docker

Helm

⚖️ Legal and Ethical Guidelines

💬 Join the Community

📄 License

👏 Contributions

About

Releases 17

Sponsor this project

Packages

Contributors 2

Languages

License

jaypyles/Scraperr

Folders and files

Latest commit

History

Repository files navigation

📋 Overview

✨ Key Features

🚀 Getting Started

Docker

Helm

⚖️ Legal and Ethical Guidelines

💬 Join the Community

📄 License

👏 Contributions

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 17

Sponsor this project

Packages 0

Contributors 2

Languages

Packages