Skip to content

bitmakerla/estela

Folders and files

NameName
Last commit message
Last commit date
Mar 12, 2025
Jan 31, 2024
Feb 8, 2024
Jan 24, 2024
Mar 13, 2025
Mar 13, 2025
Jul 17, 2024
Oct 23, 2024
Jun 23, 2022
Nov 1, 2023
Oct 1, 2022
Oct 1, 2022
Jun 13, 2022
Apr 1, 2023

Repository files navigation

estela

estela is an elastic web scraping cluster running on Kubernetes. It provides mechanisms to deploy, run and scale web scraping spiders via a REST API and a web interface.

Technologies

docker python react nodejs

Project Structure

The project consists of three main modules:

  • REST API : built with the Django REST framework toolkit, it exposes several endpoints to manage projects, spiders, and jobs. It uses Celery for task processing and takes care of deploying your Scrapy projects, among other things.
  • Queueing : estela needs a high-throughput, low-latency platform that controls real-time data feeds in a producer-consumer architecture. In this module, you will find a consumer used to collect and transport the information from the spider jobs into a database.
  • Web : A web interface implemented with React and Typescript that lets you manage projects and spiders.

Each of these modules works independently of the rest and can be changed. Each module has a more detailed description in its corresponding directory.

estela-cli

estela-cli is a command-line interface for estela.

How to Contribute

Please read CONTRIBUTING.md and follow the steps. Remember to abide by our adapted from ESTELA Code of Conduct too.