Skip to content

LFX mentorship (2025/Term-1): Expanding the LitmusChaos Tutorials - Day 0, Day 1, and Day 2 User Flows #5037

Closed
@S-ayanide

Description

@S-ayanide

Description

This task focuses on improving the LitmusChaos documentation by structuring and creating tutorials into Day 0, Day 1, and Day 2 workflows tailored for different users. Instead of documenting individual faults (which would require constant maintenance), the goal is to create user-flow-based guides that help users understand chaos engineering principles at different levels of expertise, from beginners experimenting with sample apps to advance users implementing chaos in real-world systems.

Additionally, this task will involve tech doc improvements, fixing structural issues, removing duplicates, and ensuring a clear and intuitive documentation experience for the community.

Prerequisites:

  • Strong technical writing and research skills.
  • Ability to understand user personas (SREs, Principal Engineers, Developers, etc.).
  • Familiarity with chaos engineering principles (experience with LitmusChaos is a plus).
  • Basic knowledge of Kubernetes and observability tools (Grafana, Prometheus, etc.).

Schedule: 3rd March, 2025 - 30th May, 2025

Previous Works & References:

What You Will Do:

  1. Develop Day 0, Day 1, and Day 2 Tutorials for LitmusChaos
  • Day 0 (Beginner-Level Chaos Engineering) [Already implemented, we can improve it further]
    Goal: Introduce users to chaos engineering with a simple application.
    Application: Podtato Head, Online Boutique, or another microservices demo app.
    Experiment: Simulate pod deletion and observe recovery through Kubernetes deployment strategies.
    Outcome: Understand basic failure scenarios and how Kubernetes ensures resilience.

  • Day 1 (Intermediate-Level Chaos Engineering)
    Goal: Introduce chaos into real-world applications with stateful components.
    Application: Redis, Cassandra, or MongoDB.
    Experiment:

    • Simulate leader pod crashes to test leader-election mechanisms.
    • Perform network partitioning to evaluate how replicas handle failures.
      Outcome: Learn how distributed databases and services handle failures.
  • Day 2 (Advanced Chaos Workflows & Multi-Experiment Scenarios)
    Goal: Create a comprehensive chaos engineering workflow from start to finish.
    Scenario: A complex chaos workflow covering multiple failure scenarios.
    Experiments:

    • Pod delete → CPU spike → Network latency → Validate system recovery metrics in Grafana.
    • Extend this to multi-cluster failure scenarios for advanced users.
      Outcome: Understand system-wide resilience patterns and how to build automated chaos workflows.
  1. Research Chaos Experiment Needs for Different Personas
    Identify use cases for different users (SREs, Platform Engineers, Principal Engineers).
    Determine the right type of experiments and use case tutorials for the group.

  2. Improve Documentation Structure and Fix Issues
    Work on fixing tech docs analysis open issues (structure changes, removing duplicates, improving clarity).
    Enhance navigation and make tutorials easier to follow.

Mentors

This task is ideal for those passionate about developer experience, documentation, and chaos engineering education. The tutorials created will serve as long-term learning resources for new and experienced LitmusChaos users!

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions