Skip to content

Athena: Refactor LLM Configuration to YAML-Based System #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 59 commits into
base: main
Choose a base branch
from

Conversation

LeonWehrhahn
Copy link

@LeonWehrhahn LeonWehrhahn commented Apr 13, 2025

Motivation and Context

This PR rewrites the llm_core module configuration system to address current limitations. The core motivation behind these changes is threefold:

  1. Granular LLM Model Selection for Tasks: We need the ability to specify different LLM models for different tasks. For example, using a high-powered, but potentially more costly, LLM model for low-volume complex operations like generating initial structured grading instructions, while employing a faster, more economical LLM model for high-volume tasks like generating feedback on individual student submissions.
  2. Flexible and Comprehensive LLM Model Configuration: We need the ability to configure not only the LLM model to use but also its inherent capabilities (e.g., whether it supports function calling or structured output) and default settings (e.g., temperature, top_p). This is crucial for supporting a diverse range of LLM models.
  3. Preserved Dynamic Configuration Overrides via Headers: While not a new feature, we want to retain the existing ability to dynamically override LLM model configurations via x- headers in API requests, as used in the Athena playground.

Description

To achieve the outlined goals we introduced two YAML files to manage model configurations and capabilities:

  • llm_capabilities.yml (llm_core): This file defines the core capabilities of different LLM models. It specifies default settings (like temperature, top_p) and flags for supported features (like supports_function_calling, supports_structured_output). Importantly, it also allows for LLM model-specific overrides to these defaults. This file resides at the top level of the llm_core directory and is therefore the same fore ach module - (e.g., module_modeling_llm, module_programming_llm).

  • llm_config.yml (module-specific): Each module (e.g., module_modeling_llm, module_programming_llm) now has its own llm_config.yml located at the root level of the module. This file specifies the concrete models to be used for different tasks within that module. For example, the modeling module might specify a powerful model like openai_o1 for generating grading instructions and a faster, more economical model like openai_4o for generating feedback. Switching from environment variables to module-level YAML files for LLM configuration brings these settings under version control, ensuring consistent deployments and eliminating the risk of environment-specific discrepancies.

A lot of other aspects of the llm_module were changed to support this new YAML-based configuration approach. These changes are outlined in more detail in the README.

Steps for Testing

  1. Verify Model Configuration:

    • Ensure that the llm_config.yml and llm_capabilities.yml files are correctly parsed.
    • Check that different modules (e.g., module_modeling_llm) can successfully load and use their specified model configurations.
  2. Test Feedback Generation:

    • Test if the Feedback Generation is still working in each module (e.g., module_modeling_llm, module_programming_llm).

Testserver States

Note

These badges show the state of the test servers.
Green = Currently available, Red = Currently locked
Click on the badges to get to the test servers.


Screenshots

Summary by CodeRabbit

  • New Features

    • Introduced dynamic, provider-agnostic configuration and loading of language models (OpenAI, Azure, Ollama) via new YAML config files and modular provider support.
    • Added support for specifying multiple model roles (base, mini, fast reasoning, long reasoning) per module.
  • Improvements

    • Enhanced modularity and flexibility in model selection for all LLM-based modules.
    • Refactored model configuration and prompt handling for improved clarity, maintainability, and extensibility.
    • Simplified prompt construction and prediction calls by removing explicit output formatting and function calling flags.
    • Updated environment variable samples and documentation for easier integration with new providers.
    • Improved concurrency and error handling in feedback generation workflows.
    • Added dynamic argument handling to suggestion generation based on method signatures.
    • Added checks for null prediction results to prevent downstream errors.
  • Bug Fixes

    • Improved error handling and logging for model discovery and prediction processes.
  • Documentation

    • Added comprehensive README and configuration file documentation for LLM integration.
  • Style

    • Reformatted code and configuration files for consistency and readability.
    • Reformatted logging calls and function signatures for clarity.
  • Refactor

    • Replaced static model configuration with dynamic, runtime-loaded configurations.
    • Updated function and method signatures for clarity and explicitness.
    • Removed deprecated or redundant fields and logic related to output formatting.
    • Consolidated and simplified model configuration imports and exports.
    • Removed obsolete model integration modules and replaced with provider-specific configs.
    • Simplified prompt utilities by removing conditional formatting logic.
    • Replaced strategy factory pattern with direct approach implementation registry and dynamic dispatch.
  • Chores

    • Updated dependencies (e.g., OpenAI package version).
    • Added and updated example environment and configuration files.
    • Added Poetry virtual environment configuration files for multiple modules.
    • Updated VSCode workspace settings for consistent Python interpreter paths.

LeonWehrhahn and others added 30 commits November 16, 2024 19:30
…tionships; fix foreign key references and ensure proper inheritance structure.
…remove debug prints, update caching logic, and change serialization method for structured grading instructions
Repositories:
	Athena

Repositories without this branch:
	Pyris
	Atlas
@github-actions github-actions bot added lock:athena-test1 Is currently deployed to Athena Test Server 1 and removed deploy:athena-test1 Athena Test Server 1 labels Jun 16, 2025
@LeonWehrhahn LeonWehrhahn removed the lock:athena-test1 Is currently deployed to Athena Test Server 1 label Jun 16, 2025
Copy link
Contributor

@ahmetsenturk ahmetsenturk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the following things are ok ✅

  • code lgtm,
  • tested locally with Artemis, request AI feedback worked with programming/modeling/text exercises,
  • playground with text

followings are failing ❌

  • programming and modeling on playground:
Screenshot 2025-06-16 at 16 18 17 Screenshot 2025-06-16 at 16 17 08

p.s., this would still intensive testing from as it touches almost the whole Athena :)

Copy link

⚠️ Unable to deploy to test server ⚠️

"Athena - Test 1" is already in use by PR #172.

@LeonWehrhahn LeonWehrhahn added the deploy:athena-test1 Athena Test Server 1 label Jun 17, 2025
@github-actions github-actions bot added lock:athena-test1 Is currently deployed to Athena Test Server 1 and removed deploy:athena-test1 Athena Test Server 1 labels Jun 17, 2025
@maximiliansoelch maximiliansoelch added deploy:athena-test1 Athena Test Server 1 and removed lock:athena-test1 Is currently deployed to Athena Test Server 1 labels Jun 17, 2025
@github-actions github-actions bot added lock:athena-test1 Is currently deployed to Athena Test Server 1 and removed deploy:athena-test1 Athena Test Server 1 labels Jun 17, 2025
@LeonWehrhahn LeonWehrhahn added deploy:athena-test1 Athena Test Server 1 and removed lock:athena-test1 Is currently deployed to Athena Test Server 1 labels Jun 17, 2025
@github-actions github-actions bot added lock:athena-test1 Is currently deployed to Athena Test Server 1 and removed deploy:athena-test1 Athena Test Server 1 labels Jun 17, 2025
Copy link

There hasn't been any activity on this pull request recently. Therefore, this pull request has been automatically marked as stale and will be closed if no further activity occurs within seven days. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
athena lock:athena-test1 Is currently deployed to Athena Test Server 1 stale
Projects
Status: Backlog
Development

Successfully merging this pull request may close these issues.

4 participants