Skip to content
Change the repository type filter

All

    Repositories list

    • mle-bench

      Public
      MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
      Python
      Other
      90001Updated Apr 7, 2025Apr 7, 2025
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      2.3k001Updated Apr 7, 2025Apr 7, 2025
    • template

      Public template
      Template project implementing Palisade's coding guide
      Python
      0001Updated Apr 7, 2025Apr 7, 2025
    • HTML
      42601Updated Apr 7, 2025Apr 7, 2025
    • Inspect: A framework for large language model evaluations
      Python
      MIT License
      206001Updated Apr 7, 2025Apr 7, 2025
    • hap-ctf

      Public
      Python
      0002Updated Apr 7, 2025Apr 7, 2025
    • ctfish

      Public
      Chess agent specification gaming
      HTML
      51001Updated Apr 7, 2025Apr 7, 2025
    • Voice interview agent passes a technical screen
      JavaScript
      MIT License
      1.2k600Updated Feb 16, 2025Feb 16, 2025
    • ci-action

      Public
      1000Updated Feb 15, 2025Feb 15, 2025
    • intercode

      Public
      Python
      MIT License
      445100Updated Dec 5, 2024Dec 5, 2024
    • HarmBench

      Public
      HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
      Jupyter Notebook
      MIT License
      83000Updated Aug 8, 2024Aug 8, 2024
    • foxvox

      Public
      JavaScript
      41110Updated Jul 5, 2024Jul 5, 2024
    • AutoAWQ

      Public archive
      AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
      Python
      MIT License
      258000Updated Jun 21, 2024Jun 21, 2024
    • rasp-lab

      Public archive
      Transformers with handcrafted weights which implement 3-SAT and integer factoring verification, among other algorithms. It uses RASP and Tracr for programming the Transformers.
      Jupyter Notebook
      Apache License 2.0
      1000Updated Mar 14, 2024Mar 14, 2024
    • 1100Updated Jan 29, 2024Jan 29, 2024