Skip to content

ottosulin/awesome-ai-security

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 

Repository files navigation

Awesome AI Security Awesome Track Awesome List

A curated list of awesome AI security related frameworks, standards, learning resources and tools.

If you want to contribute, create a PR or contact me @ottosulin / @ottosulin.

Learning resources

Governance

Frameworks and standards

Taxonomies and terminology

Offensive tools and frameworks

Guides & frameworks

ML

  • Malware Env for OpenAI Gym - makes it possible to write agents that learn to manipulate PE files (e.g., malware) to achieve some objective (e.g., bypass AV) based on a reward provided by taking specific manipulation actions
  • Deep-pwning - a lightweight framework for experimenting with machine learning models with the goal of evaluating their robustness against a motivated adversary
  • Counterfit - generic automation layer for assessing the security of machine learning systems
  • DeepFool - A simple and accurate method to fool deep neural networks
  • Snaike-MLFlow - MLflow red team toolsuite
  • HackingBuddyGPT - An automatic pentester (+ corresponding [benchmark dataset](https://github.com/ipa -lab/hacking-benchmark))
  • Charcuterie - code execution techniques for ML or ML adjacent libraries
  • OffsecML Playbook - A collection of offensive and adversarial TTP's with proofs of concept
  • BadDiffusion - Official repo to reproduce the paper "How to Backdoor Diffusion Models?" published at CVPR 2023
  • Exploring the Space of Adversarial Images
  • Adversarial Machine Learning Library(Ad-lib)](https://github.com/vu-aml/adlib) - Game-theoretic adversarial machine learning library providing a set of learner and adversary modules
  • Adversarial Robustness Toolkit - ART focuses on the threats of Evasion (change the model behavior with input modifications), Poisoning (control a model with training data modifications), Extraction (steal a model through queries) and Inference (attack the privacy of the training data)
  • cleverhans - An adversarial example library for constructing attacks, building defenses, and benchmarking both
  • foolbox - A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX

LLM

  • garak - security probing tool for LLMs
  • agentic_security - Agentic LLM Vulnerability Scanner / AI red teaming kit
  • Agentic Radar - Open-source CLI security scanner for agentic workflows.
  • llamator - Framework for testing vulnerabilities of large language models (LLM).
  • whistleblower - Whistleblower is a offensive security tool for testing against system prompt leakage and capability discovery of an AI application exposed through API
  • LLMFuzzer - 🧠 LLMFuzzer - Fuzzing Framework for Large Language Models 🧠 LLMFuzzer is the first open-source fuzzing framework specifically designed for Large Language Models (LLMs), especially for their integrations in applications via LLM APIs. 🚀💥
  • vigil-llm - ⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
  • FuzzyAI - A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.
  • EasyJailbreak - An easy-to-use Python framework to generate adversarial jailbreak prompts.
  • promptmap - a prompt injection scanner for custom LLM applications
  • PyRIT - The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems.
  • PurpleLlama - Set of tools to assess and improve LLM security.
  • Giskard-AI - 🐢 Open-Source Evaluation & Testing for AI & LLM systems
  • promptfoo - Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
  • HouYi - The automated prompt injection framework for LLM-integrated applications.
  • llm-attacks - Universal and Transferable Attacks on Aligned Language Models
  • Dropbox llm-security - Dropbox LLM Security research code and results
  • llm-security - New ways of breaking app-integrated LLMs
  • OpenPromptInjection - This repository provides a benchmark for prompt Injection attacks and defenses
  • Plexiglass - A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
  • ps-fuzz - Make your GenAI Apps Safe & Secure 🚀 Test & harden your system prompt
  • EasyEdit - Modify an LLM's ground truths
  • spikee) - Simple Prompt Injection Kit for Evaluation and Exploitation
  • Prompt Hacking Resources - A list of curated resources for people interested in AI Red Teaming, Jailbreaking, and Prompt Injection
  • mcp-injection-experiments - Code snippets to reproduce MCP tool poisoning attacks.

AI for offensive cyber

  • AI-Red-Teaming-Playground-Labs - AI Red Teaming playground labs to run AI Red Teaming trainings including infrastructure.
  • HackGPT - A tool using ChatGPT for hacking
  • mcp-for-security - A collection of Model Context Protocol servers for popular security tools like SQLMap, FFUF, NMAP, Masscan and more. Integrate security testing and penetration testing into AI workflows.
  • cai - Cybersecurity AI (CAI), an open Bug Bounty-ready Artificial Intelligence (paper)
  • AIRTBench - Code Repository for: AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models

Defensive tools and frameworks

Guides & frameworks

Data security and governance

  • datasig - Dataset fingerprinting for AIBOM

Safety and prevention

  • Guardrail.ai - Guardrails is a Python package that lets a user add structure, type and quality guarantees to the outputs of large language models (LLMs)
  • CodeGate - An open-source, privacy-focused project that acts as a layer of security within a developers Code Generation AI workflow
  • MCP-Security-Checklist - A comprehensive security checklist for MCP-based AI tools. Built by SlowMist to safeguard LLM plugin ecosystems.
  • Awesome-MCP-Security - Everything you need to know about Model Context Protocol (MCP) security.
  • LlamaFirewall - LlamaFirewall is a framework designed to detect and mitigate AI centric security risks, supporting multiple layers of inputs and outputs, such as typical LLM chat and more advanced multi-step agentic operations.
  • awesome-ai-safety
  • ZenGuard AI - The fastest Trust Layer for AI Agents
  • llm-guard - LLM Guard by Protect AI is a comprehensive tool designed to fortify the security of Large Language Models (LLMs).
  • vibraniumdome - Full blown, end to end LLM WAF for Agents, allowing security teams govenrance, auditing, policy driven control over Agents usage of language models.
  • mcp-guardian - MCP Guardian manages your LLM assistant's access to MCP servers, handing you realtime control of your LLM's activity.

Detection & scanners

  • modelscan - ModelScan is an open source project from Protect AI that scans models to determine if they contain unsafe code.
  • rebuff - Prompt Injection Detector
  • langkit - LangKit is an open-source text metrics toolkit for monitoring language models. The toolkit various security related metrics that can be used to detect attacks
  • MCP-Scan - A security scanning tool for MCP servers

Privacy and confidentiality

  • Python Differential Privacy Library
  • Diffprivlib - The IBM Differential Privacy Library
  • PLOT4ai - Privacy Library Of Threats 4 Artificial Intelligence A threat modeling library to help you build responsible AI
  • TenSEAL - A library for doing homomorphic encryption operations on tensors
  • SyMPC - A Secure Multiparty Computation companion library for Syft
  • PyVertical - Privacy Preserving Vertical Federated Learning
  • Cloaked AI - Open source property-preserving encryption for vector embeddings
  • PrivacyRaven - privacy testing library for deep learning systems

About

A collection of awesome resources related AI security

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 9