Skip to content

Golang AI applications have incredible potential. With unique features like inexplicable speed, easy debugging, concurrency, and excellent libraries for ML, deep learning, and reinforcement learning.

License

Notifications You must be signed in to change notification settings

promacanthus/awesome-golang-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome Golang.ai

Golang AI applications have incredible potential. With unique features like inexplicable speed, easy debugging, concurrency, and excellent libraries for ML, deep learning, and reinforcement learning.

Benchmark

  • ADeLe: ADeLe v1.0 is a comprehensive AI evaluation framework that combines explanatory analysis and predictive modeling capabilities to systematically assess AI system performance across multiple dimensions.
  • SWELancer: The SWE-Lancer-Benchmark is designed to evaluate the capabilities of frontier LLMs in solving real-world freelance software engineering tasks, exploring their potential to generate economic value through complex software development scenarios.

English

  • ARC-AGI: The Abstraction and Reasoning Corpus.
  • ARC-Challenge: AI2 Reasoning Challenge (ARC) Set.
  • BBH: Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them.
  • BIG-bench: Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models.
  • GPQA: GPQA: A Graduate-Level Google-Proof Q&A Benchmark.
  • HelloSwag: HellaSwag: Can a Machine Really Finish Your Sentence?
  • IFEval: IFEval is designed to systematically evaluate the instruction-following capabilities of large language models by incorporating 25 verifiable instruction types (e.g., format constraints, keyword inclusion) and applying dual strict-loose metrics for automated, objective assessment of model compliance.
  • LiveBench: A Challenging, Contamination-Free LLM Benchmark.
  • MMLU: Measuring Massive Multitask Language Understanding ICLR 2021.
  • MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark.
  • MMLU-Pro: [NeurIPS 2024] A More Robust and Challenging Multi-Task Language Understanding Benchmark.
  • MTEB: Massive Text Embedding Benchmark.
  • PIQA: PIQA is a dataset for commonsense reasoning, and was created to investigate the physical knowledge of existing models in NLP.
  • WinoGrande: An Adversarial Winograd Schema Challenge at Scale.

Chinese

  • C-Eval: [NeurIPS 2023] A Chinese evaluation suite for foundation models.
  • CMMLU: Measuring massive multitask language understanding in Chinese.
  • C-SimpleQA: A Chinese Factuality Evaluation for Large Language Models.

Math

  • AIME: Evaluation of LLMs on latest math competitions.
  • grade-school-math: The GSM8K dataset contains 8.5K grade school math word problems designed to evaluate multi-step reasoning capabilities in language models, revealing that even large transformers struggle with these conceptually simple yet procedurally complex tasks.
  • MATH: The MATH Dataset for NeurIPS 2021, is a benchmark for evaluating mathematical problem-solving capabilities, offering dataset loaders, evaluation code, and pre-training data.
  • MathVista: MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts.
  • Omni-MATH: Omni-MATH is a comprehensive and challenging benchmark specifically designed to assess LLMs' mathematical reasoning at the Olympiad level.
  • TAU-bench: TauBench is an open-source benchmark suite designed to evaluate the performance of large language models (LLMs) on complex reasoning tasks across multiple domains.

Code

  • AIDER: The leaderboards page of aider presents a performance comparison of various LLMs in programming-related tasks, such as code writing and editing.
  • BFCL: BFCL aims to provide a thorough study of the function-calling capability of different LLMs.
  • BigCodeBench: [ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI.
  • Code4Bench: A Mutildimensional Benchmark of Codeforces Data for Different Program Analysis Techniques.
  • CRUXEval: Code Reasoning, Understanding, and Execution Evaluation.
  • HumanEval: Code for the paper "Evaluating Large Language Models Trained on Code".
  • LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code.
  • MBPP: The benchmark consists of around 1,000 crowd-sourced Python programming problems, designed to be solvable by entry level programmers, covering programming fundamentals, standard library functionality, and so on.
  • MultiPL-E: A multi-programming language benchmark for LLMs.
  • multi-swe-bench: The Multi-SWE-bench project, developed by ByteDance's Doubao team, is the first open-source multilingual dataset for evaluating and enhancing large language models' ability to automatically debug code, covering 7 major programming languages (e.g., Java, C++, JavaScript) with real-world GitHub issues to benchmark "full-stack engineering" capabilities.
  • SWE-bench: SWE-bench is a benchmark suite designed to evaluate the capabilities of large language models (LLMs) in solving real-world software engineering tasks, focusing on actual software bug-fixing challenges extracted from open-source projects.

Tool Use

  • BFCL: Training and Evaluating LLMs for Function Calls (Tool Calls).
  • T-Eval: [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step.
  • WildBench: Benchmarking LLMs with Challenging Tasks from Real Users.

Open ended

  • Arena-Hard: Arena-Hard-Auto: An automatic LLM benchmark.

Safety

False refusal

  • Xstest: Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models".

Multi-modal

  • DPG-Bench: The DPG benchmark tests a model’s ability to follow complex image generation prompts.
  • geneval: GenEval: An object-focused framework for evaluating text-to-image alignment.
  • LongVideoBench: [Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
  • MLVU: Multi-task Long Video Understanding Benchmark.
  • perception_test: A Diagnostic Benchmark for Multimodal Video Models is a multimodal benchmark designed to comprehensively evaluate the perception and reasoning skills of multimodal video models.
  • TempCompass: A benchmark to evaluate the temporal perception ability of Video LLMs.
  • VBench: VBench is an open-source project aiming to build a comprehensive evaluation benchmark for video generation models.
  • Video-MME: [CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis.
  • mcp-go: A Go implementation of the Model Context Protocol (MCP), enabling seamless integration between LLM applications and external data sources and tools.
  • mcp-golang: Write Model Context Protocol servers in few lines of go code.
  • gateway: Universal MCP-Server for your Databases optimized for LLMs and AI-Agents.

Large Language Model

GPT

  • gpt-go: Tiny GPT implemented from scratch in pure Go. Trained on Jules Verne books.

ChatGPT Apps

  • feishu-openai: Feishu (Lark) integrated with (GPT-4 + GPT-4V + DALL·E-3 + Whisper) delivers an extraordinary work experience.
  • chatgpt-telegram: Run your own GPTChat Telegram bot, with a single command.

SDKs

  • openai-go: The official Go library for the OpenAI API.
  • go-openai: OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go.
  • generative-ai-go: Go SDK for Google Generative AI.
  • anthropic-sdk-go: Access to Anthropic's safety-first language model APIs via Go.
  • go-anthropic: Anthropic Claude API wrapper for Go.
  • deepseek-go: A Deepseek client written for Go supporting R-1, Chat V3, and Coder. Also supports external providers like Azure, OpenRouter and Local Ollama.

DevTools

  • ollama: Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
  • go-attention: A full attention mechanism and transformer in pure go.
  • langchaingo: LangChain for Go, the easiest way to write LLM-based programs in Go.
  • gpt4all-bindings: GPT4All Language Bindings provide cross-language interfaces to easily integrate and interact with GPT4All's local LLMs, simplifying model loading and inference for developers.
  • go-openai: OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go.
  • llama.go: llama.go is like llama.cpp in pure Golang.
  • eino: The ultimate LLM/AI application development framework in Golang.
  • fabric: fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
  • genkit: An open source framework for building AI-powered apps with familiar code-centric patterns. Genkit makes it easy to develop, integrate, and test AI features with observability and evaluations. Genkit works with various models and platforms.
  • swarmgo: SwarmGo (agents-sdk-go) is a Go package that allows you to create AI agents capable of interacting, coordinating, and executing tasks.
  • orra: The orra-dev/orra project offers resilience for AI agent workflows.
  • core: A fast, agnostic, and powerful Go AI framework for one-shot workflows, building autonomous agents, and working with LLM providers.
  • gollm: Unified Go interface for Language Model (LLM) providers. Simplifies LLM integration with flexible prompt management and common task functions.

Vector Database

  • milvus: Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search.
  • weaviate: Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.
  • tidb: TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.

Pipeline and Data Version

  • pachyderm: Data-Centric Pipelines and Data Versioning.

Embedding Benchmark

  • MTEB: MTEB (Massive Text Embedding Benchmark) is an open-source benchmarking framework for evaluating and comparing text embedding models across 8 tasks (e.g., classification, retrieval, clustering) using 58 datasets in 112 languages, providing standardized performance metrics for model selection.
  • BRIGHT: BBRIGHT is a realistic, challenging benchmark for reasoning-intensive retrieval, featuring 12 diverse datasets (math, code, biology, etc.) to evaluate retrieval models across complex, context-rich queries requiring logical inference.

General Machine Learning libraries

  • goml:On-line Machine Learning in Go (and so much more).
  • golearn: simple and customizable batteries included ML library in Go.
  • gonum:Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more.
  • gorgonia: Gorgonia is a library that helps facilitate machine learning in Go.
  • spago: Self-contained Machine Learning and Natural Language Processing library in Go.
  • goro: A High-level Machine Learning Library for Go.
  • goga: Golang Genetic Algorithm.
  • hep: hep is the mono repository holding all of go-hep.org/x/hep packages and tools.
  • hector: Golang machine learning lib.
  • sklearn: bits of sklearn ported to Go.
  • tokenizer: NLP tokenizers written in Go language.

Neural Networks

  • gobrain: Neural Networks written in go.
  • go-neural: Neural network implementation on golang.
  • go-deep: Artificial Neural Network.
  • olivia: Your new best friend powered by an artificial neural network.
  • gomid: A simplistic Neural Network Library in Go.
  • neurgo: Neural Network toolkit in Go.
  • gonn: GoNN is an implementation of Neural Network in Go Language, which includes BPNN, RBF, PCN.
  • gosom: Self-organizing maps in Go.
  • go-perceptron-go: A single / multi layer / recurrent neural network written in Golang.

Linear Algebra

  • gosl: Linear algebra, eigenvalues, FFT, Bessel, elliptic, orthogonal polys, geometry, NURBS, numerical quadrature, 3D transfinite interpolation, random numbers, Mersenne twister, probability distributions, optimisation, differential equations.
  • sparse: Sparse matrix formats for linear algebra supporting scientific and machine learning applications.

Probability Distributions

  • godist: Probability distributions and associated methods in Go.

Decision Trees

  • CloudForest: CloudForest is a fast, flexible Go library for multi-threaded decision tree ensembles (Random Forest, Gradient Boosting, etc.) designed for high-dimensional heterogeneous data with missing values, emphasizing speed and robustness for real-world machine learning tasks.

Regression

  • regression: Multivariable regression library in Go.
  • ridge: Ridge regression in Go.

Bayesian Classifiers

  • bayesian: Naive Bayesian Classification for Golang.
  • multibayes: Multiclass Naive Bayesian Classification.

Recommendation Engines

  • regommend: Recommendation engine for Go.
  • gorse: Go Recommender System Engine.
  • too: Simple recommendation engine implementation built on top of Redis.

Evolutionary Algorithms

  • eaopt: Evolutionary optimization library for Go (genetic algorithm, partical swarm optimization, differential evolution).
  • evo: Evolutionary Algorithms in Go.

Graph

  • gogl: A graph library in Go.

Cluster

  • gokmeans: K-means algorithm implemented in Go (golang).
  • kmeans: k-means clustering algorithm implementation written in Go.

Anomaly Detection

  • morgoth: Metric anomaly detection.
  • anomalyzer: Probabilistic anomaly detection for time series data.
  • goanomaly: Golang library for anomaly detection. Uses the Gaussian distribution and the probability density formula.

DataFrames

  • gota: Gota: DataFrames and data wrangling in Go.
  • dataframe-go: DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration.
  • qframe: Immutable data frame for Go.

Explaining Model

  • lime: Lime: Explaining the predictions of any machine learning classifier.

Books

Basic Knowledge

Reinforcement Learning

Datasets

Star History

Star History of promacanthus/awesome-golang-ai

Star Geographical Distribution

Star Geographical Distribution of promacanthus/awesome-golang-ai

About

Golang AI applications have incredible potential. With unique features like inexplicable speed, easy debugging, concurrency, and excellent libraries for ML, deep learning, and reinforcement learning.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published