Skip to content

The official GitHub page for the survey paper "A Survey on Large Language Models: Applications, Challenges, Limitations, and Practical Usage"

Notifications You must be signed in to change notification settings

anas-zafar/LLM-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

63 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, And Future Prospects (Updated 2025)

The Large Language Models Survey repository is a comprehensive compendium dedicated to the exploration and understanding of Large Language Models (LLMs). It houses an assortment of resources including research papers, blog posts, tutorials, code examples, and more to provide an in-depth look at the progression, methodologies, and applications of LLMs. This repo is an invaluable resource for AI researchers, data scientists, or enthusiasts interested in the advancements and inner workings of LLMs. We encourage contributions from the wider community to promote collaborative learning and continue pushing the boundaries of LLM research.

Timeline of LLMs

evolutionv1 1

List of LLMs (Updated July 2025)

Language Model Organization Release Date Checkpoints Paper/Blog Params (B) Context Length Licence Try it
2025 Latest Models
Grok 3 / Grok 3 Mini xAI 2025/02 Grok 3, Grok 3 Mini Grok 3 Beta β€” The Age of Reasoning Agents 314 active (1M+ total) / Smaller variant 1M tokens Proprietary xAI Platform
Llama 4 Scout Meta 2025/04 Llama 4 Scout The Llama 4 herd: The beginning of a new era 17B active (109B total) 10M tokens Llama 4 Community License HuggingFace
Llama 4 Maverick Meta 2025/04 Llama 4 Maverick The Llama 4 herd 17B active (400B total) 1M tokens Llama 4 Community License HuggingFace
Llama 4 Behemoth Meta 2025/04 (Training) In Training The Llama 4 herd 288B active (~2T total) TBD TBD TBD
Qwen 3 Family Alibaba 2025/04 Qwen 3 Family Alibaba unveils Qwen3 0.6B - 235B (22B active) 32K - 131K tokens Apache 2.0 Qwen Chat
DeepSeek-R1 Family DeepSeek 2025/01-05 DeepSeek-R1, R1-Zero, R1-0528 DeepSeek-R1: Incentivizing Reasoning Capability 37B active (671B total) 128K tokens MIT DeepSeek Platform
o3 / o3-mini / o4-mini OpenAI 2025/01-04 o3, o3-mini, o4-mini Introducing OpenAI o3 and o4-mini Undisclosed 200K tokens Proprietary ChatGPT
Claude 4 (Sonnet & Opus) Anthropic 2025/05 Claude Sonnet 4, Claude Opus 4 Introducing Claude 4 Undisclosed 200K tokens Proprietary Claude.ai
Gemini 2.5 Family Google 2025/03-06 Gemini 2.5 Pro, 2.5 Flash, 2.5 Flash-Lite Gemini 2.5: Our newest Gemini model with thinking Undisclosed 1M tokens Proprietary Gemini
Major 2024 Models
GPT-4o / GPT-4o mini OpenAI 2024/05-07 GPT-4o, GPT-4o mini Hello GPT-4o, GPT-4o mini: advancing cost-efficient intelligence Undisclosed 128K tokens Proprietary ChatGPT
o1 / o1-mini OpenAI 2024/09 o1, o1-mini Learning to Reason with LLMs Undisclosed 200K / 128K tokens Proprietary ChatGPT
Claude 3 Family Anthropic 2024/03 Claude 3 Haiku, Claude 3 Sonnet, Claude 3 Opus Introducing the next generation of Claude Undisclosed 200K tokens Proprietary Claude.ai
Claude 3.5 Sonnet Anthropic 2024/06 Claude 3.5 Sonnet Claude 3.5 Sonnet Undisclosed 200K tokens Proprietary Claude.ai
Claude 3.7 Sonnet Anthropic 2024/10 Claude 3.7 Sonnet Claude 3.7 Sonnet Undisclosed 200K tokens Proprietary Claude.ai
Gemini 1.5 Pro / Flash Google 2024/02-05 Gemini 1.5 Pro, Gemini 1.5 Flash Our next-generation model: Gemini 1.5 Undisclosed 1M-2M / 1M tokens Proprietary Gemini
Gemini 2.0 Flash Google 2024/12 Gemini 2.0 Flash Gemini 2.0 Flash Undisclosed 1M tokens Proprietary Gemini
Gemma 2 Google 2024/06 Gemma 2 Family Gemma 2: Improving Open Language Models at a Practical Size 9B, 27B 8K tokens Apache 2.0 HuggingFace
Llama 3 Family Meta 2024/04 Llama 3 Weights Introducing Meta Llama 3 8B, 70B 8K tokens Custom HuggingChat
Llama 3.1 Meta 2024/07 Llama 3.1 Weights The Llama 3 Herd of Models 8B, 70B, 405B 128K tokens Custom HuggingChat
Llama 3.2 Meta 2024/09 Llama 3.2 Models Llama 3.2: Revolutionizing edge AI and vision with open, customizable models 1B, 3B, 11B, 90B 128K tokens Custom HuggingChat
Llama 3.3 Meta 2024/12 Llama 3.3 70B Llama 3.3 70B 70B 128K tokens Custom HuggingChat
Phi-3 Family Microsoft 2024/04-08 Phi-3 Mini, Phi-3 Small, Phi-3 Medium, Phi-3.5 Mini Phi-3 Technical Report 3.8B - 14B 4K-128K tokens MIT Azure AI Studio
IBM Granite 3.0 / 3.1 IBM 2024/10-12 Granite 3.0, Granite 3.1 IBM Introduces Granite 3.0 2B, 8B 4K / 128K tokens Apache 2.0 IBM watsonx
Command R / R+ Cohere 2024/03-04 Command R, Command R+ Command R: Cohere's scalable generative model 35B / 104B 128K tokens CC BY-NC 4.0 Cohere Platform
DeepSeek-V3 Family DeepSeek 2024/12-2025/03 DeepSeek-V3, DeepSeek-V3-0324 DeepSeek-V3 Technical Report 37B active (671B total) 128K tokens MIT DeepSeek Platform
Qwen 2.5 Family Alibaba 2024/09-2025/01 Qwen 2.5 Family, Qwen 2.5-Max Qwen2.5: A Party of Foundation Models 0.5B - 72B / Undisclosed 32K-128K tokens Apache 2.0 / Proprietary Qwen Chat
QwQ-32B Alibaba 2024/11 QwQ-32B-Preview QwQ-32B Technical Report 32B 32K tokens Apache 2.0 Qwen Chat
Mistral Family Mistral AI 2023/09-2025/05 Mistral-7B, Mistral Large 2, Mistral Medium Mistral 7B 7B - 123B / Undisclosed 4K-128K tokens Apache 2.0 / Proprietary Mistral Platform
Command R / R+ 2024/03-04 Command R, Command R+ Command R: Cohere's scalable generative model 35B / 104B 128K tokens CC BY-NC 4.0 Cohere Platform
DeepSeek-V3 Family 2024/12-2025/03 DeepSeek-V3, DeepSeek-V3-0324 DeepSeek-V3 Technical Report 37B active (671B total) 128K tokens MIT DeepSeek Platform
Qwen 2.5 Family 2024/09-2025/01 Qwen 2.5 Family, Qwen 2.5-Max Qwen2.5: A Party of Foundation Models 0.5B - 72B / Undisclosed 32K-128K tokens Apache 2.0 / Proprietary Qwen Chat
QwQ-32B 2024/11 QwQ-32B-Preview QwQ-32B Technical Report 32B 32K tokens Apache 2.0 Qwen Chat
Mistral Family 2023/09-2025/05 Mistral-7B, Mistral Large 2, Mistral Medium Mistral 7B 7B - 123B / Undisclosed 4K-128K tokens Apache 2.0 / Proprietary Mistral Platform
Previous Generation Models
GPT-4 / GPT-4.5 2023/03-2024/06 API Access GPT-4 Technical Report Undisclosed 8K-128K tokens Proprietary ChatGPT
LLaMA 2 2023/06 LLaMA 2 Weights Llama 2: Open Foundation and Fine-Tuned Chat Models 7B - 70B 4K tokens Custom HuggingChat
PaLM 2 2023/05 PaLM 2 PaLM 2 Technical Report Undisclosed 8K tokens Proprietary Bard
Bard 2023/03 Bard Bard: An experiment by Google Undisclosed 8K tokens Proprietary Bard
Chinchilla 2022/03 Chinchilla Training Compute-Optimal Large Language Models 70B 2K tokens Proprietary [Research Only]
Sparrow 2022/09 Sparrow Improving alignment of dialogue agents via targeted human judgements 70B 4K tokens Proprietary [Research Only]
Gopher 2021/12 Gopher Scaling Language Models: Methods, Analysis & Insights from Training Gopher 280B 2K tokens Proprietary [Research Only]
YaLM 2022/06 YaLM 100B YaLM 100B 100B 2K tokens Apache 2.0 GitHub
OPT 2022/05 OPT Family OPT: Open Pre-trained Transformer Language Models 0.125B - 175B 2K tokens MIT HuggingFace
BLOOM 2022/11 BLOOM BLOOM: A 176B-Parameter Open-Access Multilingual Language Model 176B 2K tokens OpenRAIL-M v1 HuggingFace
Jurassic-1 / Jurassic-2 2021/08 / 2023/03 AI21 Studio Jurassic-1: Technical Details And Evaluation 178B 2K / 8K tokens Proprietary AI21 Studio
Anthropic LM (v4-s3) 2022/12 Anthropic LM Constitutional AI: Harmlessness from AI Feedback 52B 4K tokens Proprietary [Research Only]
GLaM 2021/12 GLaM GLaM: Efficient Scaling of Language Models with Mixture-of-Experts 1.2T (64B active) 2K tokens Proprietary [Research Only]
GPT-J / GPT-NeoX 2021/06 / 2022/04 GPT-J-6B, GPT-NeoX-20B GPT-J-6B: 6B JAX-Based Transformer 6B / 20B 2K tokens Apache 2.0 HuggingFace
Minerva 2022/06 Minerva Solving Quantitative Reasoning Problems with Language Models 540B 2K tokens Proprietary [Research Only]
Gallactica 2022/11 Gallactica Gallactica: A Large Language Model for Science 120B 2K tokens Apache 2.0 [Removed]
Vicuna 2023/03 Vicuna Vicuna: An Open-Source Chatbot Impressing GPT-4 7B, 13B, 33B 2K tokens Custom FastChat
Alpaca 2023/03 Stanford Alpaca Stanford Alpaca: An Instruction-following LLaMA Model 7B 2K tokens Custom GitHub
Coding-Specialized Models
Code Llama 2023/08 Code Llama Models Code Llama: Open Foundation Models for Code 7B - 34B 4K tokens Custom HuggingChat
StarCoder / StarChat 2023/05 StarCoder, StarChat StarCoder: A State-of-the-Art LLM for Code 1.1B - 16B 8K tokens OpenRAIL-M v1 HuggingFace
CodeGen2 / CodeGen2.5 2023/04-07 CodeGen2, CodeGen2.5 CodeGen2: Lessons for Training LLMs on Programming and Natural Languages 1B - 16B 2K tokens Apache 2.0 HuggingFace
CodeT5+ 2023/05 CodeT5+ CodeT5+: Open Code Large Language Models for Code Understanding and Generation 0.22B - 16B 512 tokens BSD-3-Clause GitHub
Replit Code 2023/05 replit-code-v1-3b Training a SOTA Code LLM in 1 week 2.7B Infinity (ALiBi) CC BY-SA-4.0 HuggingFace
SantaCoder 2023/01 SantaCoder SantaCoder: don't reach for the stars! 1.1B 2K tokens OpenRAIL-M v1 HuggingFace
DeciCoder 2023/08 DeciCoder-1B Introducing DeciCoder: The New Gold Standard in Efficient and Accurate Code Generation 1.1B 2K tokens Apache 2.0 HuggingFace
Additional Historical Models
T5 / Flan-T5 2019/10 T5 & Flan-T5 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 0.06B - 11B 512 tokens Apache 2.0 HuggingFace
UL2 / Flan-UL2 2022/10 UL2 & Flan-UL2 UL2 20B: An Open Source Unified Language Learner 20B 512-2K tokens Apache 2.0 HuggingFace
InstructGPT 2022/03 API Access Training language models to follow instructions with human feedback 1.3B - 175B 2K tokens Proprietary [OpenAI API]
ChatGPT 2022/11 API Access ChatGPT: Optimizing Language Models for Dialogue ~175B 4K tokens Proprietary ChatGPT
Pythia 2023/04 Pythia 70M - 12B Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling 0.07B - 12B 2K tokens Apache 2.0 HuggingFace
Dolly 2023/04 dolly-v2-12b Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM 3B, 7B, 12B 2K tokens MIT HuggingFace
RedPajama-INCITE 2023/05 RedPajama-INCITE Releasing 3B and 7B RedPajama-INCITE family of models 3B - 7B 2K tokens Apache 2.0 HuggingFace
Falcon 2023/05 Falcon-180B, Falcon-40B, Falcon-7B The RefinedWeb Dataset for Falcon LLM 7B, 40B, 180B 2K tokens Apache 2.0 HuggingFace
MPT Family 2023/05-06 MPT-7B, MPT-30B Introducing MPT-7B 7B, 30B 2K-8K tokens Apache 2.0 MosaicML
OpenLLaMA 2023/05 OpenLLaMA Models OpenLLaMA: An Open Reproduction of LLaMA 3B, 7B, 13B 2K tokens Apache 2.0 HuggingFace
h2oGPT 2023/05 h2oGPT Building the World's Best Open-Source Large Language Model 12B - 20B 256-2K tokens Apache 2.0 h2oGPT
FastChat-T5 2023/04 fastchat-t5-3b-v1.0 FastChat-T5: Compact and Commercial-friendly Chatbot 3B 512 tokens Apache 2.0 HuggingFace
StableLM 2023/04 StableLM-Alpha Stability AI Launches StableLM Suite 3B - 65B 4K tokens CC BY-SA-4.0 HuggingFace
Koala 2023/04 Koala Koala: A Dialogue Model for Academic Research 13B 4K tokens Custom BAIR
OpenHermes 2023/09 OpenHermes-7B, OpenHermes-13B Nous Research OpenHermes 7B, 13B 4K tokens MIT HuggingFace
SOLAR 2023/12 Solar-10.7B SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-scaling 10.7B 4K tokens Apache 2.0 HuggingFace
Phi-2 2023/12 phi-2 Phi-2: The surprising power of small language models 2.7B 2K tokens MIT HuggingFace
OpenLM 2023/09 OpenLM 1B, OpenLM 7B Open LM: a minimal but performative language modeling repository 1B, 7B 2K tokens MIT HuggingFace
RWKV 2021/08 RWKV Models The RWKV Language Model 0.1B - 14B Infinite (RNN) Apache 2.0 HuggingFace
DLite 2023/05 dlite-v2-1_5b Announcing DLite V2: Lightweight, Open LLMs 0.124B - 1.5B 1K tokens Apache 2.0 HuggingFace
Open Assistant 2023/03 OA-Pythia-12B Democratizing Large Language Model Alignment 12B 2K tokens Apache 2.0 HuggingFace
Cerebras-GPT 2023/03 Cerebras-GPT Cerebras-GPT: A Family of Open, Compute-efficient, Large Language Models 0.111B - 13B 2K tokens Apache 2.0 HuggingFace
XGen 2023/06 XGen-7B-8K-Base Long Sequence Modeling with XGen 7B 8K tokens Apache 2.0 HuggingFace

Key Developments in 2024

The year 2024 was transformative for the LLM landscape, with multiple breakthrough releases that established new benchmarks and capabilities:

OpenAI's Major Releases: GPT-4o launched in May 2024 brought true multimodal capabilities with 232ms response times, while o1 and o1-mini in September introduced reasoning models that spend more time "thinking" through problems, achieving 83% on mathematical olympiad problems compared to GPT-4o's 13%.

Anthropic's Claude 3 Family: The Claude 3 series (Haiku, Sonnet, Opus) launched in March 2024 were the first models to challenge GPT-4's dominance on leaderboards, followed by Claude 3.5 Sonnet in June and Claude 3.7 Sonnet in October, which became particularly popular for coding tasks.

Google's Gemini Evolution: Gemini 1.5 Pro debuted in February 2024 with up to 2M token context windows, followed by Gemini 1.5 Flash in May for faster performance, and Gemini 2.0 Flash in December 2024.

Meta's Llama Progression: Llama 3 (8B, 70B) launched in April 2024, followed by the groundbreaking Llama 3.1 series in July including the massive 405B parameter model - the largest open-source model at the time. Llama 3.2 brought multimodal capabilities in September, and Llama 3.3 concluded the year in December.

Microsoft's Phi Revolution: Microsoft's Phi-3 family proved that smaller models could punch above their weight, with Phi-3 Mini (3.8B parameters) matching much larger models on benchmarks. The series expanded with Phi-3 Small (7B), Phi-3 Medium (14B), and Phi-3.5 Mini throughout 2024.

Enterprise-Focused Models: IBM Granite 3.0 launched in October 2024 focused on enterprise use cases, while Cohere's Command R and Command R+ models excelled in retrieval-augmented generation tasks.

Google's Open Models: Gemma 2 (9B, 27B parameters) launched in June 2024 became highly popular in the open-source community, consistently ranking high in community evaluations.

Key Developments in 2025

The year 2025 has been marked by several breakthrough releases in the LLM landscape. Grok 3, launched by xAI in February 2025, introduced a 1 million token context window and achieved a record-breaking Elo score of 1402 in the Chatbot Arena, making it the first AI model to surpass this milestone. The model was trained on 12.8 trillion tokens and boasts 10x the computational power of its predecessor.

Meta's Llama 4 family represents a major leap forward with the introduction of Mixture-of-Experts (MoE) architecture. Llama 4 Scout features an unprecedented 10 million token context window, while Llama 4 Maverick achieves an ELO score of 1417 on LMSYS Chatbot Arena, outperforming GPT-4o and Gemini 2.0 Flash.

DeepSeek-R1 emerged as the first major open-source reasoning model, trained purely through reinforcement learning without supervised fine-tuning. The model demonstrates performance comparable to OpenAI's o1 across math, code, and reasoning tasks while being completely open-source under the MIT license.

Qwen 3, released by Alibaba in April 2025, features a family of "hybrid" reasoning models ranging from 0.6B to 235B parameters, supporting 119 languages and trained on over 36 trillion tokens. The models seamlessly integrate thinking and non-thinking modes, offering users flexibility to control the thinking budget.

OpenAI continued its reasoning model series with o3 and o4-mini in April 2025, while Anthropic launched Claude 4 (Opus 4 and Sonnet 4) in May 2025, setting new standards for coding and advanced reasoning with extended thinking capabilities and tool use.

Google's Gemini 2.5 Pro debuted as a thinking model with a 1 million token context window, leading on LMArena leaderboards and excelling in coding, math, and multimodal understanding tasks.

Notable Trends in 2025

  1. Reasoning Models: The emergence of models that can "think" through problems step-by-step, with extended reasoning capabilities becoming standard.

  2. Massive Context Windows: Models now support context windows ranging from 1M to 10M tokens, enabling processing of entire codebases and documents.

  3. Mixture-of-Experts (MoE) Architecture: More efficient model architectures that activate only a subset of parameters during inference.

  4. Open-Source Reasoning: DeepSeek-R1's success has democratized access to reasoning capabilities previously available only in proprietary models.

  5. Multimodal Integration: Native multimodality becoming standard, with models trained on text, images, audio, and video from the ground up.

  6. Tool Use and Agentic Capabilities: Enhanced ability to use tools, execute code, and perform complex multi-step tasks autonomously.

Performance Benchmarks (2025)

Reasoning Benchmarks (AIME 2025)

  • Grok 3: 93.3%
  • DeepSeek-R1-0528: 87.5%
  • Gemini 2.5 Pro: 86.7%
  • o3-mini: 86.5%

Coding Benchmarks (SWE-bench Verified)

  • Claude Opus 4: 72.5%
  • Claude Sonnet 4: 72.7%
  • OpenAI Codex 1: 72.1%
  • Llama 4 Maverick: ~70%

Long Context Performance (1M+ tokens)

  • Llama 4 Scout: 10M tokens
  • Grok 3: 1M tokens
  • Gemini 2.5 Pro: 1M tokens
  • Llama 4 Maverick: 1M tokens

Model Evolution Timeline

2022: Foundation Era

  • ChatGPT revolutionized conversational AI
  • InstructGPT introduced instruction following
  • Large proprietary models dominated (GPT-3, PaLM, Chinchilla)

2023: Open Source Awakening

  • LLaMA sparked the open-source revolution
  • Claude introduced constitutional AI
  • Specialized coding models emerged (Code Llama, StarCoder)
  • Model sizes optimized for efficiency (Phi, Mistral)

2024: Multimodal & Reasoning Breakthrough

  • GPT-4o achieved true multimodality
  • o1 introduced step-by-step reasoning
  • Claude 3 challenged GPT-4 dominance
  • Llama 3.1 405B became largest open model
  • Gemini 1.5 pushed context limits to 2M tokens

2025: The Reasoning Revolution

  • Grok 3 achieved highest Arena scores
  • DeepSeek-R1 democratized reasoning capabilities
  • Llama 4 introduced 10M token contexts
  • Claude 4 set new coding standards
  • Qwen 3 pioneered hybrid reasoning modes

Citation

If you find our survey useful for your research, please cite the following paper:

@article{hadi2024large,
  title={Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects},
  author={Hadi, Muhammad Usman and Al Tashi, Qasem and Shah, Abbas and Qureshi, Rizwan and Muneer, Amgad and Irfan, Muhammad and Zafar, Anas and Shaikh, Muhammad Bilal and Akhtar, Naveed and Wu, Jia and others},
  journal={Authorea Preprints},
  year={2024},
  publisher={Authorea}
}

Model Organization Summary

By Company/Organization:

πŸ”΄ Proprietary Models:

  • OpenAI: GPT-4, GPT-4.5, GPT-4o, o1, o3, o4-mini, ChatGPT, InstructGPT
  • Anthropic: Claude 3 Family, Claude 3.5, Claude 3.7, Claude 4, Anthropic LM
  • Google/DeepMind: Gemini 2.5, Gemini 2.0, Gemini 1.5, PaLM 2, Bard, T5, UL2, Chinchilla, Sparrow, Gopher, GLaM, Minerva
  • xAI: Grok 3, Grok 3 Mini
  • AI21 Labs: Jurassic-1, Jurassic-2
  • Mistral AI: Mistral 7B, Mistral Large 2, Mistral Medium

🟒 Open Source Models:

  • Meta: Llama 4, Llama 3.x, Llama 2, OPT, Code Llama, Gallactica
  • Alibaba: Qwen 3, Qwen 2.5, QwQ-32B
  • DeepSeek: DeepSeek-R1, DeepSeek-V3
  • Microsoft: Phi-3 Family, Phi-2
  • IBM: Granite 3.0, Granite 3.1
  • Google: Gemma 2
  • Cohere: Command R, Command R+
  • BigScience: BLOOM
  • EleutherAI: GPT-J, GPT-NeoX, Pythia
  • BigCode: StarCoder, StarChat, SantaCoder
  • Salesforce: CodeGen2, CodeT5+, XGen
  • TIIUAE: Falcon
  • Upstage: SOLAR

πŸŽ“ Academic/Research:

  • LMSYS: Vicuna, FastChat-T5
  • Stanford: Alpaca
  • UC Berkeley: Koala
  • LAION: Open Assistant
  • OpenLM Research: OpenLLaMA
  • MLFoundations: OpenLM

🏒 Other Companies:

  • Yandex: YaLM
  • Replit: Replit Code
  • H2O.ai: h2oGPT
  • Databricks: Dolly
  • Together: RedPajama-INCITE
  • MosaicML: MPT Family
  • Stability AI: StableLM
  • Nous Research: OpenHermes
  • Cerebras: Cerebras-GPT
  • Deci AI: DeciCoder
  • AI Squared: DLite
  • BlinkDL: RWKV

By Model Type:

🧠 Reasoning Models (2024-2025):

  • OpenAI: o1, o1-mini, o3, o3-mini, o4-mini
  • DeepSeek: DeepSeek-R1 Family
  • Alibaba: QwQ-32B, Qwen 3 (hybrid reasoning)
  • Google: Gemini 2.5 (thinking models)

πŸ’¬ Conversational Models:

  • OpenAI: ChatGPT, GPT-4o
  • Anthropic: Claude 3/4 Family
  • Google: Bard, Gemini
  • xAI: Grok 3

πŸ’» Code-Specialized:

  • Meta: Code Llama
  • BigCode: StarCoder, SantaCoder
  • Salesforce: CodeGen2, CodeT5+
  • Replit: Replit Code
  • Deci AI: DeciCoder

🌐 Multimodal:

  • OpenAI: GPT-4o
  • Google: Gemini 2.0/2.5
  • Meta: Llama 4, Llama 3.2

⚑ Efficient/Small:

  • Microsoft: Phi-3 Family, Phi-2
  • Google: Gemma 2
  • AI Squared: DLite
  • Upstage: SOLAR

Last updated: July 2025
Original repository: https://www.techrxiv.org/doi/full/10.36227/techrxiv.23589741.v3

About

The official GitHub page for the survey paper "A Survey on Large Language Models: Applications, Challenges, Limitations, and Practical Usage"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 6