Skip to content

Releases: NVIDIA/NeMo

NVIDIA Neural Modules 2.3.1

25 May 22:04
abddc85
Compare
Choose a tag to compare

Highlights

  • Collections
    • LLM
      • Llama 4: Fixed an accuracy issue caused by MoE probability normalization. Improved pre-train and fine-tune performance.
  • Export & Deploy
    • Updated vLLMExporter to use vLLM V1 to address a security vulnerability.
  • AutoModel
    • Improved chat-template handling.
  • Fault Tolerance
    • Local checkpointing: Fixed support for auto-inserted metric names for resuming from local checkpoints.

Detailed Changelogs:

Export

Changelog

Uncategorized:

Changelog
  • Bump to 2.3.1 by @chtruong814 :: PR: #13507
  • Cherry pick Use explicitly cached canary-1b-flash in CI tests (13237) into r2.3.0 by @ko3n1g :: PR: #13508
  • Cherry pick [automodel] bump liger-kernel to 0.5.8 + fallback (13260) into r2.3.0 by @ko3n1g :: PR: #13308
  • Cherry-pick Add recipe and ci scripts for qwen2vl to r2.3.0 by @romanbrickie :: PR: #13336
  • Cherry pick Fix skipme handling (13244) into r2.3.0 by @ko3n1g :: PR: #13376
  • Cherry pick Allow fp8 param gather when using FSDP (13267) into r2.3.0 by @ko3n1g :: PR: #13383
  • Cherry pick Handle boolean args for performance scripts and log received config (13291) into r2.3.0 by @ko3n1g :: PR: #13416
  • Cherry pick new perf configs (13110) into r2.3.0 by @ko3n1g :: PR: #13431
  • Cherry pick Adding additional unit tests for the deploy module (13411) into r2.3.0 by @ko3n1g :: PR: #13449
  • Cherry pick Adding more export tests (13410) into r2.3.0 by @ko3n1g :: PR: #13450
  • Cherry pick [automodel] add FirstRankPerNode (13373) into r2.3.0 by @ko3n1g :: PR: #13559
  • Cherry pick [automodel] deprecate global_batch_size dataset argument (13137) into r2.3.0 by @ko3n1g :: PR: #13560
  • Cherry-pick [automodel] fallback FP8 + LCE -> FP8 + CE (#13349) into r2.3.0 by @chtruong814 :: PR: #13561
  • Cherry pick [automodel] add find_unused_parameters=True for DDP (13366) into r2.3.0 by @ko3n1g :: PR: #13601
  • Cherry pick Add CI test for local checkpointing (#13012) into r2.3.0 by @ananthsub :: PR: #13472
  • Cherry pick [automodel] fix --mbs/gbs dtype and chat-template (13598) into r2.3.0 by @akoumpa :: PR: #13613
  • Cherry-pick Update t5.py (#13082) to r2.3.0 and bump mcore to f98b1a0 by @chtruong814 :: PR: #13642
  • [Automodel] Fix CP device_mesh issue, use PTL distsampler (#13473) by @akoumpa :: PR: #13636
  • [Llama4] Fix the recipe bug - cherrypick #13649 by @gdengk :: PR: #13650
  • build: Pin transformers (#13675) by @ko3n1g :: PR: #13692

NVIDIA Neural Modules 2.3.0

08 May 23:42
2b03b74
Compare
Choose a tag to compare

Highlights

  • Export & Deploy
    • NeMo 2.0 export path for NIM
    • ONNX and TensorRT Export for NIM Embedding Container
    • In-framework deployment for HF Models
    • TRT-LLM deployment for HF Models in NeMo Framework
  • Evaluation
    • Integrate nvidia-lm-eval to NeMo FW for evaluations with OpenAI API compatible in-framework deployment
  • AutoModel
    • VLM AutoModelForImageForTextToText
    • FP8 for AutoModel
    • Support CP with FSDP2
    • Support TP with FSDP2
    • Performance Optimization
      • add support for cut cross entropy & liger kernel
      • Gradient Checkpointing
  • Fault Tolerance
    • Integrate NVRx v0.3 Local checkpointing
  • Collections
    • LLM
      • Llama4
      • Llama Nemotron Ultra
      • Llama Nemotron Super
      • Llama Nemotron Nano
      • Nemotron-h/5
      • DeepSeek V3 Pretraining
      • Evo2
      • Qwen 2.5
      • LoRA for Qwen3-32B and Qwen3-30B-A3B
    • MultiModal
      • FLUX
      • Gemma 3
      • Qwen2-VL
    • ASR
      • NeMo Run support for ASR training
      • N-Gram LM on GPU for AED
      • N-Gram LM on GPU + Transducer greedy decoding (RNN-T, TDT)
      • Timestamps support for AED timestamp supported models
      • Migrate SpeechLM to NeMo 2.0
      • Canary-1.1
      • Replace ClassificationModels class with LabelModels
  • Performance
    • Functional MXFP8 support for (G)B200
    • Current scaling recipe with TP communication overlap and FP8 param gathers
    • Custom FSDP support that fully utilizes GB200 NVL72

Detailed Changelogs:

ASR

Changelog

TTS

Changelog

NLP / NMT

Changelog

Text Normalization / Inverse Text Normalization

Changelog

Export

Changelog

Uncategorized:

Changelog
Read more

NVIDIA Neural Modules 2.3.0rc4

21 Apr 23:24
b9abb0a
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Neural Modules 2.3.0rc4 (2025-04-21)

NVIDIA Neural Modules 2.3.0rc3

15 Apr 18:22
3d04c86
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Neural Modules 2.3.0rc3 (2025-04-15)

NVIDIA Neural Modules 2.3.0rc2

07 Apr 21:36
9ff7e75
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Neural Modules 2.3.0rc2 (2025-04-07)

NVIDIA Neural Modules 2.2.1

31 Mar 21:31
132f217
Compare
Choose a tag to compare

Highlights

  • Training
    • Fix MoE based models training instability.
    • Fix bug in Llama exporter for Llama 3.2 1B and 3B.
    • Fix bug in LoRA linear_fc1adapter when different TP is used during saving and loading the adapter checkpoint.

Detailed Changelogs:

Uncategorized:

Changelog
  • Re-add reverted commits after 2.2.0 and set next version to be 2.2.1 by @chtruong814 :: PR: #12587
  • Cherry pick Fix exporter for llama models with shared embed and output layers (12545) into r2.2.0 by @ko3n1g :: PR: #12608
  • Cherry pick Fix TP for LoRA adapter on linear_fc1 (12519) into r2.2.0 by @ko3n1g :: PR: #12607
  • Bump mcore to use 0.11.1 by @chtruong814 :: PR: #12634

NVIDIA Neural Modules 2.2.0

12 Mar 20:30
7192a2c
Compare
Choose a tag to compare

Highlights

  • Training
    • Blackwell and Grace Blackwell support
    • Pipeline parallel support for distillation
    • Improved NeMo Framework installation
  • Export & Deploy
    • vLLM export for NeMo 2.0
  • Evaluations
    • Integrate lm-eval-harness
  • Collections
    • LLM
      • DAPT Example and best practices in nemo 2.0
      • [NeMo 2.0] Enable Tool Learning and add a tutorial
      • Support GPT Embedding Model (Llama 3.2 1B/3B)
      • Qwen2.5, Phi4 (via AutoModel)
      • SFT for Llama 3.3 model (via AutoModel)
      • Support BERT Embedding Model with NeMo 2.0
      • DeepSeek SFT & PEFT Support
    • MultiModal
      • Clip
      • SP for NeVA
      • CP for NeVA
      • Intern-VIT
  • Automodel
    • Preview release.
    • PEFT and SFT support for LLMs available via Hugging Face’s AutoModelForCausalLM.
    • Support for Hugging Face-native checkpoints (full model and adapter only).
    • Support for distributed training via DDP and FSDP2.
  • ASR/TTS
    • Lhotse: TPS-free 2D bucket estimation and filtering
    • Update model outputs to make all asr outputs to be in consistent format
    • Sortformer Release Model

Detailed Changelogs:

ASR

Changelog

TTS

Changelog

NLP / NMT

Changelog

Text Normalization / Inverse Text Normalization

Changelog

NeMo Tools

Changelog

Export

Changelog

Bugfixes

Changelog
  • added required instalation for sox to process mp3 file by @Ssofja :: PR: #11709
  • removed the line which caused a problem in nfa_tutorial by @Ssofja :: PR: #11710
  • Bug fix minor bug in TRT-LLM deployment by @oyilmaz-nvidia :: PR: #11714

Uncategorized:

Changelog
Read more

NVIDIA Neural Modules 2.2.0rc3

25 Feb 12:47
b21e079
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Neural Modules 2.2.0rc3 (2025-02-25)

NVIDIA Neural Modules 2.2.0rc2

17 Feb 17:04
798b676
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Neural Modules 2.2.0rc2 (2025-02-17)

NVIDIA Neural Modules 2.2.0rc1

04 Feb 08:02
18e2bd8
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Neural Modules 2.2.0rc1 (2025-02-04)