Open
Description
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
- Building on the existing Pre-LN code, add an output-LN layer to enable Peri-LN support; a new command-line argument option is also required.
Describe alternatives you've considered
- None
Proposed implementation
- TBU
Additional context
- A prior issue requesting support for the original Gemma architecture was closed as stale.
- Does Megatron has plan to support Gemma? #707
- Enabling Megatron-LM to support Gemma-family and Olmo 2 architectures.
Metadata
Metadata
Assignees
Labels
No labels