Inactive

CENTER FOR
DIGITAL TRUST

Megatron-LLM

Large language model training library

Megatron-LLM enables pre-training and fine-tuning of large language models (LLMs) at scale. It supports architectures like Llama, Llama 2, Code Llama, Falcon, and Mistral. The library allows training of large models (up to 70B parameters) on commodity hardware using tensor, pipeline, and data parallelism. It provides features like grouped-query attention, rotary position embeddings, BF16/FP16 training, and integration with Hugging Face and WandB.

Machine LearningNatural Language

Key facts

Maturity

Support

C4DT

Lab

Key facts

Maturity

Support

C4DT

Lab

Technical

Source code: Lab Github
Last commit: 2024-05-20

Machine Learning and Optimization Laboratory

Machine Learning and Optimization Laboratory

Martin Jaggi

Prof. Martin Jaggi

The Machine Learning and Optimization Laboratory is interested in machine learning, optimization algorithms and text understanding, as well as several application domains.

This page was last edited on 2024-04-12.

This page was last edited on 2024-04-12.