VentureBeat October 15, 2021
This week, Microsoft and Nvidia announced that they trained what they claim is one of the largest and most capable AI language models to date: Megatron-Turing Natural Language Generation (MT-NLP). MT-NLP contains 530 billion parameters — the parts of the model learned from historical data — and achieves leading accuracy in a broad set of tasks, including reading comprehension and natural language inferences.
But building it didn’t come cheap. Training took place across 560 Nvidia DGX A100 servers, each containing 8 Nvidia A100 80GB GPUs. Experts peg the cost in the millions of dollars.
Like other large AI systems, MT-NLP raises questions about the accessibility of cutting-edge research approaches in machine learning. AI training costs dropped 100-fold between 2017 and...