Last month, DeepMind published a paper, “Training Compute-Optimal Large Language Models“, which talks about how OpenAI, DeepMind, Microsoft, etc., have trained large language models with a deeply suboptimal use of compute. DeepMind has also proposed new scaling laws for optimal compute use and has been training a new 70 billion parameter model that outperforms much larger language models, including GPT-3 (175 billion parameters) and Gopher (270-billion parameters).
Reacting to some recent developments regarding large language models, Russel Kaplan, Head of Nucleus at ScaleAI, wrote a series of tweets about the “second-order effect” of the rise of large language models. Let’s break the thread down.
Pay tax to companies creating large language models
Russel said in his Twitter thread that companies making products might have to embed intelligence into their massive language models, like add Copilot to VSCode, DALL.E 2 to Photoshop or GPT-3 to Google Docs. These companies may have to have their own large language models or pay tax to use them from OpenAI, Google, etc.
This tweet can be decoded with a paper written by AI researcher Timnit Gebru and her associates titled “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” This paper discusses the variety of costs and risks associated with larger language models.
Poor compute will depend on rich compute
Russel said that the development of large language models could create a divide among the tech companies,...
Read Full Story: https://analyticsindiamag.com/it-will-get-weirder-with-the-rise-of-large-language-models/
Your content is great. However, if any of the content contained herein violates any rights of yours, including those of copyright, please contact us immediately by e-mail at media[@]kissrpr.com.