Articles

Brief analysis of emerging skills in large linguistic models

Much of the research on artificial intelligence over the past two decades has focused on training neural networks, to perform a single task with specific training data sets. For example, classify if an image contains a cat, summarize an article, translate from English to Swahili ...

In recent years, a new paradigm has evolved around language models: neural networks that simply predict the next words in a sentence given the previous words in the sentence.

After being trained on a large body of unlabeled text, linguistic models can be "invited" to perform arbitrary tasks such as predicting the word following a sentence. For example, the task of translating an English sentence into Swahili could be rephrased as predicting the next word: "The Swahili translation of 'artificial intelligence' is ..."

From task-specific to task-general

This new paradigm represents a shift from models task-specific, trained to perform a single task, in models task-general, which can perform various tasks. Plus the models task-general they can also perform new activities that have not been explicitly included in the training data. For example, GPT-3 showed that linguistic models can successfully multiply two-digit numbers, even if they have not been explicitly trained to do so. However, this ability to perform new tasks only occurred with models with a certain number of parameters and trained on a sufficiently large data set.

Emergency as a behavior

The idea that quantitative changes in a system can lead to new behavior is known as emergency, a concept popularized by Nobel laureate Philip Anderson's 1972 essay “More is Different”. In many disciplines such as physics, biology, economics and computer science, the emerging phenomenon has been observed in complex systems.

In a recent article published Transactions on Machine Learning Research, the lab HAI in Stanford University definishes emerging skills in large language models as follows:

A skill is emergent if it is not present in the smaller models but is present in the larger models.

Innovation newsletter
Don't miss the most important news on innovation. Sign up to receive them by email.

To characterize the presence of skills emerging, our article aggregated the findings for various models and approaches that have emerged over the past two years since the release of GPT-3. The paper examined research that analyzed the influence of scale: models of different sizes trained with different computational resources. For many activities, the behavior of the model grows predictably with scale or increases unpredictably from random performance to higher than random values ​​at a specific scale threshold.

To learn more read the article on emerging skills in linguistic models

Jason Wei is a research scientist at Google Brain. Rishi Bommasani is a sophomore doctoral student at Stanford's Department of Computer Science who helped launch the Stanford Center for Research on Foundation Models (CRFM). Read their study "Emerging Abilities of Large Language Models,", written in collaboration with scholars from Google Research, Stanford University, UNC Chapel Hill, and DeepMind.

Staff BlogInnovazione.it

Innovation newsletter
Don't miss the most important news on innovation. Sign up to receive them by email.

Latest Articles

Innovative intervention in Augmented Reality, with an Apple viewer at the Catania Polyclinic

An ophthalmoplasty operation using the Apple Vision Pro commercial viewer was performed at the Catania Polyclinic…

May 3, 2024

The Benefits of Coloring Pages for Children - a world of magic for all ages

Developing fine motor skills through coloring prepares children for more complex skills like writing. To color…

May 2, 2024

The Future is Here: How the Shipping Industry is Revolutionizing the Global Economy

The naval sector is a true global economic power, which has navigated towards a 150 billion market...

May 1, 2024

Publishers and OpenAI sign agreements to regulate the flow of information processed by Artificial Intelligence

Last Monday, the Financial Times announced a deal with OpenAI. FT licenses its world-class journalism…

April 30 2024