― H.P. Lovecraft, At the Mountains of Madness
Horror fans might be familiar with author H.P. Lovecraft's fictional “shoggoths”, the shape-shifting and amorphous entities that he wrote about in his Cthulhu Mythos.
In the context of AI emergence, the term "shoggoth" is sometimes used to refer to a possible futuristic advanced form of artificial intelligence. It highlights the idea of an AI system that can rapidly learn, evolve, and assimilate new information and skills, much like how Lovecraft's shoggoths can change their forms and abilities.
Much has been made of so-called emergent abilities in AI. These are skills that are observed to arise unexpectedly and unpredictably within AI systems – like a shoggoth, rising from the depths.
Over the past year, there has been a growing focus on the idea of emergent abilities. Intelligent machines continued acquiring new skills, while, at the same time, their inner workings became less transparent and progressively more difficult for us to understand.
Recently, however, a new paper by Stanford researchers challenged the nature of currently observed emergent abilities in large language models (LLMs, such as GPT-3, PaLM, and LaMDA).
What are 'emergent abilities'?
A previous study defined emergence as abilities that are absent in smaller models but become present in larger models. The implication is that a machine learning model's performance will remain random until it reaches a specific size threshold. Then, it is expected to experience a sudden leap of improvement.
Experts have cautioned that sudden unexpected advancements of this kind would be a matter of concern. It would mean we could lose control of the AI system in question. At the very least, a sudden and unpredictable emergence of capabilities could lead to deception or malice in the models.
Then, AI researchers and industry leaders started claiming that some of the current LLMs have been unexpectedly exhibiting skills or knowledge beyond their intended programming…
The Stanford research
But the Stanford researchers says the observed emergent abilities are not genuine (yet, perhaps). Rather, they are a product of biased testing, cherry-picked examples, a lack of sufficient data, and the use of the wrong metrics to measure performance. They suggest that the choice of a "non-linear" measurement can create the appearance of sudden changes in performance, when in fact the improvement is gradual. "Linear'' metrics demonstrate more predictable progress.
This study is significant as it challenges the idea that emergent abilities will necessarily be an inherent characteristic of scaling AI models at all.
But while the researchers clarify that they are not saying large language models are incapable of demonstrating such emergent abilities, they emphasize that the emergent abilities reported in LLMs so far "are likely to be illusory".
While this research focused on GPT-3, they compared their findings to previous papers on the same model family.
Why the expectation of emergence at all?
Where did we get this expectation of emergent properties in LLMs in the first place? The Stanford paper explains that emergence, the manifestation of new properties as a complex system becomes more intricate, has been extensively studied across various disciplines. These include physics, biology, and mathematics. The authors cite P.W. Anderson's seminal work "More Is Different" (1972). Anderson claimed that with increasing complexity, unforeseen properties may arise that cannot be easily predicted, even with a precise understanding of the system's microscopic details.
The contrasting philosophy to this is known as reductionism. According to this viewpoint, the behavior and properties of complex systems can be explained and predicted solely by understanding the interactions and behaviors of their individual components.
Why ‘More is Different’
Anderson challenged the reductionist hypothesis. In his paper, Anderson proposed that emergent properties and behaviors are qualitatively different from the properties of a system's individual components. In other words, the whole system exhibits properties that cannot be explained by simply studying its individual parts.
Anderson uses examples from various scientific disciplines, such as solid-state physics, to illustrate his argument. He suggested that at certain levels of complexity, new phenomena arise that are not evident or predictable based solely on an understanding of the microscopic details. These emergent properties require a holistic perspective to be fully understood.
Various natural phenomena can be seen to produce such emergent properties. Examples include ants collaborating to build a bridge, birds flying in synchronized patterns, and the alignment of electrons generating magnetic properties. In physics, emergence plays a crucial role in understanding phenomena such as phase transitions, self-organization, and the behavior of complex materials. This collective behavior cannot be solely deduced from the behavior of individual components.
Through the study of emergence, scientists aim to unravel the fundamental principles underlying complex behaviors and structures across various scales.
Emergence as it stands
These days, emergence is a widely recognized and studied phenomenon across various disciplines. It has gained significant attention and relevance, particularly in fields such as physics, biology, complex systems, and, of course, AI. Researchers and scientists are actively investigating emergence to better understand how complex systems exhibit novel properties and behaviors that cannot be easily predicted or explained by analyzing their individual components.
As AI systems, including large language models (LLMs), become more complex and sophisticated, researchers will keep exploring how emergent properties and capabilities may arise.
In the meantime, perhaps, the shoggoth lies waiting…






Comments
Post a Comment