by Sebastian Schuchmann
Medium, August 17, 2019
Source Document
Motivation
Both industries and governments alike have invested significantly in the AI field, with many AI-related startups established in the last 5 years. If another AI winter were to come about many people could lose their jobs, and many startups might have to shut down, as has happened before. Moreover, the economic difference between an approaching winter period or ongoing success is estimated to be at least tens of billions of dollars by 2025, according to McKinsey & Company.
This paper does not aim to discuss whether progress in AI is to be desired or not. Instead, the purpose of the discussions and results presented herein is to inform the reader of how likely progress in AI research is.
Analysis: What Has Led to the AI Winters?
For a detailed overview of both AI winters check out my first and second medium article on the topic.
In this section, the central causes of the AI winters are extracted from the above discussion of previous winters.
First, a recurring pattern that can be observed are that promises that kindled initial excitement, but which later turned out to be inflated have been the leading cause for the AI winters. For instance, government funding was cut during both AI winters after honest assessments of the results compared to the promises. The progress was overestimated because AI initially led to significant improvements in various fields, very quickly. This suggested most of the work was done with only some minor problems to solve. However, as it later turned out these problems were not so minor in the first place. The Lighthill report, a primary contributor to the first AI winter stated: “in no part of the field have discoveries made so far produced the major impact that was then promised.” Similarly, the 1984 panel at AAAI expressed: “This unease is due to the worry that perhaps expectations about AI are too high [. . .].”
Second, the cut in funding had a major impact on research in both AI winters. In the first AI winter, the Lighthill report led to a cut of funding for all but two universities in the U.K. and further led to cuts in Europe and the U.S. In the second AI winter, funding from DARPA was reduced. Moreover, the commercial failure of many AI-related startups in the late 1980s marked the second AI winter.
Third, technological limitations, like the perceptron experienced in the 1960s, inhibited progress. The perceptron, which was first expected to soon “be conscious of its existence” could not solve the XOR-problem at that time. Similarly, limitations were faced with expert systems in the 1980s. They could not solve fundamental problems like vision or speech and lacked common sense.
Consequently, in evaluating the likelihood of another AI winter, the following aspects should be examined closely:
-
Expectations and promises compared to the actual results;
-
Funding from governments and industries;
-
Technological limitations.
Many technologies exhibit similar patterns to those mentioned above. To further narrow the focus, it is necessary to figure out how AI deviates from other technologies. Though similar in some regard, AI appears to be very susceptible to inflated estimations and technological limitations. Some reasons why AI differs from other technologies are:
1. Intelligence is highly multidimensional:
At some point, AI researchers believed that by solving chess, the riddle of intelligence would be solved. This turned out to be wrong, because intelligence involves more than the dimension of conscious, strategic thinking. Chess is only a tiny particle in the cosmos of intelligence. Researchers gave it such a central position because it turns out to be hard for humans, which leads to reason number two.
2. Moravec’s Paradox
Chess, which requires higher level thinking, is a very new skill in our evolutionary history, which might be the reasons why it is relatively difficult for humans and therefore associated with Intelligence. Vision, on the other hand, is old and mainly subconscious, which leads people to believe it is easy, but there is no reason to assume it is not as hard or even more difficult than higher level thinking. This is Moravec’s Paradox, and one can argue AI researchers have fallen prey to this statement by underestimating the processes we do subconsciously, like sensorimotor skills or common sense.
3. Hype and fear associated with achieving human-level intelligence
As I. Jordan pointed out, the hype and fear surrounding machines that are capable of achieving intelligence easily leads to exaggerations and creates media attention less common in other fields.
With these reasons in mind, the possibility of a coming AI winter can be analyzed with the appropriate framing.
Probability of an Approaching AI Winter
Subsequently, the possibility of an upcoming AI winter is assessed. The current landscape of artificial intelligence and its public reception is studied. Furthermore, the present and the historical pre-winter times are compared regarding the key areas extracted beforehand. As a recap, these areas are:
-
Expectations and promises compared to the actual results;
-
Funding from governments and industries;
-
Technological limitations.
Expectations and Promises
Many public figures are voicing claims that are reminiscent of those of early AI researchers in the 1950s. By doing this, the former group creates excitement for future progress, or hype. Kurzweil, for instance, is famous for not only predicting the singularity, a time when artificial superintelligence will be ubiquitous, will occur by 2045, but also that AI will exceed human intelligence by 2029. In a similar manner, Scott is predicting that “there is no reason and no way that a human mind can keep up with an artificial Intelligent machine by 2035.” Additionally, Ng views AI as the new electricity.
Statements of this kind set high expectations for AI and spark hype. Consequently, the phenomenon of hype and how it relates to the present state of AI is investigated.
Hype and the Hype Cycle
A tool often used when looking at hype is Gartner’s Hype Cycle. It has practical applications that let us make predictions easily, but its validity is not scientifically established. First of all, it is not a tool developed for a scientific approach. It is stylized graph made for business decisions. That said, attempts to empirically validate the Hype Cycle for different technologies have been made. It can be concluded that the Hype Cycle exists, but that its specific patterns vary a lot.
The key phases of the cycle are the peak, where interest and excitement are at their highest, and the trough of disillusionment, where the initial expectation cannot not be met. Here, interest in the field is at its lowest. Then, the field slowly recovers and reaches the plateau of productivity.
As Menzies demonstrates, the Hype Cycle is well represented in the AAAI conference attendee numbers in the 1980s. First, the conference started with a rapid increase in ticket sales leading to a peak, and then those numbers quickly dropped down. Currently, conference attendee numbers for conferences like NIPS reach or even exceed the peak of AAAI in the 1980s, and they are quickly gaining in size.
Similar patterns for interest in the field can be observed in venture capital funding for AI startups, job openings, and earning calls mentions. Researchers of hype point out that the quantity of coverage is important, but that it has to be supported by qualitative sentiments. Sentiment analysis in media articles shows that AI-related articles became 1.5 times more positive from 2016 to 2018. Especially in the period from January 2016 to July 2016, the sentiment shifted. This improvement could be correlated with the public release of Alpha Go in January 2016, and its victory against world champion Lee Sedol in March.
Following the trend of the Hype Cycle, this could lead to another trough of disillusionment with ticket sales, funding, and job openings quickly plummeting. However, AI is a very broad term describing many technologies. This further complicates the matter, as each technology under the umbrella term AI can have its own Hype Cycle and the interactions of Hype Cycles both with each other and to AI, in general, remain unclear.
Going further, a more in-depth look into these claims is made, evaluating if the quick rise in AI interest is just the consequence of exaggerated promises or if the claims stand on firm ground.
Comparison to Expert Opinion
Now, the statements and promises made by public figures are compared to a survey of leading AI researchers. In 2017, a survey of 352 machine learning researchers, who published at leading conferences, was conducted. This survey forecasts high-level machine intelligence to happen within 45 years at a 50% chance and at a 10% chance within the next nine years. However, full automation of labor was predicted much later, with a 50% probability for it to happen within the next 122 years.
This study presents results far from the predictions of futurists like Kurzweil. Further, a meta-study on AI predictions has found some evidence that most predictions of high-level machine intelligence are around 20 years in the future no matter when the prediction is made. In essence, this points to the unreliability of future predictions of AI. Moreover, every prediction of high-level machine intelligence has to be viewed with a grain of salt.
In summary, a Hype Cycle pattern is present in the current AI landscape, leading to a potential decline in interest soon. Furthermore, optimistic predictions are made by public figures, but empirical evidence questions their validity.
Nevertheless, statements like those from Ng, who views AI as the new electricity, refer more to the current state of the industry. Accordingly, the industry and governments funding is examined next.
Investment and Funding
Funding has always had a significant role in AI research. As Hendler points out, cuts in government funding are only felt years later, since existing research programs continue. Thus, time passes until the lack of new research programs becomes evident. This means that a reduction in funding would need to be in place presently to be perceived in the years to come.
In April 2018, EU members agreed to cooperate on AI research. A communication on AI was issued that dedicated 1.7 billion dollars of funding for AI research between 2018 and 2020. Then, in June 2018, the European Commission proposed the creation of the Digital Europe funding program with a focus in five key areas and total funding of 9.2 billion euros, of which 2.5 billion is dedicated to AI research.
In March 2018, the U.S. Administration stated the goal of ensuring that the United States “remains the global leader in AI.” Later, in September 2018, DARPA announced a two billion dollar campaign to fund the next wave of AI technologies. In direct opposition, China has declared the goal of leading the world in AI by 2030. Consequently, several Chinese AI initiatives have been launched. These conflicting statements have motivated many to adopt the term, “AI Race,” to refer to the battle of leadership in the AI realm between the U.S. and China. It is similar to the Space Race in the 20th century between the U.S. and the Soviet Union with the countries fighting for dominance in space travel. Back then, the race sparked much funding and research. Likewise, the “AI Race” mentality could make any reduction in funding unlikely in the next years. This is a strong point against an upcoming AI winter, as previous winters were accompanied by a decline of government funding.
Another key point is the growing AI industry. Past AI researchers have been very reliant on government funding but, according to McKinsey & Company, non-tech companies spent between $26 billion and $39 billion on AI in 2016, and tech companies spent between $20 billion and $30 billion on AI.
The market forecasts for 2025, on the other hand, have an enormous variance ranging from $644 Million to $126 billion. This disparity demonstrates the economic difference between an upcoming AI winter and another period of prosperity.
To summarize, government funding is very solid, and the “AI Race” mentality makes it likely that this situation will continue. Additionally, the industry is currently thriving. Market forecasts, however, diverge drastically.
To determine which forecasts are more convincing, the progress AI has made in the last years is related to the criticisms of current approaches.
Evaluating Progress
To view the criticisms of present AI techniques in the appropriate frame, the progress that has been made since 2012 until today (April 2019) is evaluated.
As we have seen before, AI and machine learning have risen in popularity across many measurements. A few key events stand out in the shaping of the landscape. In 2012, a convolutional neural network won the ImageNet competition by a wide margin. This, combined with progress in object detection, changed the field of computer vision completely from handcrafted feature-engineering to learned representations, thereby enabling autonomous cars to become viable in the foreseeable future. Similarly impressive results have been made in the natural-language understanding space. Deep learning has enabled all the popular voice assistants, from Alexa and Siri to Cortana.
Reinforcement learning with deep neural networks has had impressive results in gameplaying. In 2014, DeepMind used a deep-q-learner to solve 50 different Atari games, without changing the model’s architecture or hyperparameters. This flexibility in tasks was unprecedented, which led to them being acquired by Google soon after and subsequently leading the space of reinforcement learning with achievements like AlphaGo, and AlphaStar.
Finally, in the last few years, generative adversarial networks (GAN) have achieved impressive results in generating images of e.g. human faces. In essence, deep learning has had groundbreaking results across many industries.
The Criticism of Deep Learning
In this chapter, criticisms of deep learning are discussed. As demonstrated, deep learning is at the forefront of progress in the field of AI, which is the reason a skeptical attitude towards the potential of deep learning is also criticism of the prospects of AI in general. It is a similar situation to the 1980s, when expert systems dominated the field and their collapse led to a winter period. If deep learning methods face comparable technological obstacles as their historical counterpart, similar results can be expected.
There are a few categories that have been identified in which most criticisms of deep learning fall: limitations of deep learning, brittleness, and lack of unsupervised learning.
Limitations of Deep Learning
“Today more people are working on deep learning than ever before—around two orders of magnitude more than in 2014. And the rate of progress as I see it is the slowest in 5 years. Time for something new.”
Francois Chollet, creator of Keras on Twitter
As this quote is taken from Twitter, its validity is questionable, but it seems to fall in line with similar arguments he has made and it captures the general feeling well. In his book “Deep Learning with Python,” Chollet has a chapter dedicated to the limitations of deep learning, in which he writes: “It [deep learning] will not solve the more fundamental problem that deep learning models are very limited in what they can represent, and that most of the programs that one may wish to learn cannot be expressed as a continuous geometric morphing of a data manifold.” As a thought experiment, he proposes a huge data set containing source code labeled with a description of the program. He argues that a deep learning system would never be able to learn to program in this way, even with unlimited data, because tasks like these require reasoning and there is no learnable mapping from description to sourcecode. He further elaborates that adding more layers and data make it seem like these limitations are vanishing, but only superficially.
He argues that practitioners can easily fall into the trap of believing that models understand the task they undertake. However, when the models are presented with data that differs from data encountered in training data, they can fail in unexpected ways. He argues that these models don’t have an embodied experience of reality, and so they can’t make sense of their input. This is similar to arguments made in the 1980s by Dreyfus, who argued for the need of embodiment in AI. Unfortunately, a clear understanding of the role of embodiment in AI has not yet been achieved. In a similar manner, this points to fundamental problems not yet solved with deep learning approaches, namely reasoning and common sense.
In short, Chollet warns deep learning practitioners about inflating the capabilities of deep learning, as fundamental problems remain.
Deep Learning Is Brittle
The common terminology used to describe deep learning models is brittle. There are several examples of why such a description would be accurate, including adversarial attacks, lack of ability to generalize, and lack of data. A detailed discussion of these flaws and eventual prevention mechanisms is pursued.
-
Adversarial attacks: It has been demonstrated that deep learning algorithms can be susceptible to attacks via adversarial examples. Adversarials use data modified in a non-recognizable way for humans to affect the behavior of deep learning models drastically. There are multiple methods for creating adversarial examples. In one technique, noise is added to the image by another learning algorithm in order to affect the classification, without being visible.
With this technique, it is possible for an image to be changed in such a way that a specified classification can be achieved, even if it is very different like from the original classification (like “panda” and “gibbon” which humans can easily distinguish). When the method of adversarial attack is known, it can be possible to defend against it by augmenting the training set with adversarial examples. To clarify, defending against a specific adversarial attack can be possible, but protecting against adversarials, in general, is hard. Nonetheless, there are successful methods recently developed that show promise in the issue. A formal way to defend against general adversarials has been used by bounding the output space of the model. Techniques like interval bound propagation have state-of-the-art accuracy in different popular image sets.
Alcorn et al. point out that extreme misclassification also happens when familiar objects appear in strange poses. Examples like these reveal that deep learning models, understanding of objects can be quite naive.
Furthermore, adversarial attacks demonstrate an underlying problem that is more profound—the lack of explainability. Due to the black box nature of deep learning models, predicting what the network is doing is hard. These adversarial attacks show that a model may have found an optimal way to classify an object in the training data, but it may still fail to capture the vastness of the real world.
That said, there has been much work in improving the interpretability of models, mostly in the vision space through methods like semantic dictionaries, saliency maps, and activation atlases. These works represent attempts to gain insights into the hidden layers of deep learning models.
-
Lack of ability to generalize: Furthermore, deep learning models have problems generalizing beyond the training data provided. Kansky et al. demonstrated that a model trained on the Atari game Breakout failed when small changes were made to the environment. For example, changing the height of the paddle slightly resulted in very poor performance of the agent. Similar criticism can be applied to any reinforcement learning system.
Cobbe compares the evaluation of reinforcement learning agents with supervised learning and concludes that evaluating an agent in the environment he trained in is like evaluating the performance of a supervised learner with the test set. The difference is that the first case is well accepted and practiced, and the second one is not tolerated in any sense.
To solve this problem, Cobbe, as part of OpenAI, devised a benchmark for generalization to promote work in this area. Additionally, transfer learning in the domain of reinforcement learning has recently seen impressive results with OpenAI’s Dota agent. They announced that they were able to continue training on the agent despite substantial changes in rules and model size by using a transfer learning technique. Using similar methods, the lack of generalization in agents could be improved.
3. Lack of data: As “The Unreasonable Effectiveness of Data” demonstrates, data is essential in deep learning. Moreover, the rise in available data was one of the main contributors to the deep learning revolution. At the same time, not every field has access to vast amounts of data.
That said, there are two ways to tackle this problem: by creating more data or by creating algorithms that require less data. Lake et al. show that humans are able to learn visual concepts from just a few examples.
Recent approaches in one-shot or few-shot learning, where an algorithm is only presented with one or a few data points e.g., one image of a given category, have made substantial improvements. At the same time, transfer learning approaches have improved immensely. By using a model pre-trained on a large data set as a basis, it is possible to significantly reduce training time on new data sets.
To summarize, deep learning models are fairly described as brittle. That said, researchers are working on promising solutions to this problem.
The Dominance of Supervised Learning
Most achievements realized by deep learning have been through supervised or reinforcement learning. However, as LeCun points out, humans mostly learn in an unsupervised manner by observing the environment. Additionally, approximate estimates predict that around 95 percent of data is unstructured. Moreover, labeling is a time-consuming and expensive process, but labels only contain very little information about each data point. This is why LeCun believes the field has to shift more to unsupervised learning.
A particular type of unsupervised learning, sometimes called self-supervised learning, has gained traction in the last couple of years. Self-supervised learning procedures exploit some property of the training data to create a supervision signal. In a video clip, for example, all frames are sequential, and researchers exploit this property by letting the model predict the next frame of the clip, which can easily be evaluated because the truth is inherent in the data. Similar methods can be used for text or audio signals. Additionally, different characteristics of the data can be used such as rotating an image and predicting the correct angle. The intuition is that in order to turn a rotated image back to its original form, a model needs to learn properties about the world that would also be useful in different tasks like object recognition. This proves to be correct, as this model can achieve great results in classification tasks via transfer learning. When looking at the first layer of the network, the filters are very similar to supervised models and even more varied.
This criticism could be detrimental to deep learning and AI in general, if researchers dismissed it, but it seems not to be the case. OpenAI has presented some promising results, achieved by using unsupervised learning earlier this year, with the GPT-2 transformer language model. This model can generate very human-like texts by using a very large model and vasts amount of data from Reddit. They used a type of self-supervised learning by exploiting the sequentiality of text and letting the model predict the next word. Using the same architecture, MuseNet, a model that composes music, has been created recently.
Unsupervised learning has the potential to solve significant obstacles in deep learning. Current research evidence suggests optimism regarding the progress of this learning technique.
Conclusion
A complex interplay is present between AI researchers, companies, the technology and the perception of AI on many different levels. Therefore, making any prediction is hard. However, there are a few key things we can observe about the field that differ from historical pre-winter times.
In the past, reliance on government funding was very strong and the industry weak. That is far from the case today; many large companies like Google, Facebook, and Alibaba are investing more in AI technologies alone than the AI industry was worth during its boom times in the 1980s. Even more importantly, those companies did not only invest heavily in AI, but they also incorporated it heavily into their products. This provides the field a solid foot on the ground, even when public sentiment starts to shift. Similarly, stability is provided by the “AI Race” mentality, which reduces the risk of a decline in funding from governments.
Equally important are the criticisms regarding deep learning and its limitations. Though most of it is valid, the evidence suggests that researchers are already working on solutions or are aware of the innate limitations of the technique.
Furthermore, unsupervised learning, especially self-supervised learning presents promising opportunities by enabling the use of vasts amount of unlabeled data and by saving immense amounts of tedious labor.
That said, the expectations of the field are too high. Predictions about machines reaching human intelligence are unreliable. Furthermore, a Hype Cycle pattern can be exhibited in current conference attendee numbers, with the field growing quickly on many scales. As Hype Cycle patterns vary, no certain statements or predictions can be made.
Finally, the historical perspective demonstrates the wave-like nature of the field. New technologies are being created every day; a vast amount of them die out; and some are being revived. In this light, it seems adequate to be prepared for current methods to die out, as well as to be on the lookout for some forgotten technology worth reviving.
To summarize: The funding for further AI research appears stable at the moment. However, there are some technological limitations which may, coupled with very high expectations, lead to another AI winter.
“People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world.”7
Pedro Domingos