Week 1 - Artificial General Intelligence

AI Progress so far

The technological advancement in AI in the last 5 years has been staggering. Image generation has come on a huge way from GANs to the new Stable Diffusion models powering midjourney. We can now generate audio and videos based on text prompts too. Games like Go, DOTA and StarCraft have AI agents that are superior to humans. Even natural language based games like Diplomacy have been mastered which is quite scary in itself due to the level of deception you need. Most impressively has been the emergent behaviour of few shot learning in LLMs such as GPT3 and the ability to solve hard mathematical problems with Minerva. ChatGPT, which is a LLM conditioned on human feedback, has shown that chat bots are now more useful than ever, helping people speed up their coding and learning.

Source: Visualizing the deep learning revolution


Foundation models

Foundation models are powerful models trained with self-supervised learning on lots of data at scale. You can easily fine-tune these models to get SOTA results on a wide variety of tasks. Emergence is when behaviour is implicitly induced rather than explicitly constructed and this behaviour is shown in many foundation models such as few-shot learning in GPT-3. Homogenisation is the consolidation of methodologies and models such as transformers and BERT. Due to accessibility of these models is is very easy to fine-tune to solve different tasks or push capabilities further.


Emergence and homogenisation go hand in hand and accelerate the development of foundation models which are showing to be a clear route towards AGI. However, the unsettling thing here is that any flaws in these foundation models are blindly adopted for all emergent behaviour.


Source: On the Opportunities and Risks of Foundation Models


Background claims

Here are the four background claims for human level AI:

Source: Four Background Claims


Generalisation and AGI

Artificial General Intelligence - AIs that can generalise to human level performance on a wide variety of tasks. You can argue that ChatGPT fits this bucket but for me an AGI is when the intelligence is among the best rather than the average. Furthermore, for me an AGI can take actions in an environment so I think this basic definition of AGI has its problems. Here are a list of other things we thought an AGI needs in our AGI Safety Fundamentals session:


Task based approach is when a model is specifically trained for one thing as opposed to the generalisation approach where the model can understand new tasks without specific training data.


Some people think that scaling task based approaches will lead to super human performance across tasks. I think this is possible with enough data but that is rarely the case for these tasks. It is much more likely that a self-supervised approach to train a foundation model is more likely.


Evolution has led to us having rapid learning capabilities but what bad things has it caused? Greed, jealousy and anxiety are some that come to mind. A rational AGI would unlikely have these.

Some give the analogy of the role of a CEO when describing generalisation to many tasks. I think ChatGPT is almost at the point where it could be a relatively competent CEO. Almost certainly ChatGPT paired with a good prompt engineer could do the job even if the engineer had zero experience for it. ChatGPT would be able to give recommendations and give feedback when the engineer describes certain situations they have been in.


Some argue that an obstacle to the generalisation approach is specific features of the ancestral environment or human brains is required - I think this is unlikely given emergent behaviour of LLMs.


Source: AGI safety from first principles


Biological Anchors

Transformative AI is the term used for a transition when AI gains capability that causes the next industrial revolution.


Ajeya’s report finds that there is a 10% chance this happens by 2031, 50% by 2052 and 80% by 2100. The analysis suggests six possible approaches to estimate the number of FLOPs required to train an AGI:


FLOPs - Approach

10^30 - Neural net short horizon (minutes)

10^33 - Neural net medium horizon (hours)

10^36 - Neural net long horizon (years)

10^24 - Training data in human lifetime

10^41 - Training data in all evolution

10^33 - Genome approach


Some other fun facts include: FLOPs to train GPT-3 was ~10^24 and it has 10^11 parameters. FLOPs per second to run a human brain is 10^15. We have a super computer in Japan with 10^17 FLOPs.


Ajeya finds the timeline estimates by taking into account algorithmic and hardware improvements over time and then weighting these 6 approaches with a probability.


My one issue with the analysis is that there is no mention of how road blocks such as nuclear disaster, financial crashes or the chance the silicon factories get destroyed. These types of things would majorly slow down progress.


Source: Biological Anchors: A Trick That Might Or Might Not Work


More is Different for AI

This applies to DNA, Uranium water and traffic. It also applies to scaling foundation models. There is emergent behaviour such as few-shot learning when we do so. In some cases there are “phase transitions” such as Grokking. 


Could LLMs Grok if they were trained long enough? What emergent behaviour would that lead to?


ML researchers often have an Engineering worldview which is where they make predictions based on empirical results. The Philosophy worldview is sometimes underrated but the type of thought experiments that are common in this approach provide a strong 3rd anchor for AI Safety failures (along with existing systems and humans). The engineering worldview does have plenty of empirical findings that do generalise surprisingly far too. So we should use both when thinking about AI safety.


ML systems in the future will be qualitatively different, have weird failure modes.


Source: Future ML Systems Will Be Qualitatively Different, More is Different for AI


Intelligence Explosion

Intelligent AIs could swiftly lead to super-intelligence through self improvement. Super intelligence means we could have significant advances in maths, scientific and engineering challenges. The positive feedback loop of self improvement and not having the biological constraints of the human brain would mean there could be a quick takeoff. But some people disagree on this.


Why would there be an intelligence explosion? There are these benefits which hold back human’s being faster at getting more intelligent:


I wonder is LLMs at further scale could become superintelligent? Could simulation of these types of problems be emergent behaviour? Could it extrapolate or internally run experiments that solve problems we haven’t even written about before and are beyond our intelligence? Imagine asking an LLM to cure cancer and it runs the required experiments internally before continuing to predict the next word.


Also would LLMs be motivated to get better? I’m unsure if they would which goes against instrumentally convergent goals where the AI wants to:


Source: Intelligence explosion: evidence and import