“I am really quite close, I am very close, to the cutting edge in AI and it scares the hell out of me, It’s capable of vastly more than almost anyone knows and the rate of improvement is exponential.” – Elon Musk
“We have seen AI providing conversation and comfort to the lonely; we have also seen AI engaging in racial discrimination. Yet the biggest harm that AI is likely to do to individuals in the short term is job displacement, as the amount of work we can automate with AI is vastly larger than before. As leaders, it is incumbent on all of us to make sure we are building a world in which every individual has an opportunity to thrive.” -Andrew Ng
“The development of full artificial intelligence could spell the end of the human race….It would take off on its own, and re-design itself at an ever-increasing rate. Humans, who are limited by slow biological evolution, couldn’t compete, and would be superseded.” – Stephen Hawking
When a layman watches videos like the ones above and reads about the dangers of the coming of age of AI from industry stalwarts, they would understandably be quite scared. Questions such as, ‘Is AI gaining consciousness?’ ‘Is Skynet being built and are we moving towards a human vs AI terminator style Armageddon?’ would pop up in their mind.
So, how close is the current state of the art research to Skynet?
Leading this pack is GPT-3,(Generative Pre-Trained Transformer 3), 175 billion parameters( don’t need to understand, just know that it is really large!) model from Open AI, released in June 2020.
The ambition of the research to create GPT-3 was a bit different. The major reason GPT-3 was created was to show, just like we humans can do a lot of tasks( like sentence equivalence, sentiment analysis of a sentence, etc) of the English language with very few examples, a model can do that as well, if given enough data. Something known as few shot learning in the domain. But, GPT-3, after it had been created, gave some very important statistics.
- GPT-3 could convert English instructions to actual snippets of code. (Limited by the size of sequence length – don’t need to understand. Just understand, it could write code on its own but for small stuff)
- Only 52% of the articles written by GPT-3 could be disambiguated by human checkers, saying that it was not written by a human. (That’s half! GPT-3, fooled human evaluators half the time!)
- This one is interesting but a bit technical.
GPT-3 could actually do maths that it had never seen before. It had an accuracy of around 100 percent on both 2-digit additions and 2-digit subtractions and around 80 percent on 3-digit additions and 90 percent on 3-digit subtractions. What is impressive about this is, the model has only seen 0.85% (17 out of 2000) of the experimental test cases in training. Still, it achieves an accuracy of 100 percent. So, the answers are not a result of sequence memorization. So, what exactly is going on here?
- It is great at summarization of articles. Technically speaking – Abstract summarization.
Another very interesting part about this is, GPT-3 is 116.67 times larger than its predecessor GPT-2. The major reason OpenAI decided to create a larger model with more data is that they saw the performance improvement with respect to size and realized that the improvement did not concur with the law of diminishing return. The interesting thing about the various sizes of GPT-3 is, they still have not found a plateauing region, so GPT-4 is definitely on the cards, probably looking at multi-trillion parameters. So, has GPT-3 understood the basics of how to learn , and shall we be closer to AI sentience with GPT-4?
If you ask my humble opinion, I still feel that it is finding out patterns, mapping the patterns within themselves, and remembering them, than actually learning them. However, that leads to the question, isn’t understanding underlying patterns in tasks what we do for understanding contexts as well? When we were really young, we did not understand exactly what was going on but memorize number tables like
5 one za 5, 5 two za 10, 5 three za fifteeeeen…. ( If you can’t relate, you had a more privileged childhood than I did.)
Isn’t GPT-3 on the same tracks as well? It might not happen the next year, it might not happen in the next five years, it might not happen with a transformer architecture, but I feel natural language understanding is coming. Along with that, AI sentience. I am hopeful for the future. But, that future will take some time to come.
Until next time, cheers.
Footnote(No need to read if non-technical background): We already have larger models now. For example Google T5-XXL and Google Switch transformers architecture. Google switch transformers have 1.6 trillion ( Yes, trillion) parameters. But during the time of this article, we don’t have much exposure to what switch transformers can or can not do. So, looking forward to that.