It's been a year since Open AI set an important milestone in the development of generative AI with ChatGPT. A lot has happened since November 2022, but the next big evolutionary leap is still to come: Interactive AI - meaning AI applications that communicate with each other. Our author Falk Hedemann provides an overview.
Shortly after the launch of ChatGPT, technology experts realised early on that they were not just witnessing another trend that would soon die down again. Comparisons were quickly drawn with another major technology launch: Generative AI had its iPhone moment with ChatGPT.
If we take this comparison further, we are only at the beginning of a significant development. Compared to today's smartphone models, the first iPhone was more of a promise of the future. And this is exactly how we should categorise the first generation of generative AI: They are a preview of developments to come and, at best, we can only guess today in which direction they will develop.
As impressive as today's AI tools make us believe they are intelligent, they are not yet intelligent themselves. They appear intelligent to us because they can recognise and reproduce patterns in gigantic mountains of data. This means they have taken a major step forward in their development. However, today's AI tools are still a long way from being powerful AI.
AI researchers understand this to mean "Artificial General Intelligence" (AGI), which is not only able to simulate human communication by recognising patterns. Rather, an AGI can understand or learn any form of intellectual task that a human being can perform. If the capabilities of an AI even exceed human intelligence, it is referred to as a superintelligence or singularity.
Both AGI and the singularity are currently regarded as theoretical constructs of technological futurology for which it is unclear whether they will ever become reality. The current generation of AI tools, on the other hand, are categorised as "weak AI". They are characterised as models that have no creativity of their own and are unable to learn new skills independently. Their strength lies in recognising patterns in large amounts of data (machine learning). Weak AIs perform clearly defined tasks according to a fixed methodology and are particularly suitable for recurring, complex tasks.
ChatGPT as a text AI or DALL-E 3 as an image AI also belong to the category of weak AI. They can each fulfil their programmed tasks very well, but neither ChatGPT will suddenly generate images nor DALL-E 3 texts.
Since the development of an AGI is not yet in sight, but the tools of weak AI are already delivering very good results, their combination is a logical intermediate step. OpenAI, the provider of the two generative AI tools mentioned above, has already started working on this. The developers explain how this works in a blog post:
British AI researcher and entrepreneur Mustafa Suleyman calls this type of communication between AI tools "Interactive AI". He is convinced that in future we will no longer instruct our AI via graphical interfaces and by typing in text, but will talk to it. More complex tasks will then also be possible, which will be processed jointly by different AI tools. At least that's how he sees the future of his own ChatGPT alternative Pi.
Let's stay in the field of generative AI for now. With a whole set of AI tools that can communicate with each other to complete complex tasks together, we could have a complete content asset created with just one prompt. The digital assistants not only create a text for us, but also the appropriate illustration, a podcast with the author's voice, an explanatory video with subtitles and social media posts that are perfectly tailored to the respective platform and the target group that can be reached there. At least five different AI tools are involved.
In a CNBC interview, Suleyman, who is also co-founder of DeepMind, which formed Google's AI department following its acquisition in 2014, outlines a near future in which everyone has a personal AI assistant. This assistant is super-intelligent and smart, knows the user very well and can organise their day. It would be much more than just a research tool, but also an advisor and friendly companion.
According to Suleyman, this scenario should be a reality in five years' time. Whether it will actually happen that quickly and that far remains to be seen. Until then, numerous legal and ethical questions still need to be answered. And last but not least, it will also be a question of user acceptance.
Text: Falk Hedemann