Unleashing the Power of Generative AI: A Game-Changer for Our Lives

Once again the disruptive advances of scientific and technological innovation make the headlights. And like an information avalanche, it fills most communication channels, from blogs and social media to youtube tutorials and webinars, with the fantastic potential of Artificial Intelligence (AI) and how it can change our lives forever. In fact, AI has been between us for a while now, optimizing the operations we do every day at home, at work, and in our cars (even though they are not yet flying or driving on their own, but that has little to do with AI). And unfortunately most of these writers and speakers spend very little time explaining what is inside what many call “the magic box”.

And maybe its the mysterious nature of that “magic black box” that has been attracting audiences from all over the world with the most varied range of technical skills. Generative AI can be defined as a machine learning system that generates new content given an often relatively simple input (e.g. a few keywords and prompts), based on algorithms recurring to deep learning algorithms and on the data they have been trained on. In its core are generative adversarial networks (GANs) consisting of two neural networks: a generator that creates new data and a discriminator that evaluates the data working together. The generator improves the output based on discriminator feedback until it generates optimal content. Within its superpowers you count dataset enrichment, image and text generation from keyphrases, improved image resolution, or video prediction just to name a few.


Trio of images generated by the Dall-E 2 algorithm released by Open AI, output of the query “aliens with carnival masks in a party, realistic photography”
Images generated by the Dall-E 2 algorithm released by Open AI, output of the query “aliens with carnival masks in a party, realistic photography” 


One of the most popular breakthroughs in what we now call Generative AI (or GAI for short) was in the creation of new images using deep learning methods over a very large dataset of images. Already in 2018, NVIDIA researchers published a paper detailing their latest work in generating photo-realistic portraits of humans indistinguishable from images of real people. Many AI-driven image generators followed, offering text-to-image functionality including well-known brands like Shutterstock and Adobe, and allowing artists like ​​Milan Jaram to recreate the “humans” behind the Simpsons or Disney characters. The GAI system released by Open AI in a blog post in January 2021, named Dall-E, has been a popular example of this. It is using a version of the 2020th autoregressive language model, the now famous Generative Pre-trained Transformer 3 (GPT-3), available at github.com/openai/gpt-3. Because we are in the Carnival season, I asked the Dall-E 2 algorithm to provide me with a realistic photography of aliens having fun with carnival masks in a party, and there they are in the figure above. Somehow impressive.

It is very important to note that generative AI, particularly in the technical domain of computer vision, has an invaluable impact in areas like Cancer Research in, e.g., applying unsupervised methods for distinguishing cancerous from healthy tissue by generating healthy cell images that enrich the learning dataset, or performing cell segmentation for early blood cancer diagnosis. Contributing to these global efforts, it is also a bright slovenian team I have been engaged with in the past years through the European Commission project iPC building a virtual mechanistic modeling in a personalized medicine context for pediatric cancers, combining knowledge-based, machine-learning and mechanistic models to predict optimal standard and experimental therapies. 

And Yes, we will also talk about ChatGPT (and use it) but we will be discussing many other marvelous topics around AI and how it is changing the world around us, in this series of 10 articles prepared in collaboration with the SWForum. And we will also be wanting to know of your opinion and experience about the topics discussed here, so feel free to express yourself and tell us what are your worries and expectations. My name is Joao Pita Costa, part of the International Research Centre on AI under the auspices of UNESCO, and with several papers published, talks given and technologies released in machine learning and its applications, collaborating with several educational institutions on AI-related topics.  

But enough about me and more about generative AI. If we now ask ChatGPT directly what it is, and you will be answered around the following lines:

“I am ChatGPT, an AI language model developed by OpenAI. My main function is to generate human-like text in response to various inputs, making me a useful tool for a variety of applications, including customer service, chatbots, and virtual assistants. I have been trained on a massive amount of text data, allowing me to generate responses that are not only grammatically correct but also contextually appropriate. I am constantly learning and improving, and I am a cutting-edge example of the advancements being made in the field of artificial intelligence”.

The concepts and code for machine learning on text has been around since the early years of AI research in the 50’s. I am lucky to have be in the first line of the audience to great achievements with some collaborations throughout the past decade, enough to know how text mining problems are hard. After finishing my PhD in Mathematics, I joined the AI lab of the Institute Jozef Stefan, one of the largest science institutes in the region with a fruitful production of research results and technologies in the area of machine learning on text. Also due to this I would like to see more blog authors and youtubers trying to to understand and communicate how the “magic box” works, so that we do not fall in a similar storyline as in the case of the “Particle of God” that journalists used to nickname the hard to understand theoretical physics breakthrough related to the Highs boson in 2013.

Code generation is certainly another hot topic, previously addressed by initiatives like the coding assistant Github Copilot, and IBM’s ongoing project wisdom that together with RedHat aim to ease cloud automation allowing to use plain English. This is based on CodeNet, the largest dataset of its kind released in 2021 and aimed at teaching AI to code. It enabled leading AI research institutions like DeepMind to develop AlphaCode, ranked within the top 54% of participants in programming competitions with new problems requiring critical thinking and natural language understanding. On the other hand, we are now seeing more and more programmers using ChatGPT to take over their routine tasks. What will this make of the future of software industry, even without AlphaCode’s hypothesis of “the computer to learn how to programme itself?”

And who owns this AI-generated content? This week European Commission’s IP Help desk publishes an article about it, exploring the ownership of the queries and the outputs of ChatGPT in particular, but that can be scaled to other GAI tools that reuse the content in their learning datasets (with lawsuits already occurring as in Getty Images vs. Suitability AI last month). The IP Helpdesk webinar on the 5th of April is focusing “IP and Artificial Intelligence” to address these kind of legal issues on AI outputs.