Technology

Artificial intelligence generates realistic images based on one sentence

DALL-E 2, a program that uses artificial intelligence, is capable of generating images based on text descriptions.

tvxposed

Jun 3, 2022 - 22:32

0 57

Artificial intelligence generates realistic images based on one sentence

Research firm OpenAI has developed a program that can turn simple text instructions into high-quality images.

It was named DALL-E 2 - named after Salvador Dali and WALL-E, a robot from the animated film of the same name. The program uses artificial intelligence to create realistic photos or images based on textual descriptions provided by the user.

Descriptions can be very complex and include a variety of actions, artistic styles, as well as multiple themes. Some of the examples on the OpenAI blog include "an astronaut riding a horse in a photorealistic style", as well as "teddy bears mix chemicals like crazy scientists in a steampunk style".

DALL-E 2 is upgrading to the company's previous tool, DALL-E, which was launched in January 2021. The new iteration gives stunning results thanks to higher resolution images, better text comprehension, faster processing, and some new features.

Artificial intelligence is trained in images and their textual descriptions to understand the relationship between objects. "Through deep learning, he not only understands individual objects like koalas and motorcycles but learns from relationships between objects. When you look for a picture of a koala riding a motorcycle, DALL-Eknows how to create it, or anything else with a relationship to another object or shop, ” they say from the company.

The program is not available to everyone. Currently, it has access to a small number of users, and the company says it will add 1,000 new users a week in the future.

There is a community of program users on Reddit, who share the results achieved by using the DALL-E 2 program.

Val Kilmer has been able to speak in Top Gun: Maverick thanks to artificial intelligence

Sonantic, a London-based artificial intelligence business capable of constructing speech models with actors playing lines from existing screenplays, enabled Val Kilmer to talk in Top Gun: Maverick. The technique includes sending the recorded voices to Voice Engine, its unique voice engine dedicated to developing the artificial intelligence model. The Voice Engine itself offers a number of human and automated features for validating the quality of the created model.

In the case of Kilmer, some manual labor was required because the information collected consisted of recordings of his own voice from when he could still speak normally. The company had to deal with a limited amount of material that had to be meticulously cleaned to remove background noise without harming the spoken content. The audio transcripts were then generated, and the audio and text were coupled in brief chunks to train the Voice Engine model, but the resulting data was quantitatively around ten times smaller than normal for a project.

Because the algorithms utilized in the Voice Engine were unable to provide the required results due to a shortage of data, Sonantic decided to explore and develop new algorithms that would allow the production of higher quality options from the available material. They eventually generated 40 alternative speech models and chose the one that functioned best in terms of quality and expressiveness. Furthermore, new algorithms developed to replicate Val Kilmer's voice have been added to the Voice Engine, allowing it to be used by future clients.