From the course: Introduction to Prompt Engineering for Generative AI

The AI-generated image landscape

From the course: Introduction to Prompt Engineering for Generative AI

The AI-generated image landscape

- Advancements in AI have brought about the world of AI-generated video and images. We now have technology that can take a text description and generate imagery or video based on that prompt. Now, this has brought about some controversies and some of the important questions around image and video generations include are artists getting enough credit and enough compensation for creating the images that were used to train these models? There's also the question of what is art? And this is similar to a debate that came about when cameras were initially invented. Finally, there's the potential for misuse, the ability of people to create images to mislead or manipulate other people. In particular, this is of concern in fake news. Now, two very popular systems include DALL-E and Midjourney. DALL-E was created by OpenAI, the company that created ChatGPT. It's a diffusion-based model, meaning it starts with a noisy sequence and then slowly removes noise until it has a clean image. DALL-E was trained on millions of images, and it's available through Copilot as well as Chat GPT. Midjourney is extremely powerful. The way you interact with it is using the chat platform Discord. It has fewer free capabilities, so if you plan on using Midjourney a lot, you probably want to look into a paid subscription. Finally, there's the model Stable Diffusion by Stability AI. Now, stable diffusion is very powerful, and what makes it unique is that it's open source. This allows companies to use it as part of their AI offering. When it comes to AI-generated videos, there are quite a few tools out there, but one that really stands out is OpenAI's Sora. Sora is also diffusion based, and it generates extremely realistic videos. At the time of this recording, Sora will occasionally struggle representing some things. One of the things that really stands out is hand movements. You can sometimes see that they're not as accurately represented. There's also the issue of things coming in and out of the frame. Sometimes, the model can struggle to keep up with objects that are supposed to leave the frame and come back.

Contents