From the course: Multimodal Programming Interfaces

Prompt engineering with visual context

From the course: Multimodal Programming Interfaces

Prompt engineering with visual context

- [Instructor] Let's use prompt engineering and to try to benefit and to try to get a better result for when we're trying to build something. So before I did this where I wanted to have a moving object that tracks my cursor when I'm moving around. Here's a screenshot example. I provided an example, and then it created these studies. Well, not exactly what I was looking for. What was I looking for? Well, actually what I wanted was something that would look to where the cursor is kind of like a phase or like an animated phase, just like the GitHub Copilot does in its landing page. So Claude, in this case, thought that, "Well, you wanted a glowing arb," and that's not exactly what I want. So a little bit of prompt engineering. What are we going to do? So what I've done here, I'm going to scroll here all the way down. "I want you to build an example logo of a face, it could be an emoji, but this emoji needs to move or tilt its head wherever the cursor is. I need you to do this instead of what you just built, where a light tracks the cursor." So I'm trying to specify a little bit better, right? Now, you can remember that before we had built something that looked really nice, and now this is, well, slightly different and things are, well, changed drastically, so that's a problem. So let's take a look here and we can actually switch over to the latest version and see what it has built. And I'm going to go here and says, "Hey, like that sounds like a good idea. Face reaction, eyes, head tilt, and expressions." So look at this. If I move the cursor, now I have something that, that is making this logo move the eyes. So that's, that's a pretty cool, and I'm actually clicking here. Nothing does random winks when cursor is at edges. So let's take a look at this. Yeah, there's, there's a little bit of winking here, and that looks correct. So now we have something that is pretty good. What is the challenge with this approach? Well, if we look at the code, let's take a look at what we have here. Again, this is probably a lot of cascading style sheets, CSS, and perhaps a lot of JavaScript, which you will see here. So the problem with this approach is that we're going to have to go back and forth with massive amounts of HTML and code. Now, we were able to finally get kind of like what we want, but kind of like the approach that I suggest here is that instead of going and creating a screenshot of all of what we're seeing here, we could, for example, just do a tiny, a tiny screenshot of just the part that we wanted to do. So, for example, if we wanted to just have kind of like the title or like this cool style that GitHub uses, I mean, GitHub is always kind of like at the forefront of nice the design patterns. What we can do here is kind of like, or what I suggest is just focus on the smallest parts possible, so when you're iterating, then those can be done instead of generating those massive amounts of HTML. So although I was able to put more prompt engineering and generate something that looks actually more to what I was looking for, I highly suggest that, well, this approach, especially for when building websites, it's going to be, well, tremendously more challenging, especially because of the amount of generated code is going to be much more. So this is probably the last time that we're going to be using something like Claude or a web service, and we will be looking into other more specialized interfaces and tools so that we can actually see how it looks when we're closer to the code.

Contents