From the course: Hands-On AI: Build a RAG Model from Scratch with Open Source

Setting up a dev container

- [Instructor] Before we take a deep dive, we want to take the time to ensure that everyone is able to follow regardless of your coding environments or operating systems. For this reason, we'll first cover how to run a dev container. A dev container sets up everything you need to let you focus more on the science and less on the engineering. And while this might be quite helpful, I advise that you also make sure to follow the rest of the course to ensure that you understand how to deploy everything yourself without using the dev container. But of course, if necessary, you can always come back and watch this video to spin up in an environment that already has everything running all in one place. This will require that you have VS Code downloaded or that you're using VS Code through GitHub Codespaces, which you can find at the link here. All the code that I present throughout the course will be linked as exercise files in the GitHub repo. I've organized the code according to the video numbers. I'll be using a mix of Python and Bash commands to get everything up and running with an estimated 90/10 split. So it'll be very helpful if you have experience with Python and some basic Bash script commands. That said, I'm gonna explain everything as thoroughly as possible and use the simplest versions of the code that I can concoct so even someone without Python experience but basic prior programming experience can follow along. We'll be coding in a Unix environment on an Ubuntu operating system. While you don't need prior experience with shell commands, having some familiarity can make things easier. If you're familiar with Unix-based systems, you'll likely have an easier time following along. Running a model locally is computationally taxing on your computer, and so to allow even learners with the smallest computers to follow along, we're gonna be running the smallest models possible in the dev container. These models can also be helpful in GitHub Codespaces if you want to avoid exorbitantly high computational costs. But of course, we can't ensure that they will eliminate all fees as that depends on how long it takes each learner to complete the course and whether they've allocated their free hours elsewhere. Our first step is gonna be to clone our repository, which you can find at the link here. And once you download it, you'll see that it has a folder called .devcontainer, and that's what we're gonna use to help spin up everything very quickly. But before we do that, we'll have to make sure to update the HF username and HF token variables on lines 15 and 16 of our devcontainer.json file, which lives within our .devcontainer directory. If you don't have Hugging Face credentials, you can visit huggingface.com and sign up for free. Now that we have that done, we're ready to move forward. So let's go ahead and run it by going to the palette at the top of VS Code and typing greater than sign Dev Containers Open Folder in Container. You see that it already popped up for us here. And once we do that, make sure to select this repository, rag models, and VS Code will find the dev container all on its own and spin everything up for us. So we click Open. Now, this process is going to take a while. It took me about 10 minutes, and the time may vary depending on your setup. So be patient and give it a little time to finish. For the sake of time, we'll fast-forward this video. We now see that our installation is just about complete. So this took well over 10 minutes. So now let's make some use of it. Let's go ahead and click the plus sign here to create a new terminal. And here we see that the user is vscode. So we're in this virtual environment now, and we need to add some final touches. At this point, these will seem to be arbitrary commands that you're just gonna follow, but as the course continues, you'll learn more about exactly what these commands are for. So now that we've opened up the terminal using the plus sign, let's go ahead and run ollama serve. And that's it. This needs to stay running somewhere in the background, and we're gonna open up a new terminal. Now we can go back to the old terminal there and we can go to the new one here. Let's go ahead and start Postgres. And we're gonna do that with sudo service postgresql start. And that's it. Now let's go ahead and log into our Postgres database as user postgres with psql dash capital U postgres. Great. And now we're gonna create a database called text embeddings. Awesome. Now let's go ahead and connect to that database called text embeddings and we do that with slash c text_embeddings. You want that to be a forward slash. Forward slash c text embeddings. And now we see that we're connected to the database text embeddings which we just created as user postgres. In this database, we want to create extension vector and hit Enter. And now that we've done all of that, let's go ahead and quit, again, with backslash q. And that's it. Now, we have one bit of final installations that we're gonna complete through Python. So let's go ahead and open Python with python3. And now let's go ahead and import nltk. And let's go ahead and run this command nltk.download P-U-N-K-T underscore T-A-B. And that's it. We've downloaded it. Finally, let's quit with quit in parentheses. And that's more or less all we need to do. We now have our environment ready and we can use Ollama, Postgres, and Python all in one place together. Now, please do keep in mind that it's quite important to understand how to get everything running yourself. So do pay close attention to the rest of this chapter so that you can start to understand why we were running these commands, why they're important, and how they can contribute to the science at the end of the day.

Contents