Skip to content

A Retrieval-Augmented Generation (RAG) based question-answering proof-of-concept (PoC) that enables users to query documents using natural language. This system leverages local LLMs via Ollama for enhanced privacy and performance and features a chat-based interface. Built entirely with Python, including both backend an frontend.

Notifications You must be signed in to change notification settings

shantoroy/rag-chatbot-python-fullstack-template

Repository files navigation

RAG Chatbot (Python Fullstack template)

A RAG (Retrieval Augmented Generation) based question-answering prrof-of-concept (PoC) system that allows a user to query target documents using natural language. The system uses local LLMs through Ollama for privacy and performance and provides a chat interface for easy interaction. The entire codebase (both backend and frontend are developed using Python3.10).

🌟 Features

  • Local LLM Integration: Uses Ollama for running models locally, ensuring data privacy
  • Vector Search: Efficient document retrieval using FAISS
  • Modern Chat Interface: Built with Chainlit for a smooth user experience
  • Containerized Services: Easy deployment with Docker Compose
  • Async Processing: Built with FastAPI for high performance

πŸ”§ System Architecture

        
User ───────────────▢ Chainlit UI       Documents (txt/md/pdf)
                           β”‚                     β”‚
                           β”‚                     β”‚
                    Query  β”‚                     β”‚ Retrieve
                           β”‚                     β”‚ Documents
                           β–Ό                     β–Ό
                       FastAPI ◄───────────  RAG Model ◄─────── Local Ollama
                       Backend                   β”‚
                           β”‚                     β”‚
                           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                Return Answer

πŸš€ Getting Started

Prerequisites

  • Docker
    • Linux: Follow the official Docker documentation for your distribution.
    • Windows: Download and install Docker Desktop for Windows.
    • MacOS: Download and install Docker Desktop for MacOS.
  • Docker Compose
  • Ollama (for local model running)
  • Python 3.10+

Set-up

Install Ollama

  • Install Ollama locally (for Mac):

    brew install ollama
    brew services start ollama
  • Install Ollama locally (for Linux):

    curl -fsSL https://ollama.com/install.sh | sh
  • Install Ollama locally (for Windows): Download and install Ollama from the official Ollama website.

Next Steps

  • Download required models:

    ollama run mistral
    ollama run nomic-embed-text
  • Clone the repository:

    git clone https://github.com/shantoroy/rag-chatbot-python-fullstack-template.git
    cd rag-chatbot-python-fullstack-template
  • Configure .env file (check details at the Configuration section below)

  • Start the services:

    docker-compose build
    docker-compose up -d
  • Stop the services:

    docker-compose down

Usage

  • Access the chat interface at http://localhost:8505
  • Keep your files under the documents directory
  • Start asking questions about your documents!

πŸ—οΈ Project Structure

rag-chatbot-python-fullstack-template/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ model.py          # RAG model implementation
β”‚   └── api.py            # FastAPI backend
β”œβ”€β”€ frontend/
β”‚   └── app.py            # Chainlit chat interface
β”œβ”€β”€ docker/
β”‚   β”œβ”€β”€ backend.Dockerfile
β”‚   └── frontend.Dockerfile
β”œβ”€β”€ requirements/
β”‚   β”œβ”€β”€ backend_requirements.txt
β”‚   └── frontend_requirements.txt
β”œβ”€β”€ documents/            # Put/organize your documents here
β”‚   β”œβ”€β”€ test_file_1.txt 
β”‚   └── test_file_2.md
β”œβ”€β”€ .env.example          # Example file, rename to .env
β”œβ”€β”€ .gitignore
β”œβ”€β”€ docker-compose.yml    # Service orchestration
β”œβ”€β”€ requirements.txt      # Python dependencies as a whole (not needed)
β”œβ”€β”€ chainlit.md          # Chainlit configuration
└── README.md

System Diagram

System Diagram

πŸ”’ Security

  • All processing is done locally through Ollama
  • No data leaves your infrastructure
  • Authentication can be added as needed

πŸ› οΈ Configuration

  • Don't forget to rename the .env.example file to .env
  • Also add your own secret key.

Environment variables (.env):

Notes

To generate a CHAINLIT_AUTH_SECRET for your .env file, you can use the following command:

openssl rand -hex 32

This command uses OpenSSL to generate a secure random 32-byte hexadecimal string, which is suitable for use as an authentication secret. After running this command, you'll get a string that looks something like:

3d7c4e608f6df9a0e3e3ded3f1c3f384b9b3a9f9e5c1a0e2b4a8d1e0f2c3b4a7

You would then add this to your .env file:

CHAINLIT_AUTH_SECRET=3d7c4e608f6df9a0e3e3ded3f1c3f384b9b3a9f9e5c1a0e2b4a8d1e0f2c3b4a7

For Kubernetes, you'll need to encode this value as base64 before adding it to your secrets.yaml file:

echo -n "3d7c4e608f6df9a0e3e3ded3f1c3f384b9b3a9f9e5c1a0e2b4a8d1e0f2c3b4a7" | base64

Then use the resulting base64 string in your Kubernetes secrets configuration.

Kubernetes Deployment

Added sample kubernetes config files under kubernetes-template folder. You need to modify values before production usage. Read the Deployment Steps guide for details.

🀝 Contributing

  • Fork the repository
  • Create your feature branch (git checkout -b feature/amazing-feature)
  • Commit your changes (git commit -m 'Add amazing feature')
  • Push to the branch (git push origin feature/amazing-feature)
  • Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Ollama for local LLM support
  • LangChain for RAG implementation
  • Chainlit for the chat interface
  • FastAPI for the backend framework

About

A Retrieval-Augmented Generation (RAG) based question-answering proof-of-concept (PoC) that enables users to query documents using natural language. This system leverages local LLMs via Ollama for enhanced privacy and performance and features a chat-based interface. Built entirely with Python, including both backend an frontend.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published