First step towards GenAI and LLMs. My first experiment with RAG based chatbot, with an attempt to include features to make it faster, accurate and more robust. Main feature : this RAG based document retrieval and question-answering system can operate fully offline, without needing any api key(but after downloading/pulling the LLM and embedding models)
Older obselete versions are also updated to keep track of the progress with time.
Still few bugs exists, which needs to be fixed.
- instead of web scrapping, it needs to be changed to web search with Tavily integration
- Web layout needs to be updated.
- More models to be downloaded and tested.
- Version 1: Basic RAG implementation with FastAPI (Desktop version also available)
- Version 2: Enhanced chatbot with FAISS (Desktop version also available)
- Version 3: Advanced features and optimizations(v0, v1 , v2) [Streamlit versions]
- Multi-format document support (PDF, Text, Code)
- OCR capabilities with PyMuPDF and Tesseract
- Intelligent PDF parsing with fallback mechanisms
- Code syntax highlighting and language detection
- FAISS vector store integration
- Multiple embedding options:
- Ollama embeddings
- HuggingFace embeddings
- Configurable model selection
- Asynchronous document processing
- Advanced caching system with:
- LRU cache
- Disk cache
- Document fingerprinting
- Multi-threaded operations
- Multi-language support
- Context-aware responses
- Source attribution
- Response optimization
- Conversation state management
- Progress tracking
- Detailed logging
- Cache statistics
- Performance metrics
- Multi-Version Support: Three distinct versions with incremental improvements
- Document Processing: Handle PDFs, TXTs, and other text-based formats
- Vector Storage: FAISS-based efficient similarity search
- Multiple Interfaces: FastAPI and Streamlit implementations
- Async Processing: Enhanced performance with asynchronous operations
- Caching System: Optimized response times
- Testing Suite: Comprehensive unit and integration tests
RAG_new/
├── src/
│ ├── v1/ # Base implementation
│ ├── v2/ # Enhanced features
│ └── v3/ # Latest upgrades
├── data/ # Document and vector stores
├── tests/ # Testing suite
├── config/ # Configuration files
├── utils/ # Helper utilities
└── web/ # Web interface assets
# Clone repository
git clone https://github.com/yourusername/RAG_new.git
# Install dependencies
pip install -r requirements.txt
# Run tests
python run_tests.py
# Start web interface
python src/v1/RAG_Search_web.py- Set up Ollama LLM
- Configure vector store path in
config/settings.py - Add documents to
data/documents/
from src.v1.RAG_Search_new import RAG_search
# Simple query
response = RAG_search("How does RAG work?")
# Document ingestion
from src.v1.RAG_Search_new import create_vector_store_from_pdfs
create_vector_store_from_pdfs("path/to/docs")# Run all tests
pytest tests -v
# Run specific test category
pytest tests/unit -v
pytest tests/integration -v- V1: Basic RAG implementation with FastAPI
- V2: Enhanced chatbot with FAISS
- V3: Advanced features and optimizations
- check requirements file
MIT License
See CONTRIBUTING.md for guidelines.