Skip to content

bitsandbrainsai/enterprise-rag-multilingual-knowledge-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿท๏ธ Project Title

RAG Multilingual QA System (English & Arabic)

Enterprise-grade Retrieval-Augmented Generation (RAG) platform enabling bilingual question-answering with citation-verified document retrieval using vector embeddings and FAISS indexing.

๐Ÿงพ Executive Summary

The RAG Multilingual QA System is a production-oriented AI knowledge engine designed to deliver fact-verified answers from structured English and Arabic document repositories. It implements a full Retrieval-Augmented Generation pipeline including ingestion, chunking, semantic embedding, vector indexing, query understanding, retrieval ranking, and citation-driven answer synthesis.

This system is designed for enterprise knowledge bases, regulatory compliance environments, multilingual customer support, internal documentation search, and AI-assisted information systems requiring explainability and traceability.


๐Ÿ“‘ Table of Contents

  • ๐Ÿท๏ธ Project Title
  • ๐Ÿงพ Executive Summary
  • ๐Ÿ“‘ Table of Contents
  • ๐Ÿงฉ Project Overview
  • ๐ŸŽฏ Objectives & Goals
  • โœ… Acceptance Criteria
  • ๐Ÿ’ป Prerequisites
  • โš™๏ธ Installation & Setup
  • ๐Ÿ”— API Documentation
  • ๐Ÿ–ฅ๏ธ UI / Frontend
  • ๐Ÿ”ข Status Codes
  • ๐Ÿš€ Features
  • ๐Ÿงฑ Tech Stack & Architecture
  • ๐Ÿ› ๏ธ Workflow & Implementation
  • ๐Ÿงช Testing & Validation
  • ๐Ÿ” Validation Summary
  • ๐Ÿงฐ Verification Testing Tools
  • ๐Ÿงฏ Troubleshooting & Debugging
  • ๐Ÿ”’ Security & Secrets
  • โ˜๏ธ Deployment
  • โšก Quick-Start Cheat Sheet
  • ๐Ÿงพ Usage Notes
  • ๐Ÿง  Performance & Optimization
  • ๐ŸŒŸ Enhancements & Features
  • ๐Ÿงฉ Maintenance & Future Work
  • ๐Ÿ† Key Achievements
  • ๐Ÿงฎ High-Level Architecture
  • ๐Ÿ—‚๏ธ Project Structure
  • ๐Ÿงญ How to Demonstrate Live
  • ๐Ÿ’ก Summary, Closure & Compliance

๐Ÿงฉ Project Overview

This project implements a bilingual Retrieval-Augmented Generation system that answers user queries by dynamically retrieving the most relevant knowledge from an indexed corpus of English and Arabic documents.

Unlike traditional LLM chatbots, this system never fabricates data. Every answer is grounded in retrieved document chunks and delivered with full citations.

Core Data Flow:

User โ†’ Language Detection โ†’ Query Embedding โ†’ FAISS Vector Search โ†’ 
Top-K Chunks โ†’ Prompt Construction โ†’ LLM / Mock Generator โ†’ Answer + Citations

๐ŸŽฏ Objectives & Goals

  • Build a bilingual (AR/EN) knowledge retrieval system
  • Guarantee answer grounding with citations
  • Support mock mode (no paid API required)
  • Enable CLI and Web-based interaction
  • Maintain low-latency, low-cost execution
  • Provide production-ready modular architecture

โœ… Acceptance Criteria

RequirementCompliance
AR + EN documents indexedYes (10 files)
Semantic vector searchFAISS implemented
Citations providedYes
Mock mode supportedYes
CLI & Web UIFastAPI + CLI
Latency metricsIncluded

๐Ÿ”— API Documentation

EndpointMethodDescription
/queryPOSTAccepts user question and language code and returns answer with citations
/healthGETService health check

API Flow:

Client โ†’ FastAPI โ†’ Language Detection โ†’ Retriever โ†’ Generator โ†’ Response JSON

๐Ÿ–ฅ๏ธ UI / Frontend

  • CLI-based interactive prompt
  • FastAPI JSON-based web interface
  • Input fields: question, language
  • Output: answer, citations, source file list
  • Network calls handled via REST over HTTP
  • UI logic located in src/web_app.py

๐Ÿ”ข Status Codes

CodeMeaning
200Query processed successfully
400Invalid query or missing parameters
500Vector engine or model failure

๐Ÿš€ Features

  • Multilingual embeddings for Arabic and English
  • FAISS vector similarity search with cosine similarity
  • Chunk-based retrieval for high recall
  • Document-level and chunk-level citation generation
  • Mock LLM for offline testing
  • FastAPI-powered REST interface
  • CLI-driven batch Q&A execution
  • Latency and cost observability
  • Pluggable embedding and LLM providers

๐Ÿงฑ Tech Stack & Architecture

LayerTechnology
LanguagePython 3.10+
Vector EngineFAISS
EmbeddingsSentenceTransformers / OpenAI
Web APIFastAPI
Testingpytest
PackagingDocker

ASCII Architecture Diagram:

             โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
             โ”‚  Documents โ”‚
             โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚
            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
            โ”‚  Chunker     โ”‚
            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚
            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
            โ”‚ Embeddings   โ”‚
            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚
            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
            โ”‚ FAISS Index  โ”‚
            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚
User Query โ†’ Embedding โ†’ Vector Search โ†’ Top-K Chunks โ†’ Generator โ†’ Answer + Sources

๐Ÿ› ๏ธ Workflow & Implementation

  1. Load English and Arabic documents from the data directory
  2. Split each file into semantic chunks
  3. Convert each chunk into a vector embedding
  4. Store embeddings in FAISS vector index
  5. User submits a query (CLI or API)
  6. Query is embedded
  7. FAISS retrieves top-K closest chunks
  8. Chunks are injected into a generation prompt
  9. Mock or OpenAI LLM produces answer
  10. Citations are attached from source documents

๐Ÿงช Testing & Validation

IDAreaTestExpected Result
T1IndexingFAISS buildVector index created
T2QueryEnglish Q&ACorrect answer returned
T3ArabicArabic Q&ACorrect retrieval
T4Mock ModeNo API callOffline success

๐Ÿ” Validation Summary

All major system components were validated including ingestion, vector search, multilingual embeddings, citation accuracy, and mock-mode execution. Both Arabic and English pipelines achieved deterministic retrieval and reproducible responses.


๐Ÿงฐ Verification Testing Tools

  • pytest for automated regression testing
  • FAISS vector consistency validation
  • CLI-based functional testing
  • FastAPI request validation

๐Ÿงฏ Troubleshooting & Debugging

  • Missing FAISS index โ†’ rebuild vector store
  • Zero search results โ†’ verify embedding model
  • Wrong language output โ†’ check langdetect
  • Slow responses โ†’ reduce chunk size or top-K
  • API errors โ†’ verify environment variables

๐Ÿ”’ Security & Secrets

  • API keys stored in .env file
  • No secrets committed to GitHub
  • Mock mode avoids external calls
  • All network calls encrypted over HTTPS

โ˜๏ธ Deployment

  • Local: Python + FastAPI
  • Dockerized deployment for production
  • Cloud compatible with AWS, DigitalOcean, GCP
  • Stateless API with persistent FAISS volume

โšก Quick-Start Cheat Sheet

  • Build index
  • Run CLI for Q&A
  • Start FastAPI for web usage
  • Use mock mode for offline testing

๐Ÿงพ Usage Notes

  • Always rebuild index after document changes
  • Arabic queries auto-detected
  • Top-K chunks configurable

๐Ÿง  Performance & Optimization

  • FAISS IVF indexes for large corpora
  • Batch embedding for faster ingestion
  • GPU-accelerated FAISS supported

๐ŸŒŸ Enhancements & Features

  • PDF and DOCX ingestion
  • Multilingual expansion
  • Hybrid BM25 + vector search
  • Role-based access control

๐Ÿงฉ Maintenance & Future Work

  • Scheduled index rebuilds
  • Document versioning
  • Semantic caching
  • LLM fine-tuning

๐Ÿ† Key Achievements

  • Full bilingual RAG pipeline
  • Explainable AI via citations
  • Mock + production modes
  • Enterprise-grade modular design

๐Ÿงฎ High-Level Architecture

User โ†’ API / CLI โ†’ Language Detection โ†’ Embedding Engine โ†’ FAISS Index โ†’ Top-K Chunks โ†’
Prompt Assembler โ†’ LLM / Mock Generator โ†’ Answer + Source Files

๐Ÿ—‚๏ธ Project Structure

rag-multilingual-qa-system/
โ”‚
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ product_catalog_en.txt
โ”‚   โ”œโ”€โ”€ product_catalog_ar.txt
โ”‚   โ”œโ”€โ”€ warranty_policy_en.txt
โ”‚   โ”œโ”€โ”€ warranty_policy_ar.txt
โ”‚   โ”œโ”€โ”€ safety_manual_en.txt
โ”‚   โ”œโ”€โ”€ safety_manual_ar.txt
โ”‚   โ”œโ”€โ”€ company_policy_en.txt
โ”‚   โ”œโ”€โ”€ company_policy_ar.txt
โ”‚   โ”œโ”€โ”€ technical_specs_en.txt
โ”‚   โ””โ”€โ”€ technical_specs_ar.txt
โ”‚
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ config.py
โ”‚   โ”œโ”€โ”€ ingest.py
โ”‚   โ”œโ”€โ”€ chunker.py
โ”‚   โ”œโ”€โ”€ embedder.py
โ”‚   โ”œโ”€โ”€ indexer.py
โ”‚   โ”œโ”€โ”€ retriever.py
โ”‚   โ”œโ”€โ”€ generator.py
โ”‚   โ”œโ”€โ”€ cli_app.py
โ”‚   โ””โ”€โ”€ web_app.py
โ”‚
โ”œโ”€โ”€ tests/
โ”œโ”€โ”€ build_index.py
โ”œโ”€โ”€ qa_cli.py
โ”œโ”€โ”€ Dockerfile
โ”œโ”€โ”€ requirements.txt
โ””โ”€โ”€ README.md

๐Ÿ’ก Summary, Closure & Compliance

This RAG Multilingual QA System satisfies all enterprise-grade AI knowledge system requirements including explainability, multilingual support, deterministic retrieval, testability, and deployment readiness.

The architecture aligns with modern GenAI compliance standards for:

  • Source traceability
  • Model governance
  • Data integrity
  • Regulatory-safe AI usage

This solution is suitable for regulated industries, enterprise knowledge bases, legal research, support automation, and multilingual document intelligence platforms.

About

Enterprise-grade multilingual RAG knowledge engine implementing FAISS-powered dense vector retrieval, semantic chunking, embedding pipelines, and citation-grounded LLM inference. Supports Arabic & English Q&A via FastAPI microservices, hybrid mock/OpenAI execution, and scalable document intelligence with production-ready retrieval orchestration.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors