This project is a FastAPI-based application that downloads PDFs, extracts text, builds a FAISS vector index using sentence-transformers/all-MiniLM-L6-v2, and answers questions using the Mistral model via Ollama. It processes various documents (e.g., Indian Constitution, insurance policies, technical manuals) and handles queries through a REST API endpoint.
- PDF Processing: Downloads and extracts text from PDFs.
- Vector Indexing: Uses FAISS and HuggingFace embeddings for efficient document retrieval.
- Question Answering: Leverages the Mistral model (Ollama) to answer questions based on PDF content.
- Concurrency Control: Limits parallel requests to prevent server overload.
- Retry Logic: Handles transient failures (e.g., timeouts) with retries.
- OS: Windows (tested), Linux, or macOS
- Python: 3.10
- GPU: NVIDIA GPU with CUDA for optimal performance (CPU fallback supported)
- Ollama: Installed and running with the Mistral model
- Dependencies: Listed in
requirements.txt
git clone https://github.com/abhinai2244/LLM-Powered-Intelligent-Query-Retrieval-System.git
cd LLM-Powered-Intelligent-Query-Retrieval-Systempython -m venv venv
# Windows
.\\venv\\Scripts\\activate
# Linux/macOS
# source venv/bin/activatepip install -r requirements.txtRequirements include:
fastapiuvicornllama-indexllama-index-embeddings-huggingfacellama-index-llms-ollamallama-index-vector-stores-faissrequestssentence-transformersfaiss-gpu(orfaiss-cpuif no GPU)tenacityfastapi-concurrency
-
Download and install Ollama: ollama.com
-
Start the Ollama server:
ollama serve
-
Pull the Mistral model:
ollama pull mistral
-
Verify Ollama is running:
curl http://localhost:11434
-
Check GPU availability:
nvidia-smi
-
Ensure
faiss-gpuis installed for GPU acceleration. Usefaiss-cpuif no GPU is available:pip install faiss-cpu
cd <your-repo>
uvicorn new:app --host 127.0.0.1 --port 8000 --workers 1The server will be available at http://127.0.0.1:8000.
Use curl, Postman, or a script to send a POST request to /api/v1/hackrx/run. Example:
curl -X POST http://127.0.0.1:8000/api/v1/hackrx/run -H "Content-Type: application/json" -d '{
"documents": "https://hackrx.blob.core.windows.net/assets/indian_constitution.pdf?sv=2023-01-03&st=2025-07-28T06%3A42%3A00Z&se=2026-11-29T06%3A42%3A00Z&sr=b&sp=r&sig=5Gs%2FOXqP3zY00lgciu4BZjDV5QjTDIx7fgnfdz6Pu24%3D",
"questions": [
"What is the official name of India according to Article 1 of the Constitution?",
"What is abolished by Article 17 of the Constitution?"
]
}'{
"answers": [
"The official name of India, according to Article 1 of the Constitution, is 'India, that is Bharat'.",
"Article 17 of the Constitution abolishes 'untouchability' and forbids its practice in any form."
]
}The API has been tested with:
- Indian Constitution: Legal questions (e.g., rights, articles).
- Arogya Sanjeevani Policy: Insurance claims (e.g., root canal, IVF coverage).
- Super Splendor Manual: Technical questions (e.g., spark plug gap).
- Family Medicare Policy: Health insurance queries.
- Principia Newton: Physics questions (e.g., laws of motion).
- UNI Group Health Insurance: Coverage details.
Ensure the PDF URL matches the question context to avoid irrelevant answers.
-
Increase timeout in
new.py:Settings.llm = Ollama(model="mistral", request_timeout=1200.0)
-
Restart Ollama:
# Windows taskkill /IM ollama.exe /F ollama serve -
Check GPU usage:
nvidia-smi
-
Use a smaller model:
ollama run mistral:7b
If pulling Mistral fails (no such host):
nslookup dd20bb891979d25aebc8bec07b2b3bbc.r2.cloudflarestorage.com
netsh interface ip set dns name="Wi-Fi" source=static address=8.8.8.8
ipconfig /flushdns
ollama pull mistralThe server limits concurrency to 1 request. Ask clients to avoid parallel requests or increase max_concurrent in fastapi-concurrency.
If port 11434 (Ollama) or 8000 (FastAPI) is in use:
netstat -aon | findstr :11434
taskkill /PID <PID> /FEnsure compatible versions:
pip install langchain==0.2.16 langchain-community==0.2.16 pydantic==2.8.2<your-repo>/
├── new.py # FastAPI application
├── requirements.txt # Dependencies
├── docs/ # Directory for downloaded PDFs
├── index_<hash>/ # Cached FAISS indices
└── README.md # This file
- Fork the repository.
- Create a feature branch (
git checkout -b feature/your-feature). - Commit changes (
git commit -m "Add your feature"). - Push to the branch (
git push origin feature/your-feature). - Open a pull request.
MIT License. See LICENSE for details.
For issues or questions, open a GitHub issue or contact <your-email>.