- Python 3.10+
- Node.js 18+ and npm
- Git
- A valid OpenAI API key (if using GPT models)
cd backend
pip install -r requirements.txt
python -m spacy download en_core_web_smcd frontend
npm installTerminal 1:
cd backend
python -m uvicorn app.main:app --reload --host 127.0.0.1 --port 8000You should see:
Uvicorn running on http://127.0.0.1:8000
Terminal 2:
cd frontend
npm run devYou should see:
VITE v5.x.x ready in 1234 ms
Local: http://localhost:5173/
- Frontend: Open browser to
http://localhost:5173 - API Docs: Visit
http://localhost:8000/docs
- Go to the upload section
- Drag & drop PDF, TXT, or MD files (or click to select)
- Click "Upload & Index Documents"
- You should see:
- ✅ Progress bar animation
- ✅ Success message
- ✅ File validation (size, type)
- After uploading documents, go to the Query section
- Type your question
- Click "Ask"
- You should see:
- ✅ Answer generated from your documents
- ✅ Knowledge graph visualization
- ✅ Entity extraction and relationships
Problem: "uvicorn: The term 'uvicorn' is not recognized"
# Use Python module syntax:
python -m uvicorn app.main:app --reload --host 127.0.0.1 --port 8000Problem: "No module named 'spacy'" or "FAISS not found"
cd backend
pip install -r requirements.txt --force-reinstall
python -m spacy download en_core_web_smProblem: "Connection refused" - backend not running
- Make sure backend server is running on terminal
- Check that port 8000 is not in use:
netstat -ano | findstr :8000
Problem: "Cannot find module" or npm errors
cd frontend
rm -r node_modules package-lock.json
npm install
npm run devProblem: "Cannot reach http://127.0.0.1:8000"
- Backend server must be running first
- Check CORS settings are enabled (they are by default)
Create a .env file in the backend/ directory:
OPENAI_API_KEY=your-key-here
EMBEDDING_MODEL=all-MiniLM-L6-v2
LLM_MODEL=gpt-4o-mini
CHUNK_SIZE=300docker-compose upThis starts both services:
- Frontend: http://localhost:3000
- Backend: http://localhost:8000
Dataforge/
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI app & endpoints
│ │ ├── models/schemas.py # Data models
│ │ └── modules/ # Core RAG pipeline
│ │ ├── preprocessing.py
│ │ ├── retrieval.py
│ │ ├── entity_extraction.py
│ │ ├── graph_builder.py
│ │ └── answer_generator.py
│ └── requirements.txt
│
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ │ ├── DocumentUpload.jsx # Upload interface
│ │ │ ├── QueryForm.jsx # Query interface
│ │ │ ├── GraphVisualization.jsx # Graph display
│ │ │ └── ResultsPanel.jsx # Results display
│ │ ├── services/api.js # API calls
│ │ └── store/appStore.js # State management
│ └── package.json
│
└── docker-compose.yml
cd backend
pytest -v
# Run specific test
pytest tests/test_preprocessing.py::TestChunking::test_chunk_text_basic -v# Backend
cd backend
black . --check
ruff check .
# Frontend
cd frontend
npm run lint| Command | Purpose |
|---|---|
python -m uvicorn app.main:app --reload |
Start backend dev server |
npm run dev |
Start frontend dev server |
pytest -v |
Run backend tests |
npm run lint |
Lint frontend code |
docker-compose up |
Run with Docker |
docker-compose down |
Stop Docker services |
- First upload/query may take longer (models initializing)
- Subsequent uploads are faster (models cached)
- Typical query response: 3-10 seconds
- Graph visualization works best with <500 entities
- Check logs: Look at terminal output for error messages
- API Docs: Visit
http://localhost:8000/docsfor interactive API testing - Browser Console: Press F12 → Console tab for frontend errors
- Review: Check
backend/app/main.pyfor available endpoints
Happy exploring! 🚀