A comprehensive AI-powered student analytics platform that leverages Retrieval-Augmented Generation (RAG) to provide intelligent insights into student performance, engagement, and feedback analysis.
- Intelligent Chat Interface: AI-powered conversational interface for querying student data
- Automated Data Processing: Upload and process student survey data and demographics
- Multi-Modal Analysis: Support for various analysis types including:
- Individual student analysis
- Weekly performance reports
- Demographic segmentation analysis
- Course aspect analysis
- Real-time WebSocket Communication: Live updates for file processing and analysis completion
- Vector Database Integration: ChromaDB for efficient similarity search and retrieval
- Individual Analysis: Detailed insights for specific students
- Weekly Reports: Temporal analysis of student performance trends
- Segmentation Analysis: Demographic-based student grouping and analysis
- Aspect Analysis: Course-specific feedback and rating analysis
- Multi-format Support: CSV, XLS, XLSX file uploads
- Automated Data Mapping: Intelligent header mapping using LLM
- PostgreSQL Database: Robust data storage with async operations
- Background Processing: Celery-based asynchronous task processing for file processing, data analysis, and embedding generation
- Task Queue Management: Redis as message broker for reliable task distribution and result storage
- Framework: FastAPI with async/await support
- Database: PostgreSQL with SQLAlchemy ORM
- Vector Store: ChromaDB for embeddings and similarity search
- LLM Integration: Multi-provider support (Google Gemini, Groq, OpenAI)
- Background Tasks: Celery with Redis for asynchronous file processing, data analysis, and embedding generation
- Message Broker: Redis for task queue management and caching
- WebSocket: Real-time communication
- Framework: React 19 with TypeScript
- Styling: TailwindCSS
- State Management: Zustand
- Charts: Chart.js and Recharts
- Build Tool: Vite
- UI Components: Custom components with Lucide React icons
The RAG Student Sense application uses Celery with Redis as the message broker to handle computationally intensive and time-consuming tasks asynchronously. This architecture ensures that the main FastAPI application remains responsive while processing large datasets and generating embeddings.
-
File Upload Processing (
/api/upload/both-files,/api/upload/file):- Parse and validate uploaded CSV/Excel files
- Transform data according to intelligent header mapping
- Store processed data in PostgreSQL database
- Generate embeddings for survey comments and student data
- Add embeddings to ChromaDB vector store
-
Data Analysis Tasks:
- Generate comprehensive student analysis reports
- Process demographic segmentation analysis
- Create weekly performance reports
- Calculate NPS trends and risk assessments
-
Embedding Generation:
- Convert text data (comments, feedback) into vector embeddings
- Update ChromaDB collections with new embeddings
- Maintain vector store consistency
Redis serves multiple purposes in the application:
- Message Broker: Queues tasks for Celery workers
- Result Backend: Stores task results and status
- Caching: Improves performance for frequently accessed data
- Session Storage: Maintains task progress and status information
Celery tasks integrate with WebSocket connections to provide real-time progress updates:
- File processing progress notifications
- Analysis completion alerts
- Error handling and status updates
- Background task monitoring
- Python: 3.8 or higher
- Node.js: 16 or higher
- PostgreSQL: 12 or higher
- Redis: 6 or higher (required for Celery task queue and caching)
- API Keys: Google AI, Groq, and/or OpenAI API keys
git clone <repository-url>
cd RAG-Student-sensecd backend
python -m venv .venv
# On Windows
source .venv/Scripts/activate
# On macOS/Linux
source .venv/bin/activatepip install -r requirements.txt- Copy the example environment file:
cp .env.example .env- Configure your
.envfile with the following variables:
# Database Configuration
DATABASE_URL=postgresql://username:password@localhost:5432/database_name
NEON_DATABASE_URL=postgresql://username:password@localhost:5432/database_name
# Google API Configuration
GOOGLE_API_KEY=your_google_api_key_here
# ChromaDB Configuration
CHROMA_CLOUD_HOST=https://api.chromadb.com
CHROMA_TENANT=your_chroma_tenant_id
CHROMA_DATABASE=your_database_name
CHROMADB_API_KEY=your_chromadb_api_key
# RAG Collections
RAG_COLLECTION_HISTORICAL=historical_knowledge
RAG_COLLECTION_INTERVENTIONS=interventions_knowledge
RAG_COLLECTION_COMMENTS=weekly_comments_vectorized
# LLM API Keys
GROQ_API_KEY=your_groq_api_key_here
NOMIC_API_KEY=your_nomic_api_key_here
# Redis Configuration (for Celery)
REDIS_URL=redis://localhost:6379/0# Navigate to backend directory
cd backend
source .venv/Scripts/activate # Windows
# source .venv/bin/activate # macOS/Linux
# Initialize the database
python init_db.py
# Load sample data (optional)
python tests/load_sample_data.pycd frontend
npm installInstall and start Redis server:
On Windows:
- Download Redis from the official website or use WSL
- Start Redis server
On macOS:
brew install redis
brew services start redisOn Linux:
sudo apt-get install redis-server
sudo systemctl start redis-serverredis-servercd backend
source .venv/Scripts/activate # Windows
# source .venv/bin/activate # macOS/Linux
celery -A celery_worker worker --loglevel=infocd backend
source .venv/Scripts/activate # Windows
# source .venv/bin/activate # macOS/Linux
uvicorn main:app --reload --host 0.0.0.0 --port 8000cd frontend
npm run dev- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- Navigate to the Upload page
- Upload your student survey data (CSV/Excel format)
- Upload demographics data
- Wait for processing completion
- Go to the Chat page
- Ask questions about student data:
- "Show me individual analysis for student ID 12345"
- "Generate a weekly report for Course A"
- "Analyze demographic segments for working professionals"
- "What are the aspect scores for CSBT course?"
- View overall system status
- Monitor file processing progress
- Access quick analytics
POST /api/chat/query- Process chat queries with auto-classificationGET /api/chat/stream- Stream chat responsesWebSocket /api/chat/ws- Real-time chat communication
POST /api/upload/both-files- Upload survey and demographics filesPOST /api/upload/file- Upload single fileGET /api/upload/status/{file_id}- Check upload status
GET /api/data/students- Retrieve student dataGET /api/data/surveys- Retrieve survey dataGET /api/data/demographics- Retrieve demographics data
cd backend
source .venv/Scripts/activate
python -m pytest tests/cd frontend
npm testRAG-Student-sense/
βββ backend/
β βββ routers/ # FastAPI route handlers
β βββ celery_tasks/ # Background task definitions
β βββ tests/ # Backend tests and utilities
β βββ uploads/ # File upload directory
β βββ static/ # Static files
β βββ main.py # FastAPI application entry point
β βββ database.py # Database models and configuration
β βββ llm_integration.py # LLM and RAG implementation
β βββ vector_store.py # ChromaDB integration
β βββ celery_worker.py # Celery worker configuration
β βββ init_db.py # Database initialization
β βββ run.py # Application runner
β βββ requirements.txt # Python dependencies
βββ frontend/
β βββ src/
β β βββ components/ # React components
β β βββ pages/ # Page components
β β βββ stores/ # Zustand state management
β β βββ assets/ # Static assets
β β βββ App.tsx # Main React application
β β βββ main.tsx # Application entry point
β βββ index.html # HTML template
β βββ package.json # Node.js dependencies
β βββ vite.config.ts # Vite configuration
βββ sample_data_10/ # Sample data files
βββ README.md # This file
- Environment Variables: Never commit
.envfiles to version control - API Keys: Store all API keys securely in environment variables
- Database: Use strong passwords and enable SSL connections
- CORS: Configure CORS settings appropriately for production
- Input Validation: All user inputs are validated using Pydantic models
-
Backend Deployment:
- Use a production WSGI server like Gunicorn
- Configure environment variables
- Set up PostgreSQL database
- Configure Redis for Celery
- Set up reverse proxy (Nginx)
-
Frontend Deployment:
- Build the production bundle:
npm run build - Serve static files with a web server
- Configure API endpoints for production
- Build the production bundle:
Create Docker containers for both backend and frontend services with proper orchestration using Docker Compose.
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes and add tests
- Commit your changes:
git commit -am 'Add feature' - Push to the branch:
git push origin feature-name - Submit a pull request
[Add your license information here]
For support and questions:
- Create an issue in the GitHub repository
- Check the API documentation at
/docsendpoint - Review the logs in the
backend/logs/directory
- v1.0.0: Initial release with core RAG functionality
- v1.1.0: Added multi-provider LLM support
- v1.2.0: Enhanced analysis types and WebSocket integration
Built with β€οΈ using FastAPI, React, and modern AI technologies