This guide shows the exact steps to run when setting up the Student Performance Predictor project for the first time.
pip install -r requirements.txtWhat it does: Installs all required libraries:
- streamlit
- pandas
- numpy
- scikit-learn
- joblib
- plotly
- statsmodels
Expected output: "Successfully installed [packages]"
python train_advanced.pyWhat it does:
- ✅ Loads StudentPerformanceFactors.csv (6,607 records)
- ✅ Performs data preprocessing & feature engineering
- ✅ Creates 35 features (19 original + 16 engineered)
- ✅ Trains 3 models:
- Linear Regression
- Random Forest
- Gradient Boosting
- ✅ Saves trained models to disk:
student_performance_model.pkl(main model)all_models.pkl(backup models)scaler.pkl(feature scaler)feature_importance.json(feature analysis)
Expected output:
Loading data...
Data shape: (6607, 34)
Preprocessing and feature engineering...
Final shape with engineered features: (6607, 35)
Training Linear Regression model...
Model accuracy: 1.0000 (100%)
Saving models...
All models saved successfully!
Time: 2-5 minutes
python verify_system.pyWhat it does:
- ✅ Checks Python version (3.8+)
- ✅ Verifies all required packages installed
- ✅ Checks if trained models exist
- ✅ Tests model loading & prediction
- ✅ Validates data files
- ✅ Tests feature engineering
- ✅ Runs sample prediction
Expected output:
✓ Python version OK: 3.x.x
✓ All required packages installed
✓ Model files found
✓ Model loads successfully
✓ Feature scaler loads successfully
✓ Data file valid
✓ Prediction working
✓ All tests passed!
Important: Fix any ❌ marks before continuing!
Time: 1-2 minutes
python test_app.pyWhat it does:
- ✅ Tests app functionality
- ✅ Validates prediction logic
- ✅ Checks data loading
- ✅ Tests UI components
Expected output:
Test 1: Data loading... PASSED
Test 2: Model prediction... PASSED
Test 3: Feature engineering... PASSED
Test 4: Confidence calculation... PASSED
All tests passed! ✓
Time: 2-3 minutes
python model_analysis.pyWhat it does:
- ✅ Analyzes model performance
- ✅ Generates model comparison
- ✅ Shows feature importance
- ✅ Calculates metrics
Output: Prints model statistics and analysis
Time: 1-2 minutes
streamlit run app.pyNote: If streamlit run app.py doesn't work on your system, try:
python -m streamlit run app.pyFeatures:
- Prediction Dashboard
- Next Semester Score
- Model Details
streamlit run app_advanced.pyNote: If streamlit run app_advanced.py doesn't work on your system, try:
python -m streamlit run app_advanced.pyFeatures:
- Prediction Dashboard
- Feature Importance Analysis
- Prediction Confidence Visualization
- Student Analytics
- Model Performance Metrics
START
↓
1. pip install -r requirements.txt (5 min)
↓
2. python train_advanced.py (5 min) ← MODEL TRAINING
↓
3. python verify_system.py (2 min) ← VERIFY SUCCESS
↓
4. [OPTIONAL] python test_app.py (3 min) ← UNIT TESTS
↓
5. [OPTIONAL] python model_analysis.py (2 min) ← ANALYSIS
↓
6. streamlit run app_advanced.py ← READY TO USE!
↓
END - Application running at http://localhost:8501
Total Time: 15-25 minutes (including optional steps)
If you just want to run it ASAP:
# Step 1: Install packages
pip install -r requirements.txt
# Step 2: Train model
python train_advanced.py
# Step 3: Run app
streamlit run app_advanced.py
# Alternative if streamlit run doesn't work:
python -m streamlit run app_advanced.pyTime: 10-15 minutes
| File | Purpose | Phase | Command |
|---|---|---|---|
requirements.txt |
Dependencies | Setup | pip install -r requirements.txt |
train_advanced.py |
Train ML model | Training | python train_advanced.py |
verify_system.py |
Check everything works | Verification | python verify_system.py |
test_app.py |
Unit tests | Testing | python test_app.py |
model_analysis.py |
Model statistics | Analysis | python model_analysis.py |
app.py |
Simple app (3 tabs) | Usage | streamlit run app.py or python -m streamlit run app.py |
app_advanced.py |
Advanced app (5 tabs) | Usage | streamlit run app_advanced.py or python -m streamlit run app_advanced.py |
After running train_advanced.py, these files are created automatically:
✓ student_performance_model.pkl (Main trained model)
✓ all_models.pkl (Backup models)
✓ scaler.pkl (Feature scaler)
✓ feature_importance.json (Feature analysis)
✓ model_results.json (Model metrics)
Important: Don't delete these files! The app needs them to make predictions.
Solution:
pip install -r requirements.txtThis is normal! Training on 6,607 records takes 2-5 minutes. Be patient.
Solution: Check error messages and fix issues:
- Missing package? Run
pip install -r requirements.txt - Missing data? Check StudentPerformanceFactors.csv exists
- Missing models? Run
python train_advanced.py
Solution: Make sure you ran Steps 1-3 first! If streamlit run doesn't work on your system, try:
python -m streamlit run app_advanced.pySolution: Use different port:
streamlit run app_advanced.py --server.port 8502After first-time setup, backup these files (they take time to generate):
student_performance_model.pkl
all_models.pkl
scaler.pkl
If you delete these, you must re-run:
python train_advanced.pyTo retrain with new data or updated StudentPerformanceFactors.csv:
python train_advanced.pyThis will:
- ✅ Load updated data
- ✅ Recreate all features
- ✅ Retrain all models
- ✅ Update model files
- ✅ Overwrite old models
Once setup is complete:
-
Read Documentation:
README.md- How to use the appTECHNICAL.md- How it works
-
Explore the App:
- Try different student profiles
- Check feature importance
- View prediction confidence
- Analyze student patterns
-
Customize (Optional):
- Modify app.py or app_advanced.py
- Add new features
- Change model parameters
- Retrain with different settings
-
Deploy (Optional):
- Run on Streamlit Cloud
- Deploy to server
- Share with others
-
Install packages first (Phase 1)
- Apps need libraries to run
- Must be done before anything else
-
Train model second (Phase 2)
- Model training creates required files
- Apps can't run without trained model
- Takes longest time
-
Verify works (Phase 3)
- Check everything is working
- Catch errors early
- Ensure no silent failures
-
Run tests (Phase 4)
- Validate functionality
- Extra safety check
- Optional but recommended
-
Run app (Phase 6)
- Everything is ready
- Application runs smoothly
- No errors or warnings
Use this checklist when setting up for the first time:
- Python 3.8+ installed
-
pip install -r requirements.txtcompleted - All packages installed successfully
-
python train_advanced.pycompleted - student_performance_model.pkl created
- all_models.pkl created
- scaler.pkl created
- No errors during training
-
python verify_system.pyshows all ✓ - No ❌ marks in output
- Sample prediction works
-
python test_app.pyshows PASSED - All tests completed
- No errors
- StudentPerformanceFactors.csv exists
- All model files present
- Documentation files exist
- Ready to run
streamlit run app_advanced.py
-
Don't skip verification: Run
verify_system.py- it catches problems early -
Be patient during training: 2-5 minutes is normal for 6,607 records
-
Keep model files safe: Don't delete .pkl files; they take time to recreate
-
Use advanced app:
app_advanced.pyhas more features thanapp.py -
Check requirements.txt: Make sure you have all dependencies
-
Read error messages: They usually tell you exactly what's wrong
-
Restart if stuck: Close terminal and start fresh
-
Keep backup: Save model files before making changes
You've completed setup successfully when:
✅ All packages installed (no errors) ✅ Model trained (2-5 minutes) ✅ verify_system.py shows all checks passing ✅ App runs at http://localhost:8501 ✅ Can make predictions ✅ No errors or warnings
# Full Setup (everything in order)
pip install -r requirements.txt
python train_advanced.py
python verify_system.py
python test_app.py
streamlit run app_advanced.py
# Alternative if streamlit run doesn't work:
python -m streamlit run app_advanced.py
# Quick Setup (just essentials)
pip install -r requirements.txt
python train_advanced.py
streamlit run app_advanced.py
# Alternative if streamlit run doesn't work:
python -m streamlit run app_advanced.py
# Just Run App (if already trained)
streamlit run app_advanced.py
# Alternative if above doesn't work:
python -m streamlit run app_advanced.py
# Retrain Model
python train_advanced.py
# Verify Setup
python verify_system.py
# Run Tests
python test_app.py
# View Analysis
python model_analysis.py🎉 Your first-time setup guide is ready! Follow this sequence and you'll have everything working smoothly.
Total time: 15-25 minutes from zero to fully functional application!