Skip to content

Latest commit

 

History

History
490 lines (364 loc) · 11 KB

File metadata and controls

490 lines (364 loc) · 11 KB

🚀 First Time Setup Guide - Complete Sequence

This guide shows the exact steps to run when setting up the Student Performance Predictor project for the first time.


📋 Complete Setup Sequence

Phase 1: Environment Setup (5-10 minutes)

Step 1.1: Install Python Packages

pip install -r requirements.txt

What it does: Installs all required libraries:

  • streamlit
  • pandas
  • numpy
  • scikit-learn
  • joblib
  • plotly
  • statsmodels

Expected output: "Successfully installed [packages]"


Phase 2: Model Training (5-10 minutes)

Step 2.1: Train the ML Model

python train_advanced.py

What it does:

  • ✅ Loads StudentPerformanceFactors.csv (6,607 records)
  • ✅ Performs data preprocessing & feature engineering
  • ✅ Creates 35 features (19 original + 16 engineered)
  • ✅ Trains 3 models:
    • Linear Regression
    • Random Forest
    • Gradient Boosting
  • ✅ Saves trained models to disk:
    • student_performance_model.pkl (main model)
    • all_models.pkl (backup models)
    • scaler.pkl (feature scaler)
    • feature_importance.json (feature analysis)

Expected output:

Loading data...
Data shape: (6607, 34)
Preprocessing and feature engineering...
Final shape with engineered features: (6607, 35)
Training Linear Regression model...
Model accuracy: 1.0000 (100%)
Saving models...
All models saved successfully!

Time: 2-5 minutes


Phase 3: System Verification (2-5 minutes)

Step 3.1: Verify Everything Works

python verify_system.py

What it does:

  • ✅ Checks Python version (3.8+)
  • ✅ Verifies all required packages installed
  • ✅ Checks if trained models exist
  • ✅ Tests model loading & prediction
  • ✅ Validates data files
  • ✅ Tests feature engineering
  • ✅ Runs sample prediction

Expected output:

✓ Python version OK: 3.x.x
✓ All required packages installed
✓ Model files found
✓ Model loads successfully
✓ Feature scaler loads successfully
✓ Data file valid
✓ Prediction working
✓ All tests passed!

Important: Fix any ❌ marks before continuing!

Time: 1-2 minutes


Phase 4: Testing (Optional - 3-5 minutes)

Step 4.1: Run Unit Tests (Optional)

python test_app.py

What it does:

  • ✅ Tests app functionality
  • ✅ Validates prediction logic
  • ✅ Checks data loading
  • ✅ Tests UI components

Expected output:

Test 1: Data loading... PASSED
Test 2: Model prediction... PASSED
Test 3: Feature engineering... PASSED
Test 4: Confidence calculation... PASSED
All tests passed! ✓

Time: 2-3 minutes


Phase 5: Analysis (Optional - 2-3 minutes)

Step 5.1: View Model Analysis (Optional)

python model_analysis.py

What it does:

  • ✅ Analyzes model performance
  • ✅ Generates model comparison
  • ✅ Shows feature importance
  • ✅ Calculates metrics

Output: Prints model statistics and analysis

Time: 1-2 minutes


Phase 6: Run the App (Ready to use!)

Step 6.1: Run Simple Version (Basic)

streamlit run app.py

Note: If streamlit run app.py doesn't work on your system, try:

python -m streamlit run app.py

Features:

  • Prediction Dashboard
  • Next Semester Score
  • Model Details

URL: http://localhost:8501


Step 6.2: Run Advanced Version (Full Features - RECOMMENDED)

streamlit run app_advanced.py

Note: If streamlit run app_advanced.py doesn't work on your system, try:

python -m streamlit run app_advanced.py

Features:

  • Prediction Dashboard
  • Feature Importance Analysis
  • Prediction Confidence Visualization
  • Student Analytics
  • Model Performance Metrics

URL: http://localhost:8501


📊 Complete Setup Timeline

START
  ↓
1. pip install -r requirements.txt (5 min)
  ↓
2. python train_advanced.py (5 min) ← MODEL TRAINING
  ↓
3. python verify_system.py (2 min) ← VERIFY SUCCESS
  ↓
4. [OPTIONAL] python test_app.py (3 min) ← UNIT TESTS
  ↓
5. [OPTIONAL] python model_analysis.py (2 min) ← ANALYSIS
  ↓
6. streamlit run app_advanced.py ← READY TO USE!
  ↓
END - Application running at http://localhost:8501

Total Time: 15-25 minutes (including optional steps)


🎯 Minimum Setup (Quick Start)

If you just want to run it ASAP:

# Step 1: Install packages
pip install -r requirements.txt

# Step 2: Train model
python train_advanced.py

# Step 3: Run app
streamlit run app_advanced.py
# Alternative if streamlit run doesn't work:
python -m streamlit run app_advanced.py

Time: 10-15 minutes


✅ What Each File Does

File Purpose Phase Command
requirements.txt Dependencies Setup pip install -r requirements.txt
train_advanced.py Train ML model Training python train_advanced.py
verify_system.py Check everything works Verification python verify_system.py
test_app.py Unit tests Testing python test_app.py
model_analysis.py Model statistics Analysis python model_analysis.py
app.py Simple app (3 tabs) Usage streamlit run app.py or python -m streamlit run app.py
app_advanced.py Advanced app (5 tabs) Usage streamlit run app_advanced.py or python -m streamlit run app_advanced.py

📁 Files Generated During Setup

After running train_advanced.py, these files are created automatically:

✓ student_performance_model.pkl      (Main trained model)
✓ all_models.pkl                     (Backup models)
✓ scaler.pkl                         (Feature scaler)
✓ feature_importance.json            (Feature analysis)
✓ model_results.json                 (Model metrics)

Important: Don't delete these files! The app needs them to make predictions.


🚨 Troubleshooting First Time Setup

Problem: "ModuleNotFoundError: No module named..."

Solution:

pip install -r requirements.txt

Problem: "train_advanced.py" takes too long

This is normal! Training on 6,607 records takes 2-5 minutes. Be patient.

Problem: "verify_system.py shows ❌ marks"

Solution: Check error messages and fix issues:

  • Missing package? Run pip install -r requirements.txt
  • Missing data? Check StudentPerformanceFactors.csv exists
  • Missing models? Run python train_advanced.py

Problem: "streamlit run app_advanced.py" shows error

Solution: Make sure you ran Steps 1-3 first! If streamlit run doesn't work on your system, try:

python -m streamlit run app_advanced.py

Problem: "Port 8501 already in use"

Solution: Use different port:

streamlit run app_advanced.py --server.port 8502

💾 Backup Important Files

After first-time setup, backup these files (they take time to generate):

student_performance_model.pkl
all_models.pkl
scaler.pkl

If you delete these, you must re-run:

python train_advanced.py

🔄 Retraining the Model

To retrain with new data or updated StudentPerformanceFactors.csv:

python train_advanced.py

This will:

  • ✅ Load updated data
  • ✅ Recreate all features
  • ✅ Retrain all models
  • ✅ Update model files
  • ✅ Overwrite old models

📖 After Setup - Next Steps

Once setup is complete:

  1. Read Documentation:

    • README.md - How to use the app
    • TECHNICAL.md - How it works
  2. Explore the App:

    • Try different student profiles
    • Check feature importance
    • View prediction confidence
    • Analyze student patterns
  3. Customize (Optional):

    • Modify app.py or app_advanced.py
    • Add new features
    • Change model parameters
    • Retrain with different settings
  4. Deploy (Optional):

    • Run on Streamlit Cloud
    • Deploy to server
    • Share with others

🎓 Understanding the Setup Process

Why This Sequence?

  1. Install packages first (Phase 1)

    • Apps need libraries to run
    • Must be done before anything else
  2. Train model second (Phase 2)

    • Model training creates required files
    • Apps can't run without trained model
    • Takes longest time
  3. Verify works (Phase 3)

    • Check everything is working
    • Catch errors early
    • Ensure no silent failures
  4. Run tests (Phase 4)

    • Validate functionality
    • Extra safety check
    • Optional but recommended
  5. Run app (Phase 6)

    • Everything is ready
    • Application runs smoothly
    • No errors or warnings

📋 First-Time Setup Checklist

Use this checklist when setting up for the first time:

Environment Setup

  • Python 3.8+ installed
  • pip install -r requirements.txt completed
  • All packages installed successfully

Model Training

  • python train_advanced.py completed
  • student_performance_model.pkl created
  • all_models.pkl created
  • scaler.pkl created
  • No errors during training

Verification

  • python verify_system.py shows all ✓
  • No ❌ marks in output
  • Sample prediction works

Testing (Optional)

  • python test_app.py shows PASSED
  • All tests completed
  • No errors

Ready to Use

  • StudentPerformanceFactors.csv exists
  • All model files present
  • Documentation files exist
  • Ready to run streamlit run app_advanced.py

💡 Pro Tips for First-Time Users

  1. Don't skip verification: Run verify_system.py - it catches problems early

  2. Be patient during training: 2-5 minutes is normal for 6,607 records

  3. Keep model files safe: Don't delete .pkl files; they take time to recreate

  4. Use advanced app: app_advanced.py has more features than app.py

  5. Check requirements.txt: Make sure you have all dependencies

  6. Read error messages: They usually tell you exactly what's wrong

  7. Restart if stuck: Close terminal and start fresh

  8. Keep backup: Save model files before making changes


🎯 Success Indicators

You've completed setup successfully when:

✅ All packages installed (no errors) ✅ Model trained (2-5 minutes) ✅ verify_system.py shows all checks passing ✅ App runs at http://localhost:8501 ✅ Can make predictions ✅ No errors or warnings


📞 Quick Reference Commands

# Full Setup (everything in order)
pip install -r requirements.txt
python train_advanced.py
python verify_system.py
python test_app.py
streamlit run app_advanced.py
# Alternative if streamlit run doesn't work:
python -m streamlit run app_advanced.py

# Quick Setup (just essentials)
pip install -r requirements.txt
python train_advanced.py
streamlit run app_advanced.py
# Alternative if streamlit run doesn't work:
python -m streamlit run app_advanced.py

# Just Run App (if already trained)
streamlit run app_advanced.py
# Alternative if above doesn't work:
python -m streamlit run app_advanced.py

# Retrain Model
python train_advanced.py

# Verify Setup
python verify_system.py

# Run Tests
python test_app.py

# View Analysis
python model_analysis.py

🎉 Your first-time setup guide is ready! Follow this sequence and you'll have everything working smoothly.

Total time: 15-25 minutes from zero to fully functional application!