Skip to content

vytautas-bunevicius/stroke-risk-predictor

Repository files navigation

Stroke Risk Predictor

Table of Contents

Overview

Machine learning-based web application designed to assess stroke risk based on health and lifestyle factors. The system processes patient data through a CatBoost model to provide risk assessments, helping healthcare professionals identify potential stroke risks early for timely intervention.

Interface

Web App Interface

Features

  • Comprehensive health data analysis
  • Advanced feature engineering implementation
  • Multiple model evaluation framework
  • High-recall optimization
  • Flask-based web interface
  • Google Cloud Platform deployment
  • Automated testing suite
  • Containerized deployment

Model Details

Current implemented models evaluated:

  1. Logistic Regression
  2. XGBoost
  3. CatBoost (selected as final model)

Prerequisites

  • Python 3.9+ (check .python-version file for the current required version)
  • Docker
  • Google Cloud SDK
  • Flask
  • scikit-learn
  • CatBoost

Installation

Using uv (Recommended)

  1. Install uv:

    # On Unix/macOS
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
    # On Windows (PowerShell)
    irm https://astral.sh/uv/install.ps1 | iex
  2. Clone the Repository:

    git clone https://github.com/vytautas-bunevicius/stroke-risk-predictor.git
    cd stroke-risk-predictor
  3. Install Dependencies and Set Up Virtual Environment:

    uv sync

    This command creates a virtual environment and installs all dependencies from pyproject.toml.

  4. Activate the Virtual Environment:

    source .venv/bin/activate  # On Unix/macOS
    # or
    .venv\Scripts\activate     # On Windows

Using pip (Alternative)

  1. Clone the Repository:

    git clone https://github.com/vytautas-bunevicius/stroke-risk-predictor.git
    cd stroke-risk-predictor
  2. Create and Activate a Virtual Environment:

    python -m venv venv
    source venv/bin/activate  # On Unix/macOS
    # or
    venv\Scripts\activate     # On Windows
  3. Install Dependencies:

    pip install -e .

Development Setup

Environment Configuration

  1. Create .env file in project root:

    FLASK_ENV=development
    MODEL_PATH=models/catboost_model.pkl
    PORT=5000
  2. Configure Google Cloud services:

    • App Engine
    • Cloud Storage (for model storage)
    • Secret Manager

Local Execution

Run the application locally:

python src/stroke_risk_predictor/app.py

Visit http://localhost:5000 in your browser.

Deployment

The application is deployed on Google Cloud Platform App Engine:

  1. Configure deployment:

    gcloud config set project your-project-id
  2. Deploy:

    gcloud app deploy

Testing

Run the test suite:

python -m pytest tests/

License

This project is released under the Unlicense. This means you can copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

See the UNLICENSE file for more details.

About

Stroke risk classification model using patient health records

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages