Skip to content

Classification of fault in an induction motor using vibrations

Notifications You must be signed in to change notification settings

ayushraj09/vibration-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔧 Hierarchical Motor Fault Detection System

Python LightGBM Streamlit License: MIT

A production-ready web application implementing a two-stage hierarchical machine learning system for automated motor fault detection and classification using vibration signal analysis. Achieves 100% binary classification and 99.48% multi-class classification accuracy on the MAFAULDA dataset.


📋 Table of Contents


🎯 Overview

This project implements an intelligent predictive maintenance system for rotating machinery, specifically designed for industrial motor fault diagnosis. By leveraging wavelet-based feature extraction and hierarchical classification, the system provides:

  • Real-time fault detection - Binary classification (Normal vs Fault)
  • Precise fault diagnosis - Multi-class identification of 9 specific fault types
  • Explainable AI - SHAP analysis for feature importance
  • Production-ready deployment - Web-based Streamlit interface

🔍 Detected Fault Types

Fault Category Types Description
Rotor Faults Imbalance Unbalanced mass distribution
Horizontal Misalignment Shaft misalignment in horizontal plane
Vertical Misalignment Shaft misalignment in vertical plane
Bearing Faults (Overhang) Ball Fault Rolling element defect
Cage Fault Bearing cage damage
Outer Race Fault Outer raceway defect
Bearing Faults (Underhang) Ball Fault Rolling element defect
Cage Fault Bearing cage damage
Outer Race Fault Outer raceway defect

✨ Key Features

🚀 State-of-the-Art Performance

  • 100% accuracy in fault vs normal detection
  • 99.48% accuracy in specific fault classification
  • Zero false alarms in binary classification

🧠 Intelligent Feature Engineering

  • Biorthogonal 3.1 wavelet decomposition (level 4)
  • 273 statistical features from time-frequency domain:
    • Time-domain: Mean, std dev, variance, RMS, percentiles
    • Shape: Kurtosis, skewness
    • Signal characteristics: Zero/mean crossing rates, entropy
    • Transform features: Hilbert magnitude

🏗️ Hierarchical Architecture

  • Two-stage classification for improved robustness
  • Stage 1: Binary classifier with SMOTE for class balancing
  • Stage 2: Multi-class classifier for fault identification
  • Computational efficiency: Stage 2 only invoked when fault detected

📊 Model Explainability

  • SHAP analysis for feature importance visualization
  • Interpretable predictions for maintenance decision-making
  • Trust and transparency for industrial deployment

🌐 Easy Deployment

  • Streamlit web interface - No coding required
  • CSV file upload - Simple data input
  • Real-time predictions - Instant results
  • Visualization - Confusion matrices and confidence scores

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Input: Vibration Signal                   │
│                    (250,000 samples, 50kHz)                  │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│              Wavelet Decomposition (Bior3.1, L4)            │
│  Approximation + 4 Detail Coefficients × 8 Sensor Channels  │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│              Feature Extraction (273 features)               │
│   Statistical + Shape + Signal + Transform Features         │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│         Stage 1: Binary Classification (LightGBM)           │
│                  Normal vs Fault                             │
│                  Accuracy: 100%                              │
└────────────────────────┬────────────────────────────────────┘
                         │
                    Is Fault? ──────No──────> Normal
                         │
                        Yes
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│        Stage 2: Multi-class Classification (LightGBM)       │
│              Identify Specific Fault Type                    │
│                  Accuracy: 99.48%                            │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                  Fault Type Prediction                       │
│         B/C/D/E/F/G/H/I/J with Confidence Score             │
└─────────────────────────────────────────────────────────────┘

📊 Performance

Binary Classification Results

Binary Classification Confusion Matrix

Metrics:

  • Accuracy: 100%
  • Precision: 1.00
  • Recall: 1.00
  • F1-Score: 1.00

Multi-class Classification Results

Multi-class Classification Confusion Matrix

Metrics:

  • Overall Accuracy: 99.48%
  • Macro-avg Precision: 0.995
  • Macro-avg Recall: 0.995
  • Macro-avg F1-Score: 0.995

Per-Class Performance:

Fault Type Samples Accuracy Notes
Imbalance (B) 67 100% ✅ Perfect
Horizontal Misalignment (C) 39 100% ✅ Perfect
Vertical Misalignment (D) 60 96.7% ⚠️ 2 misclassified
Overhang Ball (E) 27 100% ✅ Perfect
Overhang Cage (F) 38 100% ✅ Perfect
Overhang Outer Race (G) 38 100% ✅ Perfect
Underhang Ball (H) 37 100% ✅ Perfect
Underhang Cage (I) 38 100% ✅ Perfect
Underhang Outer Race (J) 37 100% ✅ Perfect

🚀 Installation

Prerequisites

  • Python 3.8 or higher
  • pip package manager
  • Virtual environment (recommended)

Quick Start

  1. Clone the repository
git clone https://github.com/ayushraj09/vibration-analysis.git
cd vibration-analysis
  1. Create virtual environment
# Using venv
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
  1. Install dependencies
pip install -r requirements.txt

Dependencies

streamlit>=1.28.0
numpy>=1.24.0
pandas>=2.0.0
pywavelets>=1.4.1
lightgbm>=4.0.0
scikit-learn>=1.3.0
shap>=0.42.0
imbalanced-learn>=0.11.0
joblib>=1.3.0
matplotlib>=3.7.0
seaborn>=0.12.0

💻 Usage

1. Launch the Application

streamlit run src/app.py

The application will open in your default web browser at http://localhost:8501

2. Upload Vibration Data

  • Click "Browse files" or drag-and-drop a CSV file
  • Supported format: CSV with 8 columns (sensor data)
  • Expected data: 250,000 samples per measurement at 50kHz

3. View Results

The system will:

  1. Extract features using wavelet decomposition
  2. Classify as Normal or Fault (Stage 1)
  3. If Fault, identify specific fault type (Stage 2)
  4. Display prediction with confidence score

4. Interpret Results

  • Normal: No maintenance required
  • Fault Type B-J: Review fault description and schedule appropriate maintenance

🔬 Technical Details

Wavelet Decomposition

The system uses Biorthogonal 3.1 (Bior3.1) mother wavelet, selected based on comprehensive analysis by Das & Das (2023) showing superior performance for rotating machinery fault detection.

Mathematical Formulation:

$$ DWT(a,b) = \frac{1}{\sqrt{|a|}} \int_{-\infty}^{\infty} x(t)\psi^*\left(\frac{t-b}{a}\right)dt $$

Where:

  • $\psi(t)$ = Mother wavelet (Bior3.1)
  • $a$ = Scaling parameter
  • $b$ = Translation parameter
  • $\psi^*$ = Complex conjugate

Decomposition Parameters:

  • Level: 4
  • Output: 1 approximation + 4 detail coefficient sets
  • Sensor channels: 8 (underhang accelerometer 3-axis, overhang accelerometer 3-axis, tachometer, microphone)

Feature Engineering

Statistical Features (per coefficient set):

  • Central tendency: Mean, Median
  • Dispersion: Standard deviation, Variance, RMS
  • Distribution shape: Kurtosis, Skewness
  • Percentiles: 5th, 25th, 75th, 95th
  • Signal characteristics: Zero crossing rate, Mean crossing rate
  • Information theory: Shannon entropy
  • Transform domain: Hilbert transform magnitude

Total Features: 273 (≈15 features × 5 coefficient sets × 8 channels)

Model Configuration

Stage 1: Binary Classifier

LGBMClassifier(
    boosting_type='gbdt',
    objective='binary',
    learning_rate=0.05,
    n_estimators=100,
    num_leaves=31,
    max_depth=-1,
    min_child_samples=20,
    random_state=42
)

Class Balancing: SMOTE (Synthetic Minority Over-sampling Technique)

  • Original: 49 normal, 1,902 faulty
  • Balanced: Equal representation

Stage 2: Multi-class Classifier

LGBMClassifier(
    boosting_type='gbdt',
    objective='multiclass',
    num_class=9,
    learning_rate=0.05,
    n_estimators=100,
    num_leaves=31,
    max_depth=-1,
    min_child_samples=20,
    random_state=42
)

Training Strategy: 80-20 train-test split on fault samples only

Why LightGBM?

Feature Advantage
Speed Faster training than Random Forest
Memory Lower memory footprint
Accuracy Competitive with other ensemble methods
Scalability Handles large datasets efficiently
Industry-ready Proven in production environments

📚 Dataset

MAFAULDA - Machinery Fault Database

Source: Federal University of Rio de Janeiro (UFRJ)
Link: https://www02.smt.ufrj.br/~offshore/mfs/page_01.html

Experimental Setup:

  • Machine: SpectraQuest Machinery Fault Simulator (ABVT)
  • Motor: 1/4 HP DC, 700-3600 RPM
  • Sensors:
    • IMI 601A01 accelerometers (underhang: 3-axis)
    • IMI 604B31 accelerometer (overhang: 3-axis)
    • Monarch MT-190 tachometer
    • Shure SM81 microphone
  • Sampling: 50 kHz, 5 seconds (250,000 samples/measurement)
  • Bearing specs: 8 rolling elements, 0.7145 cm ball diameter

Dataset Composition:

Fault Type Measurements Variations
Normal 49 737-3686 RPM
Imbalance 333 6-35g weights
Horizontal Misalignment 197 0.5-2.0mm shifts
Vertical Misalignment 301 0.51-1.90mm shifts
Bearing Faults (Underhang) 558 Ball/Cage/Outer race × (0,6,20,35g)
Bearing Faults (Overhang) 513 Ball/Cage/Outer race × (0,6,20,35g)
Total 1,951 -

Key Characteristic: Bearing faults coupled with imbalance (0, 6, 20, 35g) as they are imperceptible without imbalance.


🔍 Model Explainability

SHAP Feature Importance

SHAP Feature Importance

Top Contributing Features:

  1. col6_mean_d4 - Mean of detail coefficient 4 (column 6)
  2. col3_mean_approx - Mean of approximation coefficient (column 3)
  3. col6_mean_d3 - Mean of detail coefficient 3 (column 6)
  4. col3_std_d4 - Standard deviation of detail coefficient 4 (column 3)

Key Insights:

  • ✅ Higher decomposition levels (d3, d4) capture critical fault signatures
  • ✅ Mean and standard deviation are most discriminative
  • ✅ Multiple sensor channels contribute synergistically
  • ✅ Both approximation and detail coefficients are important

Feature Interpretation

  • High values (red/pink): Push predictions toward specific fault classes
  • Low values (blue): Indicate different fault characteristics
  • Bidirectional impact: Demonstrates complex, non-linear decision boundaries

📁 Project Structure

vibration-analysis/
│
├── src/
│   ├── app.py                      # Streamlit web application
│   ├── utils.py                    # Feature extraction utilities
│   ├── lgbm_binary_model.joblib    # Trained binary classifier
│   ├── lgbm_multi_model.joblib     # Trained multi-class classifier
│   └── label_encoder.joblib        # Label encoder for fault classes
│
├── screenshots/
│   ├── binary_cm.png               # Binary confusion matrix
│   ├── multi_cm.png                # Multi-class confusion matrix
│   ├── shap.png                    # SHAP feature importance
│
├── notebooks/
│   └── training.ipynb              # Model training notebook
│
├── docs/
│   └── technical_report.pdf        # Detailed technical report
│
├── requirements.txt                # Python dependencies
├── README.md                       # This file
├── LICENSE                         # MIT License
└── .gitignore                      # Git ignore rules

Reference Paper

This work is based on the methodology from:

@article{das2023smart,
  title={Smart machine fault diagnostics based on fault specified discrete wavelet transform},
  author={Das, Oguzhan and Bagci Das, Duygu},
  journal={Journal of the Brazilian Society of Mechanical Sciences and Engineering},
  volume={45},
  number={55},
  year={2023},
  publisher={Springer},
  doi={10.1007/s40430-022-03975-0}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Research Foundation

  • Das & Das (2023) - Wavelet-based fault detection methodology
  • UFRJ Team - MAFAULDA dataset creation and maintenance

Open Source Libraries

  • LightGBM - Microsoft Research
  • Streamlit - Streamlit Inc.
  • PyWavelets - PyWavelets Development Team
  • scikit-learn - scikit-learn developers
  • SHAP - Scott Lundberg & Su-In Lee

Dataset

  • MAFAULDA Database - Federal University of Rio de Janeiro
  • Ribeiro et al. (2018) - Dataset curation and documentation

Original Kaggle Notebook

About

Classification of fault in an induction motor using vibrations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published