Skip to content
forked from nhahub/NHA-187

Smart Complaint System (NHA-187): An intelligent, Big Data platform using Apache Spark, Kafka, and a fine-tuned BERT model for real-time classification and sentiment analysis of Arabic customer complaints.

Notifications You must be signed in to change notification settings

Rawannada/NHA-187

 
 

Repository files navigation

📢 Smart Complaint System

Team Logo

DataFlow Team

Project Status Kafka Spark Airflow MySQL Streamlit Docker


📖 About The Project

Smart Complaint System is an intelligent, event-driven Big Data platform designed to automate the handling of customer complaints. Developed as part of the Digital Egypt Pioneers Initiative (DEPI) - Round 3 (Huawei Big Data Program), this system addresses the challenges of manual complaint sorting by leveraging real-time stream processing and Natural Language Processing (NLP).

The platform ingests unstructured complaints, uses a fine-tuned BERT model to categorize them and assess sentiment severity in real-time, and generates automated reports for stakeholders. The entire infrastructure is containerized for seamless deployment.

✨ Core Features

  • 🚀 Real-Time Ingestion: A user-friendly Streamlit interface captures complaints and streams them instantly to Apache Kafka.
  • 🧠 AI-Powered Inference: - Classification: Routes complaints to 5 departments (e.g., Bread Quality, Staff Behavior, System Down) using a fine-tuned BERT model.
    • Sentiment Analysis: Calculates a "Severity Score" (High/Medium/Low) based on the emotional tone of the text.
  • ⚡ Scalable Stream Processing: Apache Spark Structured Streaming processes data in micro-batches, applying "Lazy Loading" for efficient model inference.
  • 🗄️ Persistent Storage: Processed data is stored in MySQL for historical analysis.
  • 📧 Automated Reporting: Apache Airflow orchestrates a weekly workflow to generate Excel reports and email them directly to stakeholders.

👥 Team Members

Meet Our Team

Medhat Mohamed Ezzat
Medhat Mohamed Ezzat
Rawan Samy Nada
Rawan Samy Nada
George Ezzat Hosni
George Ezzat Hosni
Farah Maurice Wanis
Farah Maurice Wanis
David Sameh Fouad
David Sameh Fouad
Jana Amr Adbul Hamid
Jana Amr Abdelhamed

🏗️ Architecture Overview

The system follows a microservices architecture orchestrated by Docker Compose:

  1. Producer: Streamlit App $\rightarrow$ Validates input $\rightarrow$ Sends JSON to Kafka.
  2. Message Broker: Apache Kafka buffers high-velocity data.
  3. Processor: Spark Streaming consumes data $\rightarrow$ Loads AI Models $\rightarrow$ Writes to MySQL.
  4. Storage: MySQL database stores the raw text and AI predictions.
  5. Orchestrator: Airflow runs weekly jobs to extract data from MySQL and send email summaries.
Architecture Overview

🛠️ Technologies & Tools

Category Technology Purpose
Containerization Docker & Docker Compose Orchestration of the 5-service stack (Kafka, Spark, MySQL, Airflow, Streamlit).
User Interface Streamlit Interactive web form for complaint submission with validation logic.
Message Broker Apache Kafka Decoupled, fault-tolerant message buffering.
Stream Processing Apache Spark (PySpark) Distributed processing engine for real-time AI inference.
AI & NLP Hugging Face (PyTorch) Fine-tuned BERT models for Arabic text classification.
Storage MySQL 8.0 Relational database for storing analyzed complaints.
Orchestration Apache Airflow Scheduling weekly reporting and email notifications.

🚀 Getting Started

Prerequisites

  • Docker Desktop installed and running.
  • 4GB+ RAM available for containers.

Installation

  1. Clone the repository

    git clone [https://github.com/medhat2525548/smart-complaint-system.git](https://github.com/medhat2525548/smart-complaint-system.git)
    cd smart-complaint-system
  2. Download AI Models Ensure the fine-tuned models are placed in the processor/models/ directory:

    • processor/models/category_model/
    • processor/models/sentiment_model/
  3. Start the Services

    docker-compose up --build -d
  4. Access the Interfaces

    • Complaint Portal: http://localhost:8501
    • Spark UI: http://localhost:8080
    • Airflow UI: http://localhost:8081 (User/Pass: admin/admin)

About

Smart Complaint System (NHA-187): An intelligent, Big Data platform using Apache Spark, Kafka, and a fine-tuned BERT model for real-time classification and sentiment analysis of Arabic customer complaints.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 91.9%
  • Dockerfile 8.1%