|
Smart Complaint System is an intelligent, event-driven Big Data platform designed to automate the handling of customer complaints. Developed as part of the Digital Egypt Pioneers Initiative (DEPI) - Round 3 (Huawei Big Data Program), this system addresses the challenges of manual complaint sorting by leveraging real-time stream processing and Natural Language Processing (NLP).
The platform ingests unstructured complaints, uses a fine-tuned BERT model to categorize them and assess sentiment severity in real-time, and generates automated reports for stakeholders. The entire infrastructure is containerized for seamless deployment.
- 🚀 Real-Time Ingestion: A user-friendly Streamlit interface captures complaints and streams them instantly to Apache Kafka.
- 🧠 AI-Powered Inference: - Classification: Routes complaints to 5 departments (e.g.,
Bread Quality,Staff Behavior,System Down) using a fine-tuned BERT model.- Sentiment Analysis: Calculates a "Severity Score" (High/Medium/Low) based on the emotional tone of the text.
- ⚡ Scalable Stream Processing: Apache Spark Structured Streaming processes data in micro-batches, applying "Lazy Loading" for efficient model inference.
- 🗄️ Persistent Storage: Processed data is stored in MySQL for historical analysis.
- 📧 Automated Reporting: Apache Airflow orchestrates a weekly workflow to generate Excel reports and email them directly to stakeholders.
The system follows a microservices architecture orchestrated by Docker Compose:
-
Producer: Streamlit App
$\rightarrow$ Validates input$\rightarrow$ Sends JSON to Kafka. - Message Broker: Apache Kafka buffers high-velocity data.
-
Processor: Spark Streaming consumes data
$\rightarrow$ Loads AI Models$\rightarrow$ Writes to MySQL. - Storage: MySQL database stores the raw text and AI predictions.
- Orchestrator: Airflow runs weekly jobs to extract data from MySQL and send email summaries.
| Category | Technology | Purpose |
|---|---|---|
| Containerization | Docker & Docker Compose | Orchestration of the 5-service stack (Kafka, Spark, MySQL, Airflow, Streamlit). |
| User Interface | Streamlit | Interactive web form for complaint submission with validation logic. |
| Message Broker | Apache Kafka | Decoupled, fault-tolerant message buffering. |
| Stream Processing | Apache Spark (PySpark) | Distributed processing engine for real-time AI inference. |
| AI & NLP | Hugging Face (PyTorch) | Fine-tuned BERT models for Arabic text classification. |
| Storage | MySQL 8.0 | Relational database for storing analyzed complaints. |
| Orchestration | Apache Airflow | Scheduling weekly reporting and email notifications. |
- Docker Desktop installed and running.
- 4GB+ RAM available for containers.
-
Clone the repository
git clone [https://github.com/medhat2525548/smart-complaint-system.git](https://github.com/medhat2525548/smart-complaint-system.git) cd smart-complaint-system -
Download AI Models Ensure the fine-tuned models are placed in the
processor/models/directory:processor/models/category_model/processor/models/sentiment_model/
-
Start the Services
docker-compose up --build -d
-
Access the Interfaces
- Complaint Portal:
http://localhost:8501 - Spark UI:
http://localhost:8080 - Airflow UI:
http://localhost:8081(User/Pass:admin/admin)
- Complaint Portal:






