Skip to content

BleeGleeWee/Spam-SMS-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

38 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“© Spam SMS Detection ML Project

  • Deployed link✨ Here

πŸ“Œ Project Objective

The goal of this project is to build and deploy a machine learning model that can classify SMS messages as Spam or Ham (Not Spam).
The model is trained using a labeled dataset and deployed for real-world testing.


πŸ› οΈ Tech Stack

  • Python
  • Scikit-Learn
  • Pandas, Numpy
  • Natural Language Processing (NLP)
  • NLTK
  • Streamlit (for deployment)

πŸ“š Dataset


πŸ“Š Project Stages

  1. Data Cleaning
  2. Exploratory Data Analysis (EDA)
  3. Text Preprocessing (tokenization, stemming, etc.)
  4. Model Building (Naive Bayes, Logistic Regression, etc.)
  5. Vectorization (TF-IDF, GridSearchCV)
  6. Model Evaluation (Accuracy, Precision, Recall, F1 Score)
  7. PyCharm App Development (Over Streamlit)

πŸ“Š Model Performance

Metric Score
Accuracy 97.9%
Precision 97.5%
Recall 96%

βš™οΈ Steps to Run the Project

1. Clone the repository:

git clone https://github.com/BleeGleeWee/Spam-SMS-Detection.git
cd Spam-SMS-Detection

2. Install dependencies:

pip install -r requirements.txt

3. Run the Jupyter Notebook:

jupyter notebook spam_sms_detection.ipynb

4. For deployed app:

streamlit run app.py

🌟 FINAL SHOWDOWN:

image

Screenshot 2025-12-16 035317
Email/SMS-spam-classifier
β”‚
β”œβ”€β”€ data/
β”‚   └── spam.csv                         # Original dataset (or link to download in README)
β”‚
β”œβ”€β”€ notebooks/
β”‚   β”œβ”€β”€ 01_data_cleaning.ipynb           # Handling nulls, duplicates, formatting
β”‚   β”œβ”€β”€ 02_eda.ipynb                     # Visualizations and exploratory analysis
β”‚   β”œβ”€β”€ 03_text_preprocessing.ipynb      # Tokenization, stemming, stopword removal
β”‚   β”œβ”€β”€ 04_model_building.ipynb          # Naive Bayes, Logistic Regression, etc.
β”‚   └── 05_model_improvement.ipynb       # TF-IDF, hyperparameter tuning, evaluation
β”‚
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ model.pkl                        # Serialized trained model (pickle)
|   └── vectorizer.pkl                   # Trained model then vectorized
β”‚
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ app.py                           # App entry point
β”‚   β”œβ”€β”€ main.py                  # Utility to load the model
β”‚   └── train_model.py                   # Training model before testing   
β”‚
β”œβ”€β”€ .gitignore                           # Ignore notebooks checkpoints, model files, etc.
β”œβ”€β”€ requirements.txt                     # All dependencies (Flask/FastAPI, sklearn, etc.)
β”œβ”€β”€ nltk.txt                             # NLTK dependencies (stopwords, punkt)
β”œβ”€β”€ README.md                            # Full documentation 
└── LICENSE                              # MIT or any preferred open-source license

About

🚨 End-to-End SMS Spam Detection Using Machine Learning. This repository contains a complete machine learning pipeline for classifying SMS/Mail messages as Spam or Not Spam, built using a real-world dataset from Kaggle.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors