Skip to content

equaan/E-Commerce-Intelligence-Suite

Repository files navigation

πŸ›’ E-Commerce Intelligence Suite

The Smart Manager's Dashboard for Data-Driven Decisions

A powerful Streamlit-based analytics dashboard that helps e-commerce managers make smarter inventory and sales decisions through:

  • πŸ›’ Cross-Selling Engine - Discover which products are bought together using Market Basket Analysis
  • πŸ“ˆ Inventory Forecaster - Predict future demand using ARIMA time series forecasting
  • πŸ“€ Upload Module - Analyze your own sales data with easy CSV upload

πŸš€ Quick Start

Prerequisites

  • Python 3.8 or higher
  • pip package manager
  • Your own CSV sales data (see format requirements below)

🎯 Setup

  1. Clone the Repository and naviagte to project directory:
git clone https://github.com/equaan/E-Commerce-Intelligence-Suite.git
cd E-Commerce-Intelligence-Suite
  1. Create a virtual environment:
python -m venv venv
venv\Scripts\activate        # On Windows
source venv/bin/activate     # On macOS/Linux
  1. Run the setup wizard:
python setup.py

🧩 What Happens Next

  1. The setup script will:

    • βœ… Check Python version
    • βœ… Install all required dependencies
    • βœ… Configure your CSV file path
    • βœ… Initialize the database and start the app
  2. ⚠️ Important: While installing dependencies, the process may pause for a few minutes.

    • Wait patiently β€” it’s downloading libraries in the background.
    • If it seems stuck for too long, open a new terminal and run:
      python setup.py
      again. The setup will continue from where it left off.
  3. Once dependencies are installed, a data/ folder will be created.

    • Move your CSV file into that folder.
    • When prompted, enter the path as:
      data/your_file_name.csv
    • The setup will automatically detect the file, validate it, and begin initialization.
  4. If initialization fails, the script will show manual recovery steps β€” just follow those instructions and re-run the command.

  5. After successful initialization, the app will automatically launch in your browser.

πŸ“Š CSV Data Requirements

Your CSV file must contain these columns (exact names can be different, just update config.py):

Required Columns:

Column Description Example Values
InvoiceNo Transaction/Order ID "536365", "536366"
StockCode Product ID/SKU "85123A", "71053"
Description Product name "WHITE HANGING HEART T-LIGHT HOLDER"
Quantity Number of items sold 6, 8, 2
InvoiceDate Transaction date "2010-12-01 08:26:00"
UnitPrice Price per item 2.55, 3.39

Optional Columns:

Column Description Example Values
CustomerID Customer identifier "17850", "13047"
Country Customer country "United Kingdom", "France"

πŸ“ Sample Datasets

Don't have your own data yet? Try these popular e-commerce datasets:

πŸ›οΈ Online Retail Dataset (Recommended)

  • Source: UCI Machine Learning Repository
  • Description: UK-based online retail transactions (2010-2011)
  • Size: ~540K transactions
  • Perfect for: Testing all features of the suite

πŸ›’ E-commerce Data

  • Source: Kaggle E-commerce Datasets
  • Various datasets available with different formats
  • Note: May require column mapping in config.py

πŸͺ Retail Sales Data

  • Source: Kaggle Retail Analytics
  • Multiple options for different retail scenarios
  • Good for: Testing forecasting capabilities

πŸ“Š Custom Data Requirements

Your data should represent transactional sales records where each row is a product sold in a transaction. The system works best with:

  • Multiple products per transaction (for cross-selling analysis)
  • Time series data spanning several months (for forecasting)
  • Consistent product identifiers (for accurate analysis)

πŸ“Š Features (See USER_MANUAL.md for a complete feature overview)

🏠 Home Dashboard

  • Business overview with key metrics
  • Daily sales trends
  • Top-performing products
  • Key insights and analytics

πŸ›’ Cross-Selling Engine

  • Market Basket Analysis using Apriori algorithm
  • Product recommendation system
  • Association rules with confidence, lift, and support metrics
  • Interactive parameter tuning
  • Visual charts and insights

πŸ“ˆ Inventory Forecaster

  • ARIMA-based demand forecasting
  • Configurable forecast periods (7-90 days)
  • Confidence intervals and trend analysis
  • Stock level recommendations
  • Model accuracy metrics (MAPE)

πŸ“€ Upload Your Data

  • CSV file validation and processing
  • Sample data format download
  • Real-time data cleaning and analysis
  • Support for custom datasets

πŸ—‚οΈ Project Structure

DWM Project/
β”œβ”€β”€ app.py                      # Main Streamlit application
β”œβ”€β”€ initialize_data.py          # Database initialization script
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ README.md                   # This file
β”œβ”€β”€ setup.py                    # Installs deps, cleans data, inits DB, starts app
β”œβ”€β”€ config.py                   # column mapping & validation
β”œβ”€β”€ USER_MANUAL.md              # Feature usage guide
β”œβ”€β”€ models/                     # ML models
β”‚   β”œβ”€β”€ market_basket.py        # Market Basket Analysis
β”‚   └── inventory_forecaster.py # ARIMA Forecasting
└── utils/                      # Utility modules
    β”œβ”€β”€ database_setup.py       # Database management
    └── data_processing.py      # Data cleaning & validation

πŸ”§ Technical Details

Tech Stack

  • Frontend: Streamlit, Plotly
  • Backend: Python, Pandas, NumPy
  • ML Libraries: mlxtend (Apriori), statsmodels (ARIMA)
  • Database: SQLite with star schema

Database Schema

  • FactSales: Main transaction table
  • DimProduct: Product dimension
  • DimCustomer: Customer dimension
  • DimDate: Date dimension

Algorithms Used

  • Apriori Algorithm: For market basket analysis and association rules
  • ARIMA Model: For time series forecasting with automatic parameter selection

πŸ“ˆ Performance Notes

  • Optimized for datasets up to 100K transactions
  • Uses caching for improved performance
  • Responsive design for various screen sizes
  • Real-time analysis with progress indicators

βš™οΈ Configuration Guide (Only use when you have different column names and different type of dataset)

Step 1: Update CSV Path

Edit config.py and change the CSV_FILE_PATH:

# Examples of different data sources:
CSV_FILE_PATH = "data/my_sales_data.csv"                    # Local file
CSV_FILE_PATH = "C:/Users/YourName/Downloads/sales.csv"     # Full path
CSV_FILE_PATH = "https://example.com/sales_data.csv"       # URL (if accessible)

Step 2: Map Your Column Names

If your CSV has different column names, update the COLUMN_MAPPING:

COLUMN_MAPPING = {
    'InvoiceNo': 'order_id',        # Your CSV has 'order_id' instead of 'InvoiceNo'
    'StockCode': 'product_sku',     # Your CSV has 'product_sku' instead of 'StockCode'
    'Description': 'product_name',   # Your CSV has 'product_name' instead of 'Description'
    'Quantity': 'qty',              # Your CSV has 'qty' instead of 'Quantity'
    'InvoiceDate': 'purchase_date', # Your CSV has 'purchase_date' instead of 'InvoiceDate'
    'UnitPrice': 'price',           # Your CSV has 'price' instead of 'UnitPrice'
    'CustomerID': 'customer_id',    # Your CSV has 'customer_id' instead of 'CustomerID'
}

Step 3: Adjust Data Validation (Optional)

Modify validation rules in config.py if needed:

VALIDATION_RULES = {
    'min_quantity': 1,              # Change to 1 if you don't want to include returns
    'min_price': 0.01,              # Minimum price threshold
    'max_price': 5000,              # Maximum price threshold (adjust for your products)
    'remove_cancelled_orders': True, # Set to False if you want to keep cancelled orders
}

πŸ‘€ Author

🀝 Contributors

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages