The Smart Manager's Dashboard for Data-Driven Decisions
A powerful Streamlit-based analytics dashboard that helps e-commerce managers make smarter inventory and sales decisions through:
- π Cross-Selling Engine - Discover which products are bought together using Market Basket Analysis
- π Inventory Forecaster - Predict future demand using ARIMA time series forecasting
- π€ Upload Module - Analyze your own sales data with easy CSV upload
- Python 3.8 or higher
- pip package manager
- Your own CSV sales data (see format requirements below)
- Clone the Repository and naviagte to project directory:
git clone https://github.com/equaan/E-Commerce-Intelligence-Suite.git
cd E-Commerce-Intelligence-Suite- Create a virtual environment:
python -m venv venv
venv\Scripts\activate # On Windows
source venv/bin/activate # On macOS/Linux- Run the setup wizard:
python setup.pyπ§© What Happens Next
-
The setup script will:
- β Check Python version
- β Install all required dependencies
- β Configure your CSV file path
- β Initialize the database and start the app
-
β οΈ Important: While installing dependencies, the process may pause for a few minutes.- Wait patiently β itβs downloading libraries in the background.
- If it seems stuck for too long, open a new terminal and run:
again. The setup will continue from where it left off.
python setup.py
-
Once dependencies are installed, a data/ folder will be created.
- Move your CSV file into that folder.
- When prompted, enter the path as:
data/your_file_name.csv
- The setup will automatically detect the file, validate it, and begin initialization.
-
If initialization fails, the script will show manual recovery steps β just follow those instructions and re-run the command.
-
After successful initialization, the app will automatically launch in your browser.
Your CSV file must contain these columns (exact names can be different, just update config.py):
| Column | Description | Example Values |
|---|---|---|
InvoiceNo |
Transaction/Order ID | "536365", "536366" |
StockCode |
Product ID/SKU | "85123A", "71053" |
Description |
Product name | "WHITE HANGING HEART T-LIGHT HOLDER" |
Quantity |
Number of items sold | 6, 8, 2 |
InvoiceDate |
Transaction date | "2010-12-01 08:26:00" |
UnitPrice |
Price per item | 2.55, 3.39 |
| Column | Description | Example Values |
|---|---|---|
CustomerID |
Customer identifier | "17850", "13047" |
Country |
Customer country | "United Kingdom", "France" |
Don't have your own data yet? Try these popular e-commerce datasets:
- Source: UCI Machine Learning Repository
- Description: UK-based online retail transactions (2010-2011)
- Size: ~540K transactions
- Perfect for: Testing all features of the suite
- Source: Kaggle E-commerce Datasets
- Various datasets available with different formats
- Note: May require column mapping in
config.py
- Source: Kaggle Retail Analytics
- Multiple options for different retail scenarios
- Good for: Testing forecasting capabilities
Your data should represent transactional sales records where each row is a product sold in a transaction. The system works best with:
- Multiple products per transaction (for cross-selling analysis)
- Time series data spanning several months (for forecasting)
- Consistent product identifiers (for accurate analysis)
- Business overview with key metrics
- Daily sales trends
- Top-performing products
- Key insights and analytics
- Market Basket Analysis using Apriori algorithm
- Product recommendation system
- Association rules with confidence, lift, and support metrics
- Interactive parameter tuning
- Visual charts and insights
- ARIMA-based demand forecasting
- Configurable forecast periods (7-90 days)
- Confidence intervals and trend analysis
- Stock level recommendations
- Model accuracy metrics (MAPE)
- CSV file validation and processing
- Sample data format download
- Real-time data cleaning and analysis
- Support for custom datasets
DWM Project/
βββ app.py # Main Streamlit application
βββ initialize_data.py # Database initialization script
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ setup.py # Installs deps, cleans data, inits DB, starts app
βββ config.py # column mapping & validation
βββ USER_MANUAL.md # Feature usage guide
βββ models/ # ML models
β βββ market_basket.py # Market Basket Analysis
β βββ inventory_forecaster.py # ARIMA Forecasting
βββ utils/ # Utility modules
βββ database_setup.py # Database management
βββ data_processing.py # Data cleaning & validation
- Frontend: Streamlit, Plotly
- Backend: Python, Pandas, NumPy
- ML Libraries: mlxtend (Apriori), statsmodels (ARIMA)
- Database: SQLite with star schema
- FactSales: Main transaction table
- DimProduct: Product dimension
- DimCustomer: Customer dimension
- DimDate: Date dimension
- Apriori Algorithm: For market basket analysis and association rules
- ARIMA Model: For time series forecasting with automatic parameter selection
- Optimized for datasets up to 100K transactions
- Uses caching for improved performance
- Responsive design for various screen sizes
- Real-time analysis with progress indicators
βοΈ Configuration Guide (Only use when you have different column names and different type of dataset)
Edit config.py and change the CSV_FILE_PATH:
# Examples of different data sources:
CSV_FILE_PATH = "data/my_sales_data.csv" # Local file
CSV_FILE_PATH = "C:/Users/YourName/Downloads/sales.csv" # Full path
CSV_FILE_PATH = "https://example.com/sales_data.csv" # URL (if accessible)If your CSV has different column names, update the COLUMN_MAPPING:
COLUMN_MAPPING = {
'InvoiceNo': 'order_id', # Your CSV has 'order_id' instead of 'InvoiceNo'
'StockCode': 'product_sku', # Your CSV has 'product_sku' instead of 'StockCode'
'Description': 'product_name', # Your CSV has 'product_name' instead of 'Description'
'Quantity': 'qty', # Your CSV has 'qty' instead of 'Quantity'
'InvoiceDate': 'purchase_date', # Your CSV has 'purchase_date' instead of 'InvoiceDate'
'UnitPrice': 'price', # Your CSV has 'price' instead of 'UnitPrice'
'CustomerID': 'customer_id', # Your CSV has 'customer_id' instead of 'CustomerID'
}Modify validation rules in config.py if needed:
VALIDATION_RULES = {
'min_quantity': 1, # Change to 1 if you don't want to include returns
'min_price': 0.01, # Minimum price threshold
'max_price': 5000, # Maximum price threshold (adjust for your products)
'remove_cancelled_orders': True, # Set to False if you want to keep cancelled orders
}