Skip to content

🧩 AI Comprehensive Project with multiple submodules focusing on a specific data domain from images, text to tabular and signal

Notifications You must be signed in to change notification settings

anhduckkzz/Comprehensive-Project-Thesis-Practice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi DataType Comprehensive Project

This repository demonstrates end-to-end preprocessing, modeling, and evaluation workflows across four data modalities: text, images, audio, and tabular records. Each modality folder contains:

  • Raw or auto-downloaded data assets
  • Training scripts that implement preprocessing and simple models from scratch or lightweight deep learning architectures
  • Persisted metrics and 50-epoch train/validation loss curves (SVG)
  • A modality-specific README explaining design decisions, exploratory analysis, and outcomes

Project Structure

Folder Description
text/ BBC-style news snippets with bag-of-words, TF-IDF, and a shallow neural classifier.【F:text/README.md†L1-L28】【F:text/src/train_text_models.py†L95-L170】
image/ MNIST digit recognition using a handcrafted CNN and torchvision data utilities.【F:image/README.md†L1-L22】【F:image/src/train_mnist_cnn.py†L22-L133】
audio/ YESNO spoken command classification with MFCC features and a 2D CNN.【F:audio/README.md†L1-L23】【F:audio/src/train_audio_model.py†L20-L164】
Tabular/ Kaggle Titanic passenger survival modeling via scratch logistic regression.【F:Tabular/README.md†L1-L20】【F:Tabular/src/train_titanic_model.py†L32-L134】

Cross-Modal Highlights

  • Consistent Splits: Every training script enforces explicit train/validation/test partitions to standardize evaluation.【F:text/src/train_text_models.py†L86-L115】【F:image/src/train_mnist_cnn.py†L67-L86】【F:audio/src/train_audio_model.py†L72-L122】【F:Tabular/src/train_titanic_model.py†L68-L91】
  • From-Scratch Models: Simple neural or statistical models are implemented without relying on prebuilt pipelines, keeping the focus on core math and preprocessing logic.【F:text/src/train_text_models.py†L117-L170】【F:image/src/train_mnist_cnn.py†L22-L44】【F:audio/src/train_audio_model.py†L49-L71】【F:Tabular/src/train_titanic_model.py†L56-L134】
  • Transparent Metrics: Accuracy, precision, recall, and F1 are logged per modality to contextualize performance across balanced and imbalanced settings.【F:text/text_metrics.json†L1-L16】【F:image/mnist_metrics.json†L1-L6】【F:audio/audio_metrics.json†L1-L6】【F:Tabular/titanic_metrics.json†L1-L6】
  • Training Curves: SVG plots visualize 50-epoch loss trajectories for reproducibility and quick diagnostics.【F:text/plots/text_loss_curve.svg†L1-L23】【F:image/plots/mnist_loss_curve.svg†L1-L23】【F:audio/plots/audio_loss_curve.svg†L1-L23】【F:Tabular/plots/titanic_loss_curve.svg†L1-L23】

Getting Started

  1. Create a Python virtual environment with PyTorch, Torchaudio, Torchvision, Pandas, NumPy, and Scikit-learn.
  2. Choose a modality folder and run the corresponding script under src/ (e.g., python text/src/train_text_models.py).
  3. Review the generated metrics.json files and SVG loss curves; consult the modality README for interpretation guidance.【F:text/README.md†L29-L34】【F:image/README.md†L23-L27】【F:audio/README.md†L24-L27】【F:Tabular/README.md†L21-L24】

Each pipeline is intentionally lightweight to encourage experimentation with feature engineering, model capacity, and evaluation techniques across different data types.

About

🧩 AI Comprehensive Project with multiple submodules focusing on a specific data domain from images, text to tabular and signal

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages