Releases: Mmorgan-ML/Neuromodulatory-Control-Networks
NCN-18M-TinyStories-v1.0
Neuromodulatory Control Networks (NCN) – 18M TinyStories Model (v1.0)
Overview
This release introduces the first public implementation and trained checkpoint of the Neuromodulatory Control Network (NCN) architecture. NCNs are biologically inspired control modules that dynamically modulate attention precision, residual-stream gain, and FFN gating within a Transformer. The goal is adaptive, context-dependent processing with negligible runtime overhead.
This release includes:
- Full PyTorch (with optional custom CUDA kernels) implementation of the NCN architecture
- Training utilities, analysis scripts, and convergence diagnostics
- An 18M-parameter NCN model trained on 1 epoch of TinyStories, achieving PPL ≈ 4.5
Model Summary
- Parameters: ~18M
- Architecture: Transformer + NCN controller (layer-wise gain, precision, and gating modulation)
- Training Data: TinyStories (1 epoch)
- Validation Perplexity: ~4.52
- Tokenizer: GPT-2 BPE (custom loader included)
Key Features
-
Neuromodulatory Control Network:
Produces layer-wise scalar modulation signals for gain, attention precision β, and FFN gating γ. -
Dynamic Attention Precision:
Controls entropy of attention (inverse temperature) without interfering with KV-cache efficiency. -
Homeostatic Regularization:
Penalizes deviations from neutral modulation, ensuring numerical and training stability. -
Salience Pooling Mechanism:
Extracts global context for tonic input while combining with token-level phasic inputs. -
Training & Analysis Tools:
- Distributed Data Parallel (DDP) support
- Automatic Mixed Precision (AMP)
- Perplexity/convergence analysis via
ppl_analyze.py - Loss and Sample Efficiency analysis via
sample_efficiency_analyze.py - Inference and neuromodulation testing of checkpoints via
analyze_ncn.py - .txt to .bin conversion for training via
prepare_data.py
Contents
ncn_architecture/– NCN + modulated Transformer implementationgpt2_tokenizer_files– GPT2 tokenizer filestrain.py– Training script with resumption and loggingassets/– Training graphs and analysis outputslogs/- Full training .log file for 18M run on TinyStories
Checkpoint Details
The included checkpoint is the final state of the 18M NCN model after 1 training epoch on TinyStories.
Intended Use
This release is intended for:
- Researchers exploring biologically inspired control mechanisms in LLMs
- Developers investigating dynamic modulation approaches beyond static Transformers
- Experimentation with small, interpretable architectures
This checkpoint is not optimized for production or safety-critical applications.
License
This release is provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).
Tokenizer files may include licenses consistent with the original GPT-2 tokenizer specification.