Neuromodulatory Control Networks (NCN) – 18M TinyStories Model (v1.0)

Overview

This release introduces the first public implementation and trained checkpoint of the Neuromodulatory Control Network (NCN) architecture. NCNs are biologically inspired control modules that dynamically modulate attention precision, residual-stream gain, and FFN gating within a Transformer. The goal is adaptive, context-dependent processing with negligible runtime overhead.

This release includes:

Full PyTorch (with optional custom CUDA kernels) implementation of the NCN architecture
Training utilities, analysis scripts, and convergence diagnostics
An 18M-parameter NCN model trained on 1 epoch of TinyStories, achieving PPL ≈ 4.5

Model Summary

Parameters: ~18M
Architecture: Transformer + NCN controller (layer-wise gain, precision, and gating modulation)
Training Data: TinyStories (1 epoch)
Validation Perplexity: ~4.52
Tokenizer: GPT-2 BPE (custom loader included)

Key Features

Neuromodulatory Control Network:
Produces layer-wise scalar modulation signals for gain, attention precision β, and FFN gating γ.
Dynamic Attention Precision:
Controls entropy of attention (inverse temperature) without interfering with KV-cache efficiency.
Homeostatic Regularization:
Penalizes deviations from neutral modulation, ensuring numerical and training stability.
Salience Pooling Mechanism:
Extracts global context for tonic input while combining with token-level phasic inputs.
Training & Analysis Tools:
- Distributed Data Parallel (DDP) support
- Automatic Mixed Precision (AMP)
- Perplexity/convergence analysis via ppl_analyze.py
- Loss and Sample Efficiency analysis via sample_efficiency_analyze.py
- Inference and neuromodulation testing of checkpoints via analyze_ncn.py
- .txt to .bin conversion for training via prepare_data.py

ncn_architecture/ – NCN + modulated Transformer implementation
gpt2_tokenizer_files – GPT2 tokenizer files
train.py – Training script with resumption and logging
assets/ – Training graphs and analysis outputs
logs/ - Full training .log file for 18M run on TinyStories

Checkpoint Details

The included checkpoint is the final state of the 18M NCN model after 1 training epoch on TinyStories.

Intended Use

This release is intended for:

Researchers exploring biologically inspired control mechanisms in LLMs
Developers investigating dynamic modulation approaches beyond static Transformers
Experimentation with small, interpretable architectures

This checkpoint is not optimized for production or safety-critical applications.

License

This release is provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).
Tokenizer files may include licenses consistent with the original GPT-2 tokenizer specification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Neuromodulatory Control Networks (NCN) – 18M TinyStories Model (v1.0)

Overview

Model Summary

Key Features

Contents

Checkpoint Details

Intended Use

License

Uh oh!

Releases: Mmorgan-ML/Neuromodulatory-Control-Networks

NCN-18M-TinyStories-v1.0

Neuromodulatory Control Networks (NCN) – 18M TinyStories Model (v1.0)

Overview

Model Summary

Key Features

Contents

Checkpoint Details

Intended Use

License

Uh oh!