Skip to content

Releases: Mmorgan-ML/Neuromodulatory-Control-Networks

NCN-18M-TinyStories-v1.0

07 Dec 21:01
3c6ef7a

Choose a tag to compare

Neuromodulatory Control Networks (NCN) – 18M TinyStories Model (v1.0)

Overview

This release introduces the first public implementation and trained checkpoint of the Neuromodulatory Control Network (NCN) architecture. NCNs are biologically inspired control modules that dynamically modulate attention precision, residual-stream gain, and FFN gating within a Transformer. The goal is adaptive, context-dependent processing with negligible runtime overhead.

This release includes:

  • Full PyTorch (with optional custom CUDA kernels) implementation of the NCN architecture
  • Training utilities, analysis scripts, and convergence diagnostics
  • An 18M-parameter NCN model trained on 1 epoch of TinyStories, achieving PPL ≈ 4.5

Model Summary

  • Parameters: ~18M
  • Architecture: Transformer + NCN controller (layer-wise gain, precision, and gating modulation)
  • Training Data: TinyStories (1 epoch)
  • Validation Perplexity: ~4.52
  • Tokenizer: GPT-2 BPE (custom loader included)

Key Features

  • Neuromodulatory Control Network:
    Produces layer-wise scalar modulation signals for gain, attention precision β, and FFN gating γ.

  • Dynamic Attention Precision:
    Controls entropy of attention (inverse temperature) without interfering with KV-cache efficiency.

  • Homeostatic Regularization:
    Penalizes deviations from neutral modulation, ensuring numerical and training stability.

  • Salience Pooling Mechanism:
    Extracts global context for tonic input while combining with token-level phasic inputs.

  • Training & Analysis Tools:

    • Distributed Data Parallel (DDP) support
    • Automatic Mixed Precision (AMP)
    • Perplexity/convergence analysis via ppl_analyze.py
    • Loss and Sample Efficiency analysis via sample_efficiency_analyze.py
    • Inference and neuromodulation testing of checkpoints via analyze_ncn.py
    • .txt to .bin conversion for training via prepare_data.py

Contents

  • ncn_architecture/ – NCN + modulated Transformer implementation
  • gpt2_tokenizer_files – GPT2 tokenizer files
  • train.py – Training script with resumption and logging
  • assets/ – Training graphs and analysis outputs
  • logs/ - Full training .log file for 18M run on TinyStories

Checkpoint Details

The included checkpoint is the final state of the 18M NCN model after 1 training epoch on TinyStories.

Intended Use

This release is intended for:

  • Researchers exploring biologically inspired control mechanisms in LLMs
  • Developers investigating dynamic modulation approaches beyond static Transformers
  • Experimentation with small, interpretable architectures

This checkpoint is not optimized for production or safety-critical applications.

License

This release is provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).
Tokenizer files may include licenses consistent with the original GPT-2 tokenizer specification.