Skip to content

Latest commit

 

History

History
83 lines (52 loc) · 5.28 KB

File metadata and controls

83 lines (52 loc) · 5.28 KB
graph LR
    Configuration_System["Configuration System"]
    Data_Pipeline["Data Pipeline"]
    Model_Core_GNNs_["Model Core (GNNs)"]
    Training_Evaluation_Engine["Training & Evaluation Engine"]
    Utility_Services["Utility Services"]
    Configuration_System -- "Defines Parameters For" --> Data_Pipeline
    Configuration_System -- "Defines Parameters For" --> Model_Core_GNNs_
    Configuration_System -- "Defines Parameters For" --> Training_Evaluation_Engine
    Data_Pipeline -- "Provides Data To" --> Training_Evaluation_Engine
    Data_Pipeline -- "Configured By" --> Configuration_System
    Model_Core_GNNs_ -- "Used By" --> Training_Evaluation_Engine
    Model_Core_GNNs_ -- "Configured By" --> Configuration_System
    Training_Evaluation_Engine -- "Consumes Data From" --> Data_Pipeline
    Training_Evaluation_Engine -- "Trains/Evaluates" --> Model_Core_GNNs_
    Training_Evaluation_Engine -- "Utilizes" --> Utility_Services
    Utility_Services -- "Loads From" --> Configuration_System
    Utility_Services -- "Supports" --> Training_Evaluation_Engine
    click Configuration_System href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/GearNet/Configuration_System.md" "Details"
    click Data_Pipeline href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/GearNet/Data_Pipeline.md" "Details"
    click Model_Core_GNNs_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/GearNet/Model_Core_GNNs_.md" "Details"
    click Utility_Services href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/GearNet/Utility_Services.md" "Details"
Loading

CodeBoardingDemoContact

Details

High-level data flow overview for GearNet, a Deep Learning Research Framework/Library for Protein Representation Learning.

Configuration System [Expand]

This component serves as the central control for defining and managing all experimental parameters. It leverages YAML files to specify model hyperparameters, dataset paths, training schedules, and task-specific settings, ensuring flexible and reproducible experiment configurations.

Related Classes/Methods:

Data Pipeline [Expand]

Responsible for the entire data lifecycle, from acquiring and preprocessing raw protein data (e.g., HDF5 files) to parsing, featurizing atoms and residues, and transforming them into structured protein graphs. It also handles dataset splitting for training, validation, and testing.

Related Classes/Methods:

Model Core (GNNs) [Expand]

This component implements the fundamental graph neural network layers and orchestrates their combination into complete deep learning models for protein representation. It defines how information propagates across protein graphs, incorporating geometric and relational inductive biases.

Related Classes/Methods:

Training & Evaluation Engine

Manages the entire model lifecycle, encompassing pre-training, fine-tuning, and evaluation on various protein-related tasks. It sets up and executes training loops, handles validation, testing, metric evaluation, learning rate scheduling, and model checkpointing, often leveraging torchdrug solvers.

Related Classes/Methods:

Utility Services [Expand]

Provides foundational helper functions essential for the framework's operation. This includes setting up logging, managing working directories, parsing command-line arguments, and dynamically constructing core framework elements like torchdrug solvers and learning rate schedulers based on the loaded configuration.

Related Classes/Methods: