graph LR
Configuration_System["Configuration System"]
Data_Pipeline["Data Pipeline"]
Model_Core_GNNs_["Model Core (GNNs)"]
Training_Evaluation_Engine["Training & Evaluation Engine"]
Utility_Services["Utility Services"]
Configuration_System -- "Defines Parameters For" --> Data_Pipeline
Configuration_System -- "Defines Parameters For" --> Model_Core_GNNs_
Configuration_System -- "Defines Parameters For" --> Training_Evaluation_Engine
Data_Pipeline -- "Provides Data To" --> Training_Evaluation_Engine
Data_Pipeline -- "Configured By" --> Configuration_System
Model_Core_GNNs_ -- "Used By" --> Training_Evaluation_Engine
Model_Core_GNNs_ -- "Configured By" --> Configuration_System
Training_Evaluation_Engine -- "Consumes Data From" --> Data_Pipeline
Training_Evaluation_Engine -- "Trains/Evaluates" --> Model_Core_GNNs_
Training_Evaluation_Engine -- "Utilizes" --> Utility_Services
Utility_Services -- "Loads From" --> Configuration_System
Utility_Services -- "Supports" --> Training_Evaluation_Engine
click Configuration_System href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/GearNet/Configuration_System.md" "Details"
click Data_Pipeline href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/GearNet/Data_Pipeline.md" "Details"
click Model_Core_GNNs_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/GearNet/Model_Core_GNNs_.md" "Details"
click Utility_Services href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/GearNet/Utility_Services.md" "Details"
High-level data flow overview for GearNet, a Deep Learning Research Framework/Library for Protein Representation Learning.
Configuration System [Expand]
This component serves as the central control for defining and managing all experimental parameters. It leverages YAML files to specify model hyperparameters, dataset paths, training schedules, and task-specific settings, ensuring flexible and reproducible experiment configurations.
Related Classes/Methods:
config(1:1)util(1:1)
Data Pipeline [Expand]
Responsible for the entire data lifecycle, from acquiring and preprocessing raw protein data (e.g., HDF5 files) to parsing, featurizing atoms and residues, and transforming them into structured protein graphs. It also handles dataset splitting for training, validation, and testing.
Related Classes/Methods:
Model Core (GNNs) [Expand]
This component implements the fundamental graph neural network layers and orchestrates their combination into complete deep learning models for protein representation. It defines how information propagates across protein graphs, incorporating geometric and relational inductive biases.
Related Classes/Methods:
Manages the entire model lifecycle, encompassing pre-training, fine-tuning, and evaluation on various protein-related tasks. It sets up and executes training loops, handles validation, testing, metric evaluation, learning rate scheduling, and model checkpointing, often leveraging torchdrug solvers.
Related Classes/Methods:
Utility Services [Expand]
Provides foundational helper functions essential for the framework's operation. This includes setting up logging, managing working directories, parsing command-line arguments, and dynamically constructing core framework elements like torchdrug solvers and learning rate schedulers based on the loaded configuration.
Related Classes/Methods: