Stores Metaflow state, acting as Metaflow's remote Datastore. The data stored includes but is not limited:
- for each flow
- for each version
- conda environments
- dependencies
- artifacts
- input
- output
- for each version
No duplicate data is stored thanks to automatic deduplication built into Metaflow.
To read more, see the Metaflow docs
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| db_engine | n/a | string |
"postgres" |
no |
| db_engine_version | n/a | string |
"16" |
no |
| db_instance_type | RDS instance type to launch for PostgresQL database. | string |
"db.t3.small" |
no |
| db_name | Name of PostgresQL database for Metaflow service. | string |
"metaflow" |
no |
| db_username | PostgresQL username; defaults to 'metaflow' | string |
"metaflow" |
no |
| enable_key_rotation | Enable key rotation for KMS keys | bool |
false |
no |
| force_destroy_s3_bucket | Empty S3 bucket before destroying via terraform destroy | bool |
false |
no |
| metadata_service_security_group_id | The security group ID used by the MetaData service. We'll grant this access to our DB. | string |
n/a | yes |
| metaflow_vpc_id | ID of the Metaflow VPC this SageMaker notebook instance is to be deployed in | string |
n/a | yes |
| resource_prefix | Prefix given to all AWS resources to differentiate between applications | string |
n/a | yes |
| resource_suffix | Suffix given to all AWS resources to differentiate between environment and workspace | string |
n/a | yes |
| standard_tags | The standard tags to apply to every AWS resource. | map(string) |
n/a | yes |
| subnet1_id | First subnet used for availability zone redundancy | string |
n/a | yes |
| subnet2_id | Second subnet used for availability zone redundancy | string |
n/a | yes |
| Name | Description |
|---|---|
| METAFLOW_DATASTORE_SYSROOT_S3 | Amazon S3 URL for Metaflow DataStore |
| METAFLOW_DATATOOLS_S3ROOT | Amazon S3 URL for Metaflow DataTools |
| database_name | The database name |
| database_password | The database password |
| database_username | The database username |
| datastore_s3_bucket_kms_key_arn | The ARN of the KMS key used to encrypt the Metaflow datastore S3 bucket |
| rds_master_instance_endpoint | The database connection endpoint in address:port format |
| s3_bucket_arn | The ARN of the bucket we'll be using as blob storage |
| s3_bucket_name | The name of the bucket we'll be using as blob storage |