-
Notifications
You must be signed in to change notification settings - Fork 2
Config File
Cloudfuse is configured with a YAML file. The file controls the components that run in the pipeline, component options (cache, streaming, storage backends, etc.), and global settings such as logging and health monitoring.
- Start from the base configuration for all available options and defaults: https://github.com/Seagate/cloudfuse/blob/main/setup/baseConfig.yaml
Each configuration file must contain a pipeline section that defines the components that will be used in the pipelines. A pipeline is an ordered set of components. Order matters and must follow priority from highest to lowest:
libfusestreamblock_cachefile_cacheattr_caches3storageazstorage
You should follow the guide below for the component section.
-
libfuseis required in every pipeline. It connects Cloudfuse to the OS filesystem (FUSE). - Choose exactly one data access layer (these components are mutually exclusive):
file_cacheblock_cachestream
-
attr_cacheshould be included to cache file and directory attributes and speed up metadata operations. - Choose exactly one storage backend:
-
s3storage(for S3-compatible clouds)- See S3 storage configuration for information about configuring S3 storage.
-
azstorage(for Azure Blob Storage / Data Lake Storage Gen2)- See Azure storage configuration for information about configuring Azure storage.
-
-
file_cache- Best for write-heavy workloads and workloads with lots of small files.
- Caches reads and writes to local disk for faster repeated access.
- Use when you want a durable local cache and better write performance.
- See the File Cache guide to configure
file_cache.
-
block_cache- Best for read-heavy workloads with small or large files.
- Fetches only the ranges you read, reducing I/O and latency for partial reads.
- Caches reads and writes in memory or local disk for faster repeated access.
- Use when you want high performance reads with quick access to repeated blocks.
- See the Block cache guide to configure
block_cache.
-
stream- Best for read-heavy access to large objects.
- Fetches only the ranges you read, reducing I/O and latency for partial reads.
- Not ideal for write-heavy workloads.
- See the Streaming guide to configure
stream.
For example, to use file caching with S3 storage you would have the following in your configuration file.
components:
- libfuse
- file_cache
- attr_cache
- s3storageOr to use streaming with Azure storage you would have the following in your configuration file.
components:
- libfuse
- stream
- attr_cache
- azstorageRead the logging documentation for information about setting up logging in the configuration file.
Read the health monitor documentation for information about setting up the health monitor in the configuration file.
Please see the base configuration file for a list of all settings and a brief explanation of each setting.
The following are some sample configuration files to help get you started.