Skip to content

S3 Compression #5515

@chibenwa

Description

@chibenwa

Why ?

Cost simulation for a 250K user deployment:

  • s3 costs = 75.000 euros per year
  • server costs = 50.000 per year

And by trading a bit of CPU (say 2.500 euro yearly) we can save (assuming conservative 1.5 compression ratio - conservative because base64 yield already a 33% overhead) 22.500 euro yearly.

What ?

blob.properties

compression.enabled=true
compression.threshold=16KB

How ?

The idea is to be fully retro compatible.

We would, upon S3 save, compress if:

  • enabled
  • threshold is met

Then we would set a metadata onto the compressed object: content-encoding=zstd + content-original-size=...

Upon read:

  • if content-encoding=zstd uncompress
  • otherwize "server flat"

Be sure to uncompress on parallel processor.

Use com.github.luben:zstd-jni

CF

byte[] compressed = Zstd.compress(data);
byte[] original = Zstd.decompress(compressed, originalSize);

Evolution

Leverage GC generation to compress only past generation, reducing long term storage costs without impacting recent data.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions