Skip to content

Conversation

@wgtmac
Copy link
Member

@wgtmac wgtmac commented Dec 31, 2025

What is the purpose of the change

  • Add support to specify compression level to the C++ writer.
  • Make all codec enums visible and add runtime availability check.

Verifying this change

  • Extended test/DataFileTests.cc to test compression levels.

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? not applicable

@github-actions github-actions bot added the C++ Pull Requests for C++ binding label Dec 31, 2025
@wgtmac
Copy link
Member Author

wgtmac commented Dec 31, 2025

Could you please take a look? @thiru-mg @martin-g

Copy link
Contributor

@thiru-mg thiru-mg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
But we can localize all code-specific functionality (stringification, validation, etc.) into a common place by having codec_traits (or other means) and avoid switch statements at multiple places. Another advantage is that if a new codec is added in the future, it forces the developer to think in a structured way.

syncInterval_(syncInterval),
codec_(codec),
compressionLevel_(compressionLevel),
stream_(fileOutputStream(filename)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This seems to be an old issue: fileOutputStream(filename) creates/truncates the output file before running the validation checks in init(), i.e. an invalid codec/compression_level/syncInterval will lead to an exception but the output file will be already created.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is an issue. Another constructor DataFileWriter(std::unique_ptr<OutputStream> outputStream, ...) also have the same issue. However, this is a separate issue and fixing it may complicate this PR.

@wgtmac
Copy link
Member Author

wgtmac commented Jan 2, 2026

@thiru-mg @martin-g Thanks for your review! Not sure if the latest change is what you want. Let me know what you think :)

@martin-g martin-g merged commit 11fb555 into apache:main Jan 9, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ Pull Requests for C++ binding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants