Introduce PAX header support#5
Open
piurafunk wants to merge 9 commits intojakubboucek:masterfrom
Open
Conversation
jakubboucek
requested changes
Jan 2, 2025
Author
|
9de2244 and 27fcf44 were to fix a PHPStan error: By introducing checks for 0 length tar archives, it breaks legitimate tar archives that have 0 length, such as directories. I'm fine rolling those commits back, and instead instructing PHPStan to ignore that specific error on that line with |
Author
|
I went ahead and applied that PHPStan ignore comment. I also fixed another bug I found, related to null byte padding. |
Owner
|
Thanks. I am still very busy, and I expect a delay of a few months before I can review and merge. Sorry about that. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In my use case, I'm pulling very large files from a docker volume, using the Docker HTTP API endpoint. Because the files I'm pulling are larger than 8 GiB, the Docker engine prefixes the data stream with a PAX header docs. They do this to indicate the size of the file is larger than 8 GiB (
tarformat has a limitation the makes it so it can only record the size of the file as 8 GiB).This PR should introduce support for the
sizeparameter in a PAX header. There are other headers that can be implemented, but I wanted to get this feature in, because it's the core of how the header can be processed. As other use cases come up, they can be added.The test added takes quite a long time to run. It tests a tar archive that contains 1 x 10 GiB file, 2 x 10 GiB files, and 3 x 10 GiB files, to ensure it would work on a directory of very large files. Currently the test is configured to run if the
CIenv var is set, which GitHub Actions has set on their workflows (docs). For local development, they will be marked as skipped, but a local env var ofTEST_MASSIVE_FILE=truewould allow it to run as well. This way we know the tests pass before merging a PR, but development of features unrelated to these large files is faster to iterate on. If this should be changed so that CI runs are faster, I am happy to do so.